x86 (!AMD64) の64bitアトミックロード

movq xm0, ptr [addr];
movd eax, xm0;
psrldq xm0, 4; // xm0 >>= 32;
movd edx, xm0; 

というパターンを同僚に教えてもらった。勉強になります。

Yes, I see that in Volume 3, the System Programming Guide, on page 7-3, section 7.1.1, "Guaranteed Atomic Operations," they write

The Pentium 4, Intel Xeon, and P6 family, and Pentium processors guarantee that the following additional memory operations will always be carried out atomically:

  • reading or writing a quadword aligned on a 64-bit boundary
  • 16-bit accesses to uncached memory locations that fit within a 32-bit data bus
http://www.exit.com/blog/archives/000361.html