Yeah, use of SSE registers does not imply SIMD, since x87 is gone in x86-64, so even scalar FP has to use SSE registers. The asm snippets for v::operator*() in the "Optimization level 1" section use scalar SSE arithmetic only (mulss). (There's some use of movaps to move data around, but it's a stretch to call that SIMD.)
I think the "leverage" sentence you quoted and the "with SIMD taken care of" one shortly after are maybe a bit misleading, since the asm snippets there don't really demonstrate SIMD.
> since x87 is gone in x86-64, so even scalar FP has to use SSE registers.
No, it’s still there. What’s actually going on is that all x86-64 CPUs support SSE2, so there is little reason to use x87 in 64-bit code.
(You can use it for 80-bit precision. OTOH, for most purposes, 80-bit precision is actively harmful, and x87 is an incredible mess, so almost no one wants it.)
Exactly. The conversations happen at unexpected and unpredictable times depending on when the compiler needs to spill registers, which has surprising effects.
I think the "leverage" sentence you quoted and the "with SIMD taken care of" one shortly after are maybe a bit misleading, since the asm snippets there don't really demonstrate SIMD.