SIMD: Difference between revisions

Line 17:

* '''word''': 32-bit two's complement integer

* '''doubleword''', '''dword''': 64-bit two's complement integer

==x86 YMM==

The [http://software.intel.com/en-us/avx/ AVX] (Advanced Vector eXtensions) were introduced on Intel's [[Sandy Bridge]] (2010) and AMD's [[Bulldozer]] (2011). They operate on the 256-bit YMM registers (YMM0..YMM15), which are aliased by the [[SIMD#x86_XMM|XMM]] registers. They are encoded using the [[VEX]] scheme. Support for AVX can be determined via:

* Determine that [[CPUID]].1:ECX.OSXSAVE[bit 27] is set (XGETBV is enabled for application use)

* Issue XGETBV and verify that XFEATURE_ENABLED_MASK[2:1] = 0x3 (XMM state and YMM state are enabled by OS)

* Determine that [[CPUID]].1:ECX.AVX[bit 28] is set (AVX instructions supported)

==x86 XMM==

The Streaming SIMD Extensions operate on the 128-bit XMM registers (XMM0..XMM7 in 32-bit mode, XMM0..XMM15 in 64-bit mode). In its original incarnation on the PIII, execution units (but not registers) were shared with the x87 floating-point architecture. The execution units were separated in the NetBurst microarchitecture. In the Core microarchitecture, the execution engine has been widened for greater SSE throughput.

Line 78:

Line 82:

===Future Directions===

* [http://software.intel.com/en-us/avx/ AVX] (Advanced Vector eXtensions) -- to be introduced on Intel's [[Sandy Bridge]] (2010) and AMD's [[Bulldozer]] (2011), and implemented within the [http://en.wikipedia.org/wiki/VEX_prefix VEX coding scheme]

* The [http://en.wikipedia.org/wiki/FMA_instruction_set FMA instruction set] extension to x86 should hit around 2011, providing floating-point fused multiply-add

** AMD appears to call this [http://en.wikipedia.org/wiki/FMA_instruction_set FMA4], part of what was SSE5

@@ Line 17: / Line 17: @@
 * '''word''': 32-bit two's complement integer
 * '''doubleword''', '''dword''': 64-bit two's complement integer
+==x86 YMM==
+The [http://software.intel.com/en-us/avx/ AVX] (Advanced Vector eXtensions) were introduced on Intel's [[Sandy Bridge]] (2010) and AMD's [[Bulldozer]] (2011). They operate on the 256-bit YMM registers (YMM0..YMM15), which are aliased by the [[SIMD#x86_XMM|XMM]] registers. They are encoded using the [[VEX]] scheme. Support for AVX can be determined via:
+* Determine that [[CPUID]].1:ECX.OSXSAVE[bit 27] is set (XGETBV is enabled for application use)
+* Issue XGETBV and verify that XFEATURE_ENABLED_MASK[2:1] = 0x3 (XMM state and YMM state are enabled by OS)
+* Determine that [[CPUID]].1:ECX.AVX[bit 28] is set (AVX instructions supported)
 ==x86 XMM==
 The Streaming SIMD Extensions operate on the 128-bit XMM registers (XMM0..XMM7 in 32-bit mode, XMM0..XMM15 in 64-bit mode). In its original incarnation on the PIII, execution units (but not registers) were shared with the x87 floating-point architecture. The execution units were separated in the NetBurst microarchitecture. In the Core microarchitecture, the execution engine has been widened for greater SSE throughput.
@@ Line 78: / Line 82: @@
 ===Future Directions===
-* [http://software.intel.com/en-us/avx/ AVX] (Advanced Vector eXtensions) -- to be introduced on Intel's [[Sandy Bridge]] (2010) and AMD's [[Bulldozer]] (2011), and implemented within the [http://en.wikipedia.org/wiki/VEX_prefix VEX coding scheme]
 * The [http://en.wikipedia.org/wiki/FMA_instruction_set FMA instruction set] extension to x86 should hit around 2011, providing floating-point fused multiply-add
 ** AMD appears to call this [http://en.wikipedia.org/wiki/FMA_instruction_set FMA4], part of what was SSE5

Dank (talk | contribs)