SIMD: Difference between revisions
No edit summary |
No edit summary |
||
| Line 25: | Line 25: | ||
====SSE4a==== | ====SSE4a==== | ||
====SSE4.1==== | ====SSE4.1==== | ||
* Introduced on Penryn | |||
[[File:Dppd.gif|thumb|DPPD instruction dataflow]] | [[File:Dppd.gif|thumb|DPPD instruction dataflow]] | ||
*<tt>dpps</tt> -- dot product of two vectors having four single components each | *<tt>dpps</tt> -- dot product of two vectors having four single components each | ||
| Line 31: | Line 32: | ||
====SSE4.2==== | ====SSE4.2==== | ||
*Introduced on Nehalem | |||
===SSE3=== | ===SSE3 (PNI)=== | ||
*Originally known as Prescott New Instructions, and introduced on P4-Prescott | |||
*<tt>[http://www.intel.com/software/products/documentation/vlin/mergedprojects/analyzer_ec/mergedprojects/reference_olh/mergedProjects/instructions/instruct32_hh/movddup--move_one_double-fp_and_duplicate.htm movddup]</tt> -- move a double from a 8-byte-aligned memory location or lower half of XMM register to upper half, then duplicate upper half to lower half | *<tt>[http://www.intel.com/software/products/documentation/vlin/mergedprojects/analyzer_ec/mergedprojects/reference_olh/mergedProjects/instructions/instruct32_hh/movddup--move_one_double-fp_and_duplicate.htm movddup]</tt> -- move a double from a 8-byte-aligned memory location or lower half of XMM register to upper half, then duplicate upper half to lower half | ||