SIMD: Difference between revisions

No edit summary
No edit summary
Line 25: Line 25:
====SSE4a====
====SSE4a====
====SSE4.1====
====SSE4.1====
* Introduced on Penryn
[[File:Dppd.gif|thumb|DPPD instruction dataflow]]
[[File:Dppd.gif|thumb|DPPD instruction dataflow]]
*<tt>dpps</tt> -- dot product of two vectors having four single components each
*<tt>dpps</tt> -- dot product of two vectors having four single components each
Line 31: Line 32:


====SSE4.2====
====SSE4.2====
*Introduced on Nehalem


===SSE3===
===SSE3 (PNI)===
*Originally known as Prescott New Instructions, and introduced on P4-Prescott
*<tt>[http://www.intel.com/software/products/documentation/vlin/mergedprojects/analyzer_ec/mergedprojects/reference_olh/mergedProjects/instructions/instruct32_hh/movddup--move_one_double-fp_and_duplicate.htm movddup]</tt> -- move a double from a 8-byte-aligned memory location or lower half of XMM register to upper half, then duplicate upper half to lower half
*<tt>[http://www.intel.com/software/products/documentation/vlin/mergedprojects/analyzer_ec/mergedprojects/reference_olh/mergedProjects/instructions/instruct32_hh/movddup--move_one_double-fp_and_duplicate.htm movddup]</tt> -- move a double from a 8-byte-aligned memory location or lower half of XMM register to upper half, then duplicate upper half to lower half