PTX: Difference between revisions

Revision as of 07:52, 14 December 2010

The ISA to which CUDA's nvcc compiles source code. This is JIT'd into architecture-specific machine language by the hardware driver after the CUDA runtime is used to load a PTX module. It can then be scheduled for execution on CUDA devices. From Version 2.1 of the PTX ISA Reference:

PTX defines a virtual machine and ISA for general purpose parallel thread execution. PTX programs are translated at install time to the target hardware instruction set. The PTX-to-GPU translator and driver enable NVIDIA GPUs to be used as programmable parallel computers.

Versions

PTX Version	CUDA Toolkit Version	Changes
2.2	3.2	New kernel parameter directives for pointer arguments Flat address space for constants (backwards compatibility for constant banks) Texture changes for OpenCL, bilerp (bilinear interpolation) and high-bw loads
2.1	3.1	Stack-based API, indirect branches and function pointers for sm_2x targets .branchtargets, .calltargets, and .callprototype directives 32 driver-specific execution environment registers %envreg0..%envreg31 New instruction rcp.approx.ftz.f64 for fast approximate reciprocal
2.0	3.0

Tools

Marcin Wilhelm Kościelnicki's nv50dis, a disassembler

@@ Line 7: / Line 7: @@
 ! CUDA Toolkit Version
 ! Changes
+|-
+| 2.2
+| 3.2
+|
+* New kernel parameter directives for pointer arguments
+* Flat address space for constants (backwards compatibility for constant banks)
+* Texture changes for OpenCL, bilerp (bilinear interpolation) and high-bw loads
 |-
 | 2.1

PTX: Difference between revisions

Revision as of 07:52, 14 December 2010

Versions

Tools

navigation menu

Search