Check out my first novel, midnight's simulacra!

PTX: Difference between revisions

From dankwiki
No edit summary
No edit summary
Line 1: Line 1:
The ISA to which [[CUDA]]'s nvcc compiles source code. This is JIT'd into architecture-specific machine language by the hardware driver after the CUDA runtime is used to load a PTX module. It can then be scheduled for execution on CUDA devices. From Version 2.1 of the [http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/ptx_isa_2.1.pdf PTX ISA Reference]:
The ISA to which [[CUDA]]'s nvcc compiles source code. This is JIT'd into architecture-specific machine language by the hardware driver after the CUDA runtime is used to load a PTX module. It can then be scheduled for execution on CUDA devices. From Version 2.1 of the [http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/ptx_isa_2.1.pdf PTX ISA Reference]:
:''PTX defines a virtual machine and ISA for general purpose parallel thread execution. PTX programs are translated at install time to the target hardware instruction set. The PTX-to-GPU translator and driver enable NVIDIA GPUs to be used as programmable parallel computers.''
:''PTX defines a virtual machine and ISA for general purpose parallel thread execution. PTX programs are translated at install time to the target hardware instruction set. The PTX-to-GPU translator and driver enable NVIDIA GPUs to be used as programmable parallel computers.''
==Versions==
{|
|-
! PTX Version
! CUDA Toolkit Version
! Changes
|-
| 2.1
| 3.1
|
* Stack-based API, indirect branches and function pointers for sm_2x targets
* .branchtargets, .calltargets, and .callprototype directives
* 32 driver-specific execution environment registers %envreg0..%envreg31
* New instruction rcp.approx.ftz.f64 for fast approximate reciprocal
|-
| 2.0
| 3.0
|
|-
|}
==Tools==
==Tools==
* Marcin Wilhelm Kościelnicki's [http://0x04.net/cgit/index.cgi/nv50dis/ nv50dis], a disassembler
* Marcin Wilhelm Kościelnicki's [http://0x04.net/cgit/index.cgi/nv50dis/ nv50dis], a disassembler
[[CATEGORY:GPGPU]]
[[CATEGORY:GPGPU]]

Revision as of 07:41, 18 July 2010

The ISA to which CUDA's nvcc compiles source code. This is JIT'd into architecture-specific machine language by the hardware driver after the CUDA runtime is used to load a PTX module. It can then be scheduled for execution on CUDA devices. From Version 2.1 of the PTX ISA Reference:

PTX defines a virtual machine and ISA for general purpose parallel thread execution. PTX programs are translated at install time to the target hardware instruction set. The PTX-to-GPU translator and driver enable NVIDIA GPUs to be used as programmable parallel computers.

Versions

PTX Version CUDA Toolkit Version Changes
2.1 3.1
  • Stack-based API, indirect branches and function pointers for sm_2x targets
  • .branchtargets, .calltargets, and .callprototype directives
  • 32 driver-specific execution environment registers %envreg0..%envreg31
  • New instruction rcp.approx.ftz.f64 for fast approximate reciprocal
2.0 3.0

Tools

  • Marcin Wilhelm Kościelnicki's nv50dis, a disassembler