Check out my first novel, midnight's simulacra!
PTX: Difference between revisions
From dankwiki
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
The ISA to which [[CUDA]]'s nvcc compiles source code. This is JIT'd into architecture-specific machine language by the hardware driver after the CUDA runtime is used to load a PTX module. It can then be scheduled for execution on CUDA devices. From Version 2.1 of the [http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/ptx_isa_2.1.pdf PTX ISA Reference]: | The ISA to which [[CUDA]]'s nvcc compiles source code. This is JIT'd into architecture-specific machine language by the hardware driver after the CUDA runtime is used to load a PTX module. It can then be scheduled for execution on CUDA devices. From Version 2.1 of the [http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/ptx_isa_2.1.pdf PTX ISA Reference]: | ||
:''PTX defines a virtual machine and ISA for general purpose parallel thread execution. PTX programs are translated at install time to the target hardware instruction set. The PTX-to-GPU translator and driver enable NVIDIA GPUs to be used as programmable parallel computers.'' | :''PTX defines a virtual machine and ISA for general purpose parallel thread execution. PTX programs are translated at install time to the target hardware instruction set. The PTX-to-GPU translator and driver enable NVIDIA GPUs to be used as programmable parallel computers.'' | ||
==Versions== | |||
{| | |||
|- | |||
! PTX Version | |||
! CUDA Toolkit Version | |||
! Changes | |||
|- | |||
| 2.1 | |||
| 3.1 | |||
| | |||
* Stack-based API, indirect branches and function pointers for sm_2x targets | |||
* .branchtargets, .calltargets, and .callprototype directives | |||
* 32 driver-specific execution environment registers %envreg0..%envreg31 | |||
* New instruction rcp.approx.ftz.f64 for fast approximate reciprocal | |||
|- | |||
| 2.0 | |||
| 3.0 | |||
| | |||
|- | |||
|} | |||
==Tools== | ==Tools== | ||
* Marcin Wilhelm Kościelnicki's [http://0x04.net/cgit/index.cgi/nv50dis/ nv50dis], a disassembler | * Marcin Wilhelm Kościelnicki's [http://0x04.net/cgit/index.cgi/nv50dis/ nv50dis], a disassembler | ||
[[CATEGORY:GPGPU]] | [[CATEGORY:GPGPU]] |
Revision as of 07:41, 18 July 2010
The ISA to which CUDA's nvcc compiles source code. This is JIT'd into architecture-specific machine language by the hardware driver after the CUDA runtime is used to load a PTX module. It can then be scheduled for execution on CUDA devices. From Version 2.1 of the PTX ISA Reference:
- PTX defines a virtual machine and ISA for general purpose parallel thread execution. PTX programs are translated at install time to the target hardware instruction set. The PTX-to-GPU translator and driver enable NVIDIA GPUs to be used as programmable parallel computers.
Versions
PTX Version | CUDA Toolkit Version | Changes |
---|---|---|
2.1 | 3.1 |
|
2.0 | 3.0 |
Tools
- Marcin Wilhelm Kościelnicki's nv50dis, a disassembler