Check out my first novel, midnight's simulacra!
PTX: Difference between revisions
From dankwiki
(add PTX 2.2 notes) |
No edit summary |
||
Line 28: | Line 28: | ||
|- | |- | ||
|} | |} | ||
==Cooperative Thread Arrays== | |||
* Equivalent to a ''block'' in [[CUDA]] -- broken up into warps, can communicate, can be grouped into a grid, one kernel per grid | |||
* <tt>tid</tt>: thread ID within CTA | |||
* <tt>ntid</tt>: 3D shape of CTA | |||
* <tt>ctaid</tt>: CTA ID within grid | |||
* <tt>nctaid</tt>: 3D shape of grid | |||
* <tt>gridid</tt>: grid ID | |||
==Tools== | ==Tools== | ||
* Marcin Wilhelm Kościelnicki's [http://0x04.net/cgit/index.cgi/nv50dis/ nv50dis], a disassembler | * Marcin Wilhelm Kościelnicki's [http://0x04.net/cgit/index.cgi/nv50dis/ nv50dis], a disassembler | ||
[[CATEGORY:GPGPU]] | [[CATEGORY:GPGPU]] |
Revision as of 08:06, 14 December 2010
The ISA to which CUDA's nvcc compiles source code. This is JIT'd into architecture-specific machine language by the hardware driver after the CUDA runtime is used to load a PTX module. It can then be scheduled for execution on CUDA devices. From Version 2.1 of the PTX ISA Reference:
- PTX defines a virtual machine and ISA for general purpose parallel thread execution. PTX programs are translated at install time to the target hardware instruction set. The PTX-to-GPU translator and driver enable NVIDIA GPUs to be used as programmable parallel computers.
Versions
PTX Version | CUDA Toolkit Version | Changes |
---|---|---|
2.2 | 3.2 |
|
2.1 | 3.1 |
|
2.0 | 3.0 |
Cooperative Thread Arrays
- Equivalent to a block in CUDA -- broken up into warps, can communicate, can be grouped into a grid, one kernel per grid
- tid: thread ID within CTA
- ntid: 3D shape of CTA
- ctaid: CTA ID within grid
- nctaid: 3D shape of grid
- gridid: grid ID
Tools
- Marcin Wilhelm Kościelnicki's nv50dis, a disassembler