CUDA: Difference between revisions

Revision as of 12:22, 1 February 2010

Hardware/Emulation

NVIDIA maintains a list of supported hardware. Otherwise, there's emulation...

[recombinator](0) $ ~/local/cuda/C/bin/linux/emurelease/deviceQuery
CUDA Device Query (Runtime API) version (CUDART static linking)
There is no device supporting CUDA.

Device 0: "Device Emulation (CPU)"
  CUDA Driver Version:                           2.30
  CUDA Runtime Version:                          2.30
  CUDA Capability Major revision number:         9999
  CUDA Capability Minor revision number:         9999
  Total amount of global memory:                 4294967295 bytes
  Number of multiprocessors:                     16
  Number of cores:                               128
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     1
  Maximum number of threads per block:           512
  Maximum sizes of each dimension of a block:    512 x 512 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
  Maximum memory pitch:                          262144 bytes
  Texture alignment:                             256 bytes
  Clock rate:                                    1.35 GHz
  Concurrent copy and execution:                 No
  Run time limit on kernels:                     No
  Integrated:                                    Yes
  Support host page-locked memory mapping:       Yes
  Compute mode:                                  Default (multiple host threads can use this device simultaneously)

Test PASSED

CUDA model

A given host thread can execute code on only one device at once (but multiple host threads can execute code on the same device)
A group of threads which share a memory and can "synchronize their execution to coördinate accesses to memory" (use a barrier) form a block. Each thread has a threadId within its (three-dimensional) block.
- For a block of dimensions <D_x, D_y, D_z>, the threadId of the thread having index <x, y, z> is (x + y * D_x + z * D_y * D_x).
A group of blocks which share a kernel form a grid. Each block (and each thread within that block) has a blockId within its (three-dimensional) grid.
- For a grid of dimensions <D_x, D_y, D_z>, the blockId of the block having index <x, y, z> is (x + y * D_x + z * D_y * D_x).
Thus, a given thread's <blockId X threadId> dyad is unique across the device. All the threads of a block share a blockId, and corresponding threads of various blocks share a threadId.
Each time the kernel is instantiated, new grid and block dimensions may be provided
A block's threads, starting from threadId 0, are broken up into contiguous warps having some warp size number of threads.

Memory type	Replication	Access	Host access
Registers	Per-thread	Read-write	None
Local memory	Per-thread	Read-write	None
Shared memory	Per-block	Read-write	None
Global memory	Per-grid	Read-write	Read-write
Constant memory	Per-grid	Read	Read-write
Texture memory	Per-grid	Read	Read-write

Installation on Debian

libcuda-dev packages exist in the non-free archive area, and supply the core library libcuda.so. Together with the upstream toolkit and SDK from NVIDIA, this provides a full CUDA development environment for 64-bit Debian Unstable systems. I installed CUDA 2.3 on 2010-01-25 (hand-rolled 2.6.32.6 kernel, built with gcc-4.4). This machine did not have CUDA-compatible hardware (it uses Intel 965).

Download the Ubuntu 9.04 files from NVIDIA's "CUDA Zone".
Run the toolkit installer (sh cudatoolkit_2.3_linux_64_ubuntu9.04.run)
- For a user-mode install, supply $HOME/local or somesuch

* Please make sure your PATH includes /home/dank/local/cuda/bin
* Please make sure your LD_LIBRARY_PATH
*   for 32-bit Linux distributions includes /home/dank/local/cuda/lib
*   for 64-bit Linux distributions includes /home/dank/local/cuda/lib64
* OR
*   for 32-bit Linux distributions add /home/dank/local/cuda/lib
*   for 64-bit Linux distributions add /home/dank/local/cuda/lib64
* to /etc/ld.so.conf and run ldconfig as root

* Please read the release notes in /home/dank/local/cuda/doc/

* To uninstall CUDA, delete /home/dank/local/cuda
* Installation Complete

Run the SDK installer (sh cudasdk_2.3_linux.run)
- I just installed it to the same directory as the toolkit, which seems to work fine.

========================================

Configuring SDK Makefile (/home/dank/local/cuda/shared/common.mk)...

========================================

* Please make sure your PATH includes /home/dank/local/cuda/bin
* Please make sure your LD_LIBRARY_PATH includes /home/dank/local/cuda/lib

* To uninstall the NVIDIA GPU Computing SDK, please delete /home/dank/local/cuda
* Installation Complete

Building

SDK's common.mk

This assumes use of the SDK's common.mk, as recommended by the documentation.

Add the library path to LD_LIBRARY_PATH, assuming CUDA's been installed to a non-standard directory.
Set the CUDA_INSTALL_PATH and ROOTDIR (yeargh!) if outside the SDK.
I keep the following in bin/cudasetup of my home directory. Source it, using sh's . cudasetup syntax:

CUDA="$HOME/local/cuda/"

export CUDA_INSTALL_PATH="$CUDA"
export ROOTDIR="$CUDA/C/common/"
if [ -n "$LD_LIBRARY_PATH" ] ; then
	export "LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA/lib64"
else
	export "LD_LIBRARY_PATH=$CUDA/lib64"
fi

unset CUDA

@@ Line 31: / Line 31: @@
 Test PASSED</pre>
 ===CUDA model===
+* A given host thread can execute code on only one device at once (but multiple host threads can execute code on the same device)
 * A group of threads which share a memory and can "synchronize their execution to coördinate accesses to memory" (use a [[barrier]]) form a '''block'''. Each thread has a ''threadId'' within its (three-dimensional) block.
 ** For a block of dimensions &lt;D<sub>x</sub>, D<sub>y</sub>, D<sub>z</sub>&gt;, the threadId of the thread having index &lt;x, y, z&gt; is (x + y * D<sub>x</sub> + z * D<sub>y</sub> * D<sub>x</sub>).

CUDA: Difference between revisions

Revision as of 12:22, 1 February 2010

Contents

Hardware/Emulation

CUDA model

Installation on Debian

Building

SDK's common.mk

Handrolled builds

navigation menu

CUDA: Difference between revisions

Revision as of 12:22, 1 February 2010

Hardware/Emulation

CUDA model

Installation on Debian

Building

SDK's common.mk

Handrolled builds

navigation menu

Search