Changes

Python GPU programming (view source)

Revision as of 16:26, 13 November 2020

465 bytes added , 16:26, 13 November 2020

The key things that you need to know are:

* One '''kernel ''' is executed at a time on a device* Many '''threads ''' execute each kernel - each thread runs the same code but on different data (based on its threadID)* Threads are grouped into '''blocks ''' and a kernel runs on a '''grid ''' of blocks

* Blocks can't synchronize. They can run concurrently or sequentially.

* Threads have local memory ('''registers ''' ~ 1clock cycle), '''blocks share memory ''' (~10 clock cycles), and kernels have '''per-device global memory''' (~100s/1000 clock cycles)* Per device memory can transfer data to/from the CPU, and includes '''global''', '''local ''' (for consecutive access by a thread), '''constant ''' (much faster than other per device), and some specialized memories for graphics ('''texture ''' and surface).

* Transfers from global memory to local registers is in 4,8 or 16 byte units (or can incur a penalty, which slows things down). Threads can talk to constant and texture memory.

* Blocks should have dimension >=32(see warps below).* A GPU device is a set of '''[https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT ~~multiprocessors~~multiprocessor]'''* ~~At each clock cycle, a multiprocessor executes the same instruction on a warp (the~~ The number of threads in a '''warp ''' is the "warp size". It's usually 32. You can find yours by running the deviceQuery utility provided in the samples folder. See [[DIGITS DevBox#Test the installation]]. Warps are then grouped into blocks.* At each clock cycle, a multiprocessor executes the same instruction on a warp. Threads within a warp are executed physically in parallel. Warps and blocks are executed logically in parallel.* Kernel launches are asynchronous - the CPU hands off the kernel and moves on. The kernel only executes and all previous CUDA calls have completed.

==CUDA and Python==

Ed

Bureaucrats, Interface administrators, Administrators (Semantic MediaWiki), Administrators

7,613

edits

Changes

Python GPU programming (view source)

Revision as of 16:26, 13 November 2020

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools