Changes

Python GPU programming (view source)

Revision as of 16:14, 13 November 2020

149 bytes added , 16:14, 13 November 2020

*The streaming multiprocessor and the CUDA core: https://i.stack.imgur.com/kvu4M.jpg

*CUDA memory hierarchy: https://www.researchgate.net/profile/Marco_Nobile/publication/261069154/figure/fig1/AS:296718735298563@1447754667270/Schematization-of-CUDA-architecture-Schematic-representation-of-CUDA-threads-and-memory.png

*Various slides from Cyril Zeller (nVIDIA Developer Technology)'s Tutorial CUDA:https://www.slideshare.net/angelamm2012/nvidia-cuda-tutorialnondaapr08

The key things that you need to know are:

* Blocks should have dimension >=32

* A GPU device is a set of SIMT multiprocessors

* At each clock cycle, a multiprocessor executes the same instruction on a warp (the number of threads in a warp is the "warp size". It's usually 32. You can find yours by running the deviceQuery utility provided in the samples folder. See [[DIGITS DevBox#Test the installation]].

==CUDA and Python==

Ed

Bureaucrats, Interface administrators, Administrators (Semantic MediaWiki), Administrators

7,613

edits

Changes

Python GPU programming (view source)

Revision as of 16:14, 13 November 2020

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools