Changes

Jump to navigation Jump to search
4,331 bytes added ,  13:39, 21 September 2020
no edit summary
{{Project|Has project output=Content|Has sponsor=McNair ProjectsCenter
|Has title=GPU Build
|Has owner=Oliver Chang,Kyran Adams
}}
==Final Decision==
==Motherboard/CPU/Fan==*Not a huge deal, but used for data prep*MotherboardsWe decided to clone the NVIDIA [[DIGITS DevBox]]: ASUS Z10PE-D16 [httphttps://wwwdeveloper.neweggnvidia.com/Product/Product.aspx?Item=N82E16813132257&Tpk=N82E16813132257]devbox To start with we are trying to use our existing ASUS Z10 server board, Dual LGA 2011 R3, DDR4 rather than switching to the Asus X99- Up to 32GB RDIMME WS workstation class motherboard, 16 slots*Chips: Intel Haswell Xeon e5-2620v3and rather than Four TITAN X GPUs, 6 core @ 2we've got a TITAN XP and a TITAN RTX.4ghz, 6x256k level 1 cache, 15mb level 2 cache, socket LGA 2011 Note that the Asus X99-v3 [https://wwwE WS is available from NewEgg for $500 now.amazon.com/gp/product/B00M1BUUMO/ref =oh_aui_detailpage_o04_s00?ie=UTF8&psc=1]*CPU Fans: Intel Thermal Solution Cooling Fan for E5-2600 Processors BXSTS200C [https://wwwSingle vs.amazon.com/gp/product/B007HJAM50/ref=oh_aui_detailpage_o03_s00?ieMulti GPU=UTF8&psc=1]
==GPU==
*[https://www.nvidia.com/en-us/geforce/products/10series/geforce-gtx-1080-ti/ GTX 1080 Ti Specs]
* Since we are using Tensorflow, it doesn't scale well to multiple GPUs for a single model
* [http://timdettmers.com/2017/04/09/which-gpu-for-deep-learning/ Which GPU for deep learning (04/09/2017)]
# "I quickly found that it is not only very difficult to parallelize neural networks on multiple GPUs efficiently, but also that the speedup was only mediocre for dense neural networks. Small neural networks could be parallelized rather efficiently using data parallelism, but larger neural networks... received almost no speedup."
# If the network doesn't fit in the memory of one GPU (11 GB),
* [https://devtalk.nvidia.com/default/topic/743814/cuda-setup-and-installation/advice-on-single-vs-multi-gpu-system/ Advice on single vs multi-GPU system]
# Might want Want to get two graphics cards, one for development, one (crappy or onboard card) for operating system [https://stackoverflow.com/questions/21911560/how-can-i-set-one-nvidia-graphics-card-for-display-and-other-for-computingin-li]
*[https://stackoverflow.com/questions/37732196/tensorflow-difference-between-multi-gpus-and-distributed-tensorflow Different uses of multiple GPUs]
# Intra-model parallelism: If a model has long, independent computation paths, then you can split the model across multiple GPUs and have each compute a part of it. This requires careful understanding of the model and the computational dependencies.
====TL;DR====
Using multiple GPUs adds a lot of complexity. It has a few benefits: possible speed ups if the network can be split up (and is big enough), able to train multiple networks at once (either copies of the same network or modified networks), more memory for huge batches. Some frameworks have much better performance with multiple GPUs (pytorch, caffe 2) while others are catching up.
TodoPros of multiple GPUs: ask what software*Able to train multiple networks at once (either copies of the same network or modified networks). Allows for running long experiments while running new ones*Possible speed ups if the network can be split up (and is big enough), dataset sizebut tensorflow is not great for this*More memory for huge batches (not sure if necessary)Cons of multiple GPUs:*Adds a lot of complexity. === K80, NVLink ===*NVLink can link between CPU and GPU for increase in speed, but only with the CPU IBM POWER8+.*NVLink can link between GPU and GPU as a replacement for SLI with other CPUs, but this is not super relevant to tensorflow, even if trying to parallelize across one model.*[https://www.quora.com/Which-GPU-is-better-for-Deep-Learning-GTX-1080-or-Tesla-K80 This source] says to get the 1080 because the K80 is basically two K40s, which have less memory bandwidth than the 1080. [https://www.reddit.com/r/deeplearning/comments/5mc7s6/performance_difference_between_nvidia_k80_and_gtx/ This source] agrees. ==Misc. Parts==*Cases: Rosewill 1.0 mm Thickness 4U Rackmount Server Chassis, Black Metal/Steel RSV-L4000[https://www.amazon.com/gp/product/B0056OUTBK/ref=oh_aui_detailpage_o04_s00?ie=UTF8&psc=1]*Consider this case: Corsair Carbide Series Air 540 High Airflow ATX Cube Case [https://www.amazon.com/dp/B00D6GINF4/ref=twister_B00JRYFVAO?_encoding=UTF8&psc=1]*DVDRW (Needed?): Asus 24x DVD-RW Serial-ATA Internal OEM Optical Drive DRW-24B1ST [http://www.amazon.com/Asus-Serial-ATA-Internal-Optical-DRW-24B1ST/dp/B0033Z2BAQ/ref=sr_1_2?s=pc&ie=UTF8&qid=1452399113&sr=1-2&keywords=dvdrw]*Keyboard and Mouse: AmazonBasics Wired Keyboard and Wired Mouse Bundle Pack [http://www.amazon.com/AmazonBasics-Wired-Keyboard-Mouse-Bundle/dp/B00B7GV802/ref=sr_1_2?s=pc&rps=1&ie=UTF8&qid=1452402108&sr=1-2&keywords=keyboard+and+mouse&refinements=p_72%3A1248879011%2Cp_85%3A2470955011]* Optical drive: HP - DVD1265I DVD/CD Writer [https://www.newegg.com/Product/Product.aspx?Item=N82E16827140098&ignorebbr=1&nm_mc=AFC-C8Junction&cm_mmc=AFC-C8Junction-PCPartPicker,%20LLC-_-na-_-na-_-na&cm_sp=&AID=10446076&PID=3938566&SID=]  ==Other Builds/Guides== * [https://blog.slavv.com/the-1700-great-deep-learning-box-assembly-setup-and-benchmarks-148c5ebe6415 Deep learning box for $1700] ([https://news.ycombinator.com/item?id=14438472 Discussion])* [http://timdettmers.com/2015/03/09/deep-learning-hardware-guide/ A Full Hardware Guide to Deep Learning]* [https://www.oreilly.com/learning/build-a-super-fast-deep-learning-machine-for-under-1000 Cheap build]* [https://medium.com/@SocraticDatum/getting-started-with-gpu-driven-deep-learning-part-1-building-a-machine-d24a3ed1ab1e How to build a GPU deep learning machine]* [https://www.slideshare.net/PetteriTeikariPhD/deep-learning-workstation Deep Learning Computer Build] useful tips, long* [https://www.tooploox.com/blog/deep-learning-with-gpu Another box]* [http://graphific.github.io/posts/building-a-deep-learning-dream-machine/ Expensive deep learning box] ==Double GPU Server Build==[https://pcpartpicker.com/user/kyranadams/saved/gDzFdC PC Partpicker build] *[https://www.quora.com/Can-I-double-the-PCIe-lanes-in-a-dual-CPU-motherboard This article] says that it may be necessary to get both CPUs to get all of the PCI lanes ==Double GPU Build==[https://pcpartpicker.com/user/kyranadams/saved/ykK7hM PC Partpicker build] ===Motherboard===*Needs enough PCIe slots to support both GPUs and other units*Motherboards: MSI - Z170A GAMING M7 ATX LGA1151 Motherboard [https://www.newegg.com/Product/Product.aspx?Item=9SIA85V4SC7911&nm_mc=AFC-C8Junction&cm_mmc=AFC-C8Junction-PCPartPicker,%20LLC-_-na-_-na-_-na&cm_sp=&AID=10446076&PID=3938566&SID=], LGA 1151, development approach3x PCIe 3.0 x 16, 4 x PCIe 3.0 x 1, 6 x SATA 6GB/s, also used in [https://medium.com/@SocraticDatum/getting-started-with-gpu-driven-deep-learning-part-1-building-a-machine-d24a3ed1ab1e this build]
==RAM=CPU/Fan===*At least one core (two threads) per GPU*Chips: Intel - Core i7-6700 3.4GHz Quad-Core Processor [https://www.amazon.com/dp/B0136JONG8/?tag=pcpapi-20]*RAMCPU Fans: Crucial DDR4 RDIMM Cooler Master - Hyper 212 EVO 82.9 CFM Sleeve Bearing CPU Cooler [httphttps://www.newegg.com/Product/Product.aspx?Item=9SIA0ZX39C3002N82E16835103099&ignorebbr=1&nm_mc=AFC-C8Junction&cm_mmc=AFC-C8Junction-PCPartPicker,%20LLC-_-na-_-na-_-na&cm_sp=&AID=10446076&PID=3938566&SID=]*Buying this fan because it's very cheap for the reviews it got, 2133Mhz , Registered (buffered) and ECC, comes in packs of 4 x 32GBthe stock cooler for the CPU has had mixed reviews
==PSU=GPU===*PSUs: Corsair RM Series 850 Watt ATX/EPS 80PLUS Gold-Certified Power Supply - CP-9020056-NA RM850 2x GTX 1080 Ti [https://www.amazonnewegg.com/gpProduct/product/B00EB7UIXM/ref=oh_aui_detailpage_o03_s00Product.aspx?ieItem=UTF8N82E16814487338&pscignorebbr=1&nm_mc=AFC-C8Junction&cm_mmc=AFC-C8Junction-PCPartPicker,%20LLC-_-na-_-na-_-na&cm_sp=&AID=10446076&PID=3938566&SID=] * Integrated graphics on CPU: Intel HD Graphics 530
==Storage=RAM===*M.At least as much RAM as GPUs (2 Drives* 11 GB [GTX 1080 Ti size] = 22 GB, so 32GB)*Does not have to be fast for deep learning: Samsung 950 PRO "CPU-Series 512GB PCIe NVMe RAM- M.2 Internal SSD 2to-Inch MZGPU-V5P512BW RAM is the true bottleneck – this step makes use of direct memory access (DMA). As quoted above, the memory bandwidth for my RAM modules are 51.2GB/s, but the DMA bandwidth is only 12GB/s!"[httpshttp://www.amazontimdettmers.com/gp2015/03/product09/B01639694Mdeep-learning-hardware-guide/ref=oh_aui_detailpage_o03_s01?ie=UTF8&psc=1]*Solid State Drives: Intel SolidCrucial -State Drive 750 Series SSDPEDMW400G4R5 PCI-Express 3.0 MLC 32GB (2 x 16GB) DDR4- 400GB 2133 Memory [https://www.amazonnewegg.com/gpProduct/product/B00UHJJQAY/ref=oh_aui_detailpage_o07_s00Product.aspx?ieItem=UTF89SIA8PV5HF1514&pscnm_mc=1] or 800GB [https://www.amazon.com/gp/product/B013QP8XUE/refAFC-C8Junction&cm_mmc=oh_aui_detailpage_o00_s00?ieAFC-C8Junction-PCPartPicker,%20LLC-_-na-_-na-_-na&cm_sp=UTF8&pscAID=1]*Regular Hard drives: WD Red 3TB NAS Hard Disk Drive [https://www.amazon.com/gp/product/B008JJLW4M/ref10446076&PID=oh_aui_detailpage_o00_s00?ie=UTF83938566&pscSID=1] - 5400 RPM Class , SATA 6 GbGB/s 64MB Cache 3.5 Inchinterface* If not enough, should be able to extend this by buying two more cards
==Misc.=PSU===*Cases: Rosewill Some say PSU should be 1.0 mm Thickness 4U Rackmount Server Chassis5x-2x wattage of system, Black Metal/Steel RSV-L4000[https://www.amazon.com/gp/product/B0056OUTBK/ref=oh_aui_detailpage_o04_s00?ie=UTF8&psc=1]some say wattage+100W*DVDRWPSU: Asus 24x DVDEVGA -RW Serial-ATA Internal OEM Optical Drive DRWSuperNOVA G2 1000W 80+ Gold Certified Fully-24B1ST Modular ATX Power Supply [httphttps://www.amazonnewegg.com/Asus-Serial-ATA-Internal-Optical-DRW-24B1STProduct/dp/B0033Z2BAQ/ref=sr_1_2Product.aspx?sItem=pcN82E16817438010&ieignorebbr=UTF81&qidnm_mc=1452399113AFC-C8Junction&srcm_mmc=1AFC-C8Junction-PCPartPicker,%20LLC-_-2&keywords=dvdrw]*Keyboard and Mouse: AmazonBasics Wired Keyboard and Wired Mouse Bundle Pack [http://www.amazon.com/AmazonBasicsna-Wired_-Keyboardna-Mouse_-Bundle/dp/B00B7GV802/ref=sr_1_2?s=pcna&rpscm_sp=1&ieAID=UTF810446076&qidPID=14524021083938566&srSID=1-2&keywords=keyboard+and+mouse&refinements=p_72%3A1248879011%2Cp_85%3A2470955011]*KVM Switch: IOGEAR 4-Port USB Cable KVM Switch GCS24U (Black) [https://www.amazon.com/gp/product/B001S2PJO6/ref=oh_aui_detailpage_o02_s00?ie=UTF8&psc=1]
===Storage===
*SSD should be fast enough, no need for M.2 [http://timdettmers.com/2015/03/09/deep-learning-hardware-guide]
*SSD: Samsung - 850 EVO-Series 500GB 2.5" Solid State Drive [https://www.newegg.com/Product/Product.aspx?Item=N82E16820147373&ignorebbr=1&nm_mc=AFC-C8Junction&cm_mmc=AFC-C8Junction-PCPartPicker,%20LLC-_-na-_-na-_-na&cm_sp=&AID=10446076&PID=3938566&SID=]
*HDD: Seagate - Barracuda 3TB 3.5" 7200RPM Internal Hard Drive [https://www.newegg.com/Product/Product.aspx?Item=9SIADG25GT7889&nm_mc=AFC-C8Junction&cm_mmc=AFC-C8Junction-PCPartPicker,%20LLC-_-na-_-na-_-na&cm_sp=&AID=10446076&PID=3938566&SID=]
===Other Buildsthings to consider===* Water cooling? [http://timdettmers.com/2015/03/09/deep-learning-hardware-guide/ this] has a good section on cooling* Case is not rack mounted
==Software tips==* Setting up Ubuntu and Docker [https://news.ycombinatormedium.com/item?id=14438472 Deep learning box for $1700] (links to https:@SocraticDatum//blog.slavv.com/thegetting-started-with-1700gpu-greatdriven-deep-learning-boxpart-assembly2-environment-setup-and-benchmarks-148c5ebe6415)fd1947aab29]

Navigation menu