Changes

24,072 bytes added , 19:40, 13 November 2020

no edit summary

This page details the build of our [[DIGITS DevBox]]. There's also a page giving information on [[Using the DevBox]]. nVIDIA, famous for their incredibly poor supply-chain and inventory management, have been saying [https://developer.nvidia.com/devbo "Please note that we are sold out of our inventory of the DIGITS DevBox, and no new systems are being built"] since shortly after the [https://en.wikipedia.org/wiki/GeForce_10_series Titax X] was the latest and greatest thing (i.e., somewhere around 2016). But it's pretty straight forward to update [https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf their spec]. ==Introduction== ===Specification=== <onlyinclude>[[File:Top1000.jpg|right|300px]] Our [[DIGITS DevBox]], affectionately named after Lois McMaster Bujold's fifth God, has a XEON e5-2620v3 processor, 256GB of DDR4 RAM, two GPUs - one Titan RTX and one Titan Xp - with room for two more, a 500GB SSD hard drive (mounting /), and an 8TB RAID5 array bcached with a 512GB m.2 drive (mounting the /bulk share, which is available over samba). It runs Ubuntu 18.04, CUDA 10.0, cuDNN 7.6.1, Anaconda3-2019.03, python 3.7, tensorflow 1.13, digits 6, and other useful machine learning tools/libraries.</onlyinclude> ===Documentation===

The documentation from NVIDIA is here:

*https://docs.nvidia.com/dgx/digits-devbox-user-guide/index.html

*https://developer.nvidia.com/devbox

~~Hardware specs from other builds:~~

*https://www.azken.com/download/DIGITS_DEVBOX_DESIGN_GUIDE.pdf

**https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/

*https://cellmatiq.com/?p=155

*http://graphific.github.io/posts/building-a-deep-learning-dream-machine/

However, unfortunately, the form to get help from NVIDIA is closed [https://info.nvidianews.com/early_access_nvidia_3_15.html][https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://www.pyimagesearch.com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/]. And most of the other specs are limited to just the hardware [https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/][https://cellmatiq.com/?p=155][http://graphific.github.io/posts/building-a-deep-learning-dream-machine/][https://pcpartpicker.com/b/FGP323]. The best instructions that I could findwere:

*https://medium.com/yanda/building-your-own-deep-learning-dream-machine-4f02ccdb0460

The DevBox is currently unavailable from Amazon [https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC], and at around $15k buying one is prohibitive for most people. Some firms, including Lamdba Labs~~, Bizon-tech, are selling variants on them, but the details on their specs are limited (the MoBo and config details are missing entirely):~~ *[https://lambdalabs.com/deep-learning/workstations/4-gpu**https://pcpartpicker.com/b/FGP323*], Bizon-tech [https://bizon-tech.com/us/bizon-g3000*http://deeplearningbox.com/*https://www.amazon.com/Lambda-Deep-Learning-DevBox-Preinstalled/dp/B01BCDK1KC ], are selling variants on them, but their prices are high too and the details on their specs are limited (~~currently unavailable~~the MoBo and config details are missing entirely).

~~Unfortunately,~~ But the ~~form~~ parts cost is perhaps $4-5k now for the original spec! So this page goes through everything required to put one together and get ~~help from NVIDIA is closed:~~*https://info.nvidianews.com/early_access_nvidia_3_15.html*https://www.reddit.com/r/buildapc/comments/3gewmz/build_complete_nvidia_digits_devbox/*https://www.pyimagesearchit up and running.~~com/2016/06/06/hands-on-with-the-nvidia-digits-devbox-for-deep-learning/~~

~~At around $15k (the Lamdba variants go from $10k to $23k), buying one is prohibitive for most people. But the parts cost is perhaps $5k now for the original spec.~~==Hardware== ===Description===

~~==Hardware==~~We mostly followed the original hardware spec from NVIDIA, updating the capacity of the drives and other minor things, as we had many of these parts available as salvage from other boxes. We had to buy the ASUS X99-E WS motherboard (we got the ASUS X99-E WS/USB variant as the original wasn't available and this one has USB3.1), as well as some new drives, just for this project.

[[File:Front1000.jpg|right|300px]] We ~~mostly followed~~ opted to use a Xeon e5-2620v3 processor, rather than the ~~original hardware spec from NVIDIA~~Core i7-5930K. We had both available and both support 40 channels, ~~updating~~ mount in the ~~capacity of~~ LGA 2011-v3 socket, have 6 cores, 15mb caches, etc. Although the i7 has a faster clock speed, the ~~drives and other minor things~~Xeon takes registered (buffered), ECC DDR4 RDIMMs, as which means we ~~had many of these parts available as salvage from other boxes~~can put 256Gb on the board, rather than just 64Gb. ~~Though~~ For the GPUs, we ~~had~~ have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the ~~ASUS X99~~whole thing in a Rosewill RSV-~~E WS motherboard (as well as some new drives) just for this project~~L4000 case.

We opted to use a Xeon e5-2620v3 processor, rather than the Core i7-5930K (which we did have available). Both support 40 channels and mount in the LGA 2011-v3 socket, and both have 6 cores, 15mb caches etc. The i7 has a faster clock speed but the Xeon takes registered (buffered), ECC DDR4 RDIMMs, which means we can put 256Gb on the board, rather than just 64Gb. For the GPUs we have a TITAN RTX and an older TITAN Xp available to start, and we can add a 1080Ti later, or buy some additional GPUs if needed. We also put the whole thing in a Rosewill RSV-L4000 case.===Parts List===

{| class="wikitable sortable"

| 2 || ARCTIC F8 PWM Fluid Dynamic Bearing Case Fan, 80mm PWM Speed Control, 31 CFM at 22dBA

|}

===Build notes===

Old notes on a prior look at a [[GPU Build]] are on the wiki too.

[[File:Back1000.jpg|right|300px]] There weren't any particularly noteworthy things about the hardware build. The GPUs need to go in slots 1 and 3, which means they sit tight on each other. I We put the Titan XP Xp in slot 1 (and plugged the monitor into its HDMI port), because then the fans for the Titan RTX (which I we expect will get heavier use) are in the clearfor now. The case fans were set up in a push-and-pull arrangement, and the hot-swap bay was put in the center position to allow as much airflow past the GPUs as possible. ===BIOS===

The initial BIOS boot was weird - the machine ran at full power for a short period then powered off multiple times before finally giving a single system beep and loading the BIOS. It may have been memory checking or some such.

==We did NOT update the BIOS==. It didn't need it. The m.2 drive is visible in the BIOS and will be used as a cache for the RAID 5 array (using bcache). The GPUs are recognized as PCIe devices in the tool section. And all of the SATA drives are being recognized.

~~The machine boots~~ We then made the following changes:*Set the three hard disks to hot-swap enable*Set the fans to PWM, which drastically cuts down the noise, and set the lower thresholds to 200 (not that it seemed to ~~BIOS. There are some things~~ matter, they seem to ~~check:~~be idling at around 1k)*List the OS as "Other OS" rather than windows, and set enhanced mode to disabled*~~The GPUs are being recognized~~Delete the PK to disable secure boot*~~All of~~ Change the ~~drives~~boot order to be CD first (not as UEFI, ~~including~~ and then the ~~m.2, are being recognized~~Samsung 850)

Notes:

*We will do RAID 5 array in software, rather using X99 through the BIOS. *The m.2 drive will be used as a cache for the RAID 5 array, using bcache

What's really crucial is that all the hardware is visible and that we are NOT using UEFI. With UEFI, there is an issue with the drivers not being properly signed under secure boot.

==Software==

===Main OS Install === Install [http://cdimage.ubuntu.com/releases/18.04.2/release/?_ga=2.30548799.1041204444.1558044875-2114387110.1558044875 Ubuntu 18.04 ] (note that the original DiGIT DevBox ran 14.04), '''not the live version''', from a freshly burnt DVD. If you install the HWE version, you don't need to run apt-get install --install-recommends linux-generic-hwe-18.04 at the end. ====In the installer==== Choose the first network hardware option and make sure that the second (right most) network port is connected to a DHCP broadcasting router. Under partitions: [[File:Partitions1000.jpg|right|300px]] # Put one large partition, formatted as ext4, mounted as /, bootable on the 850# Partition each SATA drive as RAID# Put one large partition, formatted as ext4, not mounted on the 970 (for later)# Put software RAID5 over the 3 SATA drives, format the RAID as ext4 and mount as /bulk Install SSH and Samba. When prompted, add the MBR to the front of the 850. ====First boot==== After a reboot, the screen freezes if you didn't install HWE. Either change the bootloader, adding nomodeset (see https://www.pugetsystems.com/labs/hpc/The-Best-Way-To-Install-Ubuntu-18-04-with-NVIDIA-Drivers-and-any-Desktop-Flavor-1178/#step-4-potential-problem-number-1), or just SSH onto the box and fix that now. Run as root: apt-get update apt-get dist-upgrade apt-get install --install-recommends linux-generic-hwe-18.04 Check the release: lsb_release -a Give the box a reboot! ===X Windows=== If you install the video driver before installing Xwindows, you will need to manually edit the Xwindows config files. So, now install the X window system. The easiest way is: tasksel And choose your favorite. We used Ubuntu Desktop. And reboot again to make sure that everything is working nicely. ===Video Drivers=== The first build of this box was done with an installation of CUDA 10.1, which automatically installed version 418.67 of the NVIDIA driver. We then installed CUDA 10.0 under conda to support Tensorflow 1.13. All went mostly well, and the history of this page contains the instructions. However, at some point, likely because of an OS update, the video driver(s) stopped working. This page now describes the second build (as if it were a build from scratch). [[Addressing Ubuntu NVIDIA Issues]] provides additional information. ===Hardware and Drivers=== Check the hardware is being seen and what driver is being used with: lspci -vk Currently we are using the nouveau driver for the Xp, and have no driver loaded for the RTX. You can also list the driver using ubuntu-drivers, which is supposed to tell you which NVIDIA driver is recommended: apt-get install ubuntu-drivers-common ubuntu-drivers devices == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 == modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00 vendor : NVIDIA Corporation model : GP102 [TITAN Xp] driver : nvidia-driver-390 - distro non-free recommended driver : xserver-xorg-video-nouveau - distro free builtin But the 390 is the only driver available from the main repo. Add the experimental repo for more options: add-apt-repository ppa:graphics-drivers/ppa apt update ubuntu-drivers devices == /sys/devices/pci0000:00/0000:00:03.0/0000:03:00.0/0000:04:10.0/0000:05:00.0 == modalias : pci:v000010DEd00001B02sv000010DEsd000011DFbc03sc00i00 vendor : NVIDIA Corporation model : GP102 [TITAN Xp] driver : nvidia-driver-418 - third-party free driver : nvidia-driver-415 - third-party free driver : nvidia-driver-430 - third-party free recommended driver : nvidia-driver-396 - third-party free driver : nvidia-driver-390 - distro non-free driver : nvidia-driver-410 - third-party free driver : xserver-xorg-video-nouveau - distro free builtin Then blacklist the nouveau driver (see https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-nouveau) and reboot to a text terminal so that it isn't loaded. apt-get install build-essential gcc --version vi /etc/modprobe.d/blacklist-nouveau.conf blacklist nouveau options nouveau modeset=0 update-initramfs -u shutdown -r now Reboot to a text terminal lspci -vk Shows no kernel driver in use! Install the driver! apt install nvidia-driver-430 ====CUDA==== Get CUDA 10.0, rather than 10.1. Although 10.1 is the latest version at the time of writing, it won't work with Tensorflow 1.13, so you'll just end up installing 10.0 under conda anyway. *The installation instructions are here: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html*You can down load CUDA 10.0 from here: https://developer.nvidia.com/cuda-10.0-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=runfilelocalEssentially, first install build-essential, which gets you gcc. Then run the installer script and DO NOT install the driver (don't worry about the warning, it will work fine!): sh cuda_10.0.130_410.48_linux.run Do you accept the previously read EULA? accept/decline/quit: accept Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48? (y)es/(n)o/(q)uit: n Install the CUDA 10.0 Toolkit? (y)es/(n)o/(q)uit: y Enter Toolkit Location [ default is /usr/local/cuda-10.0 ]: Do you want to install a symbolic link at /usr/local/cuda? (y)es/(n)o/(q)uit: y Install the CUDA 10.0 Samples? (y)es/(n)o/(q)uit: y Enter CUDA Samples Location [ default is /home/ed ]: Installing the CUDA Toolkit in /usr/local/cuda-10.0 ... Missing recommended library: libGLU.so Missing recommended library: libX11.so Missing recommended library: libXi.so Missing recommended library: libXmu.so Missing recommended library: libGL.so Installing the CUDA Samples in /home/ed ... Copying samples to /home/ed/NVIDIA_CUDA-10.0_Samples now... Finished copying samples. =========== = Summary = =========== Driver: Not Selected Toolkit: Installed in /usr/local/cuda-10.0 Samples: Installed in /home/ed, but missing recommended libraries Please make sure that - PATH includes /usr/local/cuda-10.0/bin - LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA. ***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 10.0 functionality to work. To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file: sudo <CudaInstaller>.run -silent -driver Logfile is /tmp/cuda_install_2807.log Now fix the paths. To do this for a single user do: export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} But it is better to fix it for everyone by editing your environment file: vi /etc/environment PATH="/usr/local/cuda-10.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games" LD_LIBRARY_PATH="/usr/local/cuda-10.0/lib64" With version cuda 10.0, you don't need to edit rc.local to start the persistence daemon: /usr/bin/nvidia-persistenced --verbose Instead, nvidia-persistenced runs as a service. ====Test the installation==== Make the samples... cd /usr/local/cuda-10.0/samples make And change into the sample directory and run the tests: cd /usr/local/cuda-10.0/samples/bin/x86_64/linux/release ./deviceQuery ./bandwidthTest Everything should be good at this point! ===Bcache=== The RAID5 array is set up and mounted as /bulk. We need to add the cache on the m.2 drive. Begin by installing bcache: apt-get install bcache-tools It was already installed and the newest version See what we have: fdisk -l This gives us:*/dev/nvme0n1p1 m.2*/dev/sda RAID disk*/dev/sdb RAID disk*/dev/sdc RAID disk*/dev/md0 RAID array*/dev/sdd 870 The m.2 is not mounted. This can be seen by checking lsblk (or mount or df): lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 3.7T 0 disk └─sda1 8:1 0 3.7T 0 part └─md0 9:0 0 7.3T 0 raid5 /bulk sdb 8:16 0 3.7T 0 disk └─sdb1 8:17 0 3.7T 0 part └─md0 9:0 0 7.3T 0 raid5 /bulk sdc 8:32 0 3.7T 0 disk └─sdc1 8:33 0 3.7T 0 part └─md0 9:0 0 7.3T 0 raid5 /bulk sdd 8:48 0 465.8G 0 disk └─sdd1 8:49 0 465.8G 0 part / sr0 11:0 1 1024M 0 rom nvme0n1 259:0 0 465.8G 0 disk └─nvme0n1p1 259:1 0 465.8G 0 part Check the mdadm.conf file and fstab: cat /etc/mdadm/mdadm.conf ... ARRAY /dev/md/0 metadata=1.2 UUID=af515d37:8a0e05a1:59338d18:23f5af21 name=bastard:0 cat /etc/fstab UUID=475ad41e-3d64-4c90-8fbc-9289c050acea / ext4 errors=remount-ro 0 1 UUID=aa65554a-24d9-450a-b10c-63c5c6a4b48a /bulk ext4 defaults 0 2 /swapfile none swap sw 0 0 Note that the second UUID refers to /dev/md0, whereas the UUID in the contents of mdadm.conf is the UUID of the 3 RAID5 drives together: blkid /dev/md0 /dev/md0: UUID="aa65554a-24d9-450a-b10c-63c5c6a4b48a" TYPE="ext4" Note we have an active RAID5 array: cat /proc/mdstat Instructions for taking apart and/or (re-)creating a RAID array are here:*https://www.digitalocean.com/community/tutorials/how-to-create-raid-arrays-with-mdadm-on-ubuntu-18-04 Instructions on building a bcache are here:*https://wiki.ubuntu.com/ServerTeam/Bcache*https://www.kernel.org/doc/Documentation/bcache.txt Unmount the RAID array: umount /dev/md0 Wipe the both m.2 and the RAID5 array: wipefs -a /dev/nvme0n1p1 wipefs -a /dev/md0 Make the bcache, formatting both drives (md0 as backing, m.2 as cache). Note that when you do it one command the assignment is automatic. make-bcache -B /dev/md0 -C /dev/nvme0n1p1 If you screw up, cd to /sys/fs/bcache/whatever and then ls -l cache0. If there is an entry in there echo 1 > stop. This unregisters the cache and should let you start over. Check the new bcache array is there, format it and mount it: ls /dev/bcache* mkfs.ext4 /dev/bcache0 mount /dev/bcache0 /bulk Now we need to update fstab (see https://help.ubuntu.com/community/Fstab) with the right UUID and spec: blkid /dev/bcache0 UUID="4c63f20b-ad35-477d-bfaa-82571beba841" TYPE="ext4" cp /etc/fstab /etc/fstab.org vi /etc/fstab Comment out old RAID array entry Add new entry: UUID=4c63f20b-ad35-477d-bfaa-82571beba841 /bulk ext4 rw 0 0 And update your boot image and give it a reboot to check the new bcache array comes back up ok: update-initramfs -u shutdown -r now ===Samba=== These instructions are taken from the [[Research_Computing_Configuration#Samba]] page with only minor modifications. This guide is helpful: https://linuxconfig.org/how-to-configure-samba-server-share-on-ubuntu-18-04-bionic-beaver-linux Check samba is running samba --version Then fix the conf file: cp /etc/samba/smb.conf /etc/samba/smb.conf.bak vi /etc/samba/smb.conf workgroup=BASTARDGROUP usershare allow guests = no ;comment out the [printers] and [print$] sections [bulk] comment = Bulk RAID Array path = /bulk browseable = yes create mask= 0775 directory mask = 0775 read only = no guest ok = no Test the parameters, change the permissions and ownership: testparm /etc/samba/smb.conf chmod 770 /bulk groupadd smbusers chown :smbusers /bulk Now create the researcher account, and add it to the samba share group cat /etc/group groupadd -g 1002 researcher useradd -g researcher -G smbusers -s /bin/bash -p 1234 -d /home/researcher -m researcher passwd researcher hint: littleamount smbpasswd -a researcher Finally restart samba: systemctl restart smbd systemctl restart nmbd Check it works: smbclient -L localhost (no root password) And add users to the samba group (if not already): usermod -G smbusers researcher #Note that this sets the group and will overwrite sudo or other group assignments, so don't do it with your main account. Instead just: useradd ed smbusers ===Dev Tools=== ====DIGITS==== This section follows https://developer.nvidia.com/rdp/digits-download. Install Docker CE first, following https://docs.docker.com/install/linux/docker-ce/ubuntu/ Then follow https://github.com/NVIDIA/nvidia-docker#quick-start to install docker2, but change the last command to use cuda 10.0 ... sudo apt-get install -y nvidia-docker2 sudo pkill -SIGHUP dockerd # Test nvidia-smi with the latest official CUDA image docker run --runtime=nvidia --rm nvidia/cuda:10.0-base nvidia-smi Then pull DIGITS using docker (https://hub.docker.com/r/nvidia/digits/): docker pull nvidia/digits Finally run DIGITS inside a docker container (see https://github.com/NVIDIA/nvidia-docker/wiki/DIGITS for other options): docker run --runtime=nvidia --name digits -d -p 5000:5000 nvidia/digits And open a browser to http://localhost:5000/ to see DIGITS. Documentation:*https://github.com/NVIDIA/DIGITS/blob/digits-6.0/docs/GettingStarted.md*https://developer.nvidia.com/digits Note: you can kill docker containers with docker system prune ====cuDNN==== Documentation on installing cuDNN is here:*https://developer.nvidia.com/cuDNN*https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html First, make an installs directory in bulk and copy the installation files over from the RDP (E:\installs\DIGITS DevBox). Then: cd /bulk/install/ dpkg -i libcudnn7_7.6.1.34-1+cuda10.0_amd64.deb dpkg -i libcudnn7-dev_7.6.1.34-1+cuda10.0_amd64.deb dpkg -i libcudnn7-doc_7.6.1.34-1+cuda10.0_amd64.deb And test it: cp -r /usr/src/cudnn_samples_v7/ $HOME cd $HOME/cudnn_samples_v7/mnistCUDNN make clean && make ./mnistCUDNN Test passed! ====Python Based==== Now install Anaconda, so that we have python 3, and can pip and conda install things. Instructions for installing Anaconda on Ubuntu 18.04LTS (e.g., https://docs.anaconda.com/anaconda/install/linux/) all recommend using the shell script. From https://www.anaconda.com/distribution/ the latest version is 3.7, so: cd /bulk/install curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh sha256sum Anaconda3-2019.03-Linux-x86_64.sh As user researcher, run the installation (this installs python 3.7.3): bash Anaconda3-2019.03-Linux-x86_64.sh accept the install location: /home/researcher/anaconda3 accept the initialization by running conda init Flush the local env: source ~/.bashrc =====Tensorflow===== Now install tensorflow using pip (see https://www.tensorflow.org/install/pip): As root: apt install python3-pip apt install virtualenv pip3 install -U virtualenv As researcher: cd /home/researcher virtualenv --system-site-packages -p python3 ./venv source ./venv/bin/activate # sh, bash, ksh, or zsh pip install --upgrade tensorflow-gpu python -c "import tensorflow as tf; tf.enable_eager_execution(); print(tf.reduce_sum(tf.random_normal([1000, 1000])))" Note: to deactivate the virtual environment: deactivate Note that adding the anaconda path to /etc/environment makes the virtual environment redundant. =====PyTorch and SciKit===== Run the following as researcher (in venv): conda install -c anaconda numpy conda install pytorch torchvision cudatoolkit=10.0 -c pytorch conda install -c anaconda scikit-learn Refs:*https://anaconda.org/anaconda/scikit-learn*https://anaconda.org/anaconda/numpy*https://pytorch.org/ ====Other packages==== The following are not yet installed:*Caffe: http://caffe.berkeleyvision.org/*BIDMach: https://github.com/BIDData/BIDMach/wiki/Installing-and-Running =====Theano===== Theano v.1 requires python >=3.4 and <3.6. We are currently running 3.7. If we decide to install theano, we'll need to set up another version of python and another virtual environment. See:*http://deeplearning.net/software/theano/install_ubuntu.html ===VNC=== In order to use the graphical interface for Matlab and other applications, we need a VNC server. First, install the VNC client remotely. We use the standalone exe from TigerVNC. Now install TightVNC, following the instructions: https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-vnc-on-ubuntu-18-04 cd /root apt-get install xfce4 xfce4-goodies As user sudo apt-get install tightvncserver vncserver set password for user vncserver -kill :1 mv ~/.vnc/xstartup ~/.vnc/xstartup.bak vi ~/.vnc/xstartup #!/bin/bash xrdb $HOME/.Xresources startxfce4 & vncserver sudo vi /etc/systemd/system/vncserver@.service [Unit] Description=Start TightVNC server at startup After=syslog.target network.target [Service] Type=forking User=uname Group=uname WorkingDirectory=/home/uname PIDFile=/home/ed/.vnc/%H:%i.pid ExecStartPre=-/usr/bin/vncserver -kill :%i > /dev/null 2>&1 ExecStart=/usr/bin/vncserver -depth 24 -geometry 1280x800 :%i ExecStop=/usr/bin/vncserver -kill :%i [Install] WantedBy=multi-user.target Note that changing the color depth breaks it! To make changes (or after the edit) sudo systemctl daemon-reload sudo systemctl enable vncserver@2.service vncserver -kill :2 sudo systemctl start vncserver@2 sudo systemctl status vncserver@2 Stop the server with sudo systemctl stop vncserver@2 Note that we are using :2 because :1 is running our regular Xwindows GUI. Instrucions on how to set up an IP tunnel using PuTTY: https://helpdeskgeek.com/how-to/tunnel-vnc-over-ssh/ ====Connection Issues==== Coming back to this, I had issues connecting. I set up the tunnel using the saved profile in puTTY.exe and checked to see which local port was listening (it was 5901) and not firewalled using the listening ports tab under network on resmon.exe (it said allowed, not restricted under firewall status). VNC seemed to be running fine on Bastard, and I tried connecting to localhost::1 (that is 5901 on the localhost, through the tunnel to 5902 on Bastard) using VNC Connect by RealVNC. The connection was refused. I checked it was listening and there was no firewall: netstat -tlpn tcp 0 0 0.0.0.0:5902 0.0.0.0:* LISTEN 2025/Xtightvnc ufw status Status: inactive The localhost port seems to be open and listening just fine: Test-NetConnection 127.0.0.1 -p 5901 So, presumably, there must be something wrong with the tunnel itself. '''Ignoring the SSH tunnel worked fine: Connect to 192.168.2.202::5902 using the TightVNC (or RealVNC, etc.) client.''' ===RDP=== I also installed xrdp: apt install xrdp adduser xrdp ssl-cert #Check the status and that it is listening on 3389 systemctl status xrd netstat -tln #It is listening... vi /etc/xrdp/xrdp.ini #See https://linux.die.net/man/5/xrdp.ini systemctl restart xrdp This gave a dead session (a flat light blue screen with nothing on it), which finally yielded a connection log which said "login successful for display 10, start connecting, connection problems, giving up, some problem." cat /var/log/xrdp-sesman.log There could be some conflict between VNC and RDP. systemctl status xrdp shows "xrdp_wm_log_msg: connection problem, giving up". I tried without success: gsettings set org.gnome.Vino require-encryption false https://askubuntu.com/questions/797973/error-problem-connecting-windows-10-rdp-into-xrdp vi /etc/X11/Xwrapper.config allowed_users = anybody This was promising as it was previously set to consol. https://www.linuxquestions.org/questions/linux-software-2/xrdp-under-debian-9-connection-problem-4175623357/#post5817508 apt-get install xorgxrdp-hwe-18.04 Couldn't find the package... This lead was promising as it applies to 18.04.02 HWE, which is what I'm running https://www.nakivo.com/blog/how-to-use-remote-desktop-connection-ubuntu-linux-walkthrough/ dpkg -l |grep xserver-xorg-core ii xserver-xorg-core 2:1.19.6-1ubuntu4.3 amd64 Xorg X server - core server Which seems ok, despite having a problem with XRDP and Ubuntu 18.04 HWE documented very clearly here: http://c-nergy.be/blog/?p=13972 There is clearly an issue with Ubuntu 18.04 and XRDP. The solution seems to be to downgrade xserver-xorg-core and some related packages, which can be done with an install script (https://c-nergy.be/blog/?p=13933) or manually. But I don't want to do that, so I removed xrdp and went back to VNC! apt remove xrdp ===Other Software=== I installed the community edition of PyCharm: snap install pycharm-community --classic #Restart the local terminal so that it has updated paths (after a snap install, etc.) /snap/pycharm-community/214/bin/pycharm.sh On launch, you get some config options. I chose to install and enable:*IdeaVim (a VI editor emulator)*R*AWS Toolkit Make a launcher: In /usr/share/applications: vi pycharm.desktop [Desktop Entry] Version=2020.2.3 Type=Application Name=PyCharm Icon=/snap/pycharm-community/214/bin/pycharm.png Exec="/snap/pycharm-community/214/bin/pycharm.sh" %f Comment=The Drive to Develop Categories=Development;IDE; Terminal=false StartupWMClass=jetbrains-pycharm Also, create a launcher on the desktop with the same info.

Ed

Bureaucrats, Interface administrators, Administrators (Semantic MediaWiki), Administrators

7,613

edits

Changes

DIGITS DevBox (view source)

Revision as of 19:40, 13 November 2020

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools