Installing TensorFlow

From edegan.com
Jump to navigation Jump to search

Old

Tensorflow 1.9.0 with GPU Installation Log

Important note:
Install the version of software/packages strictly according to the instructions provided by Tensorflow. A different version of software, for example CUDA toolkit 9.2 instead of 9.0, might lead to failure in tensorflow. When upgrading tensorflow, do it very carefully. As of July 2018, Tensorflow is notoriously easy to break with careless installation. DO NOT attempt to install Tensorflow under your user account. Tensorflow has been installed for all users, and a new local install will interfere with it.

Synopsis

Tensorflow was previously installed. In 2018 Summer, a new piece of graphics card was installed on DB Server. Wei and Minh hence-force installed and configured tensorflow-gpu 1.9.0 for Python3.6 for all users of DB Server.

Using Tensorflow

It is important to know that, on DB Server, Tensorflow-gpu 1.9.0 is installed for python3.6, instead of either the default python3 which is Python 3.5, or the default python which is Python 2.7 . In case that the system default python3 might be changed, type in terminal to find out:

which python3

and

which python3.6   

A quick test of whether tensorflow-gpu is working for python3.6, type the following into a terminal:

python3.6 -c "import tensorflow as tf; sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

This will report back which CPU and GPU devices the tensorflow is using. If there is no information for the GPU device, there is something wrong.

NVIDIA configuration

Before installing tensorflow with GPU, configure the NVIDIA® software by following instruction: https://www.tensorflow.org/install/install_linux#NVIDIARequirements

Install CUDA Toolkit 9.0

  • 1. Installed CUDA Toolkit 9.0 Base Installer with the Runfile option. The toolkit is in
/usr/local/cuda-9.0 

for the toolkit. Did NOT install NVDIA accelerated Graphics Driver for Linux-x86_64 384.81 (We believe we have a different graphic driver. we have a much Newer version(396.26)). Installed the CUDA 9.0 samples in

HOME/MCNAIR/CUDA-SAMPLES.
  • 2. Installed Patch 1, 2 and 3. The command to install was
sudo sh cuda_9.0.176.2_linux.run # (9.0.176.1 for patch 1 and 9.0.176.3 for patch 3)
  • 3. Set up the environment variables:

The PATH variable needs to include /usr/local/cuda-9.0/bin To add this path to the PATH variable:

export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}

In addition, when using the runfile installation method, the LD_LIBRARY_PATH variable needs to contain /usr/local/cuda-9.0/lib64 on a 64-bit system To change the environment variables for 64-bit operating systems:

export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Note that the above paths change when using a custom install path with the runfile installation method.
To accomplish this:

nano /home/mcnair/.bashrc

Add

export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\
                        ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Save and exit. Close and open the terminal (or source .bashrc).

  • 4. To verify CUDA Toolkit 9.0 is installed, type
nvcc -V

 

Install cuDNN v7.1.4

  • 5. Downloaded cuDNN v7.1.4 for CUDA 9.0:

In order to download cuDNN, ensure you are registered for the NVIDIA Developer Program. Then Go to: NVIDIA cuDNN home page. -> Click Download. -> Complete the short survey and click Submit. -> Accept the Terms and Conditions. A list of available download versions of cuDNN displays. -> Select the cuDNN version you want to install. Chose the tar file.

  • 6. Install cuDNN: your CUDA directory path is referred to as
/usr/local/cuda/

your cuDNN download path is referred to as

<cudnnpath>

Follow these commands: a. Navigate to your <cudnnpath> directory containing the cuDNN Tar file. b. Unzip the cuDNN package.

$ tar -xzvf cudnn-9.0-linux-x64-v7.tgz

c. Copy the following files into the CUDA Toolkit directory.

$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h
/usr/local/cuda/lib64/libcudnn*

Install GPU drivers

  • 7. Did not need to install the GPU drivers because we already had the correct version.

Install libcupti-dev library

  • 8.Tried to install the libcupti-dev library with:
sudo apt-get install cuda-command-line-tools-9-0

but apparently it was already installed. (How surprising!)

LD-LIBRARY_PATH environment variable modification

  • 9. Added the following path to the LD-LIBRARY_PATH environment variable by accessing bash as per above:
 export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/usr/local/cuda/extras/CUPTI/lib64

Install TensorRT 3.0 (optional)

  • 10.Did not install TensorRT 3.0

Problem encountered

1. In usr/local/ we found files 'CUDA-9.2' and 'CUDA-8.0'. These were probably installed in the past.
2. When execute the following command in a terminal, it returns 'PATH: command not found'.

$ export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}

3. If installed correctly, type nvcc- V should verify installation. But currently it returns 'the program nvcc is currently not installed'.
4. When adding libcupti-dev library, after adding the path:

 export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/usr/local/cuda/extras/CUPTI/lib64

Upon source .bashrc, it returns the following:

 -bash: export: `:/usr/local/cuda-9.0/lib64:/usr/local/cuda-9.0/extras/CUPTI/lib64': not a valid identifier

So far it does not affect the functionality of tensorflow, but it will probably affect libcupti-dev library.

Tensorflow (with GPU support) Installation

We followed this instruction here: https://www.tensorflow.org/install/install_linux#InstallingVirtualenv to install tensorflow.

Install Tensorflow using the Virtual Environment

Install on DBServer under the user McNair. Password: askEd

  • 1.install virtualenv:

Surprise again! Someone already installed it! Did not install virtualenv again.

  • 2. Create a directory for the virtual environment and choose python 3 interpreter
 mkdir ~/tensorflow  # somewhere to work out of
 cd ~/tensorflow
 # Choose one of the following Python environments for the ./venv directory:
 virtualenv --system-site-packages -p python3 venv # Use Python 3.n

NOTE: python2 DOES NOT WORK WITH GPU

  • 3. Activate the Virtualenv environment:
 source ~/tensorflow/venv/bin/activate      # bash
  • 4. Upgrade pip:
pip install -U pip
  • 5. Install TensorFlow in the virtual environment: within
 pip install -U tensorflow-gpu
  • Validate the installation with:
(venv)$ python -c "import tensorflow as tf; print(tf.__version__)"

 

Testing Tensorflow with GPU in virtual environment

Create a python file with the following:

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

Run it in the virtual environment.  

Installing Tensorflow as root for All users

Important note: Currently on DB Server, pip/pip3 is working with Python3.6 rather than Python3. Hence the following installs a copy of tensorflow-gpu 1.9.0 for Python3.6 for all users

Installation

  • 0. Deleted previously installed tensorflow with CPU support:
sudo pip3 uninstall tensorflow
  • 1. Used this command to install tensorflow-gpu:
 sudo pip3 install -U tensorflow-gpu

Path variable (crucial)

If you logged on as a user using tensorflow for the first time, you need to set the CUDA Toolkit 9.0 environment variables. Type into terminal

nano .bashrc

Add the following:

export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64\
                        ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Save and exit (CTRL + O and CTRL + X). Type

source .bashrc 

Testing Tensorflow with GPU as (non-root) user

After ssh onto DB Server, type the following command into a terminal:

python3.6 -c "import tensorflow as tf; print(tf.__version__);sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))"