Skip to main content

Installing Tensorflow GPU

EXTRA:
Some cool commands: nvidia-smi, neofetch 

*** THIS IS THE LAST STEP AFTER INSTALLING NVIDIA Drivers, CUDA, cuDNN ***
cd /tmp
curl -O https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh
bash Anaconda3-5.2.0-Linux-x86_64.sh 
-Yes add to bashrc
source ~/.bashrc
conda list 

conda create --name tfgpu
conda env list 
source activate tfgpu 
conda install -c anaconda tensorflow-gpu
conda remove -n tfgpu --all
Installing Tensorflow GPU
sudo apt install neofetch
neofetch
 
***************************************************************************
 
https://github.com/markjay4k/Install-Tensorflow-on-Ubuntu-17.10-/blob/master/Tensorflow%20Install%20instructions.ipynb
https://www.youtube.com/watch?v=vxjbL5iN1XY

Overview

    Step 1: Update your GPU driver (should be higher than version 390)
    Step 2: Install the CUDA Toolkit version 9.0 (with all the patches)
    Step 3: Install CUDNN 7.0.5
    Step 4: Install Tensorflow GPU
    Step 5: Test it!

My Case:
 
Step 1: NVIDIA® GPU drivers —CUDA 9.0 requires 384.x or higher. 
GPU Driver:
RTX 2070
Linux x64 (AMD64/EM64T) Display Driver
Version:  418.43
Release Date:  2019.2.22
Operating System:  Linux 64-bit
Language:  English (US)
File Size:  101.71 MB

bash /home/rb/Downloads/NVIDIA-Linux-x86_64-418.43.run [error]
sudo bash /home/rb/Downloads/NVIDIA-Linux-x86_64-418.43.run [error]
sudo apt install gcc
sudo bash /home/rb/Downloads/NVIDIA-Linux-x86_64-418.43.run [error]
sudo apt install make 
sudo bash /home/rb/Downloads/NVIDIA-Linux-x86_64-418.43.run [error]
***Some 32 bit error**
So 
sudo dpkg --add-architecture i386
sudo apt update
sudo apt install build-essential libc6:i386 

disable the default nouveau Nvidia driver
sudo bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
sudo bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
this will create a file named blacklist-nvidia-nouveau.conf in /etc/modprobe.d with contents:
blacklist nouveau
options nouveau modeset=0
 
Can confirm using:
cat /etc/modprobe.d/blacklist-nvidia-nouveau.conf
 
Then,
sudo update-initramfs -u
 
The update-initramfs script manages your initramfs images on your local box. 
It keeps track of the existing initramfs archives in /boot. 
There are three modes of operation create, update or delete. ...
At boot time, the kernel unpacks that archive into RAM disk, mounts and uses it as initial root file system. 

Then
sudo reboot
 
In order to install new Nvidia driver we need to stop the current display server.
The easiest way to do this is to change into runlevel 3 using the telinit command.
After executing the following linux command the display server will stop,
therefore make sure you save all your current work ( if any ) before you proceed:

sudo telinit 3

Hit CTRL+ALT+F1 and login with your username and password to open a new TTY1 session. 

Then:
sudo bash /home/rb/Downloads/NVIDIA-Linux-x86_64-418.43.run
 
The distribution-provided pre-install script failed!
Are you sure you want to continue? -> CONTINUE INSTALLATION
Would you like to run the nvidia-xconfig utility? -> YES  
  
sudo reboot
 
After reboot you should be able to start NVIDIA X Server Settings app from the Activities menu. 

https://linuxconfig.org/how-to-install-the-nvidia-drivers-on-ubuntu-18-04-bionic-beaver-linux
https://www.tensorflow.org/install/gpu
 
Command: 
nvidia-smi 


Step 2:
CUDA Toolkit:
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal


sudo dpkg -i cuda-repo-ubuntu1804-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-10-1-local-10.1.105-418.39/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

Error:
The following packages have unmet dependencies:
 cuda : Depends: cuda-10-1 (>= 10.1.105) but it is not going to be installed
  
Means you are missing cuda-10-1 package
sudo apt-get install cuda-10-1
but on and on packages keep missing
So better,
 
Solution:
sudo apt-get install aptitude
sudo aptitude install cuda
 
Maybe aptitude crawls back, gets all the missing dependencies and installs it for you and then finally installs the main package.

 
Update your PATH variable
 
sudo gedit ~/.bashrc
export PATH=/usr/local/cuda-10.1/bin${PATH:+:$PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
[Yes 2 }} at the end]
will basically add /usr/local/cuda-10.1/bin to path
 
source ~/.bashrc
or
. ~/.bashrc 


Step 3: cuDNN install

Download cuDNN Library for Linux
 
https://developer.nvidia.com/cudnn 

 
 tar -xzvf cudnn-10.1-linux-x64-v7.5.0.56.tgz
 
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
 
Done cuDNN.
 
 
source ~/.bashrc
Restart terminal
source activate tfgpu 
conda install jupyter notebook
jupyter notebook

import tensorflow as tf
print(tf.__version__)
hello = tf.constant('hello tensorflow')
with tf.Session() as sess:
    print(sess.run(hello))

Output:
1.12.0
b'hello tensorflow'