This is a companion piece to my instructions on building TensorFlow from source. In particular, the aim is to install the following pieces of software
- NVIDIA graphics card driver (v440.82)
- CUDA (v10.2)
- cuDNN (v7.6.5)
on an Ubuntu Linux system, in particular Ubuntu 20.04.
At the time of writing (2020-05-23), these were the latest available versions. As a disclaimer, please note that I am not interested in running an outdated Ubuntu version or installing a CUDA/cuDNN version that is not the latest. Therefore, the below instructions may or may not be useful to you. Please also note that the instructions are likely outdated, since I only update them occasionally. Don't just copy these instructions, but check what the respective latest versions are and use these instead!
Download and install the latest NVIDIA graphics driver from here: https://www.nvidia.com/en-us/drivers/unix/. Note that every CUDA version requires a minimum version of the driver; check this beforehand.
Ubuntu 20.04 currently offers installation of the NVIDIA driver version 440.82 through its built-in 'Additional Drivers' mechanism, which should be sufficient for CUDA 10.2. Otherwise, download and install the latest NVIDIA graphics driver from here: https://www.nvidia.com/en-us/drivers/unix/.
The CUDA runfile also includes a version of the NVIDIA graphics driver, but I like to separate installing either, as installing them in combination can be more brittle on "unsupported" distributions for CUDA.
Download the latest CUDA version here. For example, I downloaded:
$ wget http://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
Here's the first roadblock: Ubuntu 20.04 ships with GCC 9.3.0 by default, but CUDA 10.2 pretends to only support Ubuntu 18.04 and GCC versions up to version 8. When trying to install CUDA on an up-to-date system, it will fail.
Uhm... this is insane. I understand when code needs to be built with a certain minimum version of a compiler, but no well written piece of software ever should specify a maximum version.
You would now think that you can simply install GCC 8 (something along the lines of sudo apt install gcc-8 and running CC=$(which gcc-8) CXX=$(which g++-8) ./cuda_10.2.89_440.33.01_linux.run as root) and be happy, but alas, no. The CUDA installer conveniently disregards any such set environment variables. And changing the system default compiler as suggested out on the internet using the inadequate update-alternatives mechanism should clearly not be an option for anyone!
Time for more desperate measures. Go ahead and install CUDA like this:
$ sudo sh cuda_10.2.89_440.33.01_linux.run --override
The --override flag overrides the compiler check, and you can now go on. Deselect the driver if it was installed earlier, but install the rest. Try to build the samples. You will notice that this fails, again with a message such as
unsupported GNU version! gcc versions later than 8 are not supported!
Thanks for nothing, NVIDIA. Thankfully we can disable this error by commenting out the #error pragma in /usr/local/cuda/include/crt/host_config.h. Do so. This is what it looks like for me:
#if defined(__GNUC__)
#if __GNUC__ > 8
//#error -- unsupported GNU version! gcc versions later than 8 are not supported!
#endif /* __GNUC__ > 8 */
I have no idea what the implications are, but so far I haven't found any. There's a similar section on Clang just below, in case you decide to compile TensorFlow with Clang. (I have not tried yet, but it should be a good adventure.)
Just go here and follow the instructions. You'll have to log in, so downloading of the right cuDNN binary packages cannot be easily automated. Meh.