Skip to content

Instantly share code, notes, and snippets.

@mfcabrera
Forked from erikbern/install-tensorflow.sh
Last active November 20, 2016 20:20
Show Gist options
  • Select an option

  • Save mfcabrera/d921b9bcb5789e2d3f50393831f3f7b8 to your computer and use it in GitHub Desktop.

Select an option

Save mfcabrera/d921b9bcb5789e2d3f50393831f3f7b8 to your computer and use it in GitHub Desktop.

Revisions

  1. mfcabrera revised this gist Nov 20, 2016. 1 changed file with 13 additions and 4 deletions.
    17 changes: 13 additions & 4 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -10,6 +10,9 @@ sudo apt-get update
    sudo apt-get upgrade -y # choose “install package maintainers version”
    sudo apt-get install -y build-essential python-pip python-dev git python-numpy swig python-dev default-jdk zip zlib1g-dev

    sudo apt-get install -y build-essential git python-pip libfreetype6-dev libxft-dev libncurses-dev libopenblas-dev gfortran python-matplotlib libblas-dev liblapack-dev libatlas-base-dev python-dev python-pydot linux-headers-generic linux-image-extra-virtual unzip python-numpy swig python-pandas python-sklearn unzip wget pkg-config zip g++ zlib1g-dev


    # Blacklist Noveau which has some kind of conflict with the nvidia driver
    echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
    echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
    @@ -43,8 +46,8 @@ cd
    # After filling out an annoying questionnaire, you’ll download a file named cudnn-8.0-linux-x64-v2.tgz. You need to transfer it to your EC2 instance: I did this by adding it to my Dropbox folder and using wget to upload it. Once you have uploaded it to your home directory, run the following:
    # Install CUDA NN 8.0
    tar -vxzf cudnn-8.0-linux-x64-v5.0-ga.tgz
    sudo cp cuda/libcudnn* /usr/local/cuda/lib64
    sudo cp cuda/cudnn.h /usr/local/cuda/include/
    sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
    sudo cp cuda/include/cudnn.h /usr/local/cuda/include/

    # Next up, we’ll add some environment variables. You may wish to add these to your ~/.bashrc.
    export CUDA_HOME=/usr/local/cuda
    @@ -53,9 +56,12 @@ export PATH=$PATH:$CUDA_ROOT/bin
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_ROOT/lib64

    # 1. Install JDK 8
    sudo add-apt-repository ppa:webupd8team/java
    sudo add-apt-repository -y ppa:webupd8team/java
    sudo apt-get update
    sudo apt-get install oracle-java8-installer
    # Hack to silently agree license agreement
    echo debconf shared/accepted-oracle-license-v1-1 select true | sudo debconf-set-selections
    echo debconf shared/accepted-oracle-license-v1-1 seen true | sudo debconf-set-selections
    sudo apt-get install -y oracle-java8-installer
    # Note: You might need to sudo apt-get install software-properties-common if you don't have the add-apt-repository command. See here.

    #sudo apt-get install openjdk-8-jdk. Inst
    @@ -80,6 +86,9 @@ wget https://github.com/bazelbuild/bazel/releases/download/0.4.0/bazel-0.4.0-jdk
    sudo bash bazel-0.4.0-jdk7-installer-linux-x86_64.sh

    # Install Tensorflow
    sudo apt-get install pkg-config zip g++ zlib1g-dev
    git clone --recurse-submodules https://github.com/tensorflow/tensorflow
    git checkout tags/v0.11.0
    TF_UNOFFICIAL_SETTING=1 ./configure

    # Please specify a list of comma-separated Cuda compute capabilities you want to build with.
  2. mfcabrera revised this gist Nov 20, 2016. 1 changed file with 1 addition and 3 deletions.
    4 changes: 1 addition & 3 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -1,8 +1,6 @@
    # Note – this is not a bash script (some of the steps require reboot)
    # I named it .sh just so Github does correct syntax highlighting.
    #
    # This is also available as an AMI in us-east-1 (virginia): ami-cf5028a5
    #
    # This install Tensorflow 0.11, Cuda 8.0 and cudnn-8.0
    # The CUDA part is mostly based on this excellent blog post:
    # http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/
    # I extened Erick using additional instructions from http://ramhiser.com/2016/01/05/installing-tensorflow-on-an-aws-ec2-instance-with-gpu-support/
  3. mfcabrera revised this gist Nov 20, 2016. 1 changed file with 11 additions and 9 deletions.
    20 changes: 11 additions & 9 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -103,18 +103,20 @@ cd tensorflow
    # Instead, you need to run ./configure like below (not tested yet)
    TF_UNOFFICIAL_SETTING=1 ./configure
    bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

    # Build Python package
    # Note: you have to specify --config=cuda here - this is not mentioned in the official docs
    # https://github.com/tensorflow/tensorflow/issues/25#issuecomment-156173717
    bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
    bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
    sudo pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl
    sudo pip install --upgrade /tmp/tensorflow_pkg/tensorflow-0.11.0-cp27-cp27mu-linux_x86_64.whl

    # test in a python
    # import tensorflow as tf
    # tf_session = tf.Session()
    # x = tf.constant(1)
    # y = tf.constant(1)
    # tf_session.run(x + y)

    # Test it!
    # Test it wit CIFAR
    cd tensorflow/models/image/cifar10/
    python cifar10_multi_gpu_train.py

    # On a g2.2xlarge: step 100, loss = 4.50 (325.2 examples/sec; 0.394 sec/batch)
    # On a g2.8xlarge: step 100, loss = 4.49 (337.9 examples/sec; 0.379 sec/batch)
    # doesn't seem like it is able to use the 4 GPU cards unfortunately :(
    # You can also check that TensorFlow is working by training a CNN on the MNIST data set.
    python ~/tensorflow/tensorflow/models/image/mnist/convolutional.py
  4. mfcabrera revised this gist Nov 20, 2016. 1 changed file with 54 additions and 22 deletions.
    76 changes: 54 additions & 22 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -5,6 +5,7 @@
    #
    # The CUDA part is mostly based on this excellent blog post:
    # http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/
    # I extened Erick using additional instructions from http://ramhiser.com/2016/01/05/installing-tensorflow-on-an-aws-ec2-instance-with-gpu-support/

    # Install various packages
    sudo apt-get update
    @@ -18,28 +19,53 @@ sudo update-initramfs -u
    sudo reboot # Reboot (annoying you have to do this in 2015!)

    # Some other annoying thing we have to do
    sudo apt-get install -y linux-image-extra-virtual
    sudo reboot # Not sure why this is needed
    # sudo apt-get install -y linux-image-extra-virtual
    #sudo reboot # Not sure why this is needed

    # Install latest Linux headers
    sudo apt-get install -y linux-source linux-headers-`uname -r`

    # Install CUDA 7.0 (note – don't use any other version)
    wget http://developer.download.nvidia.com/compute/cuda/7_0/Prod/local_installers/cuda_7.0.28_linux.run
    chmod +x cuda_7.0.28_linux.run
    ./cuda_7.0.28_linux.run -extract=`pwd`/nvidia_installers
    cd nvidia_installers
    sudo ./NVIDIA-Linux-x86_64-346.46.run
    sudo modprobe nvidia
    sudo ./cuda-linux64-rel-7.0.28-19326674.run
    # Install CUDA 8.0 (note – don't use any other version)
    mkdir packages
    cd packages
    wget https://developer.nvidia.com/compute/cuda/8.0/prod/local_installers/cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64-deb
    sudo dpkg -i cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64-deb
    rm cuda-repo-ubuntu1604-8-0-local_8.0.44-1_amd64-deba
    sudo apt-get update
    sudo apt-get install -y cuda

    # chmod +x cuda_7.0.28_linux.run
    # ./cuda_7.0.28_linux.run -extract=`pwd`/nvidia_installers
    # cd nvidia_installers
    # sudo ./NVIDIA-Linux-x86_64-346.46.run
    # sudo modprobe nvidia
    # sudo ./cuda-linux64-rel-7.0.28-19326674.run
    cd

    # Install CUDNN 6.5 (note – don't use any other version)
    # YOU NEED TO SCP THIS ONE FROM SOMEWHERE ELSE – it's not available online.
    # You need to register and get approved to get a download link. Very annoying.
    tar -xzf cudnn-6.5-linux-x64-v2.tgz
    sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64
    sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include/
    # After filling out an annoying questionnaire, you’ll download a file named cudnn-8.0-linux-x64-v2.tgz. You need to transfer it to your EC2 instance: I did this by adding it to my Dropbox folder and using wget to upload it. Once you have uploaded it to your home directory, run the following:
    # Install CUDA NN 8.0
    tar -vxzf cudnn-8.0-linux-x64-v5.0-ga.tgz
    sudo cp cuda/libcudnn* /usr/local/cuda/lib64
    sudo cp cuda/cudnn.h /usr/local/cuda/include/

    # Next up, we’ll add some environment variables. You may wish to add these to your ~/.bashrc.
    export CUDA_HOME=/usr/local/cuda
    export CUDA_ROOT=/usr/local/cuda
    export PATH=$PATH:$CUDA_ROOT/bin
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_ROOT/lib64

    # 1. Install JDK 8
    sudo add-apt-repository ppa:webupd8team/java
    sudo apt-get update
    sudo apt-get install oracle-java8-installer
    # Note: You might need to sudo apt-get install software-properties-common if you don't have the add-apt-repository command. See here.

    #sudo apt-get install openjdk-8-jdk. Inst
    # all other required packages

    sudo apt-get install pkg-config zip g++ zlib1g-dev unzip



    # At this point the root mount is getting a bit full
    # I had a lot of issues where the disk would fill up and then Bazel would end up in this weird state complaining about random things
    @@ -51,12 +77,18 @@ sudo ln -s /mnt/tmp /tmp
    # Note that /mnt is not saved when building an AMI, so don't put anything crucial on it

    # Install Bazel
    cd /mnt/tmp
    git clone https://github.com/bazelbuild/bazel.git
    cd bazel
    git checkout tags/0.1.0
    ./compile.sh
    sudo cp output/bazel /usr/bin
    cd /tmp
    wget https://github.com/bazelbuild/bazel/releases/download/0.4.0/bazel-0.4.0-jdk7-installer-linux-x86_64.sh
    sudo bash bazel-0.4.0-jdk7-installer-linux-x86_64.sh

    # Install Tensorflow
    TF_UNOFFICIAL_SETTING=1 ./configure

    # Please specify a list of comma-separated Cuda compute capabilities you want to build with.
    # You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
    # Please note that each additional compute capability significantly increases your build time and binary size.
    # [Default is: "3.5,5.2"]: 3.0


    # Install TensorFlow
    cd /mnt/tmp
  5. @erikbern erikbern revised this gist Nov 13, 2015. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -1,6 +1,8 @@
    # Note – this is not a bash script (some of the steps require reboot)
    # I named it .sh just so Github does correct syntax highlighting.

    #
    # This is also available as an AMI in us-east-1 (virginia): ami-cf5028a5
    #
    # The CUDA part is mostly based on this excellent blog post:
    # http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/

  6. @erikbern erikbern revised this gist Nov 12, 2015. 1 changed file with 5 additions and 3 deletions.
    8 changes: 5 additions & 3 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -63,9 +63,11 @@ export CUDA_HOME=/usr/local/cuda
    git clone --recurse-submodules https://github.com/tensorflow/tensorflow
    cd tensorflow
    # Patch to support older K520 devices on AWS
    wget "https://gist.github.com/infojunkie/cb6d1a4e8bf674c6e38e/raw/5e01e5b2b1f7afd3def83810f8373fbcf6e47e02/cuda_30.patch"
    git apply cuda_30.patch
    ./configure
    # wget "https://gist.github.com/infojunkie/cb6d1a4e8bf674c6e38e/raw/5e01e5b2b1f7afd3def83810f8373fbcf6e47e02/cuda_30.patch"
    # git apply cuda_30.patch
    # According to https://github.com/tensorflow/tensorflow/issues/25#issuecomment-156234658 this patch is no longer needed
    # Instead, you need to run ./configure like below (not tested yet)
    TF_UNOFFICIAL_SETTING=1 ./configure
    bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

    # Build Python package
  7. @erikbern erikbern revised this gist Nov 12, 2015. 1 changed file with 3 additions and 0 deletions.
    3 changes: 3 additions & 0 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,6 @@
    # Note – this is not a bash script (some of the steps require reboot)
    # I named it .sh just so Github does correct syntax highlighting.

    # The CUDA part is mostly based on this excellent blog post:
    # http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/

  8. @erikbern erikbern revised this gist Nov 12, 2015. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -66,7 +66,7 @@ git apply cuda_30.patch
    bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

    # Build Python package
    # Note: you ahve to specify --config=cuda here - this is not mentioned in the official docs
    # Note: you have to specify --config=cuda here - this is not mentioned in the official docs
    # https://github.com/tensorflow/tensorflow/issues/25#issuecomment-156173717
    bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
    bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
    @@ -77,5 +77,5 @@ cd tensorflow/models/image/cifar10/
    python cifar10_multi_gpu_train.py

    # On a g2.2xlarge: step 100, loss = 4.50 (325.2 examples/sec; 0.394 sec/batch)
    # On a g2.2xlarge: step 100, loss = 4.49 (337.9 examples/sec; 0.379 sec/batch)
    # On a g2.8xlarge: step 100, loss = 4.49 (337.9 examples/sec; 0.379 sec/batch)
    # doesn't seem like it is able to use the 4 GPU cards unfortunately :(
  9. @erikbern erikbern revised this gist Nov 12, 2015. 1 changed file with 5 additions and 3 deletions.
    8 changes: 5 additions & 3 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,4 @@
    # This is mostly based on this excellent blog post:
    # The CUDA part is mostly based on this excellent blog post:
    # http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/

    # Install various packages
    @@ -19,7 +19,7 @@ sudo reboot # Not sure why this is needed
    # Install latest Linux headers
    sudo apt-get install -y linux-source linux-headers-`uname -r`

    # Install CUDA 7.0
    # Install CUDA 7.0 (note – don't use any other version)
    wget http://developer.download.nvidia.com/compute/cuda/7_0/Prod/local_installers/cuda_7.0.28_linux.run
    chmod +x cuda_7.0.28_linux.run
    ./cuda_7.0.28_linux.run -extract=`pwd`/nvidia_installers
    @@ -29,7 +29,9 @@ sudo modprobe nvidia
    sudo ./cuda-linux64-rel-7.0.28-19326674.run
    cd

    # Install cudnn (YOU NEED TO SCP THIS ONE FROM SOMEWHERE ELSE)
    # Install CUDNN 6.5 (note – don't use any other version)
    # YOU NEED TO SCP THIS ONE FROM SOMEWHERE ELSE – it's not available online.
    # You need to register and get approved to get a download link. Very annoying.
    tar -xzf cudnn-6.5-linux-x64-v2.tgz
    sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64
    sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include/
  10. @erikbern erikbern revised this gist Nov 12, 2015. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -74,4 +74,6 @@ sudo pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl
    cd tensorflow/models/image/cifar10/
    python cifar10_multi_gpu_train.py

    # On a g2.2xlarge: step 100, loss = 4.50 (325.2 examples/sec; 0.394 sec/batch)
    # On a g2.2xlarge: step 100, loss = 4.50 (325.2 examples/sec; 0.394 sec/batch)
    # On a g2.2xlarge: step 100, loss = 4.49 (337.9 examples/sec; 0.379 sec/batch)
    # doesn't seem like it is able to use the 4 GPU cards unfortunately :(
  11. @erikbern erikbern revised this gist Nov 12, 2015. 1 changed file with 4 additions and 0 deletions.
    4 changes: 4 additions & 0 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -35,6 +35,8 @@ sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64
    sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include/

    # At this point the root mount is getting a bit full
    # I had a lot of issues where the disk would fill up and then Bazel would end up in this weird state complaining about random things
    # Make sure you don't run out of disk space when building Tensorflow!
    sudo mkdir /mnt/tmp
    sudo chmod 777 /mnt/tmp
    sudo rm -rf /tmp
    @@ -71,3 +73,5 @@ sudo pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl
    # Test it!
    cd tensorflow/models/image/cifar10/
    python cifar10_multi_gpu_train.py

    # On a g2.2xlarge: step 100, loss = 4.50 (325.2 examples/sec; 0.394 sec/batch)
  12. @erikbern erikbern revised this gist Nov 12, 2015. 1 changed file with 3 additions and 5 deletions.
    8 changes: 3 additions & 5 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -31,7 +31,7 @@ cd

    # Install cudnn (YOU NEED TO SCP THIS ONE FROM SOMEWHERE ELSE)
    tar -xzf cudnn-6.5-linux-x64-v2.tgz
    sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/
    sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64
    sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include/

    # At this point the root mount is getting a bit full
    @@ -62,14 +62,12 @@ git apply cuda_30.patch
    bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

    # Build Python package
    # TODO: I think you have to specify --config=cuda here - that's why it wasn't working last time I tried
    # Note: you ahve to specify --config=cuda here - this is not mentioned in the official docs
    # https://github.com/tensorflow/tensorflow/issues/25#issuecomment-156173717
    bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
    bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
    bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
    sudo pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl

    # Test it!
    cd tensorflow/models/image/cifar10/
    python cifar10_multi_gpu_train.py

    # Hmm... this runs, but doesn't use the GPU's. Not sure why :()
  13. @erikbern erikbern revised this gist Nov 12, 2015. 1 changed file with 1 addition and 0 deletions.
    1 change: 1 addition & 0 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -39,6 +39,7 @@ sudo mkdir /mnt/tmp
    sudo chmod 777 /mnt/tmp
    sudo rm -rf /tmp
    sudo ln -s /mnt/tmp /tmp
    # Note that /mnt is not saved when building an AMI, so don't put anything crucial on it

    # Install Bazel
    cd /mnt/tmp
  14. @erikbern erikbern revised this gist Nov 12, 2015. 1 changed file with 2 additions and 0 deletions.
    2 changes: 2 additions & 0 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -61,6 +61,8 @@ git apply cuda_30.patch
    bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

    # Build Python package
    # TODO: I think you have to specify --config=cuda here - that's why it wasn't working last time I tried
    # https://github.com/tensorflow/tensorflow/issues/25#issuecomment-156173717
    bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
    bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
    sudo pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl
  15. @erikbern erikbern revised this gist Nov 11, 2015. 1 changed file with 30 additions and 19 deletions.
    49 changes: 30 additions & 19 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -4,7 +4,7 @@
    # Install various packages
    sudo apt-get update
    sudo apt-get upgrade -y # choose “install package maintainers version”
    sudo apt-get install -y build-essential python-pip python-dev git python-numpy swig python-dev default-java-sdk zip zlib1g-dev
    sudo apt-get install -y build-essential python-pip python-dev git python-numpy swig python-dev default-jdk zip zlib1g-dev

    # Blacklist Noveau which has some kind of conflict with the nvidia driver
    echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
    @@ -29,33 +29,44 @@ sudo modprobe nvidia
    sudo ./cuda-linux64-rel-7.0.28-19326674.run
    cd

    # Install tensorflow
    sudo pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.5.0-cp27-none-linux_x86_64.whl
    export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
    export CUDA_HOME=/usr/local/cuda
    # Install cudnn (YOU NEED TO SCP THIS ONE FROM SOMEWHERE ELSE)
    tar -xzf cudnn-6.5-linux-x64-v2.tgz
    sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/
    sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include/

    # At this point the root mount is getting a bit full
    sudo mkdir /mnt/tmp
    sudo chmod 777 /mnt/tmp
    sudo rm -rf /tmp
    sudo ln -s /mnt/tmp /tmp

    # Install Bazel
    cd /mnt/tmp
    git clone https://github.com/bazelbuild/bazel.git
    cd bazel
    git checkout tags/0.1.0
    ./compile.sh
    sudo cp output/bazel /usr/bin
    cd


    # Install TensorFlow
    git clone https://github.com/tensorflow/tensorflow
    cd tensorflow/tensorflow/models/image/mnist
    python convolutional.py
    cd /mnt/tmp
    export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
    export CUDA_HOME=/usr/local/cuda
    git clone --recurse-submodules https://github.com/tensorflow/tensorflow
    cd tensorflow
    # Patch to support older K520 devices on AWS
    wget "https://gist.github.com/infojunkie/cb6d1a4e8bf674c6e38e/raw/5e01e5b2b1f7afd3def83810f8373fbcf6e47e02/cuda_30.patch"
    git apply cuda_30.patch
    ./configure
    bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

    # At this point, it breaks down. It works, but doesn't use the GPU. On g2.2xlarge:
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # Build Python package
    bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
    bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
    sudo pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl

    # On g2.8xlarge:
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 1, name: GRID K520, pci bus id: 0000:00:04.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 2, name: GRID K520, pci bus id: 0000:00:05.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 3, name: GRID K520, pci bus id: 0000:00:06.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # Test it!
    cd tensorflow/models/image/cifar10/
    python cifar10_multi_gpu_train.py

    # TODO: Seems like there's a discussion on GitHub about CUDA 3.0 support
    # https://github.com/tensorflow/tensorflow/issues/25
    # Hmm... this runs, but doesn't use the GPU's. Not sure why :()
  16. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 11 additions and 2 deletions.
    13 changes: 11 additions & 2 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -4,7 +4,7 @@
    # Install various packages
    sudo apt-get update
    sudo apt-get upgrade -y # choose “install package maintainers version”
    sudo apt-get install -y build-essential python-pip python-dev git # same
    sudo apt-get install -y build-essential python-pip python-dev git python-numpy swig python-dev default-java-sdk zip zlib1g-dev

    # Blacklist Noveau which has some kind of conflict with the nvidia driver
    echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
    @@ -34,7 +34,16 @@ sudo pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-
    export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
    export CUDA_HOME=/usr/local/cuda

    # Clone source as well (for examples)
    # Install Bazel
    git clone https://github.com/bazelbuild/bazel.git
    cd bazel
    git checkout tags/0.1.0
    ./compile.sh
    sudo cp output/bazel /usr/bin
    cd


    # Install TensorFlow
    git clone https://github.com/tensorflow/tensorflow
    cd tensorflow/tensorflow/models/image/mnist
    python convolutional.py
  17. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 3 additions and 0 deletions.
    3 changes: 3 additions & 0 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -1,3 +1,6 @@
    # This is mostly based on this excellent blog post:
    # http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/

    # Install various packages
    sudo apt-get update
    sudo apt-get upgrade -y # choose “install package maintainers version”
  18. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -43,4 +43,7 @@ python convolutional.py
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 1, name: GRID K520, pci bus id: 0000:00:04.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 2, name: GRID K520, pci bus id: 0000:00:05.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 3, name: GRID K520, pci bus id: 0000:00:06.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 3, name: GRID K520, pci bus id: 0000:00:06.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.

    # TODO: Seems like there's a discussion on GitHub about CUDA 3.0 support
    # https://github.com/tensorflow/tensorflow/issues/25
  19. @erikbern erikbern revised this gist Nov 10, 2015. No changes.
  20. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 8 additions and 0 deletions.
    8 changes: 8 additions & 0 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -35,4 +35,12 @@ export CUDA_HOME=/usr/local/cuda
    git clone https://github.com/tensorflow/tensorflow
    cd tensorflow/tensorflow/models/image/mnist
    python convolutional.py

    # At this point, it breaks down. It works, but doesn't use the GPU. On g2.2xlarge:
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.

    # On g2.8xlarge:
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 1, name: GRID K520, pci bus id: 0000:00:04.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 2, name: GRID K520, pci bus id: 0000:00:05.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 3, name: GRID K520, pci bus id: 0000:00:06.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
  21. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -28,9 +28,11 @@ cd

    # Install tensorflow
    sudo pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.5.0-cp27-none-linux_x86_64.whl
    export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
    export CUDA_HOME=/usr/local/cuda

    # Clone source as well (for examples)
    git clone https://github.com/tensorflow/tensorflow
    cd tensorflow/tensorflow/model/image/mnist
    cd tensorflow/tensorflow/models/image/mnist
    python convolutional.py
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
  22. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -23,7 +23,7 @@ chmod +x cuda_7.0.28_linux.run
    cd nvidia_installers
    sudo ./NVIDIA-Linux-x86_64-346.46.run
    sudo modprobe nvidia
    sudo ./cuda-linux64-rel-7.5.18-19867135.run
    sudo ./cuda-linux64-rel-7.0.28-19326674.run
    cd

    # Install tensorflow
  23. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -17,7 +17,7 @@ sudo reboot # Not sure why this is needed
    sudo apt-get install -y linux-source linux-headers-`uname -r`

    # Install CUDA 7.0
    wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.0-28_amd64.deb
    wget http://developer.download.nvidia.com/compute/cuda/7_0/Prod/local_installers/cuda_7.0.28_linux.run
    chmod +x cuda_7.0.28_linux.run
    ./cuda_7.0.28_linux.run -extract=`pwd`/nvidia_installers
    cd nvidia_installers
  24. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -10,11 +10,11 @@ sudo update-initramfs -u
    sudo reboot # Reboot (annoying you have to do this in 2015!)

    # Some other annoying thing we have to do
    sudo apt-get install linux-image-extra-virtual
    sudo apt-get install -y linux-image-extra-virtual
    sudo reboot # Not sure why this is needed

    # Install latest Linux headers
    sudo apt-get install linux-source linux-headers-`uname -r`
    sudo apt-get install -y linux-source linux-headers-`uname -r`

    # Install CUDA 7.0
    wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.0-28_amd64.deb
  25. @erikbern erikbern revised this gist Nov 10, 2015. No changes.
  26. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 5 additions and 2 deletions.
    7 changes: 5 additions & 2 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -1,15 +1,18 @@
    # Install various packages
    sudo apt-get update
    sudo apt-get upgrade -y # choose “install package maintainers version”
    sudo apt-get install -y build-essential python-pip python-dev git linux-image-extra-virtual # same
    sudo reboot # Not sure why this is needed
    sudo apt-get install -y build-essential python-pip python-dev git # same

    # Blacklist Noveau which has some kind of conflict with the nvidia driver
    echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
    echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
    sudo update-initramfs -u
    sudo reboot # Reboot (annoying you have to do this in 2015!)

    # Some other annoying thing we have to do
    sudo apt-get install linux-image-extra-virtual
    sudo reboot # Not sure why this is needed

    # Install latest Linux headers
    sudo apt-get install linux-source linux-headers-`uname -r`

  27. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 3 additions and 8 deletions.
    11 changes: 3 additions & 8 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -1,19 +1,14 @@
    # Install various packages
    sudo apt-get update
    sudo apt-get upgrade -y # choose “install package maintainers version”
    sudo apt-get install -y build-essential python-pip python-dev git

    # Not sure why this is needed
    sudo apt-get install linux-image-extra-virtual
    sudo reboot # especially this??
    sudo apt-get install -y build-essential python-pip python-dev git linux-image-extra-virtual # same
    sudo reboot # Not sure why this is needed

    # Blacklist Noveau which has some kind of conflict with the nvidia driver
    echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
    echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf

    # Reboot (annoying you have to do this in 2015!)
    sudo update-initramfs -u
    sudo reboot
    sudo reboot # Reboot (annoying you have to do this in 2015!)

    # Install latest Linux headers
    sudo apt-get install linux-source linux-headers-`uname -r`
  28. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 4 additions and 4 deletions.
    8 changes: 4 additions & 4 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -3,6 +3,10 @@ sudo apt-get update
    sudo apt-get upgrade -y # choose “install package maintainers version”
    sudo apt-get install -y build-essential python-pip python-dev git

    # Not sure why this is needed
    sudo apt-get install linux-image-extra-virtual
    sudo reboot # especially this??

    # Blacklist Noveau which has some kind of conflict with the nvidia driver
    echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
    echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
    @@ -11,10 +15,6 @@ echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
    sudo update-initramfs -u
    sudo reboot

    # Not sure why this is needed
    sudo apt-get install linux-image-extra-virtual
    sudo reboot # especially this??

    # Install latest Linux headers
    sudo apt-get install linux-source linux-headers-`uname -r`

  29. @erikbern erikbern revised this gist Nov 10, 2015. 1 changed file with 27 additions and 7 deletions.
    34 changes: 27 additions & 7 deletions install-tensorflow.sh
    Original file line number Diff line number Diff line change
    @@ -1,18 +1,38 @@
    # Install various packages
    sudo apt-get update
    sudo apt-get upgrade -y # choose “choose package maintainers version”
    sudo apt-get install -y build-essential linux-source linux-headers-`uname -r` python-pip python-dev
    sudo apt-get upgrade -y # choose “install package maintainers version”
    sudo apt-get install -y build-essential python-pip python-dev git

    # Blacklist Noveau which has some kind of conflict with the nvidia driver
    echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
    echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf

    # Reboot (annoying you have to do this in 2015!)
    sudo update-initramfs -u
    sudo reboot

    # Not sure why this is needed
    sudo apt-get install linux-image-extra-virtual
    sudo reboot # especially this??

    # Install latest Linux headers
    sudo apt-get install linux-source linux-headers-`uname -r`

    wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda_7.5.18_linux.run
    chmod +x cuda_7.5.18_linux.run
    ./cuda_7.5.18_linux.run -extract=`pwd`/nvidia_installers
    # Install CUDA 7.0
    wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.0-28_amd64.deb
    chmod +x cuda_7.0.28_linux.run
    ./cuda_7.0.28_linux.run -extract=`pwd`/nvidia_installers
    cd nvidia_installers
    sudo ./NVIDIA-Linux-x86_64-352.39.run
    sudo ./NVIDIA-Linux-x86_64-346.46.run
    sudo modprobe nvidia
    sudo ./cuda-linux64-rel-7.5.18-19867135.run
    sudo pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.5.0-cp27-none-linux_x86_64.whl
    cd

    # Install tensorflow
    sudo pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.5.0-cp27-none-linux_x86_64.whl

    # Clone source as well (for examples)
    git clone https://github.com/tensorflow/tensorflow
    cd tensorflow/tensorflow/model/image/mnist
    python convolutional.py
    # I tensorflow/core/common_runtime/gpu/gpu_device.cc:611] Ignoring gpu device (device: 0, name: GRID K520, pci bus id: 0000:00:03.0) with Cuda compute capability 3.0. The minimum required Cuda capability is 3.5.
  30. @erikbern erikbern renamed this gist Nov 10, 2015. 1 changed file with 0 additions and 0 deletions.
    File renamed without changes.