Skip to content

Instantly share code, notes, and snippets.

@Markus92
Last active August 26, 2025 10:51
Show Gist options
  • Select an option

  • Save Markus92/6ef0c3157445ba09f5441adbbd9af12f to your computer and use it in GitHub Desktop.

Select an option

Save Markus92/6ef0c3157445ba09f5441adbbd9af12f to your computer and use it in GitHub Desktop.

Revisions

  1. Markus92 revised this gist Apr 21, 2020. 1 changed file with 11 additions and 2 deletions.
    13 changes: 11 additions & 2 deletions serversetup.md
    Original file line number Diff line number Diff line change
    @@ -242,8 +242,8 @@ TaskPlugin=task/cgroup
    # TIMERS
    #KillWait=30
    #MinJobAge=300
    #SlurmctldTimeout=120
    #SlurmdTimeout=300
    SlurmctldTimeout=600
    SlurmdTimeout=600
    #
    #
    # SCHEDULING
    @@ -372,3 +372,12 @@ user devices /
    This will move all users in the group *gpu* to GPU access, and everyone else to no GPU access. Exactly what we want.

    Now reboot for the final time and you're done!

    ## Post-mortem
    This system has been up and running for around a year now, and it works perfectly:
    the system had only two short outages. One was caused by time-out of the SLURM
    daemon, killing all running jobs for some reason (new jobs were fine). This
    is mitigated now by setting the time-outs a bit less tight.
    The other one, we have no clue what happened. It was a total hardware lockup,
    even the physical console didn't respond. A quick physical reboot later and
    everything was up and running again like before!
  2. Markus92 revised this gist Apr 21, 2020. 1 changed file with 8 additions and 6 deletions.
    14 changes: 8 additions & 6 deletions serversetup.md
    Original file line number Diff line number Diff line change
    @@ -7,13 +7,13 @@ One challenge is, is how to manage these GPUs. There are many approaches, but gi

    A group at a previous affiliation of mine had the same problems and used Docker containers with a job scheduler to mitigate most of these problem. Unfortunately I had never used it myself and was thus not familiar with the exact details of their implementation. This approach solves most of our problems: no conflicting software versions (just roll a container per research paper and archive it), no competing for GPUs and, most importantly, people can't accidentally screw up colleagues' experiments.

    There was one more constraint I had: our system should be as easy to use as possible. When I talked about 'job scheduling' and 'GPU allocation' to my colleagues, the reaction I got was that they were scared it'd be either too complicated to use, or too limited to use. As I really didn't want to go the 'Google Sheets' route for GPU scheduling, I kept this in mind during the design of the system. Another constraint was set by our sysadmin: no root for users, as we got some old NFSv3 fileservers which authenticate on UID/GID level. This immediately excluded Docker, as described [here](https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface). As users would be allowed to control the Docker daemon, it's pretty much the same as putting everyone in the group 'sudoers'. Not something we want, to be honest.
    There was one more constraint I had: our system should be as easy to use as possible. When I talked about 'job scheduling' and 'GPU allocation' to my colleagues, the reaction I got was that they were scared it'd be either too complicated to use, or too limited to use. As I really didn't want to go the 'Google Sheets' route for GPU scheduling, I kept this in mind during the design of the system. Another constraint was set by our sysadmin: no root for users, as we got some legacy NFSv3 fileservers which authenticate on UID/GID level. This immediately excluded Docker, as described [here](https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface). As users would be allowed to control the Docker daemon, it's pretty much the same as putting everyone in the group 'sudoers'. Not something we want, to be honest.

    In the end I decided to use a combination of [Singularity](https://sylabs.io/singularity/) and [SLURM](https://slurm.schedmd.com/). Singularity is a container tool created for and used a lot by HPC facilities. SLURM is an industry-standard job scheduler, also used by many HPC instances. An advantage of these tools is that they are industry-standard, thus used a lot and well-documented. Always helpful when running into problems. To enforce proper usage of these tools, control groups (cgroups) are used to lock access down: by default there are no GPU permissions.

    As most tutorials I found were quite outdated, here's a new one. It's in typical 'follow-along' style, so you can copy/paste the commands onto your own terminal and you'll end up with a similar system! Root access is required, obviously.

    Note: we are running a new install of Ubuntu 18.04 LTS.
    Note: we are running a new, fresh install of Ubuntu 18.04 LTS.

    ## Installing Singularity
    As most debian packages for Singularity are quite outdated, we'll compile it ourselves. It's written in Go, so we'll also install a recent Go version.
    @@ -90,7 +90,9 @@ We're going to make a few changes to the default configuration, mainly to make i
    $sudo nano /usr/local/etc/singularity/singularity.conf
    ```

    First, change `always use nv = no` to `yes`. It doesn't really have any downsides, just saves you from typing --nv every time. Second, we add a few bind paths. Obviously these are user specific, though `/run/user` is useful for everyone running a systemd-based distribution like Ubuntu or Debian. I added these below the standard bind paths, you'll find it easily in the config file.
    First, to bind the NVIDIA binaries into every container, change `always use nv = no` to `yes`. It doesn't really have any downsides, just saves you from typing --nv every time.

    Second, we add a few bind paths. Obviously these are user specific, though `/run/user` is useful for everyone running a systemd-based distribution like Ubuntu or Debian. I added these below the standard bind paths, you'll find it easily in the config file.
    ```sh
    # For temporary files
    bind path = /run/user
    @@ -105,7 +107,7 @@ $ singularity exec docker://nvcr.io/nvidia/pytorch:19.05-py3 jupyter notebook
    ```

    ## SLURM
    Unfortunately the packages in Ubuntu and Debian are a bit too outdated, so we'll compile our own version. First install some dependenices. Note that we'll install the cgroup stuff right away.
    For GPU scheduling, we use SLURM. Unfortunately the packages in Ubuntu and Debian are a bit too outdated, so we'll compile our own version. First install some dependencies. Note that we'll install the cgroup stuff right away.

    ```sh
    sudo apt-get install build-essential ruby-dev libpam0g-dev libmysqlclient-dev munge libmunge-dev libmysqld-dev cgroup-bin libpam-cgroup cgroup-tools
    @@ -309,7 +311,7 @@ sudo groupadd gpu
    sudo usermod -aG gpu mark
    ```

    I'd advise to add every user with root access to this group for administration tasks. Do *not* add any regular users to it, or it'll break the purpose of the scheduling system.
    I'd advise to add every user with root access to this group for administration tasks. Do *not* add any regular users to it, or it'll break the purpose of the scheduling system as they'll have unlimited GPU access, always.

    To load these `cgroups` every time the system boots, we'll run `cgconfigparser` on boot. Let's create a small `systemd` script to do this:

    @@ -369,4 +371,4 @@ user devices /

    This will move all users in the group *gpu* to GPU access, and everyone else to no GPU access. Exactly what we want.

    Now reboot for the final time and you're done!
    Now reboot for the final time and you're done!
  3. Markus92 revised this gist Apr 21, 2020. 1 changed file with 17 additions and 15 deletions.
    32 changes: 17 additions & 15 deletions serversetup.md
    Original file line number Diff line number Diff line change
    @@ -1,19 +1,19 @@
    # Setting up a GPU server with scheduling and containers


    Our group recently acquired a new server to do some deep learning: a [SuperMicro 4029GP-TRT2](https://www.supermicro.com/products/system/4U/4029/SYS-4029GP-TRT2.cfm), stuffed with 8x NVIDIA RTX 2080 Ti. Though maybe a bit overpowered, with upcoming networks like BigGAN and fully 3D networks, as well as students joining our group, this machine will be used quite a lot in the future.
    Our group recently acquired a new server to do some deep learning: a [SuperMicro 4029GP-TRT2](https://www.supermicro.com/products/system/4U/4029/SYS-4029GP-TRT2.cfm), stuffed with 8x NVidia RTX 2080 Ti. Though maybe a bit overpowered, with upcoming networks like BigGAN and fully 3D networks, as well as students joining our group, this machine will be used quite a lot in the future.

    One challenge is, is how to manage these GPUs. There are many approaches, but given that most PhD candidates aren't sysadmins, these range from 'free-for-all', leading to one person hogging all GPUs for weeks due to a bug in the code, to Excel sheets that noone understands and noone adheres to because changing GPU ids in code is hard. This leads to a lot of frustration, low productivity and under-utilisation of these expensive servers. Another issue is conflicting software versions. TensorFlow and Keras, for example, tend to do breaking API changes every now and then. As these always happen right before a conference deadline, this leads to even more frustration when trying to run a few extra experiments.

    A group at a previous affiliation of mine had the same problems and used Docker containers with a job scheduler to mitigate most of these problem. Unfortunately I had never used it myself and was thus not familiar with the exact details of their implementation. This approach solves most of our problems: no conflicting software versions (just roll a container per research paper and archive it), no competing for GPUs and, most importantly, people can't accidentally screw up colleagues' experiments.

    There was one more constraint I had: our system should be as easy to use as possible. When I talked about 'job scheduling' and 'GPU allocation' to my colleagues, the reaction I got was that they were scared it'd be either too complicated to use, or too limited to use. As I really didn't want to go the 'Google Sheets' route for GPU scheduling, I kept this in mind during the design of the system. Another constraint was set by our sysadmin: no root for users, as we use some old NFSv3 fileservers which authenticate on UID/GID level. This immediately excluded Docker, as described (here)[https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface]. As users would be allowed to control the Docker daemon, it's pretty much the same as putting everyone in the group 'sudoers'. Not something we want, to be honest.
    There was one more constraint I had: our system should be as easy to use as possible. When I talked about 'job scheduling' and 'GPU allocation' to my colleagues, the reaction I got was that they were scared it'd be either too complicated to use, or too limited to use. As I really didn't want to go the 'Google Sheets' route for GPU scheduling, I kept this in mind during the design of the system. Another constraint was set by our sysadmin: no root for users, as we got some old NFSv3 fileservers which authenticate on UID/GID level. This immediately excluded Docker, as described [here](https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface). As users would be allowed to control the Docker daemon, it's pretty much the same as putting everyone in the group 'sudoers'. Not something we want, to be honest.

    In the end I decided to use a combination of [Singularity](https://sylabs.io/singularity/) and [SLURM](https://slurm.schedmd.com/). Singularity is a container tool and used a lot by HPC facilities. SLURM is an industry-standard job scheduler, also used by many HPC instances. An advantage of these tools is that they are industry-standard, thus used a lot and well-documented. Always helpful when running into problems. To enforce proper usage of these tools, control groups (cgroups) are used to lock access down.
    In the end I decided to use a combination of [Singularity](https://sylabs.io/singularity/) and [SLURM](https://slurm.schedmd.com/). Singularity is a container tool created for and used a lot by HPC facilities. SLURM is an industry-standard job scheduler, also used by many HPC instances. An advantage of these tools is that they are industry-standard, thus used a lot and well-documented. Always helpful when running into problems. To enforce proper usage of these tools, control groups (cgroups) are used to lock access down: by default there are no GPU permissions.

    As most tutorials I found were quite outdated, here's a new one. It's in typical 'follow-along' style, so you can copy/paste the commands onto your own terminal and you'll end up with a similar system! Note: root access is required of course.
    As most tutorials I found were quite outdated, here's a new one. It's in typical 'follow-along' style, so you can copy/paste the commands onto your own terminal and you'll end up with a similar system! Root access is required, obviously.

    Note: we are running Ubuntu 18.04 LTS.
    Note: we are running a new install of Ubuntu 18.04 LTS.

    ## Installing Singularity
    As most debian packages for Singularity are quite outdated, we'll compile it ourselves. It's written in Go, so we'll also install a recent Go version.
    @@ -62,18 +62,18 @@ go env
    ```
    This should give some output of Go.

    Next step is compiling Singularity itself. First get dep, then Singularity. Obviously change v3.2.1 to any later version if you want. Take a look at their github tags for more info.
    Next step is compiling Singularity itself. First get dep, then Singularity. Obviously change v3.5.2 to any later version if you want. Take a look at their github tags for more info.
    ```sh
    go get -u github.com/golang/dep/cmd/dep
    go get -d github.com/sylabs/singularity
    cd $GOPATH/src/github.com/sylabs/singularity
    git checkout v3.2.1
    git checkout v3.5.2
    ```
    It'll complain a bit about no Go files being there, but still does its job.
    Now compile time, this will take a few minutes:
    ```sh
    ./mconfig
    make -C builddir
    make -j10 -C builddir
    sudo make -C ./builddir install
    ```

    @@ -82,7 +82,7 @@ You should be done now! Let's test it:
    singularity version
    ```

    And the output should be `3.2.1` or the version you picked before.
    And the output should be `3.5.2` or the version you picked before.

    We're going to make a few changes to the default configuration, mainly to make it easier for our users. We'll add a few bind points and change a few defaults to make the containers as transparent as possible.

    @@ -147,9 +147,9 @@ sudo systemctl enable slurmdbd
    ```

    We can't start them yet because we don't have a slurm.conf file yet.
    There is a generator to make one, but I'll drop my own slurm.conf file here below.
    There is a generator to make one, but I'll drop my own slurm.conf file here below later.

    We also need mysql for accounting. This isn't the most desirable application you can install (for security reasons), but nowadays the defaults of mysql 5.7 at Ubuntu 18.04 are pretty sane.
    We also need mysql for accounting. This isn't the most desirable application you can install (for security reasons), but nowadays the defaults of mysql 5.7 at Ubuntu 18.04 are pretty sane (no more guest access, no empty root password).

    ```sh
    sudo DEBIAN_FRONTEND=noninteractive apt-get install -y mysql-server pwgen
    @@ -160,12 +160,13 @@ Use pwgen two generate two passwords: one for the mysql root user, one for the s
    ```sh
    pwgen 16 2
    ```
    Write them down or store them somewhere.
    Write them down or store them somewhere. Now open a mysql shell:

    ```sh
    mysql
    ```
    Then run these commands in the mysql shell:
    Then run these commands in the shell: Replace your_secure_password
    with one of the password generated by `pwgen` above.
    ```sql
    create user 'slurm'@'localhost';
    set password for 'slurm'@'localhost' = 'your_secure_password';
    @@ -181,7 +182,7 @@ Now it's time for the configuration files. There's two:
    2. `slurmd.conf` which is the generic slurm configuration

    I'll start with `slurmdbd.conf` and will just copypaste them here.
    Put them in `/etc/slurm/`
    Put them in `/etc/slurm/`. Don't forget to replace the password!

    ```
    # SLURMDB config file
    @@ -338,7 +339,8 @@ And run the command `sudo systemctl enable cgconfigparser.service` after.

    This will now be run every time on boot. So reboot the system.

    To move user processes into the right group, we edit `/etc/pam.d/common-session`.
    To move user processes into the right group, we edit
    `/etc/pam.d/common-session`.
    Add below line to the bottom of the file:

    ```
  4. Markus92 revised this gist Oct 21, 2019. 1 changed file with 3 additions and 3 deletions.
    6 changes: 3 additions & 3 deletions serversetup.md
    Original file line number Diff line number Diff line change
    @@ -1,15 +1,15 @@
    # Setting up a GPU server with scheduling and containers


    Our group recently acquired a new server to do some deep learning: a (https://www.supermicro.com/products/system/4U/4029/SYS-4029GP-TRT2.cfm)[SuperMicro 4029GP-TRT2], stuffed with 8x NVIDIA RTX 2080 Ti. Though maybe a bit overpowered, with upcoming networks like BigGAN and fully 3D networks, as well as students joining our group, this machine will be used quite a lot in the future.
    Our group recently acquired a new server to do some deep learning: a [SuperMicro 4029GP-TRT2](https://www.supermicro.com/products/system/4U/4029/SYS-4029GP-TRT2.cfm), stuffed with 8x NVIDIA RTX 2080 Ti. Though maybe a bit overpowered, with upcoming networks like BigGAN and fully 3D networks, as well as students joining our group, this machine will be used quite a lot in the future.

    One challenge is, is how to manage these GPUs. There are many approaches, but given that most PhD candidates aren't sysadmins, these range from 'free-for-all', leading to one person hogging all GPUs for weeks due to a bug in the code, to Excel sheets that noone understands and noone adheres to because changing GPU ids in code is hard. This leads to a lot of frustration, low productivity and under-utilisation of these expensive servers. Another issue is conflicting software versions. TensorFlow and Keras, for example, tend to do breaking API changes every now and then. As these always happen right before a conference deadline, this leads to even more frustration when trying to run a few extra experiments.

    A group at a previous affiliation of mine had the same problems and used Docker containers with a job scheduler to mitigate most of these problem. Unfortunately I had never used it myself and was thus not familiar with the exact details of their implementation. This approach solves most of our problems: no conflicting software versions (just roll a container per research paper and archive it), no competing for GPUs and, most importantly, people can't accidentally screw up colleagues' experiments.

    There was one more constraint I had: our system should be as easy to use as possible. When I talked about 'job scheduling' and 'GPU allocation' to my colleagues, the reaction I got was that they were scared it'd be either too complicated to use, or too limited to use. As I really didn't want to go the 'Google Sheets' route for GPU scheduling, I kept this in mind during the design of the system. Another constraint was set by our sysadmin: no root for users, as we use some old NFSv3 fileservers which authenticate on UID/GID level. This immediately excluded Docker, as described [https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface](here). As users would be allowed to control the Docker daemon, it's pretty much the same as putting everyone in the group 'sudoers'. Not something we want, to be honest.
    There was one more constraint I had: our system should be as easy to use as possible. When I talked about 'job scheduling' and 'GPU allocation' to my colleagues, the reaction I got was that they were scared it'd be either too complicated to use, or too limited to use. As I really didn't want to go the 'Google Sheets' route for GPU scheduling, I kept this in mind during the design of the system. Another constraint was set by our sysadmin: no root for users, as we use some old NFSv3 fileservers which authenticate on UID/GID level. This immediately excluded Docker, as described (here)[https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface]. As users would be allowed to control the Docker daemon, it's pretty much the same as putting everyone in the group 'sudoers'. Not something we want, to be honest.

    In the end I decided to use a combination of (https://sylabs.io/singularity/)[Singularity] and (https://slurm.schedmd.com/)[SLURM]. Singularity is a container tool and used a lot by HPC facilities. SLURM is an industry-standard job scheduler, also used by many HPC instances. An advantage of these tools is that they are industry-standard, thus used a lot and well-documented. Always helpful when running into problems. To enforce proper usage of these tools, control groups (cgroups) are used to lock access down.
    In the end I decided to use a combination of [Singularity](https://sylabs.io/singularity/) and [SLURM](https://slurm.schedmd.com/). Singularity is a container tool and used a lot by HPC facilities. SLURM is an industry-standard job scheduler, also used by many HPC instances. An advantage of these tools is that they are industry-standard, thus used a lot and well-documented. Always helpful when running into problems. To enforce proper usage of these tools, control groups (cgroups) are used to lock access down.

    As most tutorials I found were quite outdated, here's a new one. It's in typical 'follow-along' style, so you can copy/paste the commands onto your own terminal and you'll end up with a similar system! Note: root access is required of course.

  5. Markus92 created this gist Oct 21, 2019.
    370 changes: 370 additions & 0 deletions serversetup.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,370 @@
    # Setting up a GPU server with scheduling and containers


    Our group recently acquired a new server to do some deep learning: a (https://www.supermicro.com/products/system/4U/4029/SYS-4029GP-TRT2.cfm)[SuperMicro 4029GP-TRT2], stuffed with 8x NVIDIA RTX 2080 Ti. Though maybe a bit overpowered, with upcoming networks like BigGAN and fully 3D networks, as well as students joining our group, this machine will be used quite a lot in the future.

    One challenge is, is how to manage these GPUs. There are many approaches, but given that most PhD candidates aren't sysadmins, these range from 'free-for-all', leading to one person hogging all GPUs for weeks due to a bug in the code, to Excel sheets that noone understands and noone adheres to because changing GPU ids in code is hard. This leads to a lot of frustration, low productivity and under-utilisation of these expensive servers. Another issue is conflicting software versions. TensorFlow and Keras, for example, tend to do breaking API changes every now and then. As these always happen right before a conference deadline, this leads to even more frustration when trying to run a few extra experiments.

    A group at a previous affiliation of mine had the same problems and used Docker containers with a job scheduler to mitigate most of these problem. Unfortunately I had never used it myself and was thus not familiar with the exact details of their implementation. This approach solves most of our problems: no conflicting software versions (just roll a container per research paper and archive it), no competing for GPUs and, most importantly, people can't accidentally screw up colleagues' experiments.

    There was one more constraint I had: our system should be as easy to use as possible. When I talked about 'job scheduling' and 'GPU allocation' to my colleagues, the reaction I got was that they were scared it'd be either too complicated to use, or too limited to use. As I really didn't want to go the 'Google Sheets' route for GPU scheduling, I kept this in mind during the design of the system. Another constraint was set by our sysadmin: no root for users, as we use some old NFSv3 fileservers which authenticate on UID/GID level. This immediately excluded Docker, as described [https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface](here). As users would be allowed to control the Docker daemon, it's pretty much the same as putting everyone in the group 'sudoers'. Not something we want, to be honest.

    In the end I decided to use a combination of (https://sylabs.io/singularity/)[Singularity] and (https://slurm.schedmd.com/)[SLURM]. Singularity is a container tool and used a lot by HPC facilities. SLURM is an industry-standard job scheduler, also used by many HPC instances. An advantage of these tools is that they are industry-standard, thus used a lot and well-documented. Always helpful when running into problems. To enforce proper usage of these tools, control groups (cgroups) are used to lock access down.

    As most tutorials I found were quite outdated, here's a new one. It's in typical 'follow-along' style, so you can copy/paste the commands onto your own terminal and you'll end up with a similar system! Note: root access is required of course.

    Note: we are running Ubuntu 18.04 LTS.

    ## Installing Singularity
    As most debian packages for Singularity are quite outdated, we'll compile it ourselves. It's written in Go, so we'll also install a recent Go version.

    First, install some standard packages for compiling stuff.
    ```sh
    $ sudo apt-get update && \
    sudo apt-get install -y \
    python \
    git \
    dh-autoreconf \
    build-essential \
    libarchive-dev \
    libssl-dev \
    uuid-dev \
    libgpgme11-dev \
    squashfs-tools
    ```

    ```sh
    $ wget https://dl.google.com/go/go1.12.6.linux-amd64.tar.gz
    $ sudo tar -xvf go1.12.6.linux-amd64.tar.gz
    $ sudo mv go /usr/local
    $ /usr/local/go/bin/go version
    ```

    To make sure the GOPATH is set for everyone, I created a new script in `/etc/profile.d`
    ```sh
    $ sudo nano /etc/profile.d/dl_paths.sh
    ```

    And the script:
    ```sh
    GOROOT="/usr/local/go"

    export GOROOT=${GOROOT}
    export GOPATH=$HOME/go
    export PATH=$GOROOT/bin:$PATH
    ```

    To test this, logout and log back in again (or just reboot).

    ```sh
    export
    go env
    ```
    This should give some output of Go.

    Next step is compiling Singularity itself. First get dep, then Singularity. Obviously change v3.2.1 to any later version if you want. Take a look at their github tags for more info.
    ```sh
    go get -u github.com/golang/dep/cmd/dep
    go get -d github.com/sylabs/singularity
    cd $GOPATH/src/github.com/sylabs/singularity
    git checkout v3.2.1
    ```
    It'll complain a bit about no Go files being there, but still does its job.
    Now compile time, this will take a few minutes:
    ```sh
    ./mconfig
    make -C builddir
    sudo make -C ./builddir install
    ```

    You should be done now! Let's test it:
    ```sh
    singularity version
    ```

    And the output should be `3.2.1` or the version you picked before.

    We're going to make a few changes to the default configuration, mainly to make it easier for our users. We'll add a few bind points and change a few defaults to make the containers as transparent as possible.

    ```sh
    $sudo nano /usr/local/etc/singularity/singularity.conf
    ```

    First, change `always use nv = no` to `yes`. It doesn't really have any downsides, just saves you from typing --nv every time. Second, we add a few bind paths. Obviously these are user specific, though `/run/user` is useful for everyone running a systemd-based distribution like Ubuntu or Debian. I added these below the standard bind paths, you'll find it easily in the config file.
    ```sh
    # For temporary files
    bind path = /run/user
    # Mounts to data
    bind path = /raid
    ```

    And finally a test run (this might take a while as the container is HUGE.)
    ```sh
    $ cd ~
    $ singularity exec docker://nvcr.io/nvidia/pytorch:19.05-py3 jupyter notebook
    ```

    ## SLURM
    Unfortunately the packages in Ubuntu and Debian are a bit too outdated, so we'll compile our own version. First install some dependenices. Note that we'll install the cgroup stuff right away.

    ```sh
    sudo apt-get install build-essential ruby-dev libpam0g-dev libmysqlclient-dev munge libmunge-dev libmysqld-dev cgroup-bin libpam-cgroup cgroup-tools
    ```

    Then download, extract and compile. My machine has many cores so we'll use some multi-threading in the make. Depending on your computer, you might have enough time to grab and drink some coffee.
    ```sh
    wget https://download.schedmd.com/slurm/slurm-19.05.0.tar.bz2
    tar -xaf slurm-19.05.0.tar.bz2
    cd slurm-19.05.0/
    ./configure --sysconfdir=/etc/slurm --enable-pam --localstatedir=/var --with-munge --with-ssl
    make -j10
    sudo make install
    ```

    Logout/login, then check if it actually does something.
    ```sh
    srun
    ```
    You'll get an error about the configuration file not existing,

    Now start and enable munge.

    ```sh
    sudo systemctl enable munge
    sudo systemctl start munge
    sudo systemctl status munge
    ```

    Copy systemd files and enable them, create user for SLURM.

    ```sh
    cd ~/slurm-19.05.0/etc
    sudo cp *.service /lib/systemd/system/
    sudo adduser --system --no-create-home --group slurm
    sudo systemctl enable slurmd
    sudo systemctl enable slurmctld
    sudo systemctl enable slurmdbd
    ```

    We can't start them yet because we don't have a slurm.conf file yet.
    There is a generator to make one, but I'll drop my own slurm.conf file here below.

    We also need mysql for accounting. This isn't the most desirable application you can install (for security reasons), but nowadays the defaults of mysql 5.7 at Ubuntu 18.04 are pretty sane.

    ```sh
    sudo DEBIAN_FRONTEND=noninteractive apt-get install -y mysql-server pwgen
    ```

    Use pwgen two generate two passwords: one for the mysql root user, one for the slurm user.

    ```sh
    pwgen 16 2
    ```
    Write them down or store them somewhere.

    ```sh
    mysql
    ```
    Then run these commands in the mysql shell:
    ```sql
    create user 'slurm'@'localhost';
    set password for 'slurm'@'localhost' = 'your_secure_password';
    grant usage on *.* to 'slurm'@'localhost';
    create database slurm_acct_db;
    grant all privileges on slurm_acct_db.* to 'slurm'@'localhost';
    flush privileges;
    exit
    ```

    Now it's time for the configuration files. There's two:
    1. `slurmdbd.conf` which is for the database daemon
    2. `slurmd.conf` which is the generic slurm configuration

    I'll start with `slurmdbd.conf` and will just copypaste them here.
    Put them in `/etc/slurm/`

    ```
    # SLURMDB config file
    # Created by Mark Janse 2019-06-18
    # logging level
    ArchiveEvents=no
    ArchiveJobs=yes
    ArchiveSteps=no
    ArchiveSuspend=no
    # service
    DbdHost=localhost
    SlurmUser=slurm
    AuthType=auth/munge
    # logging; remove this to use syslog
    LogFile=/var/log/slurm-llnl/slurmdbd.log
    # database backend
    StoragePass=your_secure_password
    StorageUser=slurm
    StorageType=accounting_storage/mysql
    StorageLoc=slurm_acct_db
    ```

    And here's the `slurm.conf`. I'll assume hostname `turing` for the main pc. The name of the cluster is `bip-cluster`, but isn't really too important:
    At the bottom I also define the node, ours has 8 GPUs, 2 CPUs, 10 cores per CPU and 2 threads per core. Change these to your own liking.

    ```
    #def dslurm.conf file generated by configurator easy.html.
    # Put this file on all nodes of your cluster.
    # See the slurm.conf man page for more information.
    #
    # Set your hostname here!
    SlurmctldHost=turing
    #
    #MailProg=/bin/mail
    MpiDefault=none
    #MpiParams=ports=#-#
    ProctrackType=proctrack/cgroup
    ReturnToService=1
    SlurmctldPidFile=/var/run/slurmctld.pid
    #SlurmctldPort=6817
    SlurmdPidFile=/var/run/slurmd.pid
    #SlurmdPort=6818
    SlurmdSpoolDir=/var/spool/slurmd
    SlurmUser=slurm
    #SlurmdUser=root
    StateSaveLocation=/var/spool/slurm
    SwitchType=switch/none
    TaskPlugin=task/cgroup
    #
    #
    # TIMERS
    #KillWait=30
    #MinJobAge=300
    #SlurmctldTimeout=120
    #SlurmdTimeout=300
    #
    #
    # SCHEDULING
    FastSchedule=1
    SchedulerType=sched/backfill
    SelectType=select/cons_tres
    SelectTypeParameters=CR_Core
    #
    #
    # LOGGING AND ACCOUNTING
    AccountingStorageType=accounting_storage/slurmdbd
    AccountingStorageEnforce=associations
    ClusterName=bip-cluster
    #JobAcctGatherFrequency=30
    JobAcctGatherType=jobacct_gather/linux
    #SlurmctldDebug=info
    SlurmctldLogFile=/var/log/slurm/slurmctld.log
    #SlurmdDebug=info
    SlurmdLogFile=/var/log/slurm/slurmd.log
    #
    #
    # COMPUTE NODES
    # NodeName=turing Sockets=2 CoresPerSocket=10 ThreadsPerCore=2 State=UNKNOWN
    # Partitions
    GresTypes=gpu
    NodeName=turing Gres=gpu:8 Sockets=2 CoresPerSocket=10 ThreadsPerCore=2 State=UNKNOWN
    PartitionName=tu102 Nodes=turing Default=YES MaxTime=96:00:00 MaxNodes=2 DefCpuPerGPU=5 State=UP
    ```

    For GPU scheduling you also need a `gres.conf`. This changes per machine if you have different amounts of GPUs. In our case, there is only one machine with 8 GPUs.
    ```
    # Defines all 8 GPUs on Turing
    Name=gpu File=/dev/nvidia[0-7]
    ```
    ## Restricting unauthorized GPU access
    Previously, we already installed several tools to enable `cgroups` to work. Now we're going to make them work.
    First we create the file `cgconfig.conf`. See below for contents. We create a group `nogpu` for processes without gpu access, and a group `gpu` for processes which can access the GPU.

    Location of the file is `/etc/cgconfig.conf`
    ```
    # Below restricts access to NVIDIA devices for all users in this cgroup
    # Number 195 is documented in kernel for NVIDIA driver stuff
    group nogpu {
    devices {
    devices.deny = "c 195:* rwm";
    }
    }
    # Opposite of above, just to be sure
    group gpu {
    devices {
    devices.allow = "c 195:* rwm";
    }
    }
    ```

    For admin tasks, you might want to create a usergroup which will always have GPU access.

    ```sh
    sudo groupadd gpu
    sudo usermod -aG gpu mark
    ```

    I'd advise to add every user with root access to this group for administration tasks. Do *not* add any regular users to it, or it'll break the purpose of the scheduling system.

    To load these `cgroups` every time the system boots, we'll run `cgconfigparser` on boot. Let's create a small `systemd` script to do this:

    ```sh
    sudo nano /lib/systemd/system/cgconfigparser.service
    ```

    And copy-paste below file in there:

    ```
    [Unit]
    Description=cgroup config parser
    After=network.target
    [Service]
    User=root
    Group=root
    ExecStart=/usr/sbin/cgconfigparser -l /etc/cgconfig.conf
    Type=oneshot
    [Install]
    WantedBy=multi-user.target
    ```

    And run the command `sudo systemctl enable cgconfigparser.service` after.

    This will now be run every time on boot. So reboot the system.

    To move user processes into the right group, we edit `/etc/pam.d/common-session`.
    Add below line to the bottom of the file:

    ```
    session optional pam_cgroup.so
    ```

    The pam reads the file `/etc/cgrules.conf` so create that. I added it here below:

    ```
    # /etc/cgrules.conf
    #The format of this file is described in cgrules.conf(5)
    #manual page.
    #
    # Example:
    #<user> <controllers> <destination>
    #@student cpu,memory usergroup/student/
    #peter cpu test1/
    #% memory test2/
    # End of file
    root devices /
    user devices /
    @gpu devices /gpu
    * devices /nogpu
    ```

    This will move all users in the group *gpu* to GPU access, and everyone else to no GPU access. Exactly what we want.

    Now reboot for the final time and you're done!