Skip to content

Instantly share code, notes, and snippets.

@Matthieubmt
Forked from lucndm/Host
Created April 17, 2023 14:15
Show Gist options
  • Save Matthieubmt/8f23dfbc3f739a205c6c4489a6bb3bd3 to your computer and use it in GitHub Desktop.
Save Matthieubmt/8f23dfbc3f739a205c6c4489a6bb3bd3 to your computer and use it in GitHub Desktop.

Revisions

  1. @lucndm lucndm created this gist Dec 21, 2019.
    49 changes: 49 additions & 0 deletions Host
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,49 @@
    Note : Proxmox 6.1

    VI : /etc/apt/sources.list

    # security updates
    deb http://security.debian.org jessie/updates main contrib

    # PVE pve-no-subscription repository provided by proxmox.com,
    # NOT recommended for production use
    deb http://download.proxmox.com/debian jessie pve-no-subscription
    # jessie-backports
    deb http://httpredir.debian.org/debian jessie-backports main contrib non-free
    =======================
    VI : /etc/modules-load.d/modules.conf
    # /etc/modules: kernel modules to load at boot time.

    # This file contains the names of kernel modules that should be loaded
    # at boot time, one per line. Lines beginning with “#” are ignored.
    nvidia
    nvidia_uvm
    ========================
    ```bash
    update-initramfs -u
    ```
    ========================
    apt-get update && apt-get upgrade
    apt-cache search pve-header
    apt-get install -t jessie-backports nvidia-driver-440 # use driver version 440
    apt-get install i7z nvidia-smi htop iotop
    ========================
    VI: # /etc/udev/rules.d/70-nvidia.rules
    # /etc/udev/rules.d/70-nvidia.rules
    # Create /nvidia0, /dev/nvidia1 … and /nvidiactl when nvidia module is loaded
    KERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'"
    # Create the CUDA node when nvidia_uvm CUDA module is loaded
    KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0666 /dev/nvidia-uvm*'"
    =========================
    REBOOT
    =========================
    ```bash
    nvidia-smi
    ```
    Show driver and card is OKEY.
    ==========================
    ```bash
    modprobe nvidia-uvm
    ls /dev/nvidia* -l
    ```

    7 changes: 7 additions & 0 deletions LXC- ubuntu 18.08
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,7 @@
    bash ./cuda_10.2.89_440.33.01_linux.run

    Please make sure that
    - PATH includes /usr/local/cuda-10.2/bin
    - LD_LIBRARY_PATH includes /usr/local/cuda-10.2/lib64, or, add /usr/local/cuda-10.2/lib64 to /etc/ld.so.conf and run ldconfig as root

    To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-10.2/bin
    5 changes: 5 additions & 0 deletions REF
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,5 @@
    https://medium.com/repro-repo/install-cuda-10-1-and-cudnn-7-5-0-for-pytorch-on-ubuntu-18-04-lts-9b6124c44cc
    https://tutorials.technology/tutorials/85-How-to-remove-Nouveau-kernel-driver-Nvidia-install-error.html
    https://medium.com/@MARatsimbazafy/journey-to-deep-learning-nvidia-gpu-passthrough-to-lxc-container-97d0bc474957
    https://gist.github.com/MakiseKurisu/21b08e5f6537a5b0a08a34c2382dd244/raw/94e1c9653d934241f6e04955afa823c0e5bafee4/setup.sh
    https://askubuntu.com/questions/5417/how-to-get-the-gpu-info
    24 changes: 24 additions & 0 deletions container_id.conf in pve
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,24 @@
    #VI : /etc/pve/lxc/<CONTAINER_ID>.conf
    # Deep Learning Container (CUDA, cuDNN, OpenCL support)

    arch: amd64
    cpulimit: 8
    cpuunits: 1024
    hostname: MachineLearning
    memory: 16384
    net0: bridge=vmbr0,gw=192.168.1.1,hwaddr=36:39:64:66:36:66,ip=192.168.1.200/24,name=eth0,type=veth
    onboot: 0
    ostype: archlinux
    rootfs: local-lvm:vm-400-disk-1,size=192G
    swap: 16384
    unprivileged: 1


    # GPU Passthrough config
    lxc.cgroup.devices.allow: c 195:* rwm
    lxc.cgroup.devices.allow: c 243:* rwm
    lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
    lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
    lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
    lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
    lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file