Hopobcn/README.md

Use Gitlab-CI with GPU support

Since `gitlab-runner` cannot be forced to use `nvidia-docker` wrapper, follow this steps:

Install all required software: docker, nvidia-docker, gitlab-ci-multi-runner
Execute: curl -s http://localhost:3476/docker/cli
Use that data to fill devices/volumes/volume_driver fields in /etc/gitlab-runner/config.toml

Hopobcn · 2018-02-27T14:53:55Z

This method is outdated.

pafelin · 2018-10-05T17:03:34Z

Is there a newer method ?
I have tried by installing the nvidia-docker, + docker + the runner itself. Then I set only the runtime parameter of the runner to be "nvidia" and the executor to be "docker" but tensorflow for example doesn't detect the GPUs at all.

frtrotta · 2019-07-05T10:02:26Z

The following config.toml provides GPU support (notice the runtime parameter).

concurrent = 1
check_interval = 0

[[runners]]
  name = "Docker runner <---complete-me--->"
  url = "https://<---complete-me---->"
  token = "28ce17edc8ea7437f3e49969c86341"
  executor = "docker"
  [runners.docker]
    tls_verify = false
    image = "nvidia/cuda"
    privileged = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
    runtime = "nvidia"
[runners.cache]

Yet, is it not clear to me how to restrict the GPUs assigned to the runner, on a multi-GPU server. This functionality is named "GPU isolation".

The docker run command for GPU isolation follows: please notice the -e NVIDIA_VISIBLE_DEVICES=0. how can this be set for the runner in config.toml?

docker run --runtime=nvidia --rm -e NVIDIA_VISIBLE_DEVICES=0 nvidia/cuda:9.0-base nvidia-smi

Hopobcn · 2019-07-08T07:23:10Z

In the [[runers]] section there's and environment keyword to define environment vars. But I guess that it wont work because you have to specify that environment var to docker.

So the only way I see is to specify NVIDIA_VISIBLE_DEVICES directly in the Dockerfile
https://github.com/NVIDIA/nvidia-docker/wiki/Usage#dockerfiles

frtrotta · 2019-07-10T11:27:48Z

It seems that environment in [[runners]] section is exactly what we were looking for.

Actually, whatever environment variable setting that happens before running the script section of the .gitlab-ci.yml configuration file is ok. See the following two examples: both of them worked for me.

Example 1: using gitlab-runner configuration only

In /etc/gitlab-runner/config.toml:

[[runners]]
  name = "runner-gpu0-test"
  url = "<url>"
  token = "<token>"
  executor = "docker"
  environment = ["NVIDIA_VISIBLE_DEVICES=0"]   # <== Notice this
  [runners.docker]
    runtime = "nvidia"  # <== Notice this
    tls_verify = false
    image = "nvidia/cuda:9.0-base"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]

[[runners]]
  name = "runner-gpu1-test"
  url = "<url>"
  token = "<token>"
  executor = "docker"
  environment = ["NVIDIA_VISIBLE_DEVICES=1"]  # <== Notice this
  [runners.docker]
    runtime = "nvidia"  # <== Notice this
    tls_verify = false
    image = "nvidia/cuda:9.0-base"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]

The .gitlab-ci.yml file.

image: nvidia/cuda:9.0-base

test:run_on_gpu0:
  stage: test
  script:
    - echo NVIDIA_VISIBLE_DEVICES=${NVIDIA_VISIBLE_DEVICES}
    - nvidia-smi
    - sleep 10s
  tags:
    - docker
    - gpu0

test:run_on_gpu1:
  stage: test
  script:
    - echo NVIDIA_VISIBLE_DEVICES=${NVIDIA_VISIBLE_DEVICES}
    - nvidia-smi
    - sleep 7s
  tags:
    - docker
    - gpu1

The two runners have been tagged with docker, gpu0 and docker, gpu1 respectively.

Example2: using Gitlab CI custom environment variables

Gitlab CI custom environment variables

/etc/gitlab-runner/config.toml same as Example 1.

The .gitlab-ci.yml file.

image: nvidia/cuda:9.0-base

variables:
   NVIDIA_VISIBLE_DEVICES: "3"  # This is going to override definition(s) in /etc/gitlab-runner/config.toml

test:run_on_gpu0:
  stage: test
  script:
    - echo NVIDIA_VISIBLE_DEVICES=${NVIDIA_VISIBLE_DEVICES}
    - nvidia-smi
    - sleep 10s
  tags:
    - docker
    - gpu0

test:run_on_gpu1:
  stage: test
  script:
    - echo NVIDIA_VISIBLE_DEVICES=${NVIDIA_VISIBLE_DEVICES}
    - nvidia-smi
    - sleep 7s
  tags:
    - docker
    - gpu1

hyviquel · 2019-09-18T16:45:22Z

Do you guys know how to make it work with docker v19.03.2 which integrates native support for nvidia gpus?
The runtime = "nvidia"does not work anymore, containers should be executed with --gpus flag now.

docker run -it --rm --gpus all ubuntu nvidia-smi

frtrotta · 2019-09-19T11:05:04Z

it is an open issue and, looking at the comments, it does not seem to be fixed soon.

I am using Docker 19.03 together with nvidia-docker2. This provides the new --gpu switch, while keeping the compatibility with the old --runtime switch (refer to https://github.com/NVIDIA/nvidia-docker/tree/master#upgrading-with-nvidia-docker2-deprecated).

Hopobcn/README.md

Use Gitlab-CI with GPU support

Since `gitlab-runner` cannot be forced to use `nvidia-docker` wrapper, follow this steps:

Hopobcn commented Feb 27, 2018

Uh oh!

pafelin commented Oct 5, 2018

Uh oh!

frtrotta commented Jul 5, 2019

Uh oh!

Hopobcn commented Jul 8, 2019

Uh oh!

frtrotta commented Jul 10, 2019

Uh oh!

hyviquel commented Sep 18, 2019 •

edited

Loading

Uh oh!

frtrotta commented Sep 19, 2019 •

edited

Loading

Uh oh!

	concurrent = 1
	check_interval = 0

	[[runners]]
	name = "Docker runner <---complete-me--->"
	url = "https://<---complete-me---->"
	token = "28ce17edc8ea7437f3e49969c86341"
	executor = "docker"
	[runners.docker]
	tls_verify = false
	image = "nvidia/cuda"
	privileged = false
	disable_cache = false
	devices = ["/dev/nvidiactl", "/dev/nvidia-uvm", "/dev/nvidia-uvm-tools", "/dev/nvidia3", "/dev/nvidia2", "/dev/nvidia1", "/dev/nvidia0"]
	volumes = ["/cache", "nvidia_driver_384.81:/usr/local/nvidia:ro"]
	volume_driver = "nvidia-docker"
	shm_size = 0
	[runners.cache]

Hopobcn/README.md

Use Gitlab-CI with GPU support

Since gitlab-runner cannot be forced to use nvidia-docker wrapper, follow this steps:

Hopobcn commented Feb 27, 2018

Uh oh!

pafelin commented Oct 5, 2018

Uh oh!

frtrotta commented Jul 5, 2019

Uh oh!

Hopobcn commented Jul 8, 2019

Uh oh!

frtrotta commented Jul 10, 2019

Example 1: using gitlab-runner configuration only

Example2: using Gitlab CI custom environment variables

Uh oh!

hyviquel commented Sep 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frtrotta commented Sep 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Since `gitlab-runner` cannot be forced to use `nvidia-docker` wrapper, follow this steps:

hyviquel commented Sep 18, 2019 •

edited

Loading

frtrotta commented Sep 19, 2019 •

edited

Loading