Skip to content

Commit

Permalink
Add GPU support (#1289)
Browse files Browse the repository at this point in the history
---------

Signed-off-by: Dan Webb <[email protected]>
  • Loading branch information
damacus authored Dec 11, 2024
1 parent b5012c5 commit d62eec5
Show file tree
Hide file tree
Showing 8 changed files with 479 additions and 9 deletions.
2 changes: 2 additions & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
require:
- cookstyle
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased

- Added GPU support for the `docker_container` resource

## 11.6.1 - *2024-12-10*

## 11.6.0 - *2024-12-03*
Expand Down
24 changes: 24 additions & 0 deletions documentation/docker_container.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ Most `docker_container` properties are the `snake_case` version of the `CamelCas
- `env_file` - Read environment variables from a file and set in the container. Accepts an Array or String to the file location. lazy evaluator must be set if the file passed is created by Chef.
- `extra_hosts` - An array of hosts to add to the container's `/etc/hosts` in the form `['host_a:10.9.8.7', 'host_b:10.9.8.6']`
- `force` - A boolean to use in container operations that support a `force` option. Defaults to `false`
- `gpus` - GPU devices to add to the container. Use 'all' to pass all GPUs to the container.
- `gpu_driver` - GPU driver to use for container. Defaults to 'nvidia'.
- `health_check` - A hash containing the health check options - [healthcheck reference](https://docs.docker.com/engine/reference/run/#healthcheck)
- `host` - A string containing the host the API should communicate with. Defaults to ENV['DOCKER_HOST'] if set
- `host_name` - The hostname for the container.
Expand Down Expand Up @@ -521,3 +523,25 @@ docker_container 'health_check' do
action :run
end
```

### Run a container with GPU support

```ruby
# Using default NVIDIA driver
docker_container 'gpu_container' do
repo 'nvidia/cuda'
tag 'latest'
command 'nvidia-smi'
gpus 'all'
action :run_if_missing
end

# Using a custom GPU driver
docker_container 'custom_gpu_container' do
repo 'custom/gpu-image'
tag 'latest'
gpus 'all'
gpu_driver 'custom_driver'
action :run_if_missing
end
```
14 changes: 11 additions & 3 deletions resources/container.rb
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@
property :volumes_from, [String, Array], coerce: proc { |v| v.nil? ? nil : Array(v) }
property :volume_driver, String
property :working_dir, String
property :gpus, [String, nil], description: 'GPU devices to add to the container (e.g., all or device=0)'
property :gpu_driver, String, default: 'nvidia', description: 'GPU driver to use for container (e.g., nvidia)'

# Used to store the bind property since binds is an alias to volumes
property :volumes_binds, Array, coerce: proc { |v| v.sort }
Expand Down Expand Up @@ -326,19 +328,19 @@ def coerce_port_bindings(v)
#
# If you say: `image 'repo/blah'`
# Repo will be: `repo/blah`
# Tag will be: `latest`
# Tag will be: `latest'
#
# If you say: `image 'repo/blah:3.1'`
# Repo will be: `repo/blah`
# Tag will be: `3.1`
# Tag will be: `3.1'
#
# If you say: `image 'repo:1337/blah'`
# Repo will be: `repo:1337/blah`
# Tag will be: `latest'
#
# If you say: `image 'repo:1337/blah:3.1'`
# Repo will be: `repo:1337/blah`
# Tag will be: `3.1`
# Tag will be: `3.1'
#
def image(image = nil)
if image
Expand Down Expand Up @@ -544,6 +546,12 @@ def load_container_labels

# Store the state of the options and create the container
new_resource.create_options = config
config['HostConfig']['DeviceRequests'] = [{
'Driver' => new_resource.gpu_driver,
'Count' => -1, # -1 means no limit
'Capabilities' => [['gpu']],
}] if new_resource.gpus == 'all'

Docker::Container.create(config, connection)
end
end
Expand Down
13 changes: 13 additions & 0 deletions spec/docker_test/container_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -945,4 +945,17 @@
)
end
end

context 'testing GPU support' do
let(:chef_run) { ChefSpec::SoloRunner.new(platform: 'ubuntu', version: '18.04').converge(described_recipe) }

it 'creates a container with GPU support' do
expect(chef_run).to run_if_missing_docker_container('gpu_test').with(
repo: 'nvidia/cuda',
tag: 'latest',
gpus: 'all',
gpu_driver: 'nvidia'
)
end
end
end
Loading

0 comments on commit d62eec5

Please sign in to comment.