Containerized workspace for General PyTorch Deep Learning Prototyping development.
Before using the Docker environment, make sure you have the following dependencies on your local machine:
Please follow the official guide to install Docker, which provides the virtualization technology behind the isolated workspace, on your local machine.
In order to run Docker commands without sudo:
-
Execute the following command to add current user to Docker Group
sudo usermod -aG docker $USER
-
Log out, then log in
Please follow the official guide to install Docker-Compose, Docker orchestrator on local machine, on your local machine.
For prototyping development the following libraries for Graph Neural Network development are used:
- CUDA 11.3, powered by NVIDIA-driver-R470
Note For prototyping you just need a CUDA compatible card. For training on large-scale dataset cloud utilities should be used.
Note make sure you have installed the exact version of NVIDIA driver on your machine. Otherwise the driver could be imcompatible with GPU container.
-
First, use the command below to make sure you have CUDA compatible NVIDIA display card on your local machine.
sudo lshw -C display
-
Follow this guide, Install Nvidia driver using GUI method # 1 on Ubuntu Linux to enable the latest NVIDIA driver on your local machine.
Use the command below to install NVIDIA Container Toolkit, which enables NVIDIA runtime for docker, on your local machine:
# add repository:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
# install:
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit nvidia-docker2
# restart docker:
sudo systemctl restart docker
Note For CN developmers, please also add Docker registry from Alibaba Cloud to accelerate image fetching:
-
Open Docker daemon config:
sudo gedit daemon.json
-
Add registry-mirrors
{ "registry-mirrors": ["https://[YOUR_IMAGE_ACCELERATOR_ID].mirror.aliyuncs.com/"], "runtimes": { "nvidia": { "path": "nvidia-container-runtime", "runtimeArgs": [] } } }
-
Restart docker
# restart docker: sudo systemctl restart docker
Use the following command to ensure you have all the dependencies ready:
# run nvidia-smi inside base gpu docker:
docker run --rm --gpus all nvidia/cuda:11.3-cudnn8-devel-ubuntu18.04 nvidia-smi
# expected output:
Mon Feb 21 16:04:39 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01 Driver Version: 470.103.01 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro RTX 5000 Off | 00000000:01:00.0 Off | N/A |
| N/A 58C P8 2W / N/A | 550MiB / 16125MiB | 6% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2329 G /usr/lib/xorg/Xorg 49MiB |
| 0 N/A N/A 2882 G /usr/bin/gnome-shell 85MiB |
| 0 N/A N/A 3813 G /usr/lib/xorg/Xorg 202MiB |
| 0 N/A N/A 3992 G /usr/bin/gnome-shell 54MiB |
| 0 N/A N/A 5596 G ...AAAAAAAAA= --shared-files 88MiB |
| 0 N/A N/A 14806 G ...AAAAAAAAA= --shared-files 62MiB |
+-----------------------------------------------------------------------------+
Now you are ready to explore the web workspace for PyTorch deep learning prototyping development.
Execute the commands below at the root of repo to launch development environments.
Note bash script is still used due to Docker-Compose's limited support of NVIDIA-container-toolkit.
# launch GPU workspace:
bash ./workspace-bionic-gpu-vnc.sh
You can identify the running workspace by the following commands:
# list the running docker instances:
docker ps -a
Now go to http://localhost:40080/ to access the web workspace. This is a virtual desktop powered by noVNC.
First, add a remote server configuration as follows:
Then, add a remote Python interpreter as follows:
Finally:
-
Add PYTHONPATH to Run/Debug Configuration
# execute this inside Docker web workspace, with target conda env activated: (object-detection) root@4fc082c3a132:~# python Python 3.6.12 |Anaconda, Inc.| (default, Sep 8 2020, 23:10:56) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import detectron2 >>> detectron2.__version__ '0.3' >>> detectron2.__file__ '/opt/anaconda/envs/object-detection/lib/python3.6/site-packages/detectron2/__init__.py' # set PYTHONPATH in Run/Debug Configuration using the value found above: export PYTHONPATH=/opt/anaconda/envs/object-detection/lib/python3.6/site-packages
-
Map local path to remote path according to volume mount config:
Verify that everything is there by run a demo detection:
Input | Output |
---|---|
![]() |
![]() |