Skip to content

AlexGeControl/Graph-Neural-Network

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Development Workspace for General PyTorch Deep Learning Prototyping

Containerized workspace for General PyTorch Deep Learning Prototyping development.


Prerequisites

Before using the Docker environment, make sure you have the following dependencies on your local machine:

Docker

Please follow the official guide to install Docker, which provides the virtualization technology behind the isolated workspace, on your local machine.

IMPORTANT: Change Current User Group

In order to run Docker commands without sudo:

  • Execute the following command to add current user to Docker Group

    sudo usermod -aG docker $USER
  • Log out, then log in

Docker-Compose

Please follow the official guide to install Docker-Compose, Docker orchestrator on local machine, on your local machine.

NVIDIA Driver for GPU Workspace

Introduction

For prototyping development the following libraries for Graph Neural Network development are used:

  • CUDA 11.3, powered by NVIDIA-driver-R470

Note For prototyping you just need a CUDA compatible card. For training on large-scale dataset cloud utilities should be used.

NVIDIA Driver, R470

Note make sure you have installed the exact version of NVIDIA driver on your machine. Otherwise the driver could be imcompatible with GPU container.

  • First, use the command below to make sure you have CUDA compatible NVIDIA display card on your local machine.

    sudo lshw -C display
  • Follow this guide, Install Nvidia driver using GUI method # 1 on Ubuntu Linux to enable the latest NVIDIA driver on your local machine.

NVIDIA Container Toolkit

Use the command below to install NVIDIA Container Toolkit, which enables NVIDIA runtime for docker, on your local machine:

# add repository:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

# install:
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit nvidia-docker2

# restart docker:
sudo systemctl restart docker

Note For CN developmers, please also add Docker registry from Alibaba Cloud to accelerate image fetching:

  • Open Docker daemon config:

    sudo gedit daemon.json
  • Add registry-mirrors

    {
        "registry-mirrors": ["https://[YOUR_IMAGE_ACCELERATOR_ID].mirror.aliyuncs.com/"],
        "runtimes": {
            "nvidia": {
                "path": "nvidia-container-runtime",
                "runtimeArgs": []
            }
        }
    }
    
  • Restart docker

    # restart docker:
    sudo systemctl restart docker

Final Verification

Use the following command to ensure you have all the dependencies ready:

# run nvidia-smi inside base gpu docker:
docker run --rm --gpus all nvidia/cuda:11.3-cudnn8-devel-ubuntu18.04 nvidia-smi
# expected output:
Mon Feb 21 16:04:39 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro RTX 5000     Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   58C    P8     2W /  N/A |    550MiB / 16125MiB |      6%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2329      G   /usr/lib/xorg/Xorg                 49MiB |
|    0   N/A  N/A      2882      G   /usr/bin/gnome-shell               85MiB |
|    0   N/A  N/A      3813      G   /usr/lib/xorg/Xorg                202MiB |
|    0   N/A  N/A      3992      G   /usr/bin/gnome-shell               54MiB |
|    0   N/A  N/A      5596      G   ...AAAAAAAAA= --shared-files       88MiB |
|    0   N/A  N/A     14806      G   ...AAAAAAAAA= --shared-files       62MiB |
+-----------------------------------------------------------------------------+

Up & Running

Now you are ready to explore the web workspace for PyTorch deep learning prototyping development.

Launch Environment

Execute the commands below at the root of repo to launch development environments.

Note bash script is still used due to Docker-Compose's limited support of NVIDIA-container-toolkit.

# launch GPU workspace:
bash ./workspace-bionic-gpu-vnc.sh

You can identify the running workspace by the following commands:

# list the running docker instances:
docker ps -a

Access Web Workspace

Now go to http://localhost:40080/ to access the web workspace. This is a virtual desktop powered by noVNC.

PyTorch Deep Learning Prototyping Workspace Portal


PyCharm Remote Debug

Remote Server Config

First, add a remote server configuration as follows:

PyCharm Remote Server Configuration

Remote Python Interpreter

Then, add a remote Python interpreter as follows:

PyCharm Remote Python Interpreter

Run/Debug Configurations

Finally:

  • Add PYTHONPATH to Run/Debug Configuration

    # execute this inside Docker web workspace, with target conda env activated:
    (object-detection) root@4fc082c3a132:~# python
    Python 3.6.12 |Anaconda, Inc.| (default, Sep  8 2020, 23:10:56) 
    [GCC 7.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import detectron2
    >>> detectron2.__version__
    '0.3'
    >>> detectron2.__file__
    '/opt/anaconda/envs/object-detection/lib/python3.6/site-packages/detectron2/__init__.py'
    # set PYTHONPATH in Run/Debug Configuration using the value found above:
    export PYTHONPATH=/opt/anaconda/envs/object-detection/lib/python3.6/site-packages
  • Map local path to remote path according to volume mount config:

    PyCharm Run Configuration

Verification

Verify that everything is there by run a demo detection:

PyCharm Remote Detectron2

Input Output
Input Output

About

Sandbox for graph neural network learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published