Skip to content

jigangkim/nvidia-gpu-scheduler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

3092e52 · Dec 13, 2022

History

35 Commits
Jan 10, 2020
Apr 22, 2021
Jan 16, 2020
Sep 7, 2022
Dec 1, 2020
Jan 10, 2020
Apr 22, 2021
Jan 17, 2022
Jan 17, 2022
Jan 15, 2020
Dec 13, 2022

Repository files navigation

Manage multiple NVIDIA GPU compute tasks

Supports per GPU compute limits (number of processes, utilization rate, memory usage) on a per-(UNIX)user/worker basis, load-balancing, multiple nodes(machines) and more.

Tested on tensorflow-gpu tasks.


Installation (virtual python environment such as venv/conda is recommended)

cd /path/to/install
git clone https://github.com/jigangkim/nvidia-gpu-scheduler.git
cd /path/to/install/nvidia-gpu-scheduler

pip install . # standard installation
pip install -e . # editable (develop mode) installation

Usage (dummy example: json)

cd /path/to/install/nvidia-gpu-scheduler
# Run job server
python example.py --identity scheduler --config_ext .json
# Run worker
python example.py --identity worker --config_ext .json

Usage (dummy example: gin)

cd /path/to/install/nvidia-gpu-scheduler
# Run job server
python example.py --identity scheduler --config_ext .gin
# Run worker
python example.py --identity worker --config_ext .gin

Usage (OpenAI baselines example)

cd /path/to/install/nvidia-gpu-scheduler
# Run job server
python example_openaibaselines.py --identity scheduler
# Run worker
python example_openaibaselines.py --identity worker