Skip to content

Commit

Permalink
Update instructions to build with nvidia cuda runtime image for ONNX (#…
Browse files Browse the repository at this point in the history
…2435)

* Update instructions to build with nvidia cuda runtime image for docker

* updated deepspeed documentation

* updated deepspeed documentation

* updated deepspeed documentation

* added example command

* Lint failure

* changed variable name

* Exit if -bi and -g are specified

---------

Co-authored-by: Mark Saroufim <[email protected]>
  • Loading branch information
agunapal and msaroufim authored Jul 29, 2023
1 parent e2cd91b commit 35ef00f
Show file tree
Hide file tree
Showing 4 changed files with 31 additions and 2 deletions.
14 changes: 13 additions & 1 deletion docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ Use `build_image.sh` script to build the docker images. The script builds the `p
|-h, --help|Show script help|
|-b, --branch_name|Specify a branch name to use. Default: master |
|-g, --gpu|Build image with GPU based ubuntu base image|
|-bi, --baseimage specify base docker image. Example: nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu20.04|
|-bt, --buildtype|Which type of docker image to build. Can be one of : production, dev, ci, codebuild|
|-t, --tag|Tag name for image. If not specified, script uses torchserve default tag names.|
|-cv, --cudaversion| Specify to cuda version to use. Supported values `cu92`, `cu101`, `cu102`, `cu111`, `cu113`, `cu116`, `cu117`, `cu118`. Default `cu117`|
Expand All @@ -55,8 +56,10 @@ Creates a docker image with publicly available `torchserve` and `torch-model-arc

- To create a GPU based image with cuda 10.2. Options are `cu92`, `cu101`, `cu102`, `cu111`, `cu113`, `cu116`, `cu117`, `cu118`

- GPU images are built with NVIDIA CUDA base image. If you want to use ONNX, please specify the base image as shown in the next section.

```bash
./build_image.sh -g -cv cu102
./build_image.sh -g -cv cu117
```

- To create an image with a custom tag
Expand All @@ -65,6 +68,15 @@ Creates a docker image with publicly available `torchserve` and `torch-model-arc
./build_image.sh -t torchserve:1.0
```

**NVIDIA CUDA RUNTIME BASE IMAGE**

To make use of ONNX, we need to use [NVIDIA CUDA runtime](https://github.com/NVIDIA/nvidia-docker/wiki/CUDA) as the base image.
This will increase the size of your Docker Image

```bash
./build_image.sh -bi nvidia/cuda:11.7.0-cudnn8-runtime-ubuntu20.04 -g -cv cu117
```

**DEVELOPER ENVIRONMENT IMAGES**

Creates a docker image with `torchserve` and `torch-model-archiver` installed from source.
Expand Down
14 changes: 14 additions & 0 deletions docker/build_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ BRANCH_NAME="master"
DOCKER_TAG="pytorch/torchserve:latest-cpu"
BUILD_TYPE="production"
BASE_IMAGE="ubuntu:20.04"
UPDATE_BASE_IMAGE=false
USE_CUSTOM_TAG=false
CUDA_VERSION=""
USE_LOCAL_SERVE_FOLDER=false
Expand All @@ -22,6 +23,7 @@ do
echo "-h, --help show brief help"
echo "-b, --branch_name=BRANCH_NAME specify a branch_name to use"
echo "-g, --gpu specify to use gpu"
echo "-bi, --baseimage specify base docker image. Example: nvidia/cuda:11.7.0-cudnn8-runtime-ubuntu20.04 "
echo "-bt, --buildtype specify to created image for codebuild. Possible values: production, dev, codebuild."
echo "-cv, --cudaversion specify to cuda version to use"
echo "-t, --tag specify tag name for docker image"
Expand Down Expand Up @@ -49,6 +51,12 @@ do
CUDA_VERSION="cu117"
shift
;;
-bi|--baseimage)
BASE_IMAGE="$2"
UPDATE_BASE_IMAGE=true
shift
shift
;;
-bt|--buildtype)
BUILD_TYPE="$2"
shift
Expand Down Expand Up @@ -141,6 +149,12 @@ then
DOCKER_TAG=${CUSTOM_TAG}
fi

if [[ $UPDATE_BASE_IMAGE == true && $MACHINE == "gpu" ]];
then
echo "Incompatible options: -bi doesn't work with -g option"
exit 1
fi

if [ "${BUILD_TYPE}" == "production" ]
then
DOCKER_BUILDKIT=1 docker build --file Dockerfile --build-arg BASE_IMAGE="${BASE_IMAGE}" --build-arg CUDA_VERSION="${CUDA_VERSION}" --build-arg PYTHON_VERSION="${PYTHON_VERSION}" --build-arg BUILD_NIGHTLY="${BUILD_NIGHTLY}" -t "${DOCKER_TAG}" --target production-image .
Expand Down
4 changes: 3 additions & 1 deletion docs/performance_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ At a high level what TorchServe allows you to do is
2. Load those weights from `base_handler.py` using `ort_session = ort.InferenceSession(self.model_pt_path, providers=providers, sess_options=sess_options)` which supports reasonable defaults for both CPU and GPU inference
3. Allow you define custom pre and post processing functions to pass in data in the format your onnx model expects with a custom handler

To use ONNX with GPU on TorchServe Docker, we need to build an image with [NVIDIA CUDA runtime](https://github.com/NVIDIA/nvidia-docker/wiki/CUDA) as the base image as shown [here](https://github.com/pytorch/serve/blob/master/docker/README.md#create-torchserve-docker-image)

<h4>TensorRT<h4>

TorchServe also supports models optimized via TensorRT. To leverage the TensorRT runtime you can convert your model by [following these instructions](https://github.com/pytorch/TensorRT) and once you're done you'll have serialized weights which you can load with [`torch.jit.load()`](https://pytorch.org/TensorRT/getting_started/getting_started_with_python_api.html#getting-started-with-python-api).
Expand Down Expand Up @@ -77,7 +79,7 @@ You can find more information on TorchServe benchmarking [here](https://github.c

TorchServe has native support for the PyTorch profiler which will help you find performance bottlenecks in your code.

If you created a custom `handle` or `initialize` method overwriting the BaseHandler, you must define the `self.manifest` attribute to be able to run `_infer_with_profiler`.
If you created a custom `handle` or `initialize` method overwriting the BaseHandler, you must define the `self.manifest` attribute to be able to run `_infer_with_profiler`.

```
export ENABLE_TORCH_PROFILER=TRUE
Expand Down
1 change: 1 addition & 0 deletions ts_scripts/spellcheck_conf/wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1065,5 +1065,6 @@ ActionSLAM
statins
ci
chatGPT
baseimage
cuDNN
Xformer

0 comments on commit 35ef00f

Please sign in to comment.