Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TorchServe quick start example #3040

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

agunapal
Copy link
Collaborator

@agunapal agunapal commented Mar 23, 2024

Description

The goal for the PR is to enable TorchServe user to quickly deploy a model for serving using a single script without having to install anything manually.

This PR adds the following quick start example for TorchServe

./examples/getting_started/build_image.sh vit  # optional arg --torch.compile

docker run --rm -it --env TORCH_COMPILE=false --env MODEL_NAME=vit --platform linux/amd64 -p 127.0.0.1:8080:8080 -v /home/ubuntu/serve/model_store:/home/model-server/model-store pytorch/torchserve:demo

# In another terminal, run the following command for inference
curl http://127.0.0.1:8080/predictions/vit -T ./examples/image_classifier/kitten.jpg
  • For BERT models, set the following
export HUGGINGFACE_TOKEN=< Your token>
  • Supports the following models
resnet, densenet, vit, fasterrcnn, bertsc, berttc, bertqa, berttg
  • if Nvidia GPU is present, it automatically picks the GPU image and uses the GPU
  • It also supports torch.compile

Fixes #(issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Feature/Issue validation/testing

Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • Inference
curl http://127.0.0.1:8080/predictions/vit -T ./examples/image_classifier/kitten.jpg
{
  "tabby": 0.5469660758972168,
  "tiger_cat": 0.23448117077350616,
  "Egyptian_cat": 0.1045909970998764,
  "lynx": 0.0010765091283246875,
  "Persian_cat": 0.00040055206045508385
}

Checklist:

  • Did you have fun?
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

@agunapal agunapal marked this pull request as ready for review March 23, 2024 23:37
@agunapal agunapal requested a review from msaroufim March 23, 2024 23:37
@lxning
Copy link
Collaborator

lxning commented Mar 25, 2024

This example contains Dockerfile and the config+handler for BERT, Resnet, and Desnet. It's not clear for cx to follow up this example since TS has dir for Dockerfile and model handlers for these models.

We need have a clear guidance and implementation to support model card.

Copy link
Member

@msaroufim msaroufim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally I like the direction this is going in. Recommending docker early on to users feels like a natural thing since torchserve will just work on any device

I also like making the compiler flag an env variable will make trying things out quite simple

What I don't like is that bert examples still have these unsinspiring labels for "accepted" vs "not accepted" which is confusing because it might make people think it's a sytem level error instead of the expected output of the mode. I would also want to see the code duplication issues fixed. There's also still a link to an existing getting started guide on the main page, let's get rid of that and the tutorial

Otherwise once this is in I do expect this to be default way people first use torchserve, presumably you're doing llama7b and 70b in a future PR?

@@ -0,0 +1,4 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these labels have been confusing forever, can we do something else?

@@ -0,0 +1,538 @@
import ast
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this copy pasted from the examples transformer handler? The duplication will hurt us would recommend having this in 1 place

@@ -0,0 +1,188 @@
import argparse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same point on duplication here

COPY $EXAMPLE_DIR/config.properties /home/model-server/config.properties

WORKDIR /home/model-server/getting_started
RUN chmod +x /usr/local/bin/dockerd-entrypoint.sh \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants