-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Launch TorchServe without repackaging model contents #117
Comments
Hi @fm1ch4 , I am working on the changes you suggested. If I understand correctly, we don't need the _adapt_to_ts_format function anymore. This function was using sagemaker-pytorch-inference-toolkit/src/sagemaker_pytorch_serving_container/torchserve.py Lines 164 to 167 in d3fd2f3
What should the model name be in config file after we remove the formatting function? I don't quite understand how torchserve processes the parameter --models model=environment.model_dir to generate the model name.
|
Hey @mseth10, thanks for taking a look. The for the
|
Hi @mseth10 , do we have an estimation for a fix for this? |
Hi @Qingzi-Lan , we are currently seeing some specific test failures as mentioned in #118 (comment) , can you help understand why these tests would pass with DLC and fail with the generic image. Also, can we skip them and get the PR merged to get the DLC customers unblocked? |
what image are the test using by generic image? |
Currently, the startup code will repackage the model contents in
environment.model_dir
into TS format using the TS model archiver: https://github.com/aws/sagemaker-pytorch-inference-toolkit/blob/master/src/sagemaker_pytorch_serving_container/torchserve.py#L78This causes the model contents to be read and rewritten to disk on container startup, which increases container startup time. For SageMaker Serverless Inference, this causes cold starts to be longer (even longer for larger models).
TorchServe is making a change to support loading models from a directory without the need to repackage the model as a .mar file: pytorch/serve#1498
This issue is to request for this inference toolkit to use this new feature and avoid repackaging the model contents. I /think/ this should be as simple as removing the model archiver command execution and setting [
--models
in the TorchServe command to--models model=environment.model_dir
The text was updated successfully, but these errors were encountered: