Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serving a model using custom container, instance run of disk #112

Open
HamidShojanazeri opened this issue Nov 19, 2021 · 4 comments
Open

Comments

@HamidShojanazeri
Copy link

Describe the bug
Using a custom container to serve a Pytorch model, defined as below, it throw "No space left on device"

container = {"Image": image, "ModelDataUrl": model_artifact}

create_model_response = sm.create_model(
    ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=container
)

create_endpoint_config_response = sm.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            "InstanceType": "ml.g4dn.8xlarge",
            "InitialVariantWeight": 1,
            "InitialInstanceCount": 1,
            "ModelName": model_name,
            "VariantName": "AllTraffic",
        }
    ],
)

Docker image size is 17 GB and Torchserve mar file is 8 GB. I was wondering if there is any way to increase the storage for the instances that are serving the model. Going through the doc for endpoint configuration seems there is no setting for specifics about instances.

-- Cloud watch log

256717956_890382124957120_3900367258239977898_n

Expected behavior

Having knobs to set the storage for the serving instances.

@HamidShojanazeri
Copy link
Author

cc @nskool

@HamidShojanazeri
Copy link
Author

HamidShojanazeri commented Nov 19, 2021

I believe exposing few knobs for some of the settings including storage for the host instances would be helpful. Thanks @lxning for the offline discussions, it would be great if could add this as a feature to Sagemaker SDK.

@lxning
Copy link
Contributor

lxning commented Nov 19, 2021

According to SM hosting team, currently SM SDK does not support storage size configuration. The only available solution is to change instance type. Pls refer host-instance-storage-volumes-table

@HamidShojanazeri
Copy link
Author

HamidShojanazeri commented Nov 19, 2021

@lxning this is a limiting factor, as it is easy to hit the limit mostly on gpu instance 30GB, some of Nvidia dockers similar in this case can go up to 21 GB and heavier workloads that chain multiple models can end up having a large model_artifact size that goes beyond the limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants