Deploying Google PaliGemma on Amazon SageMaker

This tutorial will guide you through the process of deploying the Google PaliGemma vision model on Amazon SageMaker as a real-time inference endpoint using a Jupyter Notebook.

What is PaliGemma?

PaliGemma is a large-scale vision-language model developed by Google. It can perform various tasks, including image segmentation, object detection, and image captioning, by combining vision and language representations.

Prerequisites

Before proceeding with this tutorial, ensure that you have the following prerequisites:

An AWS account with appropriate permissions to create and manage SageMaker resources.
A Hugging Face account and access token (https://huggingface.co/settings/tokens).

Step-by-Step Guide

Set up the environment: The notebook starts by installing the required Python packages, such as huggingface_hub and sagemaker.

!pip install --upgrade huggingface_hub
!pip install --upgrade sagemaker

Authenticate with AWS: The notebook authenticates with your AWS account by obtaining the SageMaker execution role and default bucket. This step is necessary for SageMaker to have the required permissions to create resources.

import sagemaker
import boto3
sess = sagemaker.Session()
# ... (code omitted for brevity)

Prepare the model directory: The notebook creates a directory structure to store the PaliGemma model files. It also writes the necessary code files (inference.py and requirements.txt) to this directory.

!mkdir code

%%writefile code/requirements.txt
accelerate
bitsandbytes
git+https://github.com/huggingface/transformers.git
Pillow

%%writefile code/inference.py
# ... (inference script code omitted for brevity)

Download the model snapshot: The notebook downloads the PaliGemma model snapshot from the Hugging Face repository using your access token. This step requires you to accept the terms and conditions for the PaliGemma model on the Hugging Face website.

from distutils.dir_util import copy_tree
from pathlib import Path
from huggingface_hub import snapshot_download
import random
HF_MODEL_ID="google/PaliGemma-3b-mix-224"
HF_TOKEN="YOUR_HF_TOKEN"

# download snapshot
snapshot_dir = snapshot_download(repo_id=HF_MODEL_ID, use_auth_token=HF_TOKEN)
# ... (code omitted for brevity)

Create a compressed model archive: The notebook compresses the model directory into a .tar.gz file, which will be uploaded to Amazon S3 for deployment.

import tarfile
import os

def compress(tar_dir=None, output_file="model.tar.gz"):
    # ... (code omitted for brevity)

compress(str(model_tar))

Upload the model to S3: The compressed model archive is uploaded to an S3 bucket associated with your SageMaker session.

from sagemaker.s3 import S3Uploader

s3_model_uri = S3Uploader.upload(local_path="model.tar.gz", desired_s3_uri=f"s3://{sess.default_bucket()}/paligemma")

Deploy the model to SageMaker: The notebook creates a HuggingFaceModel instance and deploys it to a SageMaker endpoint. This step involves specifying the model data location (S3 URI), execution role, and various configuration parameters like instance type and transformers version.

from sagemaker.huggingface.model import HuggingFaceModel

huggingface_model = HuggingFaceModel(
    model_data=s3_model_uri,
    role=role,
    transformers_version="4.37.0",
    pytorch_version="2.1.0",
    py_version='py310',
)

predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g4dn.xlarge",
    endpoint_name=f"PaliGemma-{timestamp}"
)

Test the deployed model: The notebook demonstrates how to send a request to the deployed SageMaker endpoint for image detection. It encodes a test image as base64, includes a textual prompt, and sends it to the endpoint for inference. The response from the endpoint is then printed.

import base64
from PIL import Image

# ... (code omitted for brevity)

payload = {
    "prompt": "detect dog",
    "image": encoded_input_image
}

query_response = predictor.predict(data=payload)

By following this tutorial and referencing the provided code snippets, you will learn how to deploy the PaliGemma vision model on Amazon SageMaker and test it with a sample image segmentation task. Remember to replace placeholders like YOUR_HF_TOKEN with your actual Hugging Face access token.

Delete the deployed model: running the last command you can delete the endpoint to avoid not desidered costs:

# delete endpoint
predictor.delete_endpoint()

General Disclaimer

This notebook is intended for demonstration and educational purposes only. It is not designed for production use without further modifications and hardening. Before deploying this endpoint to a production environment, it is crucial to conduct thorough testing, security assessments, and optimizations based on your specific requirements and best practices.

Contributing

Contributions are welcome! Please follow the usual Git workflow:

Fork the repository
Create a new branch for your feature or bug fix
Commit your changes
Push to the branch
Create a new pull request

License

This library is licensed under the MIT-0 License. See the LICENSE file.

License of the project.
Code of Conduct of the project.
CONTRIBUTING for more information.

Legal Disclaimer

You should consider doing your own independent assessment before using the content in this repository for production purposes. This may include (amongst other things) testing, securing, and optimizing the content provided in this repository, based on your specific quality control practices and standards.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deploying Google PaliGemma on Amazon SageMaker

What is PaliGemma?

Prerequisites

Step-by-Step Guide

General Disclaimer

Contributing

License

Legal Disclaimer

About

Releases

Packages

Languages

License

Cerrix/paligemma_sagemaker_inference_notebooks

Folders and files

Latest commit

History

Repository files navigation

Deploying Google PaliGemma on Amazon SageMaker

What is PaliGemma?

Prerequisites

Step-by-Step Guide

General Disclaimer

Contributing

License

Legal Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages