Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VertexAI Model-Registry & Model-Deployer #3161

Open
wants to merge 42 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
2fb8a7d
initial commit on vertex ai deployer and model registry
safoinme Jun 3, 2024
c03f2a0
vertex model
safoinme Jun 6, 2024
3c6bbe9
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Jul 14, 2024
4eeeb27
vertex deployer
safoinme Jul 15, 2024
7881b69
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Sep 13, 2024
7c0ca3f
vertex registry code
safoinme Sep 18, 2024
6769b6c
format
safoinme Sep 18, 2024
9a03f34
Refactor model registration and add URI parameter
safoinme Sep 20, 2024
afc5c2b
Refactor model registration and add URI parameter
safoinme Sep 21, 2024
2dc0d2d
Refactor model registration and remove unnecessary code
safoinme Sep 21, 2024
5c5bb84
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Oct 21, 2024
54b6748
Refactor GCP service and flavor classes for Vertex AI deployment
safoinme Oct 25, 2024
a80f71a
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Oct 25, 2024
a980449
Refactor Vertex AI model registry and deployer configurations
safoinme Oct 31, 2024
6e2b660
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Oct 31, 2024
0a13214
Refactor model deployer configurations and add VertexAI model deployer
safoinme Oct 31, 2024
53da68d
Refactor model deployer configurations and add VertexAI model deployer
safoinme Oct 31, 2024
ff015e1
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Nov 4, 2024
ce2019d
Rename VertexAI model registry classes and update documentation for c…
safoinme Nov 7, 2024
14f2998
Auto-update of LLM Finetuning template
actions-user Nov 7, 2024
0b30a61
Auto-update of Starter template
actions-user Nov 7, 2024
83dfe31
Auto-update of E2E template
actions-user Nov 7, 2024
7888717
Auto-update of NLP template
actions-user Nov 7, 2024
72cc93c
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Nov 7, 2024
70cc4a9
Auto-update of LLM Finetuning template
actions-user Nov 7, 2024
fcdec6e
Auto-update of Starter template
actions-user Nov 7, 2024
0108c0f
Auto-update of E2E template
actions-user Nov 7, 2024
0c33f82
Auto-update of NLP template
actions-user Nov 7, 2024
4f18ba5
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Nov 8, 2024
012cd6e
Update default filenames and improve backward compatibility for sklea…
safoinme Nov 12, 2024
ac2e69a
Auto-update of LLM Finetuning template
actions-user Nov 12, 2024
3194db3
Enhance Vertex AI Model Registry with model conversion utility and do…
safoinme Nov 12, 2024
3d558ae
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Nov 13, 2024
1d5b7fa
Merge branch 'feature/vertex-ai-deployer-model-registry' of https://g…
safoinme Nov 13, 2024
a4e4b45
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
htahir1 Dec 3, 2024
32e8059
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
strickvl Dec 30, 2024
e58c2f7
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
htahir1 Jan 23, 2025
e236e8a
Merge branch 'feature/vertex-ai-deployer-model-registry' of https://g…
safoinme Jan 28, 2025
373177b
Refactor Vertex AI model registry and deployer configurations to enha…
safoinme Feb 3, 2025
2e9b7c4
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Feb 3, 2025
8f074ef
refactor: remove direct attribute from ModelRegistryModelMetadata and…
safoinme Feb 4, 2025
c7a8d15
Merge branch 'develop' into feature/vertex-ai-deployer-model-registry
safoinme Feb 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 187 additions & 0 deletions docs/book/component-guide/model-deployers/vertex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# Vertex AI Model Deployer

[Vertex AI](https://cloud.google.com/vertex-ai) provides managed infrastructure for deploying machine learning models at scale. The Vertex AI Model Deployer in ZenML allows you to deploy models to Vertex AI endpoints, providing a scalable and fully managed solution for model serving.

## When to use it?

Use the Vertex AI Model Deployer when:

- You are leveraging Google Cloud Platform (GCP) and wish to integrate with its native ML serving infrastructure.
- You need enterprise-grade model serving capabilities complete with autoscaling and GPU acceleration.
- You require a fully managed solution that abstracts away the operational overhead of serving models.
- You need to deploy models directly from your Vertex AI Model Registry—or even from other registries or artifacts.
- You want seamless integration with GCP services like Cloud Logging, IAM, and VPC.

This deployer is especially useful for production deployments, high-availability serving, and dynamic scaling based on workloads.

{% hint style="info" %}
For best results, the Vertex AI Model Deployer works with a Vertex AI Model Registry in your ZenML stack. This allows you to register models with detailed metadata and configuration and then deploy a specific version seamlessly.
{% endhint %}

## How to deploy it?

The Vertex AI Model Deployer is enabled via the ZenML GCP integration. First, install the integration:

```shell
zenml integration install gcp -y
```

### Authentication and Service Connector Configuration

The deployer requires proper GCP authentication. The recommended approach is to use the ZenML Service Connector:

```shell
# Register the service connector with a service account key
zenml service-connector register vertex_deployer_connector \
--type gcp \
--auth-method=service-account \
--project_id=<PROJECT_ID> \
[email protected] \
--resource-type gcp-generic

# Register the model deployer and connect it to the service connector
zenml model-deployer register vertex_deployer \
--flavor=vertex \
--location=us-central1 \
--connector vertex_deployer_connector
```

{% hint style="info" %}
The service account used for deployment must have the following permissions:
- `Vertex AI User` to enable model deployments
- `Vertex AI Service Agent` for model endpoint management
- `Storage Object Viewer` if the model artifacts reside in Google Cloud Storage
{% endhint %}

## How to use it

A complete usage example is available in the [ZenML Examples repository](https://github.com/zenml-io/zenml-projects/tree/main/vertex-registry-and-deployer).

### Deploying a Model in a Pipeline

Below is an example of a deployment step that uses the updated configuration options. In this example, the deployment configuration supports:

- **Model versioning**: Explicitly provide the model version (using the full resource name from the model registry).
- **Display name and Sync mode**: Fields such as `display_name` (for a friendly endpoint name) and `sync` (to wait for deployment completion) are now available.
- **Traffic configuration**: Route a certain percentage (e.g., 100%) of traffic to this deployment.
- **Advanced options**: You can still specify custom container settings, resource specifications (including GPU options), and explanation configuration via shared classes from `vertex_base_config.py`.

```python
from typing_extensions import Annotated
from zenml import ArtifactConfig, get_step_context, step
from zenml.client import Client
from zenml.integrations.gcp.services.vertex_deployment import (
VertexDeploymentConfig,
VertexDeploymentService,
)

@step(enable_cache=False)
def model_deployer(
model_registry_uri: str,
safoinme marked this conversation as resolved.
Show resolved Hide resolved
is_promoted: bool = False,
) -> Annotated[
VertexDeploymentService,
ArtifactConfig(name="vertex_deployment", is_deployment_artifact=True),
]:
"""Model deployer step.

Args:
model_registry_uri: The full resource name of the model in the registry.
is_promoted: Flag indicating if the model is promoted to production.

Returns:
The deployed model service.
"""
if not is_promoted:
# Skip deployment if the model is not promoted.
return None
else:
zenml_client = Client()
current_model = get_step_context().model
model_deployer = zenml_client.active_stack.model_deployer

# Create deployment configuration with advanced options.
vertex_deployment_config = VertexDeploymentConfig(
location="europe-west1",
name=current_model.name, # Unique endpoint name in Vertex AI.
display_name="zenml-vertex-quickstart",
model_name=model_registry_uri, # Fully qualified model name (from model registry).
model_version=current_model.version, # Specify the model version explicitly.
description="An example of deploying a model using the Vertex AI Model Deployer",
sync=True, # Wait for deployment to complete before proceeding.
traffic_percentage=100, # Route 100% of traffic to this model version.
# (Optional) Advanced configurations:
# container=VertexAIContainerSpec(
# image_uri="your-custom-image:latest",
# ports=[8080],
# env={"ENV_VAR": "value"}
# ),
# resources=VertexAIResourceSpec(
# accelerator_type="NVIDIA_TESLA_T4",
# accelerator_count=1,
# machine_type="n1-standard-4",
# min_replica_count=1,
# max_replica_count=3,
# ),
# explanation=VertexAIExplanationSpec(
# metadata={"method": "integrated-gradients"},
# parameters={"num_integral_steps": 50}
# )
)

service = model_deployer.deploy_model(
config=vertex_deployment_config,
service_type=VertexDeploymentService.SERVICE_TYPE,
)

return service
```

*Example: [`model_deployer.py`](../../examples/vertex-registry-and-deployer/steps/model_deployer.py)*

### Configuration Options

The Vertex AI Model Deployer leverages a comprehensive configuration system defined in the shared base configuration and deployer-specific settings:

- **Basic Settings:**
- `location`: The GCP region for deployment (e.g., "us-central1" or "europe-west1").
- `name`: Unique identifier for the deployed endpoint.
- `display_name`: A human-friendly name for the endpoint.
- `model_name`: The fully qualified model name from the model registry.
- `model_version`: The version of the model to deploy.
- `description`: A textual description of the deployment.
- `sync`: A flag to indicate whether the deployment should wait until completion.
- `traffic_percentage`: The percentage of incoming traffic to route to this deployment.

- **Container and Resource Configuration:**
- Configurations provided via [VertexAIContainerSpec](../../integrations/gcp/flavors/vertex_base_config.py) allow you to specify a custom serving container image, HTTP routes (`predict_route`, `health_route`), environment variables, and port exposure.
- [VertexAIResourceSpec](../../integrations/gcp/flavors/vertex_base_config.py) lets you override the default machine type, number of replicas, and even GPU options.

- **Advanced Settings:**
- Service account, network configuration, and customer-managed encryption keys.
- Model explanation settings via `VertexAIExplanationSpec` if you need integrated model interpretability.

These options are defined across the [Vertex AI Base Config](../../integrations/gcp/flavors/vertex_base_config.py) and the deployer–specific configuration in [VertexModelDeployerFlavor](../../integrations/gcp/flavors/vertex_model_deployer_flavor.py).

### Limitations and Considerations

1. **Stack Requirements:**
- It is recommended to pair the deployer with a Vertex AI Model Registry in your stack.
- Compatible with both local and remote orchestrators.
- Requires valid GCP credentials and permissions.

2. **Authentication:**
- Best practice is to use service connectors for secure and managed authentication.
- Supports multiple authentication methods (service accounts, local credentials).

3. **Costs:**
- Vertex AI endpoints will incur costs based on machine type and uptime.
- Utilize autoscaling (via configured `min_replica_count` and `max_replica_count`) to manage cost.

4. **Region Consistency:**
- Ensure that the model and deployment are created in the same GCP region.

For more details, please refer to the [SDK docs](https://sdkdocs.zenml.io) and the relevant implementation files:
- [`vertex_model_deployer.py`](../../integrations/gcp/model_deployers/vertex_model_deployer.py)
- [`vertex_base_config.py`](../../integrations/gcp/flavors/vertex_base_config.py)
- [`vertex_model_deployer_flavor.py`](../../integrations/gcp/flavors/vertex_model_deployer_flavor.py)
207 changes: 207 additions & 0 deletions docs/book/component-guide/model-registries/vertex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,207 @@
# Vertex AI Model Registry

[Vertex AI](https://cloud.google.com/vertex-ai) is Google Cloud's unified ML platform that helps you build, deploy, and scale ML models. The Vertex AI Model Registry is a centralized repository for managing your ML models throughout their lifecycle. With ZenML's Vertex AI Model Registry integration, you can register model versions—with extended configuration options—track metadata, and seamlessly deploy your models using Vertex AI's managed infrastructure.

## When would you want to use it?

You should consider using the Vertex AI Model Registry when:

- You're already using Google Cloud Platform (GCP) and want to leverage its native ML infrastructure.
- You need enterprise-grade model management with fine-grained access control.
- You want to track model lineage and metadata in a centralized location.
- You're building ML pipelines that integrate with other Vertex AI services.
- You need to deploy models with custom configurations such as defined container images, resource specifications, and additional metadata.

This registry is particularly useful in scenarios where you:
- Build production ML pipelines that require deployment to Vertex AI endpoints.
- Manage multiple versions of models across development, staging, and production.
- Need to register model versions with detailed configuration for robust deployment.

{% hint style="warning" %}
**Important:** The Vertex AI Model Registry implementation only supports the model **version** interface—not the model interface. This means that you cannot directly register, update, or delete models; you only have operations for model versions. A model container is automatically created with the first version, and subsequent uploads with the same display name create new versions.
{% endhint %}

## How do you deploy it?

The Vertex AI Model Registry flavor is enabled through the ZenML GCP integration. First, install the integration:

```shell
zenml integration install gcp -y
```

### Authentication and Service Connector Configuration

Vertex AI requires proper GCP authentication. The recommended configuration is via the ZenML Service Connector, which supports both service-account-based authentication and local gcloud credentials.

1. **Using a GCP Service Connector with a service account (Recommended):**
```shell
# Register the service connector with a service account key
zenml service-connector register vertex_registry_connector \
--type gcp \
--auth-method=service-account \
--project_id=<PROJECT_ID> \
[email protected] \
--resource-type gcp-generic

# Register the model registry
zenml model-registry register vertex_registry \
--flavor=vertex \
--location=us-central1

# Connect the model registry to the service connector
zenml model-registry connect vertex_registry --connector vertex_registry_connector
```
2. **Using local gcloud credentials:**
```shell
# Register the model registry using local gcloud auth
zenml model-registry register vertex_registry \
--flavor=vertex \
--location=us-central1
```

{% hint style="info" %}
The service account needs the following permissions:
- `Vertex AI User` role for creating and managing model versions.
- `Storage Object Viewer` role if accessing models stored in Google Cloud Storage.
{% endhint %}

## How do you use it?

### Registering Models inside a Pipeline with Extended Configuration

The Vertex AI Model Registry supports extended configuration options via the `VertexAIModelConfig` class (defined in the [vertex_base_config.py](../../integrations/gcp/flavors/vertex_base_config.py) file). This means you can specify additional details for your deployments such as:

- **Container configuration**: Use the `VertexAIContainerSpec` to define a custom serving container (e.g., specifying the `image_uri`, `predict_route`, `health_route`, and exposed ports).
- **Resource configuration**: Use the `VertexAIResourceSpec` to specify compute resources like `machine_type`, `min_replica_count`, and `max_replica_count`.
- **Additional metadata and labels**: Annotate your model registrations with pipeline details, stage information, and custom labels.

Below is an example of how you might register a model version in your ZenML pipeline:

```python
from typing_extensions import Annotated

from zenml import ArtifactConfig, get_step_context, step
from zenml.client import Client
from zenml.integrations.gcp.flavors.vertex_base_config import (
VertexAIContainerSpec,
VertexAIModelConfig,
VertexAIResourceSpec,
)
from zenml.logger import get_logger
from zenml.model_registries.base_model_registry import (
ModelRegistryModelMetadata,
)

logger = get_logger(__name__)


@step(enable_cache=False)
def model_register(
is_promoted: bool = False,
) -> Annotated[str, ArtifactConfig(name="model_registry_uri")]:
"""Model registration step.

Registers a model version in the Vertex AI Model Registry with extended configuration
and returns the full resource name of the registered model.

Extended configuration includes settings for container, resources, and metadata which can then be reused in
subsequent model deployments.
"""
if is_promoted:
# Get the current model from the step context
current_model = get_step_context().model

client = Client()
model_registry = client.active_stack.model_registry
# Create an extended model configuration using Vertex AI base settings
model_config = VertexAIModelConfig(
location="europe-west1",
container=VertexAIContainerSpec(
image_uri="europe-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-5:latest",
predict_route="predict",
health_route="health",
ports=[8080],
),
resources=VertexAIResourceSpec(
machine_type="n1-standard-4",
min_replica_count=1,
max_replica_count=1,
),
labels={"env": "production"},
description="Extended model configuration for Vertex AI",
)

# Register the model version with the extended configuration as metadata
model_version = model_registry.register_model_version(
name=current_model.name,
version=str(current_model.version),
model_source_uri=current_model.get_model_artifact("sklearn_classifier").uri,
description="ZenML model version registered with extended configuration",
metadata=ModelRegistryModelMetadata(
zenml_pipeline_name=get_step_context().pipeline.name,
zenml_pipeline_run_uuid=str(get_step_context().pipeline_run.id),
zenml_step_name=get_step_context().step_run.name,
),
config=model_config,
)
logger.info(f"Model version {model_version.version} registered in Model Registry")

# Return the full resource name of the registered model
return model_version.registered_model.name
else:
return ""
```

*Example: [`model_register.py`](../../examples/vertex-registry-and-deployer/steps/model_register.py)*

### Working with Model Versions

Since the Vertex AI Model Registry supports only version-level operations, here are some commands to manage model versions:

```shell
# List all model versions
zenml model-registry models list-versions <model-name>

# Get details of a specific model version
zenml model-registry models get-version <model-name> -v <version>

# Delete a model version
zenml model-registry models delete-version <model-name> -v <version>
```

### Configuration Options

The Vertex AI Model Registry accepts several configuration options, now enriched with extended settings:

- **location**: The GCP region where your resources will be created (e.g., "us-central1" or "europe-west1").
- **project_id**: (Optional) A GCP project ID override.
- **credentials**: (Optional) GCP credentials configuration.
- **container**: (Optional) Detailed container settings (defined via `VertexAIContainerSpec`) for the model's serving container such as:
- `image_uri`
- `predict_route`
- `health_route`
- `ports`
- **resources**: (Optional) Compute resource settings (using `VertexAIResourceSpec`) like `machine_type`, `min_replica_count`, and `max_replica_count`.
- **labels** and **metadata**: Additional annotation data for organizing and tracking your model versions.

These configuration options are specified in the [Vertex AI Base Config](../../integrations/gcp/flavors/vertex_base_config.py) and further extended in the [Vertex AI Model Registry Flavor](../../integrations/gcp/flavors/vertex_model_registry_flavor.py).

### Key Differences from Other Model Registries

1. **Version-Only Interface**: Vertex AI only supports version-level operations for model registration.
2. **Authentication**: Uses GCP service connectors and local credentials integrated via ZenML.
3. **Extended Configuration**: Register model versions with detailed settings for container, resources, and metadata through `VertexAIModelConfig`.
4. **Managed Service**: As a fully managed service, Vertex AI handles infrastructure management while you focus on your ML models.

## Limitations

- The methods `register_model()`, `update_model()`, and `delete_model()` are not implemented; you can only work with model versions.
- It is recommended to specify a serving container image URI rather than rely on the default scikit-learn container to ensure compatibility with Vertex AI endpoints.
- All models registered through this integration are automatically labeled with `managed_by="zenml"` for consistent tracking.

For more detailed information, check out the [SDK docs](https://sdkdocs.zenml.io/latest/integration_code_docs/integrations-gcp/#zenml.integrations.gcp.model_registry).

<figure>
<img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf">
<figcaption>ZenML in action</figcaption>
</figure>
Loading
Loading