diff --git a/notebooks/community/model_garden/model_garden_translation_llm_translation_and_evaluation.ipynb b/notebooks/community/model_garden/model_garden_translation_llm_translation_and_evaluation.ipynb
new file mode 100644
index 000000000..885ae47d5
--- /dev/null
+++ b/notebooks/community/model_garden/model_garden_translation_llm_translation_and_evaluation.ipynb
@@ -0,0 +1,497 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "OsXAs2gcIpbC"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "99c1c3fc2ca5"
+ },
+ "source": [
+ " # Vertex AI Model Garden - TranslationLLM Translation and Evaluation (Demo)\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " Run in Colab Enterprise\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " View on GitHub\n",
+ " \n",
+ " | \n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "AtUbwvxier8E"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "In this tutorial, you will learn how to use the *Vertex AI Python SDK* to generate translation responses and then use the *Gen AI Evaluation Service* to measure the translation quality of your LLM responses using [BLEU](https://en.wikipedia.org/wiki/BLEU), [MetricX](https://github.com/google-research/metricx) and [COMET](https://unbabel.github.io/COMET/html/index.html).\n",
+ "\n",
+ "### Costs\n",
+ "\n",
+ "This tutorial uses billable components of Google Cloud:\n",
+ "\n",
+ "* Vertex AI\n",
+ "* Cloud Storage\n",
+ "\n",
+ "Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Storage pricing](https://cloud.google.com/storage/pricing), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "CCkzFPrEer8F"
+ },
+ "source": [
+ "## Getting Started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "N0Wchgzqer8F"
+ },
+ "source": [
+ "### Install Vertex AI Python SDK for Gen AI Evaluation Service and Cloud translation Python client"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "T-Cgoq37er8F"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install --upgrade --user --quiet google-cloud-aiplatform[evaluation] google-cloud-translate"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "qP4ihOCkEBje"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Import libraries\n",
+ "import os\n",
+ "\n",
+ "import pandas as pd\n",
+ "import vertexai\n",
+ "from google.cloud import aiplatform, translate_v3\n",
+ "from IPython.display import Markdown, display\n",
+ "from vertexai import evaluation\n",
+ "from vertexai.evaluation.metrics import pointwise_metric"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "6UbmJ_MyTpw_"
+ },
+ "source": [
+ "### Define Google Cloud Project Information"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "N-i6OtB9er8G"
+ },
+ "outputs": [],
+ "source": [
+ "# Get the default project id and region.\n",
+ "PROJECT_ID = os.environ[\"GOOGLE_CLOUD_PROJECT\"]\n",
+ "# @markdown If you want to use a different region, please make sure the region is supported by Vertex AI Evaluation.\n",
+ "# @markdown Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#eval-locations.\n",
+ "REGION = os.environ[\"GOOGLE_CLOUD_REGION\"]\n",
+ "\n",
+ "# @markdown **[Optional]** Set the experiment name for your experiment.\n",
+ "EXPERIMENT_NAME = \"my-eval-task-experiment\" # @param {type:\"string\"}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ka4wP_ljer8G"
+ },
+ "source": [
+ "### Initialize Vertex AI SDK and Google Cloud Translation client."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "kfTC7tqka78-"
+ },
+ "outputs": [],
+ "source": [
+ "client = translate_v3.TranslationServiceClient()\n",
+ "vertexai.init(project=PROJECT_ID, location=REGION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JyCI4QfPgm0R"
+ },
+ "source": [
+ "## Helper Functions"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "gT_OJBHfCg4Q"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Display evaluation result.\n",
+ "def display_eval_result(eval_result, metrics=None, model_name=None, rows=0):\n",
+ " \"\"\"Display the evaluation results.\"\"\"\n",
+ " if model_name is not None:\n",
+ " display(Markdown(\"## Eval Result for %s\" % model_name))\n",
+ "\n",
+ " summary_metrics, metrics_table = (\n",
+ " eval_result.summary_metrics,\n",
+ " eval_result.metrics_table,\n",
+ " )\n",
+ "\n",
+ " metrics_df = pd.DataFrame.from_dict(summary_metrics, orient=\"index\").T\n",
+ " if metrics:\n",
+ " metrics_df = metrics_df.filter(\n",
+ " [\n",
+ " metric\n",
+ " for metric in metrics_df.columns\n",
+ " if any(selected_metric in metric for selected_metric in metrics)\n",
+ " ]\n",
+ " )\n",
+ " metrics_table = metrics_table.filter(\n",
+ " [\n",
+ " metric\n",
+ " for metric in metrics_table.columns\n",
+ " if any(selected_metric in metric for selected_metric in metrics)\n",
+ " ]\n",
+ " )\n",
+ "\n",
+ " # Display the summary metrics\n",
+ " display(Markdown(\"### Summary Metrics\"))\n",
+ " display(metrics_df)\n",
+ " if rows > 0:\n",
+ " # Display samples from the metrics table\n",
+ " display(Markdown(\"### Row-based Metrics\"))\n",
+ " display(metrics_table.head(rows))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "TPvnBvTdgm0R"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Translate text.\n",
+ "def translate_text(\n",
+ " text: str,\n",
+ " source_language_code: str,\n",
+ " target_language_code: str,\n",
+ ") -> translate_v3.TranslationServiceClient:\n",
+ " \"\"\"Translating Text from English.\n",
+ "\n",
+ " Args:\n",
+ " text: The content to translate.\n",
+ " source_language_code: The language code for the text.\n",
+ " target_language_code: The language code for the translation. E.g. \"fr\" for\n",
+ " French, \"es\" for Spanish, etc. Available languages:\n",
+ " https://cloud.google.com/translate/docs/languages#neural_machine_translation_model\n",
+ " \"\"\"\n",
+ " parent = f\"projects/{PROJECT_ID}/locations/us-central1\"\n",
+ " # @markdown Translate text from English to `target_language_code` (your chosen language) using the Translate LLM model.\n",
+ " # @markdown 1. Translate LLM is available in us-central1.\n",
+ " # @markdown 2. Supported mime types are listed in https://cloud.google.com/translate/docs/supported-formats.\n",
+ " response = client.translate_text(\n",
+ " contents=[text],\n",
+ " target_language_code=target_language_code,\n",
+ " parent=parent,\n",
+ " mime_type=\"text/plain\",\n",
+ " source_language_code=source_language_code,\n",
+ " model=f\"{parent}/models/general/translation-llm\", # Use Translate LLM.\n",
+ " )\n",
+ "\n",
+ " # Display the translation for each input text provided\n",
+ " for translation in response.translations:\n",
+ " print(f\"Translated text: {translation.translated_text}\")\n",
+ " # Example response:\n",
+ " # Translated text: Bonjour comment vas-tu aujourd'hui?\n",
+ "\n",
+ " return response"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "fQP9YjD_g0Fw"
+ },
+ "source": [
+ "## Getting Translations"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "HKDL4oCFbeEc"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Try out a translation.\n",
+ "translations = translate_text(\n",
+ " text=\"Dem Feuer konnte Einhalt geboten werden\",\n",
+ " source_language_code=\"de\",\n",
+ " target_language_code=\"en\",\n",
+ ")\n",
+ "translations"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "YIB0wnHfgbVN"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Generate translations.\n",
+ "\n",
+ "# Define original text.\n",
+ "sources = [\n",
+ " \"Dem Feuer konnte Einhalt geboten werden\",\n",
+ " \"Schulen und Kindergärten wurden eröffnet.\",\n",
+ "]\n",
+ "\n",
+ "# Generate responses.\n",
+ "translations = []\n",
+ "for source in sources:\n",
+ " translation = (\n",
+ " translate_text(\n",
+ " text=source, target_language_code=\"en\", source_language_code=\"de\"\n",
+ " )\n",
+ " .translations[0]\n",
+ " .translated_text\n",
+ " )\n",
+ " translations.append(translation)\n",
+ "\n",
+ "translations"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "KwgQlLkSg5XP"
+ },
+ "source": [
+ "## Evaluating Your translations"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "XGY40wjrQWOc"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Prepare evaluation dataset.\n",
+ "\n",
+ "# These are the references we will send for evaluation.\n",
+ "references = [\n",
+ " \"They were able to control the fire.\",\n",
+ " \"Schools and kindergartens opened\",\n",
+ "]\n",
+ "\n",
+ "# Define evaluation dataset using the responses generated.\n",
+ "eval_dataset = pd.DataFrame(\n",
+ " {\n",
+ " \"source\": sources,\n",
+ " \"response\": translations,\n",
+ " \"reference\": references,\n",
+ " }\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "jo7ahGbnnYfp"
+ },
+ "source": [
+ "### Set up eval metrics for your data."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "aG3kUfTmoAwb"
+ },
+ "source": [
+ "You can evaluate the translation quality of your data generated from an LLM using any of the metrics below.\n",
+ "\n",
+ "- [BLEU](https://en.wikipedia.org/wiki/BLEU):\\\n",
+ "BLEU calculates a score from 0 to 1 based on how many matching words and phrases appear in a machine translation compared to a human reference, with higher scores indicating better quality.\n",
+ "\n",
+ "- [COMET](https://unbabel.github.io/COMET/html/index.html):\\\n",
+ "COMET uses a neural network to produce a score typically between 0 and 1, reflecting the similarity between a machine translation and a human reference, where higher scores mean better quality.\n",
+ "\n",
+ "- [MetricX](https://github.com/google-research/metricx):\\\n",
+ "Metric-X is a LLM-based evaluation metric for translation quality measurement that aims at maching the Multidimensional Quality Metrics (MQM) score range of 0 (best) to 25 (worst). It is a newer and improved version of Bluert-X that was published publicly by Google.\n",
+ "\n",
+ "See [documentations](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/evaluation/metrics/pointwise_metric.py) for more information about supported COMET and MetricX versions."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "EGe_vlUvPVOM"
+ },
+ "outputs": [],
+ "source": [
+ "metrics = [\n",
+ " \"bleu\",\n",
+ " pointwise_metric.Comet(),\n",
+ " pointwise_metric.MetricX(),\n",
+ "]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ZJg5FdWfnhjz"
+ },
+ "source": [
+ "### Run evaluation\n",
+ "\n",
+ "With the evaluation dataset and metrics defined, you can run evaluation for an `EvalTask` on different models and applications, and many other use cases."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "rOvo-LpsQTIj"
+ },
+ "outputs": [],
+ "source": [
+ "eval_task = evaluation.EvalTask(\n",
+ " dataset=eval_dataset, metrics=metrics, experiment=EXPERIMENT_NAME\n",
+ ")\n",
+ "eval_result = eval_task.evaluate()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "GwnZMsXBnjS7"
+ },
+ "source": [
+ "You can view the summary metrics and row-based metrics for each response in the `EvalResult`.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "2QOFq9YZROPr"
+ },
+ "outputs": [],
+ "source": [
+ "display_eval_result(eval_result, rows=2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "BLMKgL2OnCyt"
+ },
+ "source": [
+ "## Clean up"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "Dv8drshKnEf2"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Delete ExperimentRun\n",
+ "delete_experiment = False\n",
+ "if delete_experiment:\n",
+ " aiplatform.ExperimentRun(\n",
+ " run_name=eval_result.metadata[\"experiment_run\"],\n",
+ " experiment=eval_result.metadata[\"experiment\"],\n",
+ " ).delete()"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "name": "model_garden_translation_llm_translation_and_evaluation.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}