docs: Update documentation for segment_anything_2_image (#595)

HumanSignal · Aug 7, 2024 · 541c94f · 541c94f
1 parent 145f328
commit 541c94f
Show file tree

Hide file tree

Showing 16 changed files with 36 additions and 24 deletions.
diff --git a/.github/docker-build-config.yml b/.github/docker-build-config.yml
@@ -22,7 +22,7 @@
   backend_tag_prefix: nemoasr-
 - backend_dir_name: segment_anything_model
   backend_tag_prefix: sam-
-- backend_dir_name: segment_anything_2
+- backend_dir_name: segment_anything_2_image
   backend_tag_prefix: sa2-
 - backend_dir_name: sklearn_text_classifier
   backend_tag_prefix: sklearntxtclass-

diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml
@@ -11,7 +11,7 @@ jobs:
     needs: calculate_matrix
     runs-on: ${{
       matrix.ml_backend.backend_dir_name == 'mmdetection-3' && 'ubuntu-latest-4c-16gb' ||
-      matrix.ml_backend.backend_dir_name == 'segment_anything_2' && 'ubuntu-latest-4c-16gb' ||
+      matrix.ml_backend.backend_dir_name == 'segment_anything_2_image' && 'ubuntu-latest-4c-16gb' ||
       matrix.ml_backend.backend_dir_name == 'grounding_dino' && 'ubuntu-latest-4c-16gb' ||
       matrix.ml_backend.backend_dir_name == 'grounding_sam' && 'ubuntu-latest-4c-16gb' ||
       'ubuntu-latest' }} # Use larger runner for some backends, as we need >20GB during build time.

diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -24,7 +24,7 @@ jobs:
     needs: calculate_matrix
     runs-on: ${{
       matrix.ml_backend == 'mmdetection-3' && 'ubuntu-latest-4c-16gb' ||
-      matrix.ml_backend == 'segment_anything_2' && 'ubuntu-latest-4c-16gb' ||
+      matrix.ml_backend == 'segment_anything_2_image' && 'ubuntu-latest-4c-16gb' ||
       matrix.ml_backend == 'grounding_dino' && 'ubuntu-latest-4c-16gb' ||
       matrix.ml_backend == 'grounding_sam' && 'ubuntu-latest-4c-16gb' ||
       'ubuntu-latest' }} # Use larger runner for some backends, as we need >20GB during build time.

diff --git a/README.md b/README.md
@@ -40,24 +40,27 @@ Check the **Required parameters** column to see if you need to set any additiona
 - **Training** column indicates if the model can be used for training in Label Studio: update the model state based the
   submitted annotations.
 
-| MODEL_NAME                                                                                 | Description                                                                                                                               | Pre-annotation | Interactive mode | Training | Required parameters                           |
-|--------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------|-----------------|------------------|----------|-----------------------------------------------|
-| [segment_anything_model](/label_studio_ml/examples/segment_anything_model)                 | Image segmentation by [Meta](https://segment-anything.com/)                                                                               | ❌               | ✅                |   ❌       | None                                          |
-| [llm_interactive](/label_studio_ml/examples/llm_interactive)                               | Prompt engineering with [OpenAI](https://platform.openai.com/), Azure LLMs.                                                               | ✅               | ✅                | ✅        | OPENAI_API_KEY                                |
-| [grounding_dino](/label_studio_ml/examples/grounding_dino)                                 | Object detection with prompts. [Details](https://github.com/IDEA-Research/GroundingDINO)                                                  | ❌               | ✅                | ❌        | None                                          |
-| [tesseract](/label_studio_ml/examples/tesseract)                                           | Interactive OCR. [Details](https://github.com/tesseract-ocr/tesseract)                                                                    | ❌               | ✅                | ❌        | None                                          |
-| [easyocr](/label_studio_ml/examples/easyocr)                                               | Automated OCR. [EasyOCR](https://github.com/JaidedAI/EasyOCR)                                                                             | ✅               | ❌                | ❌        | None                                          |
-| [spacy](/label_studio_ml/examples/spacy)                                                   | NER by [SpaCy](https://spacy.io/)                                                                                                         | ✅               | ❌                | ❌        | None                                          |
-| [flair](/label_studio_ml/examples/flair)                                                   | NER by [flair](https://flairnlp.github.io/)                                                                                               | ✅               | ❌                | ❌        | None                                          |
-| [bert_classifier](/label_studio_ml/examples/bert_classifier)                               | Text classification with [Huggingface](https://huggingface.co/transformers/v3.0.2/model_doc/auto.html#automodelforsequenceclassification) | ✅               | ❌                | ✅        | None                                          |
-| [huggingface_llm](/label_studio_ml/examples/huggingface_llm)                               | LLM inference with [Hugging Face](https://huggingface.co/tasks/text-generation)                                                           | ✅               | ❌                | ❌        | None                                          |
-| [huggingface_ner](/label_studio_ml/examples/huggingface_ner)                               | NER by [Hugging Face](https://huggingface.co/docs/transformers/en/tasks/token_classification)                                             | ✅               | ❌                | ✅        | None                                          |
-| [nemo_asr](/label_studio_ml/examples/nemo_asr)                                             | Speech ASR by [NVIDIA NeMo](https://github.com/NVIDIA/NeMo)                                                                               | ✅               | ❌                | ❌        | None                                          |
-| [mmdetection](/label_studio_ml/examples/mmdetection-3)                                     | Object Detection with [OpenMMLab](https://github.com/open-mmlab/mmdetection)                                                              | ✅               | ❌                | ❌        | None                                          |
-| [sklearn_text_classifier](/label_studio_ml/examples/sklearn_text_classifier)               | Text classification with [scikit-learn](https://scikit-learn.org/stable/)                                                                 | ✅               | ❌                | ✅        | None                                          |
-| [interactive_substring_matching](/label_studio_ml/examples/interactive_substring_matching) | Simple keywords search                                                                                                                    | ❌               | ✅                | ❌        | None                                          |
-| [langchain_search_agent](/label_studio_ml/examples/langchain_search_agent)                 | RAG pipeline with Google Search and [Langchain](https://langchain.com/)                                                                   | ✅               | ✅                | ✅        | OPENAI_API_KEY, GOOGLE_CSE_ID, GOOGLE_API_KEY |
-
+| MODEL_NAME                                                                                 | Description                                                                                                                                          | Pre-annotation | Interactive mode | Training |  Required parameters  | Arbitrary or Set Labels?                                                   | 
+|--------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|------------------|----------|----------------------|----------------------------------------------------------------------------|
+| [bert_classifier](/label_studio_ml/examples/bert_classifier)                               | Text classification with [Huggingface](https://huggingface.co/transformers/v3.0.2/model_doc/auto.html#automodelforsequenceclassification)            | ✅              | ❌                | ✅        | None                       | Arbitrary|
+| [easyocr](/label_studio_ml/examples/easyocr)                                               | Automated OCR. [EasyOCR](https://github.com/JaidedAI/EasyOCR)                                                                                        | ✅              | ❌                | ❌        | None                       | Set (characters)                                                           | 
+| [flair](/label_studio_ml/examples/flair)                                                   | NER by [flair](https://flairnlp.github.io/)                                                                                                          | ✅              | ❌                | ❌        | None                       | Arbitrary|
+| [gliner](/label_studio_ml/examples/gliner)                                                 | NER by [GLiNER](https://huggingface.co/spaces/tomaarsen/gliner_medium-v2.1)                                                                          | ❌              |  ✅  |  ✅  | None | Arbitrary|
+| [grounding_dino](/label_studio_ml/examples/grounding_dino)                                 | Object detection with prompts. [Details](https://github.com/IDEA-Research/GroundingDINO)                                                             | ❌              | ✅                | ❌        | None                       | Arbitrary                                                                  |
+| [grounding_sam](/label_studio_ml/examples/grounding_sam) | Object Detection with [Prompts](https://github.com/IDEA-Research/GroundingDINO) and [SAM 2](https://github.com/facebookresearch/segment-anything-2) |    ❌              | ✅                | ❌        | None                       | Arbitrary                                                                  |
+| [huggingface_llm](/label_studio_ml/examples/huggingface_llm)                               | LLM inference with [Hugging Face](https://huggingface.co/tasks/text-generation)                                                                      | ✅              | ❌                | ❌        | None                       | Arbitrary | 
+| [huggingface_ner](/label_studio_ml/examples/huggingface_ner)                               | NER by [Hugging Face](https://huggingface.co/docs/transformers/en/tasks/token_classification)                                                        | ✅              | ❌                | ✅        | None                       | Arbitrary | 
+| [interactive_substring_matching](/label_studio_ml/examples/interactive_substring_matching) | Simple keywords search                                                                                                                               | ❌              | ✅                | ❌        | None                       | Arbitrary | 
+| [langchain_search_agent](/label_studio_ml/examples/langchain_search_agent)                 | RAG pipeline with Google Search and [Langchain](https://langchain.com/)                                                                              | ✅              | ✅                | ✅        | OPENAI_API_KEY, GOOGLE_CSE_ID, GOOGLE_API_KEY | Arbitrary | 
+| [llm_interactive](/label_studio_ml/examples/llm_interactive)                               | Prompt engineering with [OpenAI](https://platform.openai.com/), Azure LLMs.                                                                          | ✅              | ✅                | ✅        | OPENAI_API_KEY             | Arbitrary                                                                  | 
+| [mmdetection](/label_studio_ml/examples/mmdetection-3)                                     | Object Detection with [OpenMMLab](https://github.com/open-mmlab/mmdetection)                                                                         | ✅              | ❌                | ❌        | None                       | Arbitrary | 
+| [nemo_asr](/label_studio_ml/examples/nemo_asr)                                             | Speech ASR by [NVIDIA NeMo](https://github.com/NVIDIA/NeMo)                                                                                          | ✅              | ❌                | ❌        | None                       | Set (vocabulary and characters) | 
+| [segment_anything_2_image](/label_studio_ml/examples/segment_anything_2_image)             | Image segmentation with [SAM 2](https://github.com/facebookresearch/segment-anything-2)                                                              | ❌              | ✅ | ❌ | None| Arbitrary|
+| [segment_anything_model](/label_studio_ml/examples/segment_anything_model)                 | Image segmentation by [Meta](https://segment-anything.com/)                                                                                          | ❌              | ✅                |   ❌       | None                       | Arbitrary                                                                  |
+| [sklearn_text_classifier](/label_studio_ml/examples/sklearn_text_classifier)               | Text classification with [scikit-learn](https://scikit-learn.org/stable/)                                                                            | ✅              | ❌                | ✅        | None                        | Arbitrary | 
+| [spacy](/label_studio_ml/examples/spacy)                                                   | NER by [SpaCy](https://spacy.io/)                                                                                                                    | ✅              | ❌                | ❌        | None                       | Set      [(see documentation)](https://spacy.io/usage/linguistic-features) |
+| [tesseract](/label_studio_ml/examples/tesseract)                                           | Interactive OCR. [Details](https://github.com/tesseract-ocr/tesseract)                                                                               | ❌              | ✅                | ❌        | None                       | Set (characters)                                                           | 
+| [watsonX](/label_studio_ml/exampels/watsonx)| LLM inference with [WatsonX](https://www.ibm.com/products/watsonx-ai) and integration with [WatsonX.data](watsonx.data)| ✅ | ✅| ❌ | None| Arbitrary|
 # (Advanced usage) Develop your model
 
 To start developing your own ML backend, follow the instructions below.

diff --git a/...examples/segment_anything_2/.dockerignore → ...es/segment_anything_2_image/.dockerignore b/...examples/segment_anything_2/.dockerignore → ...es/segment_anything_2_image/.dockerignore
diff --git a/...ml/examples/segment_anything_2/Dockerfile → ...mples/segment_anything_2_image/Dockerfile b/...ml/examples/segment_anything_2/Dockerfile → ...mples/segment_anything_2_image/Dockerfile
diff --git a/..._ml/examples/segment_anything_2/README.md → ...amples/segment_anything_2_image/README.md b/..._ml/examples/segment_anything_2/README.md → ...amples/segment_anything_2_image/README.md
@@ -1,6 +1,13 @@
-This guide describes the simplest way to start using **SegmentAnything 2** with Label Studio.
+
+Segment Anything 2, or SAM 2, is a model releaed by Meta in July 2024. An update to the original Segment Anything Model, 
+SAM 2 provides even better object segmentation for both images and video. In this guide, we'll show you how to use 
+SAM 2 for better image labeling with label studio. 
 
 ## Using SAM2 with Label Studio (tutorial)
+Click on the image below to watch our ML Evangelist Micaela Kaplan explain how to link SAM 2 to your Label Studio Project.
+You'll need to follow the instructions below to stand up an instance of SAM2 before you can link your model! 
+
+
 [![Connecting SAM2 Model to Label Studio for Image Annotation ](https://img.youtube.com/vi/FTg8P8z4RgY/0.jpg)](https://www.youtube.com/watch?v=FTg8P8z4RgY)
 
 Note that as of 8/1/2024, SAM2 only runs on GPU.
@@ -13,7 +20,7 @@ Note that as of 8/1/2024, SAM2 only runs on GPU.
 git clone https://github.com/HumanSignal/label-studio-ml-backend.git
 cd label-studio-ml-backend
 pip install -e .
-cd label_studio_ml/examples/segment_anything_2
+cd label_studio_ml/examples/segment_anything_2_image
 pip install -r requirements.txt
 ```
 
@@ -24,7 +31,7 @@ pip install -r requirements.txt
 
 ```bash
 cd ../
-label-studio-ml start ./segment_anything_2
+label-studio-ml start ./segment_anything_2_image
 ```
 
 4. Connect running ML backend server to Label Studio: go to your project `Settings -> Machine Learning -> Add Model` and specify `http://localhost:9090` as a URL. Read more in the official [Label Studio documentation](https://labelstud.io/guide/ml#Connect-the-model-to-Label-Studio).

diff --git a/...o_ml/examples/segment_anything_2/_wsgi.py → ...xamples/segment_anything_2_image/_wsgi.py b/...o_ml/examples/segment_anything_2/_wsgi.py → ...xamples/segment_anything_2_image/_wsgi.py
diff --git a/...les/segment_anything_2/docker-compose.yml → ...gment_anything_2_image/docker-compose.yml b/...les/segment_anything_2/docker-compose.yml → ...gment_anything_2_image/docker-compose.yml
diff --git a/...o_ml/examples/segment_anything_2/model.py → ...xamples/segment_anything_2_image/model.py b/...o_ml/examples/segment_anything_2/model.py → ...xamples/segment_anything_2_image/model.py
diff --git a/.../segment_anything_2/requirements-base.txt → ...nt_anything_2_image/requirements-base.txt b/.../segment_anything_2/requirements-base.txt → ...nt_anything_2_image/requirements-base.txt
diff --git a/.../segment_anything_2/requirements-test.txt → ...nt_anything_2_image/requirements-test.txt b/.../segment_anything_2/requirements-test.txt → ...nt_anything_2_image/requirements-test.txt
diff --git a/...mples/segment_anything_2/requirements.txt → ...segment_anything_2_image/requirements.txt b/...mples/segment_anything_2/requirements.txt → ...segment_anything_2_image/requirements.txt
diff --git a/...o_ml/examples/segment_anything_2/start.sh → ...xamples/segment_anything_2_image/start.sh b/...o_ml/examples/segment_anything_2/start.sh → ...xamples/segment_anything_2_image/start.sh
diff --git a/...l/examples/segment_anything_2/test_api.py → ...ples/segment_anything_2_image/test_api.py b/...l/examples/segment_anything_2/test_api.py → ...ples/segment_anything_2_image/test_api.py
diff --git a/label_studio_ml/examples/segment_anything_model/README.md b/label_studio_ml/examples/segment_anything_model/README.md
@@ -24,6 +24,8 @@ image: "/tutorials/segment-anything.png"
 https://github.com/shondle/label-studio-ml-backend/assets/106922533/42a8a535-167c-404a-96bd-c2e2382df99a
 
 Use Facebook's Segment Anything Model with Label Studio!
+In July 2024, Facebook released an update to the Segement Anything model, called SAM 2. To use this newer model for 
+labeling, see [the segment_anything_2_image repo](https://github.com/HumanSignal/label-studio-ml-backend/tree/master/label_studio_ml/examples/segment_anything_2_image)
 
 ## Quickstart