Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segment anything 2 pipeline image #185

Merged
merged 33 commits into from
Sep 4, 2024

Conversation

pschroedl
Copy link
Collaborator

@pschroedl pschroedl commented Aug 30, 2024

This pull request introduces an implementation of the Segment Anything v2 (SAM2) pipeline within the AI worker.

This implements the basic functionality needed to perform segmentation on an image, including specifying points and labels. Returns the scores, logits, and low-res-masks.

In addition, this PR introduces a new method of adding pipelines dependencies without breaking existing implementations present in ai-runner. Example usage for local development:

cd ai-worker/runner
docker build -t livepeer/ai-runner:base .
docker build -f ../cmd/segment-anything-2/Dockerfile.segment_anything_2 -t livepeer/ai-runner:segment-anything-2 .

docker run --name sam2-runnner -e MODEL_DIR=/models -e PIPELINE=segment-anything-2 -e MODEL_ID=facebook/sam2-hiera-large -e HUGGINGFACE_TOKEN={token} --gpus 0 -p 8002:8000 -v ~/.lpData/models:/models livepeer/ai-runner:segment-anything-2

The existing Dockerfile is specified as a 'base' image ( e.g. : FROM livepeer/ai-runner:base ) in cmd/segment-anything-2/Dockerfile.segment_anything_2.

The go-livepeer PR is here implements the logic necessary to launch this pipeline-specific container when appropriate : livepeer/go-livepeer#3131

This PR has been updated to move the mapping of pipelines to docker images into the createContainer method in Docker.go

rickstaa and others added 20 commits August 13, 2024 22:53
This commit introduces a prototype implementation of the
[Segment Anything v2](https://github.com/facebookresearch/segment-anything-2)
(SAM2) pipeline within the AI worker. The prototype demonstrates the basic
functionality needed to perform segmentation on an image. Note that video
segmentation is not yet implemented. Additionally, the dependencies were
updated quickly, which may temporarily break other pipelines.
This commit allows nested arrays to be supplied as JSON strings for SAM2
input. It also implements robust error handling to return a 400 error with
a descriptive message when incorrect parameters are provided.
This commit ensures that we return the masks, iou_predictions and
low_res_masks in json format.
This commit replaces the JSON.dump method with a simple str method since
it is highly unlikely that the string contains invalid data.

Co-authored-by: Peter Schroedl <[email protected]>
pschroedl and others added 7 commits August 30, 2024 13:13
This commit moves the SAM2 docker file inside the docker container.
This commit cleansup the codebase and adds FastAPI parameter and
pipeline descriptions.
This commit improves the sam2 route function name so that it is more
pythonic and shows up nicer in the OpenAPI spec pipeline summary.
This commit updates the golang bindings so that the runner changes are
reflected.
This commit adds the media type content MIME type to the segment
anything 2 pipeline.
This commit removes the debug patch which was accidentally added to the
code.
This commit adds the SAM2 model download command so that orchestrators
can pre-download the model.
This commit ensures that the parameters are in the same order as the
pipeline parameters.
Copy link
Member

@rickstaa rickstaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀!

@rickstaa
Copy link
Member

rickstaa commented Sep 3, 2024

@pschroedl model caching is not supported until this upstream pull request is merged
facebookresearch/sam2#285 (comment).

pschroedl and others added 2 commits September 3, 2024 21:43
Copy link
Member

@rickstaa rickstaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀.

@pschroedl pschroedl merged commit c4d02e9 into livepeer:main Sep 4, 2024
2 checks passed
rickstaa added a commit to livepeer/go-livepeer that referenced this pull request Sep 4, 2024
This commit adds support for the new [segment anything 2](https://ai.meta.com/sam2/) pipeline (SAM2) that was added to the AI-worker in [this pull request](livepeer/ai-worker#185). While the new SAM pipeline can also do video segmentation this will be done in a subsequent pull request.

Co-authored-by: John | Elite Encoder <[email protected]>
Co-authored-by: Peter Schroedl <[email protected]>
Co-authored-by: Rick Staa <[email protected]>
JJassonn69 pushed a commit to JJassonn69/ai-worker that referenced this pull request Sep 18, 2024
* feat(pipeline): add SAM2 image segmentation prototype

This commit introduces a prototype implementation of the
[Segment Anything v2](https://github.com/facebookresearch/segment-anything-2)
(SAM2) pipeline within the AI worker. The prototype demonstrates the basic
functionality needed to perform segmentation on an image. Note that video
segmentation is not yet implemented. Additionally, the dependencies were
updated quickly, which may temporarily break other pipelines.

* revert Dockerfile, requirements, add sam2 Dockerfile

* refactor: enhance SAM2 input handling and error management

This commit allows nested arrays to be supplied as JSON strings for SAM2
input. It also implements robust error handling to return a 400 error with
a descriptive message when incorrect parameters are provided.

* refactor: improve SAM2 return time

This commit ensures that we return the masks, iou_predictions and
low_res_masks in json format.

* Sam2 -> SegmentAnything2

* update go bindings

* update multipart.go binding with NewSegmentAnything2Writer

* update worker and multipart methods

* predictions -> scores, mask -> logits

* add sam2 specific multipartwriter fields

* add segment-anything-2 to containerHostPorts

* fix pipeline name in worker.go

* revert Dockerfile, requirements, add sam2 Dockerfile

* Sam2 -> SegmentAnything2

* predictions -> scores, mask -> logits

* feat: replace JSON.dump with str

This commit replaces the JSON.dump method with a simple str method since
it is highly unlikely that the string contains invalid data.

Co-authored-by: Peter Schroedl <[email protected]>

* move pipeline-specific dockerfile

* update openapi yaml

* add segment anything specific readme

* update go bindings

* refactor: move SAM2 docker

This commit moves the SAM2 docker file inside the docker container.

* refactor: add FastAPI descriptions

This commit cleansup the codebase and adds FastAPI parameter and
pipeline descriptions.

* refactor: improve sam2 route function name

This commit improves the sam2 route function name so that it is more
pythonic and shows up nicer in the OpenAPI spec pipeline summary.

* chore(worker): update golang bindings

This commit updates the golang bindings so that the runner changes are
reflected.

* refactor(runner): add media_type

This commit adds the media type content MIME type to the segment
anything 2 pipeline.

* chore(worker): remove debug patch

This commit removes the debug patch which was accidentally added to the
code.

* feat(runnner): add SAM2 model download command

This commit adds the SAM2 model download command so that orchestrators
can pre-download the model.

* refactor(worker): change SAM2 multipart reader param order

This commit ensures that the parameters are in the same order as the
pipeline parameters.

* determine docker image in createContainer

* fix: fix examples

This commit fixes the example scripts.

---------

Co-authored-by: Rick Staa <[email protected]>
Co-authored-by: Elite Encoder <[email protected]>
Co-authored-by: Peter Schroedl <[email protected]>
JJassonn69 added a commit to JJassonn69/ai-worker that referenced this pull request Sep 20, 2024
* feat(model): add Realistic Vision model T2I support (livepeer#136)

This commit ensures that the https://huggingface.co/SG161222/Realistic_Vision_V6.0_B1_noVAE
model is supported in the T2I pipeline.

* ci: add JS/TS SDK update trigger (livepeer#138)

This commit adds a update trigger to the OpenAPI sync action that
triggers a update of the JS/TS SDK.

* ci: add TS/JS SDK OpenAPI spec update trigger (livepeer#139)

This commit addes a trigger to update the OpenAPI spec in https://github.com/livepeer/ai-sdk-js. Furhter it improves the OpenAPI spec upstream sync action to forward more information.

* refactor: add T2I parameter annotations (livepeer#141)

This commit adds parameter annotations to the T2I pipeline similar to
how it is done in the rest of the pipelines. Descriptions will be added
in a subsequenty commit.

* refactor: sort imports using isort (livepeer#142)

This commit sorts the python imports using the
https://pypi.org/project/isort package.

* ci: update OpenAPI spec trigger repos (livepeer#143)

This commit ensures the right upstream repos are triggered in the
trigger upstream OpenAPI sync action.

* feat: improve prompt splitter (livepeer#146)

This commit ensures that an empty dict is returned by the prompt
splitter when no valid prompt was found.

* feat(T2I): add Black Forrest Labs Flux 1 support (livepeer#147)

This commit adds support for the [Black Forrest Labs Flux 1 Schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) model to the T2I pipeline. It is important to note that this model can only run on GPUs with more than 33 GB or VRAM.

* refactor: fix unet load deprecation warnings (livepeer#151)

This commit fixes some unet deprecation warnings by chaning the way the
stable diffusion base model is loaded.

* refactor: resolve CLIPFeatureExtractor deprecation warning (livepeer#152)

This commit resolves a CLIPFeatureExtractor deprecation warning thrown
by the NSFW check logic.

* refactor: added descriptions to the pipeline parameters. (livepeer#144)

* Added descriptions to the parameters.

All parameters needing descriptions across: A2T, I2I, I2V, T2I, and Upscale have had their descriptions added.

* The descriptions have been updated to better apply to the current implementation.

* refactor: shorten parameter descriptions

This commit shortens some of the parameter descriptions since the longer
description is found on huggingface.

* chore: update OpenAPI spec and golang bindings

This commit ensures that the OpenAPI spec and golang bindings are
updated with the new descriptions.

---------

Co-authored-by: Rick Staa <[email protected]>

* chore: apply black formatter (livepeer#153)

This commit applies the black formatter to the codebase to ensure the
code formatting is consistent.

* feat: improve OpenAPI spec generation and naming (livepeer#155)

This commit improves the naming and generation for the OpenAPI spec so
that they are easier to work with.

* chore: remove redundant OpenAPI spec (livepeer#156)

This commit removes a redundant OpenAPI spec that was introduced some
time ago.

* Revert "chore: remove redundant OpenAPI spec (livepeer#156)" (livepeer#157)

This reverts commit 41fa3f4.

* chore: remove redundant OpenAPI spec (livepeer#158)

This commit removes a redundant OpenAPI spec file that was introduced
some time ago.

* refactor: cleanup Gateway OpenAPI spec (livepeer#160)

This commit removes the health endpoint schema from the generated
Gateway OpenAPI spec.

* chore: fix flake8 errors (livepeer#159)

This commit fixes the flake8 errors that were introduced into the
codebase in the last months.

* refactor: remove old OpenAPI spec (livepeer#161)

This commit removes the old unused OpenAPI spec.

* feat: update ByteDance/SDXL-Lighting to default to 8step (livepeer#162)

* update ByteDance/SDXL-Lightning to default to 8 step unet

* update I2I to 8step default for ByteDance/SDXL-Lightning model

* feat: apply git release version to OpenAPI spec (livepeer#164)

This commit ensures that the latest git release flag is applied to the OpenAPI spec.

* refactor: add pipeline descriptions (livepeer#169)

This commit adds pipeline descriptions so that each pipeline is clearly
explained on the docs.

* refactor(openapi): replace json with yaml (livepeer#170)

This commit replaces the default OpenAPI spec with yaml.

* refactor: add response type descriptions (livepeer#171)

This commit ensures that descriptions show up for the route response
types in the docs.

* chore(worker): update go bindings (livepeer#172)

This commit updates the go bindings to include the right docstrings.

* ci: fix OpenAPI spec check action (livepeer#173)

This commit fixes the OpenAPI spec check action. This action can be used
to ensure the OpenAPI spec and go bindings are up to date.

* ci: remove manual SDK/Docs update trigger (livepeer#174)

This commit replaces the manual update trigger for the docs and SDKs by
Speakeasy actions.

* refactor: type gen_openapi file (livepeer#175)

This commit ensures that the functions in the gen_openapi file are
typed.

* chore: remove redundant OpenAPI specs (livepeer#177)

This commit removes the JSON versions of the OpanAPI spec since they are
no longer used.

* refactor: rename A2T pipeline attribute (livepeer#179)

This commit renames the self.ldm (Latent Diffusion Model) to self.tm
(Transformer model) to make the distinction clearer.

* chore: update make go-bindings generation command (livepeer#180)

This commit ensures that the make file uses the right OpenAPI spec to
generate the go bindings.

* add studio api url (livepeer#178)

* feat: add Studio Gateway

This commit adds the studio Gateway to the list of servers.

* chore: update OpenAPI spec

This commit updates the OpenAPI spec to add the Studio gateway to the
list of servers and thus the documentation.

* feat: enable multiple containers for pipeline/model_id (livepeer#148)

This commit makes the container map more unique providing users the case
of running multiple pipelines behind one external endpoint.

Co-authored-by: Rick Staa <[email protected]>

* feat: add OpenAPI gen version arg (livepeer#184)

This commit provides developers with a `--version` argument they can use
when generating the OpenAPI spec using the `gen_openapi.py` script`.

* Segment anything 2 pipeline image (livepeer#185)

* feat(pipeline): add SAM2 image segmentation prototype

This commit introduces a prototype implementation of the
[Segment Anything v2](https://github.com/facebookresearch/segment-anything-2)
(SAM2) pipeline within the AI worker. The prototype demonstrates the basic
functionality needed to perform segmentation on an image. Note that video
segmentation is not yet implemented. Additionally, the dependencies were
updated quickly, which may temporarily break other pipelines.

* revert Dockerfile, requirements, add sam2 Dockerfile

* refactor: enhance SAM2 input handling and error management

This commit allows nested arrays to be supplied as JSON strings for SAM2
input. It also implements robust error handling to return a 400 error with
a descriptive message when incorrect parameters are provided.

* refactor: improve SAM2 return time

This commit ensures that we return the masks, iou_predictions and
low_res_masks in json format.

* Sam2 -> SegmentAnything2

* update go bindings

* update multipart.go binding with NewSegmentAnything2Writer

* update worker and multipart methods

* predictions -> scores, mask -> logits

* add sam2 specific multipartwriter fields

* add segment-anything-2 to containerHostPorts

* fix pipeline name in worker.go

* revert Dockerfile, requirements, add sam2 Dockerfile

* Sam2 -> SegmentAnything2

* predictions -> scores, mask -> logits

* feat: replace JSON.dump with str

This commit replaces the JSON.dump method with a simple str method since
it is highly unlikely that the string contains invalid data.

Co-authored-by: Peter Schroedl <[email protected]>

* move pipeline-specific dockerfile

* update openapi yaml

* add segment anything specific readme

* update go bindings

* refactor: move SAM2 docker

This commit moves the SAM2 docker file inside the docker container.

* refactor: add FastAPI descriptions

This commit cleansup the codebase and adds FastAPI parameter and
pipeline descriptions.

* refactor: improve sam2 route function name

This commit improves the sam2 route function name so that it is more
pythonic and shows up nicer in the OpenAPI spec pipeline summary.

* chore(worker): update golang bindings

This commit updates the golang bindings so that the runner changes are
reflected.

* refactor(runner): add media_type

This commit adds the media type content MIME type to the segment
anything 2 pipeline.

* chore(worker): remove debug patch

This commit removes the debug patch which was accidentally added to the
code.

* feat(runnner): add SAM2 model download command

This commit adds the SAM2 model download command so that orchestrators
can pre-download the model.

* refactor(worker): change SAM2 multipart reader param order

This commit ensures that the parameters are in the same order as the
pipeline parameters.

* determine docker image in createContainer

* fix: fix examples

This commit fixes the example scripts.

---------

Co-authored-by: Rick Staa <[email protected]>
Co-authored-by: Elite Encoder <[email protected]>
Co-authored-by: Peter Schroedl <[email protected]>

* fix(pipeline): add FLUX.1-dev and disable negative_prompt on flux (livepeer#167)

This commit adds the black-forest-labs/FLUX.1 model download commands.
The dev model is placed under the `--restricted` flag since it can not be
used for commercial purposes.

Co-authored-by: Rick Staa <[email protected]>

* chore: update OpenAPI spec version

This commit updates the version set in the OpenAPI spec.

* ci(docker): add ai-runner base Docker tag (livepeer#194)

This commit ensures that the main Docker container is also tagged as the
base container so that it can be used as the base for the pipeline
specific containers.

Co-authored-by: ad-astra-video <[email protected]>

* ci(docker): add workflow dispatch (livepeer#195)

This commit ensures that developers can trigger docker image building.

* ci(docker): ensure docker ci dispatch works (livepeer#196)

This commit ensures that the workflow dispatch triggers the docker job.

* ci: add pipeline docker ci (livepeer#193)

* chore(docker): add 'base' tag and segment-anything-2 docker image build

* update segment-anything-2 to dynamic base image

* make more space on runner

* refactor(ci): split Docker CI

This commit ensures that the pipeline docker build ci is found in a
seperate action from the base.

* ci(docker): enable pipeline docker workflow dispatch

This commit ensures that maintainers can trigger the pipeline specific
Docker action using a workflow dispatch.

* ci(docker): fix out of space error

This commit switches to the oxfort runner to see if it can fis the OS
storage error.

* ci: cleanup hosted runner

This commit cleans up the hosted runner so that we don't run into OS
storage issues when trying to build the container.

---------

Co-authored-by: Brad P <[email protected]>

* ci(docker): add sam2 docker tags (livepeer#197)

This commit ensures that the SAM2 docker has the right tags.

* ci(docker): enable pipeline docker workflow dispatch (livepeer#198)

This commit ensures that maintainers can manually run the pipeline
docker ci.

* feat(sdks): implement SDK-specific API customizations (livepeer#191)

* feat(sdks): implement SDK-specific API customizations

This commit introduces several SDK-specific OpenAPI configurations to the runner
API configuration. These customizations will enhance the SDKs we are planning
to release.

Co-authored-by: Victor Elias <[email protected]>

* feat: enable speakeasy retries (livepeer#201)

This commit enables the [speakeasy
retries](https://www.speakeasy.com/docs/customize-sdks/retries#global-retries)
feature for the SDKs.

* Revert "feat: enable speakeasy retries (livepeer#201)" (livepeer#202)

This reverts commit caa4bb7.

* chore: release v0.5.0

This commit releases a new minor version since we had to revert the SDK
related changes.

* chore: update alpha to beta phase (livepeer#203)

This commit updates the code and documentation to signal we are entering
the Beta phase of the AI network journey.

* fix(runner): improve 'num_inference_steps' logic (livepeer#205)

This commit prevents a Key Error from being thrown when the pipelines
are called directly.

* fix(runner): fix benchmarking script (livepeer#206)

This commit removes the 'batch_size' argument from the benchmarking script since our current pipeliens don't support batching requests due to us not having a way to estimate VRAM and preventing out of memory errors. For more information see livepeer#66. We can add this option back in when we have solved this.

* readme: update with note (livepeer#207)

* docs: remove AI Realtime Video note from main branch

This commit removes the AI Realtime video warning note from the
mainbranch as it should have been on the
https://github.com/livepeer/ai-worker/tree/realtime-ai-experimental
branch.

---------

Co-authored-by: Rick Staa <[email protected]>
Co-authored-by: ea_superstar <[email protected]>
Co-authored-by: ad-astra-video <[email protected]>
Co-authored-by: PSchroedl <[email protected]>
Co-authored-by: Elite Encoder <[email protected]>
Co-authored-by: Peter Schroedl <[email protected]>
Co-authored-by: Brad P <[email protected]>
Co-authored-by: Victor Elias <[email protected]>
Co-authored-by: Emran M <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants