[Usage]: Can it support using the siglip model as the vision model in a multimodal model? #7144

BrenchCC · 2024-08-05T05:31:00Z

Your current environment

The output of `python collect_env.py`

How would you like to use vllm

I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.

The text was updated successfully, but these errors were encountered:

jeejeelee · 2024-08-05T05:42:16Z

FYI: #6942

ywang96 · 2024-08-05T06:00:07Z

Just as a reference, generally speaking, you're welcome to import any ViT directly from transformers if the VLM utilizes it (but it would be also great if you port the ViT over)

BrenchCC · 2024-08-05T06:24:16Z

Thank you so much.

BrenchCC added the usage How to use vllm label Aug 5, 2024

DarkLight1337 mentioned this issue Aug 5, 2024

[Model] SiglipVisionModel ported from transformers #6942

Merged

ywang96 closed this as completed in #6942 Aug 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Usage]: Can it support using the siglip model as the vision model in a multimodal model? #7144

[Usage]: Can it support using the siglip model as the vision model in a multimodal model? #7144

BrenchCC commented Aug 5, 2024

jeejeelee commented Aug 5, 2024 •

edited

Loading

ywang96 commented Aug 5, 2024 •

edited

Loading

BrenchCC commented Aug 5, 2024

[Usage]: Can it support using the siglip model as the vision model in a multimodal model? #7144

[Usage]: Can it support using the siglip model as the vision model in a multimodal model? #7144

Comments

BrenchCC commented Aug 5, 2024

Your current environment

How would you like to use vllm

jeejeelee commented Aug 5, 2024 • edited Loading

ywang96 commented Aug 5, 2024 • edited Loading

BrenchCC commented Aug 5, 2024

jeejeelee commented Aug 5, 2024 •

edited

Loading

ywang96 commented Aug 5, 2024 •

edited

Loading