[Model] Dynamic image size support for LLaVA-NeXT #5279

DarkLight1337 · 2024-06-05T14:39:45Z

A proof-of-concept showcasing the application of input processor for inserting a dynamic number of image tokens. This is required to fully take advantage of the multi-resolution feature in LLaVA-NeXT.

Related contributions

This PR is essentially #5276 applied on top of #4199. It will leave draft status when both dependences are merged.

Update: Since #4199 has been merged, #5276 now supersedes this PR.

…y default

DarkLight1337 · 2024-06-11T07:05:05Z

Closing, since LLaVA-NeXT has already been added to vLLM. #5276 should be able to take care of this now.

For reference, I'm deferring the deletion of the branch for this PR until #5276 is merged.

DarkLight1337 · 2024-06-27T09:25:58Z

#5276 is in a pretty good spot now so I'll delete this branch.

DarkLight1337 added 18 commits June 3, 2024 06:34

Introduce a higher level INPUT_REGISTRY

34bfa79

Move dummy data generation to input registry

df2aa19

Update docs

c72d2b3

Rename process_input to map_input

d8c6488

Reorder arguments

f18de48

Apply input processor

653537d

Remove VisionLanguageConfig from input mapper

a2f5a3c

Fix bad use of functools.partial

378ad80

Use default input processor

7aa3778

Merge branch 'upstream' into mm-image-tokenizer

c774168

Fix wrong arguments

532f863

Use pillow image instead of tensor to avoid bypassing the processor b…

080d40c

…y default

Update interface of dummy data factory and input processor

662693a

Add input processor for injecting image tokens; fix docs

911cac7

Add new documentation pages

a38b347

Fix LLaVA-NeXT input processor and cleanup code

9cfbcce

Merge branch 'upstream' into mm-image-tokenizer-2

1dbab81

Add LLaVA-NeXT model and test

4fda1a1

This was referenced Jun 5, 2024

[RFC]: Multi-modality Support on vLLM #4194

Open

[Model] Initial support for LLaVA-NeXT #4199

Merged

DarkLight1337 closed this Jun 11, 2024

DarkLight1337 deleted the mm-image-tokenizer-llava branch June 27, 2024 09:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] Dynamic image size support for LLaVA-NeXT #5279

[Model] Dynamic image size support for LLaVA-NeXT #5279

DarkLight1337 commented Jun 5, 2024 •

edited

Loading

DarkLight1337 commented Jun 11, 2024 •

edited

Loading

DarkLight1337 commented Jun 27, 2024

[Model] Dynamic image size support for LLaVA-NeXT #5279

[Model] Dynamic image size support for LLaVA-NeXT #5279

Conversation

DarkLight1337 commented Jun 5, 2024 • edited Loading

Related contributions

DarkLight1337 commented Jun 11, 2024 • edited Loading

DarkLight1337 commented Jun 27, 2024

DarkLight1337 commented Jun 5, 2024 •

edited

Loading

DarkLight1337 commented Jun 11, 2024 •

edited

Loading