[Model][LoRA]LoRA support added for MolmoForCausalLM #11439

ayylemao · 2024-12-23T16:14:24Z

[Model][LoRA]LoRA support added for MolmoForCausalLM
LoRA supported for inference in MolmoForCausalLM

Need some help with verifying this works correctly since my own tests revealed the same results for base_model and base_model+ lora.

ping @jeejeelee

FIX #11431

github-actions · 2024-12-23T16:14:35Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

jeejeelee · 2024-12-24T00:55:27Z

Need some help with verifying this works correctly since my own tests revealed the same results for base_model and base_model+ lora.

I roughly know what the reason is. I will resolve it today, and we also need to update the documentation,see :https://docs.vllm.ai/en/latest/models/supported_models.html#id3

mergify · 2024-12-24T08:36:44Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ayylemao.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

ayylemao · 2024-12-24T08:49:40Z

i might have done goofed by syncing my fork and pushing to the PR branch
I fixed what i could, sorry for all the review requests...

Signed-off-by: Matthias Vogler <[email protected]>

mergify · 2024-12-24T13:06:36Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ayylemao.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

jeejeelee · 2024-12-25T01:46:37Z

@ayylemao could you plz address the branch conflict ?

vllm/model_executor/models/molmo.py

ayylemao · 2024-12-25T14:27:39Z

Thank you for your review.

I will get back to it after christmas!

Signed-off-by: Matthias Vogler <[email protected]>

…o_lora

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee · 2024-12-27T02:16:09Z

@ayylemao I've added modules with LoRA support. Could you test if it generates reasonable results?

Signed-off-by: Jee Jee Li <[email protected]>

ayylemao · 2024-12-27T09:11:59Z

@ayylemao I've added modules with LoRA support. Could you test if it generates reasonable results?

@jeejeelee After testing i still get the same results for the base model and base_model+lora when serving via vllm serve.
I think my lora adapter still does not get applied correctly, even though when evaluating via transformers+peft it works as expected.

I am not sure how to test more accuratly if the lora adapters get correctly applied or not beyond just comparing the actual model output.
Do you have any idea what might be the issue or how to pinpoint further?

jeejeelee · 2024-12-27T16:07:49Z

@ayylemao I've added modules with LoRA support. Could you test if it generates reasonable results?

@jeejeelee After testing i still get the same results for the base model and base_model+lora when serving via vllm serve. I think my lora adapter still does not get applied correctly, even though when evaluating via transformers+peft it works as expected.

I am not sure how to test more accuratly if the lora adapters get correctly applied or not beyond just comparing the actual model output. Do you have any idea what might be the issue or how to pinpoint further?

I'm training a LoRA model for this model and will test it tomorrow.

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee · 2024-12-30T02:56:28Z

@ayylemao I've added modules with LoRA support. Could you test if it generates reasonable results?

@jeejeelee After testing i still get the same results for the base model and base_model+lora when serving via vllm serve. I think my lora adapter still does not get applied correctly, even though when evaluating via transformers+peft it works as expected.
I am not sure how to test more accuratly if the lora adapters get correctly applied or not beyond just comparing the actual model output. Do you have any idea what might be the issue or how to pinpoint further?

I'm training a LoRA model for this model and will test it tomorrow.

@ayylemao After merging with #11551, when testing 100 samples, 99 generated results matched exactly with the transformers output. The remaining one also produced a reasonable result.

ayylemao · 2024-12-30T09:06:15Z

@ayylemao I've added modules with LoRA support. Could you test if it generates reasonable results?

@jeejeelee After testing i still get the same results for the base model and base_model+lora when serving via vllm serve. I think my lora adapter still does not get applied correctly, even though when evaluating via transformers+peft it works as expected.
I am not sure how to test more accuratly if the lora adapters get correctly applied or not beyond just comparing the actual model output. Do you have any idea what might be the issue or how to pinpoint further?

I'm training a LoRA model for this model and will test it tomorrow.

@ayylemao After merging with #11551, when testing 100 samples, 99 generated results matched exactly with the transformers output. The remaining one also produced a reasonable result.

Thank you for testing.

After your merge with #11551 i also get correct results!
Thank you very much for your help with this PR.

jeejeelee

@ayylemao Thanks for your contribution

ayylemao · 2024-12-30T17:36:45Z

Do i still need to do something here because of the failing buildkite/ci/pr/lora-test-1 check?
It seems that a Qwen2-VL check has failed.

[2024-12-30T10:24:42Z] =================================== FAILURES ===================================
--
  | [2024-12-30T10:24:42Z] ______________________________ test_qwen2vl_lora _______________________________
  | [2024-12-30T10:24:42Z]
  | [2024-12-30T10:24:42Z] qwen2vl_lora_files = '/root/.cache/huggingface/hub/models--jeeejeee--qwen2-vl-lora-pokemon/snapshots/6c85687748763f7449fa15c345520b43ba6e522f'
  | [2024-12-30T10:24:42Z]
  | [2024-12-30T10:24:42Z]     @pytest.mark.xfail(current_platform.is_rocm(),
  | [2024-12-30T10:24:42Z]                        reason="Qwen2-VL dependency xformers incompatible with ROCm"
  | [2024-12-30T10:24:42Z]                        )
  | [2024-12-30T10:24:42Z]     def test_qwen2vl_lora(qwen2vl_lora_files):
  | [2024-12-30T10:24:42Z]         llm = vllm.LLM(
  | [2024-12-30T10:24:42Z]             MODEL_PATH,
  | [2024-12-30T10:24:42Z]             max_num_seqs=2,
  | [2024-12-30T10:24:42Z]             enable_lora=True,
  | [2024-12-30T10:24:42Z]             max_loras=2,
  | [2024-12-30T10:24:42Z]             max_lora_rank=16,
  | [2024-12-30T10:24:42Z]             trust_remote_code=True,
  | [2024-12-30T10:24:42Z]             mm_processor_kwargs={
  | [2024-12-30T10:24:42Z]                 "min_pixels": 28 * 28,
  | [2024-12-30T10:24:42Z]                 "max_pixels": 1280 * 28 * 28,
  | [2024-12-30T10:24:42Z]             },
  | [2024-12-30T10:24:42Z]             max_model_len=4096,
  | [2024-12-30T10:24:42Z]         )
  | [2024-12-30T10:24:42Z]         output1 = do_sample(llm, qwen2vl_lora_files, lora_id=1)
  | [2024-12-30T10:24:42Z]         for i in range(len(EXPECTED_OUTPUT)):
  | [2024-12-30T10:24:42Z] >           assert EXPECTED_OUTPUT[i].startswith(output1[i])
  | [2024-12-30T10:24:42Z] E           AssertionError: assert False
  | [2024-12-30T10:24:42Z] E            +  where False = <built-in method startswith of str object at 0x7f462c0639f0>('A vibrant street scene with')
  | [2024-12-30T10:24:42Z] E            +    where <built-in method startswith of str object at 0x7f462c0639f0> = 'A red stop sign stands prominently in the foreground, with a traditional Chinese gate and a black SUV in the background, illustrating a blend of modern and cultural elements.'.startswith
  | [2024-12-30T10:24:42Z]
  | [2024-12-30T10:24:42Z] lora/test_qwen2vl.py:78: AssertionError

jeejeelee · 2024-12-31T01:15:40Z

@ayylemao This is a known issue and is not related to this PR.

) Signed-off-by: Matthias Vogler <[email protected]> Signed-off-by: Jee Jee Li <[email protected]> Co-authored-by: Matthias Vogler <[email protected]> Co-authored-by: Jee Jee Li <[email protected]> Signed-off-by: xcnick <[email protected]>

DarkLight1337 requested a review from jeejeelee December 23, 2024 17:37

mergify bot added documentation Improvements or additions to documentation needs-rebase labels Dec 24, 2024

ayylemao force-pushed the molmo_lora branch from f7d0dcb to f1bf4fe Compare December 24, 2024 08:38

ayylemao requested review from mgoin, robertgshaw2-neuralmagic, DarkLight1337, simon-mo, WoosukKwon, njhill, ywang96, comaniac, alexm-neuralmagic, zhuohan123 and youkaichao as code owners December 24, 2024 08:47

mergify bot added ci/build frontend labels Dec 24, 2024

Matthias Vogler added 3 commits December 24, 2024 09:54

added molmo lora

5f4bbf7

Signed-off-by: Matthias Vogler <[email protected]>

added Molmo Lora Support

1ca7a96

Signed-off-by: Matthias Vogler <[email protected]>

format code, edit docs

2bb466c

Signed-off-by: Matthias Vogler <[email protected]>

ayylemao force-pushed the molmo_lora branch from 95f8036 to 2bb466c Compare December 24, 2024 08:56

mergify bot removed the needs-rebase label Dec 24, 2024

format code

037bec8

Signed-off-by: Matthias Vogler <[email protected]>

jeejeelee removed request for mgoin, comaniac and zhuohan123 December 24, 2024 12:38

jeejeelee removed the request for review from njhill December 24, 2024 12:38

mergify bot added the needs-rebase label Dec 24, 2024

jeejeelee reviewed Dec 25, 2024

View reviewed changes

vllm/model_executor/models/molmo.py Outdated Show resolved Hide resolved

ayylemao added 2 commits December 26, 2024 13:16

Merge remote-tracking branch 'upstream/main' into molmo_lora

245aa8e

obtain lora_config directly from vllm_config

3952737

Signed-off-by: Matthias Vogler <[email protected]>

ayylemao marked this pull request as draft December 26, 2024 12:31

mergify bot removed the needs-rebase label Dec 26, 2024

ayylemao added 2 commits December 26, 2024 13:37

format code

c81e868

clean up code

fa01a17

ayylemao marked this pull request as ready for review December 26, 2024 12:59

jeejeelee added 2 commits December 27, 2024 02:02

Merge branch 'main' of https://github.com/vllm-project/vllm into molm…

06c15c1

…o_lora

Add modules

b2c16c7

Signed-off-by: Jee Jee Li <[email protected]>

format

789c888

Signed-off-by: Jee Jee Li <[email protected]>

Sync main

2135d63

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee approved these changes Dec 30, 2024

View reviewed changes

jeejeelee enabled auto-merge (squash) December 30, 2024 10:08

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 30, 2024

simon-mo disabled auto-merge December 31, 2024 01:33

simon-mo merged commit a2a40bc into vllm-project:main Dec 31, 2024
68 of 71 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model][LoRA]LoRA support added for MolmoForCausalLM #11439

[Model][LoRA]LoRA support added for MolmoForCausalLM #11439

ayylemao commented Dec 23, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Dec 23, 2024

jeejeelee commented Dec 24, 2024

mergify bot commented Dec 24, 2024

ayylemao commented Dec 24, 2024 •

edited

Loading

mergify bot commented Dec 24, 2024

jeejeelee commented Dec 25, 2024

ayylemao commented Dec 25, 2024

jeejeelee commented Dec 27, 2024

ayylemao commented Dec 27, 2024 •

edited

Loading

jeejeelee commented Dec 27, 2024

jeejeelee commented Dec 30, 2024

ayylemao commented Dec 30, 2024 •

edited

Loading

jeejeelee left a comment •

edited

Loading

ayylemao commented Dec 30, 2024

jeejeelee commented Dec 31, 2024

[Model][LoRA]LoRA support added for MolmoForCausalLM #11439

[Model][LoRA]LoRA support added for MolmoForCausalLM #11439

Conversation

ayylemao commented Dec 23, 2024 • edited by github-actions bot Loading

github-actions bot commented Dec 23, 2024

jeejeelee commented Dec 24, 2024

mergify bot commented Dec 24, 2024

ayylemao commented Dec 24, 2024 • edited Loading

mergify bot commented Dec 24, 2024

jeejeelee commented Dec 25, 2024

ayylemao commented Dec 25, 2024

jeejeelee commented Dec 27, 2024

ayylemao commented Dec 27, 2024 • edited Loading

jeejeelee commented Dec 27, 2024

jeejeelee commented Dec 30, 2024

ayylemao commented Dec 30, 2024 • edited Loading

jeejeelee left a comment • edited Loading

Choose a reason for hiding this comment

ayylemao commented Dec 30, 2024

jeejeelee commented Dec 31, 2024

ayylemao commented Dec 23, 2024 •

edited by github-actions bot

Loading

ayylemao commented Dec 24, 2024 •

edited

Loading

ayylemao commented Dec 27, 2024 •

edited

Loading

ayylemao commented Dec 30, 2024 •

edited

Loading

jeejeelee left a comment •

edited

Loading