Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support llava1.5 lora finetuning. #1487

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

lkk12014402
Copy link
Contributor

What does this PR do?

add llava1.5 finetuning and add an example

@lkk12014402 lkk12014402 requested a review from regisss as a code owner November 14, 2024 17:32
@lkk12014402
Copy link
Contributor Author

lkk12014402 commented Nov 14, 2024

performance comparison

finetuning on gaudi

Before optimization

image

After optimization (this pr)

image

finetuning on a100

image

@yao-matrix
Copy link
Contributor

@libinta , pls help review, validated pass in 1.19.0-410 build, thx.

Copy link

The code quality check failed, please run make style.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@yafshar
Copy link
Contributor

yafshar commented Dec 2, 2024

@lkk12014402 thanks for your contribution.

Can you explain why you created a new script, run_llava_lora_finetune.py, which is very similar to run_image2text_lora_finetune.py, instead of modifying the original script? Introducing new code can increase the possibility of errors and make maintenance more challenging.

@yafshar
Copy link
Contributor

yafshar commented Dec 2, 2024

@lkk12014402 in your performance comparison above, what kind of optimization you did? I do not see any optimization. Can you elaborate?

@lkk12014402
Copy link
Contributor Author

@lkk12014402 thanks for your contribution.

Can you explain why you created a new script, run_llava_lora_finetune.py, which is very similar to run_image2text_lora_finetune.py, instead of modifying the original script? Introducing new code can increase the possibility of errors and make maintenance more challenging.

hi, llava's Processor, DataCollator, evaluation code, padding style are different from the run_image2text_lora_finetune.py. If we merge the llava's code into the file run_image2text_lora_finetune.py, some if-else code will be ingested into the file. And I think each model maybe have its own Processor and DataCollator, or suitable dataset, So it is clear to create a new script.

@lkk12014402
Copy link
Contributor Author

lkk12014402 commented Dec 4, 2024

@lkk12014402 in your performance comparison above, what kind of optimization you did? I do not see any optimization. Can you elaborate?

padding inputs for static shape during training datacollator

@yafshar
Copy link
Contributor

yafshar commented Dec 5, 2024

@lkk12014402 can you also add the tests results? Does your change breaks any unit tests (test_image_to_text_example)?

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest tests/test_image_to_text_example.py -v -s -k llava

before and after your changes?

@yafshar
Copy link
Contributor

yafshar commented Dec 5, 2024

I ran the test

main

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest tests/test_image_to_text_example.py -v -s -k llava

10 passed, 4 deselected in 4419.67s (1:13:39)

this PR

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest tests/test_image_to_text_example.py -v -s -k llava

10 passed, 4 deselected in 2320.45s (0:38:40)

Copy link
Contributor

@yafshar yafshar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not sure about adding a new script but other than that it LGTM!

@regisss would you please check this PR.

@lkk12014402 lkk12014402 closed this Dec 6, 2024
@lkk12014402 lkk12014402 reopened this Dec 6, 2024
@lkk12014402
Copy link
Contributor Author

I ran the test

main

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest tests/test_image_to_text_example.py -v -s -k llava

10 passed, 4 deselected in 4419.67s (1:13:39)

this PR

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest tests/test_image_to_text_example.py -v -s -k llava

10 passed, 4 deselected in 2320.45s (0:38:40)

Thanks~ @yafshar I will merge llava run_llava_lora_finetune.py into the run_image2text_lora_finetune.py

@lkk12014402
Copy link
Contributor Author

I am still not sure about adding a new script but other than that it LGTM!

@regisss would you please check this PR.

hi, @yafshar I have merged the 2 scripts.

Thanks~

@yafshar
Copy link
Contributor

yafshar commented Dec 6, 2024

@lkk12014402 thanks. Can you do the full test to make sure your changes did not break anything?

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest tests/test_image_to_text_example.py -v -s 

@yafshar
Copy link
Contributor

yafshar commented Dec 6, 2024

OK, I ran the tests. They seem to be fine. Would you please try some other examples

main branch

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest tests/test_image_to_text_example.py -v -s 
14 passed in 1957.63s (0:32:37)

this PR

>>> RUN_SLOW=true GAUDI2_CI=1 python -m pytest tests/test_image_to_text_example.py -v -s 
14 passed in 1428.22s (0:23:48)

@lkk12014402
Copy link
Contributor Author

hi, @regisss please review~

Copy link
Contributor

@vidyasiv vidyasiv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on regiss feedback: please add/update relevant test(s) for new script/model: https://github.com/huggingface/optimum-habana/blob/main/tests/test_image_to_text_example.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants