-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Llava #2639
[WIP] Llava #2639
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall design/change looks very good! I only made some minor style suggestions.
Make sure your changes do not break existing functions by running some unit tests here (https://github.com/lm-sys/FastChat/tree/main/tests#unit-tests-for-fastchat)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Rename
fastchat/serve/examples/dog.jpeg
->fastchat/serve/example_images/dog.jpeg
- Delete
playground/images/python.png
,playground/images/sunset.jpg
Do we need to add additional dependency? I got the below error for
|
You can add new dependency here Line 22 in 99d19ac
model_worker or create a new tag vision =
|
Sorry could you resolve conflicts? |
if image_file.startswith("http://") or image_file.startswith("https://"): | ||
response = requests.get(image_file) | ||
image = Image.open(BytesIO(response.content)).convert("RGB") | ||
elif base64.b64encode(base64.b64decode(image_file)) == image_file.encode(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the input is not in the base64 format, this line will raise an error. Maybe swap the last two branches, or catch the error?
@BabyChouSr this is an awesome change, thank you for contributing to this. I was trying to test this by running on CPU but I get the following error when running the test message.
|
Closing this PR for now. This PR addresses many concerns including 1. Multimodal support 2. Gradio web server for multimodal models 3. Support for Huggingface multimodal models 4. GPT-4-V support. I will use this PR as reference to decompose into separate PRs. |
Why are these changes needed?
Provide the ability to interact with multimodal models.
TODO:
CLI commands:
Testing multimodal worker:
Example output (with --max-new-tokens 256):
![image](https://private-user-images.githubusercontent.com/49086305/280859023-2b8affcb-2da1-46c9-a0c5-5c5e3fca3f6a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg3NzcwMzIsIm5iZiI6MTczODc3NjczMiwicGF0aCI6Ii80OTA4NjMwNS8yODA4NTkwMjMtMmI4YWZmY2ItMmRhMS00NmM5LWEwYzUtNWM1ZTNmY2EzZjZhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA1VDE3MzIxMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTA4OGRhMzllMTFkZWI4NjljOGNhNzk5ZTdlOGFmYWVhMWU1YTFmMGUxMmM5NTgzYzYzY2Q5NTg1ZDY5Y2YzZWQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.AKcaYtkIgO14nNIkmRsaqhwKvM1XAHvK7iiiw0NfNRI)
Gradio interface:
![image](https://private-user-images.githubusercontent.com/49086305/280887233-4a911ba0-a5f0-469d-976a-b97c4bf337c4.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg3NzcwMzIsIm5iZiI6MTczODc3NjczMiwicGF0aCI6Ii80OTA4NjMwNS8yODA4ODcyMzMtNGE5MTFiYTAtYTVmMC00NjlkLTk3NmEtYjk3YzRiZjMzN2M0LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA1VDE3MzIxMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWIzOTRjNGVlZDFjM2NjNDc0YjA1OGQ0YjYwYWYzYjAzMDM5YzhlNWFhZjg0MzBkZDRkMGRlZThmNDViODlhNDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.s4-pHR12rx_5I-KQUu86g4hNdyMmGLbn6KDdw3SUVkg)
![image](https://private-user-images.githubusercontent.com/49086305/280889046-08329a04-71fd-430a-8a48-82e5e304ab90.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg3NzcwMzIsIm5iZiI6MTczODc3NjczMiwicGF0aCI6Ii80OTA4NjMwNS8yODA4ODkwNDYtMDgzMjlhMDQtNzFmZC00MzBhLThhNDgtODJlNWUzMDRhYjkwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA1VDE3MzIxMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTg1ZGUyZjM2OTlhNWU3Zjk2YmMwMTY3OTk2MzhmYzZkYzA4YzExMjgzNTViOTc5MzFkZTU5YWVmYzllYjZlOGYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.NWm78upEE5KWe2vM--8nKebwfTb8sxGLEErZcKfnUXs)
Unit test outputs:
Things to think about:
Checks
format.sh
to lint the changes in this PR.