[WIP] Llava #2639

BabyChouSr · 2023-11-04T08:23:41Z

Why are these changes needed?

Provide the ability to interact with multimodal models.

TODO:

CLI commands:

python3 -m fastchat.serve.cli --model-path llava-hf/llava-1.5-7b-hf --debug

Testing multimodal worker:

python3 -m fastchat.serve.controller
python3 -m fastchat.serve.model_worker --model-path llava-hf/llava-1.5-7b-hf --multimodal
python3 -m fastchat.serve.test_message --model-name llava-hf/llava-1.5-7b-hf --max-new-tokens 256

Example output (with --max-new-tokens 256):

Gradio interface:

Unit test outputs:

Things to think about:

I created another file for the vision models, is there a smarter way to do this? Only some functions change but there is shared functionality between gradio_web_server_vision and gradio_web_server.
How do I get the model list for vision models?
Should we keep the ability to process images? (Pad, resize, crop) (DELETED for now)

Checks

I've run format.sh to lint the changes in this PR.
I've included any doc changes needed.
I've made sure the relevant tests are passing (if applicable).

merrymercy

The overall design/change looks very good! I only made some minor style suggestions.

Make sure your changes do not break existing functions by running some unit tests here (https://github.com/lm-sys/FastChat/tree/main/tests#unit-tests-for-fastchat)

fastchat/constants.py

fastchat/conversation.py

fastchat/model/model_adapter.py

fastchat/serve/gradio_web_server_vision.py

fastchat/serve/inference.py

merrymercy

Rename fastchat/serve/examples/dog.jpeg -> fastchat/serve/example_images/dog.jpeg
Delete playground/images/python.png, playground/images/sunset.jpg

fastchat/conversation.py

fastchat/serve/gradio_web_server_vision.py

fastchat/utils.py

fastchat/conversation.py

fastchat/serve/multimodal_model_worker.py

…-multimodal

infwinston · 2023-11-21T08:02:53Z

Do we need to add additional dependency? I got the below error for python3 -m fastchat.serve.cli --model-path liuhaotian/llava-v1.5-7b --multimodal --debug

Traceback (most recent call last):
  File "/opt/conda/envs/chatbot/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/envs/chatbot/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/gcpuser/sky_workdir/FastChat/fastchat/serve/cli.py", line 310, in <module>
    main(args)
  File "/home/gcpuser/sky_workdir/FastChat/fastchat/serve/cli.py", line 226, in main
    chat_loop(
  File "/home/gcpuser/sky_workdir/FastChat/fastchat/serve/inference.py", line 371, in chat_loop
    model, tokenizer, image_processor = load_model(
  File "/home/gcpuser/sky_workdir/FastChat/fastchat/model/model_adapter.py", line 321, in load_model
    model, tokenizer, image_processor = adapter.load_model(model_path, kwargs)
  File "/home/gcpuser/sky_workdir/FastChat/fastchat/model/model_adapter.py", line 1870, in load_model
    vision_tower.load_model()
  File "/home/gcpuser/sky_workdir/FastChat/fastchat/model/llava/multimodal_encoder/clip_encoder.py", line 23, in load_model
    self.image_processor = CLIPImageProcessor.from_pretrained(
  File "/opt/conda/envs/chatbot/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1259, in __getattribute__
    requires_backends(cls, cls._backends)
  File "/opt/conda/envs/chatbot/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1247, in requires_backends
    raise ImportError("".join(failed))
ImportError:
CLIPImageProcessor requires the PIL library but it was not found in your environment. You can install it with pip:
`pip install pillow`. Please note that you may need to restart your runtime after installation.

merrymercy · 2023-11-22T08:48:11Z

You can add new dependency here

FastChat/pyproject.toml

Line 22 in 99d19ac

    
           model_worker = ["accelerate>=0.21", "peft", "sentencepiece", "torch", "transformers>=4.31.0", "protobuf"]

, either by adding them to model_worker or create a new tag vision =

…-multimodal

infwinston · 2023-12-01T03:37:40Z

Sorry could you resolve conflicts?

…-multimodal

Ying1123 · 2023-12-28T03:45:56Z

fastchat/utils.py

+    if image_file.startswith("http://") or image_file.startswith("https://"):
+        response = requests.get(image_file)
+        image = Image.open(BytesIO(response.content)).convert("RGB")
+    elif base64.b64encode(base64.b64decode(image_file)) == image_file.encode():


If the input is not in the base64 format, this line will raise an error. Maybe swap the last two branches, or catch the error?

shaunaa126 · 2023-12-29T17:53:52Z

@BabyChouSr this is an awesome change, thank you for contributing to this. I was trying to test this by running on CPU but I get the following error when running the test message.

python -m fastchat.serve.model_worker --model-path llava-1.5-7b-hf --multimodal --device cpu

2023-12-29 11:48:35 | INFO | model_worker | args: Namespace(host='localhost', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:21001', model_path='llava-1.5-7b-hf', revision='main', device='cpu', gpus=None, num_gpus=1, max_gpu_memory=None, dtype=None, load_8bit=False, cpu_offloading=False, gptq_ckpt=None, gptq_wbits=16, gptq_groupsize=-1, gptq_act_order=False, awq_ckpt=None, awq_wbits=16, awq_groupsize=-1, enable_exllama=False, exllama_max_seq_len=4096, exllama_gpu_split=None, exllama_cache_8bit=False, enable_xft=False, xft_max_seq_len=4096, xft_dtype=None, model_names=None, conv_template=None, embed_in_truncate=False, limit_worker_concurrency=5, stream_interval=2, no_register=False, seed=None, debug=False, ssl=False, multimodal=True)
2023-12-29 11:48:35 | INFO | model_worker | Loading the model ['llava-1.5-7b-hf'] on worker 2082b194 ...
Loading checkpoint shards:   0%|                                                   | 0/3 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|███████████████████████████████████████████| 3/3 [00:00<00:00, 22.60it/s]
Loading checkpoint shards: 100%|███████████████████████████████████████████| 3/3 [00:00<00:00, 22.55it/s]
2023-12-29 11:48:35 | ERROR | stderr | 
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2023-12-29 11:48:35 | INFO | model_worker | Register to controller
2023-12-29 11:48:35 | ERROR | stderr | INFO:     Started server process [174724]
2023-12-29 11:48:35 | ERROR | stderr | INFO:     Waiting for application startup.
2023-12-29 11:48:35 | ERROR | stderr | INFO:     Application startup complete.
2023-12-29 11:48:35 | ERROR | stderr | INFO:     Uvicorn running on http://localhost:21002 (Press CTRL+C to quit)

python -m fastchat.serve.test_message --model-name llava-1.5-7b-hf --max-new-tokens 256

Models: ['llava-1.5-7b-hf']
worker_addr: http://localhost:21002
USER: Tell me a story with more than 1000 words.
ASSISTANT: **NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.**

("addmm_impl_cpu_" not implemented for 'Half')

BabyChouSr · 2024-01-19T05:19:56Z

Closing this PR for now. This PR addresses many concerns including 1. Multimodal support 2. Gradio web server for multimodal models 3. Support for Huggingface multimodal models 4. GPT-4-V support. I will use this PR as reference to decompose into separate PRs.

BabyChouSr added 9 commits November 4, 2023 01:21

init commit

e2d9db7

loading model

ebe4013

text model test works

5b40cc2

add ability to have images

ef81e0d

reset convo for new image

ce54a29

simply warn if file doesn't exist

cc75581

fix model worker endpoint

b3da05a

init gradio single vision language model interface

a443383

create new file for vision server

5f4e439

BabyChouSr marked this pull request as ready for review November 7, 2023 01:25

format

9388048

BabyChouSr requested a review from merrymercy November 7, 2023 01:29

merrymercy reviewed Nov 7, 2023

View reviewed changes

BabyChouSr added 4 commits November 9, 2023 18:09

format issues, refactor vision server, rename constants

547ca26

remove load image from multimodal worker

e5388df

remove some unused imports

915b4cd

add model support

f1602bf

BabyChouSr requested a review from merrymercy November 10, 2023 02:48

merrymercy requested changes Nov 11, 2023

View reviewed changes

BabyChouSr added 7 commits November 11, 2023 11:06

fix some imports

8173ee3

remove unused args and create utility fuction

86039cf

add conversion to openai format

21ee79e

rename flag

f9d8962

Merge branch 'main' of https://github.com/BabyChouSr/FastChat into pr…

fd32e20

…-multimodal

clean and test

502cf23

fix conversation description

ec21cb6

BabyChouSr requested a review from infwinston November 12, 2023 09:30

BabyChouSr added 2 commits November 16, 2023 15:35

fix and format

78181b1

fix get_images function for multiple images

b48faf4

BabyChouSr added 3 commits November 21, 2023 16:03

move imports into the function

921fa3d

Merge remote-tracking branch 'origin' into pr-multimodal

d942f3e

format

60960a0

BabyChouSr added 6 commits November 22, 2023 19:57

consolidate multimodal model worker and model worker

90af475

change gradio server to just a tab in multi-view

919462a

significantly reduce code in vision file

0ff62c0

separate get_model_list for text vs. vision language models

7faf3dc

Merge branch 'main' of https://github.com/BabyChouSr/FastChat into pr…

d9bc4ac

…-multimodal

add vision optional dependency

825c0e2

BabyChouSr and others added 9 commits November 30, 2023 19:49

Merge branch 'main' of https://github.com/BabyChouSr/FastChat into pr…

bd9a5e2

…-multimodal

Update server for open_ai compatibility

57e692f

format

857132e

Change burden for image token to the Conversation class

96e00c5

Merge branch 'main' of https://github.com/BabyChouSr/FastChat into pr…

1501a7c

…-multimodal

Format

dc22df2

Use huggingface llava

0cc1bd8

Fix conversation and remove old LLaVA files

a3f2011

Remove LLaVA specific stream_func

487e987

Ying1123 reviewed Dec 28, 2023

View reviewed changes

merrymercy added the high-priority label Jan 7, 2024

BabyChouSr closed this Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Llava #2639

[WIP] Llava #2639

BabyChouSr commented Nov 4, 2023 •

edited

Loading

merrymercy left a comment

merrymercy left a comment

infwinston commented Nov 21, 2023

merrymercy commented Nov 22, 2023

infwinston commented Dec 1, 2023

Ying1123 Dec 28, 2023

shaunaa126 commented Dec 29, 2023

BabyChouSr commented Jan 19, 2024

[WIP] Llava #2639

[WIP] Llava #2639

Conversation

BabyChouSr commented Nov 4, 2023 • edited Loading

Why are these changes needed?

Checks

merrymercy left a comment

Choose a reason for hiding this comment

merrymercy left a comment

Choose a reason for hiding this comment

infwinston commented Nov 21, 2023

merrymercy commented Nov 22, 2023

infwinston commented Dec 1, 2023

Ying1123 Dec 28, 2023

Choose a reason for hiding this comment

shaunaa126 commented Dec 29, 2023

BabyChouSr commented Jan 19, 2024

BabyChouSr commented Nov 4, 2023 •

edited

Loading