RuntimeError: CUDA error: device-side assert triggered #220

hessaAlawwad · 2024-11-14T08:49:04Z

Hello,
I am trying the following code to test sending multiple images:

import requests
import torch
from PIL import Image
from transformers import MllamaForConditionalGeneration, AutoProcessor

model_id = "meta-llama/Llama-3.2-11B-Vision-Instruct"

# model = MllamaForConditionalGeneration.from_pretrained(
#     model_id,
#     torch_dtype=torch.bfloat16,
#     device_map="auto",
# )
# processor = AutoProcessor.from_pretrained(model_id)

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/0052a70beed5bf71b92610a43a52df6d286cd5f3/diffusers/rabbit.jpg"
image = Image.open(requests.get(url, stream=True).raw)

messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": "If I had to write a haiku for this one, it would be: "},
        {"type": "image"},
        {"type": "text", "text": "If I had to write a haiku for this one, it would be: "}
    ]}
]
input_text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(
    [image,image],
    input_text,
    add_special_tokens=False,
    return_tensors="pt"
).to(model.device)

output = model.generate(**inputs, max_new_tokens=30)
print(processor.decode(output[0]))

and got the error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-23-5e73b30f8d1d>](https://localhost:8080/#) in <cell line: 34>()
     32 ).to(model.device)
     33 
---> 34 output = model.generate(**inputs, max_new_tokens=30)
     35 print(processor.decode(output[0]))

3 frames
[/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py](https://localhost:8080/#) in _has_unfinished_sequences(self, this_peer_finished, synced_gpus, device, cur_len, max_length)
   2411                 if this_peer_finished_flag.item() == 0.0:
   2412                     return False
-> 2413             elif this_peer_finished:
   2414                 return False
   2415             return True

RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

how can I solve it?

The text was updated successfully, but these errors were encountered:

ashwinb · 2024-11-14T19:19:04Z

cc @init27, this is a huggingface specific issue.

init27 · 2024-11-14T20:17:50Z

Thanks Ashwin!
@hessaAlawwad-this is by design, for the current model, we only recommend chatting with one image in a session.

Sosycs · 2024-11-15T04:10:41Z

@init27 thank you sir, but when you say "we only recommend" do you mean it is possible to chat with multiple images?
because if so why would we get an error? it should pass and give weak results no?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA error: device-side assert triggered #220

RuntimeError: CUDA error: device-side assert triggered #220

hessaAlawwad commented Nov 14, 2024

ashwinb commented Nov 14, 2024

init27 commented Nov 14, 2024

Sosycs commented Nov 15, 2024

RuntimeError: CUDA error: device-side assert triggered #220

RuntimeError: CUDA error: device-side assert triggered #220

Comments

hessaAlawwad commented Nov 14, 2024

ashwinb commented Nov 14, 2024

init27 commented Nov 14, 2024

Sosycs commented Nov 15, 2024