Weird output for MSVD-QA dataset #354

oximi123 · 2024-11-28T12:44:03Z

Hi, when I was using the following inference code (the same as provided on Huggingface) to evaluate LLaVA-Video-7B-Qwen2 on MSVD-QA dataset, it gives weird output like the repeating endless 0 or two "is" in the same sentence. How can I solve this problem?

tokenizer, model, image_processor, max_length = load_pretrained_model(pretrained, None, model_name, torch_dtype="bfloat16", device_map=device_map)  # Add any other thing you want to pass in llava_model_args
model.eval()
video_path = "XXXX"
max_frames_num = 64
video,frame_time,video_time = load_video(video_path, max_frames_num, 1, force_sample=True)
video = image_processor.preprocess(video, return_tensors="pt")["pixel_values"].cuda().half()
video = [video]
conv_template = "qwen_1_5"  # Make sure you use correct chat template for different models
time_instruciton = f"The video lasts for {video_time:.2f} seconds, and {len(video[0])} frames are uniformly sampled from it. These frames are located at {frame_time}.Please answer the following questions related to this video."
question = DEFAULT_IMAGE_TOKEN + f"\n{time_instruciton}\nPlease describe this video in detail."
conv = copy.deepcopy(conv_templates[conv_template])
conv.append_message(conv.roles[0], question)
conv.append_message(conv.roles[1], None)
prompt_question = conv.get_prompt()
input_ids = tokenizer_image_token(prompt_question, tokenizer, IMAGE_TOKEN_INDEX, return_tensors="pt").unsqueeze(0).to(device)
cont = model.generate(
    input_ids,
    images=video,
    modalities= ["video"],
    do_sample=False,
    temperature=0,
    max_new_tokens=4096,
)

By the way, I observe the similar weird output on Vstream-QA dataset ().. Do I need to change the prompt and how?

The text was updated successfully, but these errors were encountered:

Rachel0901 · 2024-12-12T00:50:28Z

I also noticed repetitive contents in the output. Not sure if this is an issue related to qwen model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weird output for MSVD-QA dataset #354

Weird output for MSVD-QA dataset #354

oximi123 commented Nov 28, 2024

Rachel0901 commented Dec 12, 2024

Weird output for MSVD-QA dataset #354

Weird output for MSVD-QA dataset #354

Comments

oximi123 commented Nov 28, 2024

Rachel0901 commented Dec 12, 2024