Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Generation Error #474

Open
tomato996 opened this issue Feb 13, 2025 · 0 comments
Open

A Generation Error #474

tomato996 opened this issue Feb 13, 2025 · 0 comments
Assignees

Comments

@tomato996
Copy link

tomato996 commented Feb 13, 2025

My model encountered a strange problem. After loading the model, the first few inferences could output content normally. However, after a few inferences, the model's inference time suddenly increased and the output of meaningless tokens was like this.

Image

I think this is not the data problem , because sometimes the same problem can get a normal response, and sometimes it can't .

Here is my code to load and inference

ckpt_path = "/ceph_data/szy/internlm-xcomposer2d5-7B"
self.model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.bfloat16, trust_remote_code=True).cuda()
self.tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True)
with torch.autocast(device_type='cuda', dtype=torch.float16):
     response, _ = self.model.chat(self.tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants