Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: return_dict=False is not working in minicpm3 #263

Open
1 task done
TianmengChen opened this issue Nov 18, 2024 · 0 comments
Open
1 task done

[Bug]: return_dict=False is not working in minicpm3 #263

TianmengChen opened this issue Nov 18, 2024 · 0 comments
Labels
bug Something isn't working triage

Comments

@TianmengChen
Copy link

Is there an existing issue ? / 是否已有相关的 issue ?

  • I have searched, and there is no existing issue. / 我已经搜索过了,没有相关的 issue。

Describe the bug / 描述这个 bug

Add parameter return_dict=False to model.generate or from_pretrained dose not affect the value of return_dict=True in the pipeline. If direct change return_dict=False, error occurs.

To Reproduce / 如何复现

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

path = "MiniCPM3-4B"
device = "cpu"

tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map=device, trust_remote_code=True, return_dict=False)

messages = [
    {"role": "user", "content": "推荐5个北京的景点。"},
]
model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(device)
model.config.torchscript = True

model_outputs = model.generate(
    model_inputs,
    max_new_tokens=128,
    return_dict=False
)

output_token_ids = [
    model_outputs[i][len(model_inputs[i]):] for i in range(len(model_inputs))
]

responses = tokenizer.batch_decode(output_token_ids, skip_special_tokens=True)[0]
print(responses)

then print retunr_dict after return_dict = return_dict if return_dict is not None else self.config.use_return_dict

Expected behavior / 期望的结果

No response

Screenshots / 截图

No response

Environment / 环境

- OS: [e.g. Ubuntu 20.04]
- Pytorch: [e.g. torch 2.0.0]
- CUDA: [e.g. CUDA 11.8]
- Device: [e.g. A10, RTX3090]

Additional context / 其他信息

No response

@TianmengChen TianmengChen added bug Something isn't working triage labels Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

1 participant