Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deepseek微调后进行推理输出混乱 #6908

Open
1 task done
HelloWorld506 opened this issue Feb 12, 2025 · 1 comment
Open
1 task done

deepseek微调后进行推理输出混乱 #6908

HelloWorld506 opened this issue Feb 12, 2025 · 1 comment
Labels
bug Something isn't working pending This problem is yet to be addressed

Comments

@HelloWorld506
Copy link

Reminder

  • I have read the above rules and searched the existing issues.

System Info

最新版llamafactory

Reproduction

我微调了deepseek-qwen-7B模型,我的输出只有A,B,C,训练时准确率很高,但是推理时会输出思维链,甚至会有<|im_start|>user类似的在input中的词,请问训练时是做了什么操作让其不输出思维链吗,另外推理时输出在input中的词是为什么呢,应该如何解决呢

Others

No response

@HelloWorld506 HelloWorld506 added bug Something isn't working pending This problem is yet to be addressed labels Feb 12, 2025
@Haroldhy
Copy link

什么叫训练的时候准确率很高,训练准确率是意思?推理时又是什么意思,用webui推理还是transformers还是别的什么框架

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

2 participants