Qwen2.5-72b-instruct Repeating Outputs #988

thiner · 2024-09-27T03:40:24Z

thiner
Sep 27, 2024

我用vllm 0.6.2部署了 Qwen2.5-72b-instruct-GPTQ-Int4模型。但是我发现这个模型在处理长文本时经常输出重复内容。请问是配置上有什么需要注意的吗？
我的vllm参数基本保持默认，只加了rope_scaling。模型设置方面，我已经尝试加大了frequency_penalty, 有一些改善，但是太高又会导致翻译质量显著下降。

Answered by jklj077

Sep 27, 2024

如果是说该停的时候没停，然后后面都是重复内容的话，这个我们目前观察大概率是量化导致的。可以试试换AWQ，能缓解一些，原始精度模型目前看是正常的。

https://qwen.readthedocs.io/zh-cn/latest/quantization/gptq.html#qwen2-5-72b-instruct-gptq-int4-cannot-stop-generation-properly

另外，由于vLLM默认的采样超参并不会读取模型文件中的默认参数，这边也建议一般都加上：https://qwen.readthedocs.io/zh-cn/latest/deployment/vllm.html#openai-compatible-api-service (并不针对该情况)

View full answer

jklj077 · 2024-09-27T04:57:28Z

jklj077
Sep 27, 2024
Maintainer

如果是说该停的时候没停，然后后面都是重复内容的话，这个我们目前观察大概率是量化导致的。可以试试换AWQ，能缓解一些，原始精度模型目前看是正常的。

https://qwen.readthedocs.io/zh-cn/latest/quantization/gptq.html#qwen2-5-72b-instruct-gptq-int4-cannot-stop-generation-properly

另外，由于vLLM默认的采样超参并不会读取模型文件中的默认参数，这边也建议一般都加上：https://qwen.readthedocs.io/zh-cn/latest/deployment/vllm.html#openai-compatible-api-service (并不针对该情况)

2 replies

thiner Sep 27, 2024
Author

你提到的“采样参数”是指下面这几个参数吗？

"temperature": 0.7,
  "top_p": 0.8,
  "repetition_penalty": 1.05,

thiner Sep 27, 2024
Author

模型换成AWQ量化后，又调整了提示词，repetition_penalty保持默认值，现在已经基本不会出现重复的情况了。

thiner · 2024-09-27T06:53:41Z

thiner
Sep 27, 2024
Author

@jklj077 是的，内容完成后又开始重复，有时候是最后一小段，有时候是重复输出最后一个字符。这是vllm的问题吗？如果是的话，那么我换一个推理服务是不是就好了？

0 replies

thiner · 2024-09-27T08:02:46Z

thiner
Sep 27, 2024
Author

我将模型换成了AWQ量化，重复内容的现象的确有缓解。但是依然会出现，尤其是输出内容特别长的时候。我尝试了设置repetition_penalty=1.05，这样会严重干扰后面输出的内容，大部分前面出现过的词会被替换成“*”号。这个方法完全不可行。

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen2.5-72b-instruct Repeating Outputs #988

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Qwen2.5-72b-instruct Repeating Outputs #988

thiner Sep 27, 2024

Replies: 3 comments · 2 replies

jklj077 Sep 27, 2024 Maintainer

thiner Sep 27, 2024 Author

thiner Sep 27, 2024 Author

thiner Sep 27, 2024 Author

thiner Sep 27, 2024 Author

thiner
Sep 27, 2024

Replies: 3 comments 2 replies

jklj077
Sep 27, 2024
Maintainer

thiner Sep 27, 2024
Author

thiner Sep 27, 2024
Author

thiner
Sep 27, 2024
Author

thiner
Sep 27, 2024
Author