Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: weight lm_head.weight does not exist,我用Qwen2进行dpo微调后,再调用模型报错 #6073

Open
dahaogewsh opened this issue Nov 19, 2024 · 1 comment
Labels
pending This problem is yet to be addressed

Comments

@dahaogewsh
Copy link

报错信息:
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 92, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 246, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 205, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 622, in get_model
return FlashQwen2(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_qwen2.py", line 72, in init
model = Qwen2ForCausalLM(config, weights)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_qwen2_modeling.py", line 351, in init
self.lm_head = SpeculativeHead.load(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/layers.py", line 615, in load
lm_head = TensorParallelHead.load(config, prefix, weights)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/layers.py", line 654, in load
weight = weights.get_tensor(f"{prefix}.weight")
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/weights.py", line 99, in get_tensor
filename, tensor_name = self.get_filename(tensor_name)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/weights.py", line 63, in get_filename
raise RuntimeError(f"weight {tensor_name} does not exist")
RuntimeError: weight lm_head.weight does not exist

微调参数:

model

model_name_or_path: qwen2-1.5b

method

stage: dpo
do_train: true
finetuning_type: full
pref_beta: 0.1
pref_loss: simpo # choices: [sigmoid (dpo), orpo, simpo]
pref_ftx: 0.5
#simpo_gamma: 0.6
#dpo_label_smoothing: 0.1

dataset

dataset: zk_dpo
template: empty
cutoff_len: 1024
overwrite_cache: true
preprocessing_num_workers: 16

output

output_dir: saves/zk/dpo
logging_steps: 10
save_steps: 5000
plot_loss: true
overwrite_output_dir: true

train

per_device_train_batch_size: 8
gradient_accumulation_steps: 2
learning_rate: 5.0e-6
num_train_epochs: 1.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
report_to: wandb

@github-actions github-actions bot added the pending This problem is yet to be addressed label Nov 19, 2024
@dahaogewsh
Copy link
Author

这是怎么回事啊,我查了一圈,看到只有三个模型会省掉lm_head.weight,没有说qwen2保存模型的时候会省这个参数啊,哪位大佬帮帮我?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

1 participant