You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
报错信息:
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 92, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 246, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 205, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 622, in get_model
return FlashQwen2(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_qwen2.py", line 72, in init
model = Qwen2ForCausalLM(config, weights)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_qwen2_modeling.py", line 351, in init
self.lm_head = SpeculativeHead.load(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/layers.py", line 615, in load
lm_head = TensorParallelHead.load(config, prefix, weights)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/layers.py", line 654, in load
weight = weights.get_tensor(f"{prefix}.weight")
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/weights.py", line 99, in get_tensor
filename, tensor_name = self.get_filename(tensor_name)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/weights.py", line 63, in get_filename
raise RuntimeError(f"weight {tensor_name} does not exist")
RuntimeError: weight lm_head.weight does not exist
报错信息:
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py", line 92, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 246, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py", line 205, in serve_inner
model = get_model(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/init.py", line 622, in get_model
return FlashQwen2(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_qwen2.py", line 72, in init
model = Qwen2ForCausalLM(config, weights)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_qwen2_modeling.py", line 351, in init
self.lm_head = SpeculativeHead.load(
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/layers.py", line 615, in load
lm_head = TensorParallelHead.load(config, prefix, weights)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/layers.py", line 654, in load
weight = weights.get_tensor(f"{prefix}.weight")
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/weights.py", line 99, in get_tensor
filename, tensor_name = self.get_filename(tensor_name)
File "/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/weights.py", line 63, in get_filename
raise RuntimeError(f"weight {tensor_name} does not exist")
RuntimeError: weight lm_head.weight does not exist
微调参数:
model
model_name_or_path: qwen2-1.5b
method
stage: dpo
do_train: true
finetuning_type: full
pref_beta: 0.1
pref_loss: simpo # choices: [sigmoid (dpo), orpo, simpo]
pref_ftx: 0.5
#simpo_gamma: 0.6
#dpo_label_smoothing: 0.1
dataset
dataset: zk_dpo
template: empty
cutoff_len: 1024
overwrite_cache: true
preprocessing_num_workers: 16
output
output_dir: saves/zk/dpo
logging_steps: 10
save_steps: 5000
plot_loss: true
overwrite_output_dir: true
train
per_device_train_batch_size: 8
gradient_accumulation_steps: 2
learning_rate: 5.0e-6
num_train_epochs: 1.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: true
ddp_timeout: 180000000
report_to: wandb
The text was updated successfully, but these errors were encountered: