-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deepseek-chat模型LorA微调完没有adapter_config.json #238
Comments
可以是尝试一下 transformers版本为 4.31.3 |
transformers 4.31.3没有~用了transformers==4.31.0~但还是不行~感觉是不是我脚本写的有问题~ MAX_LENGTH = 384 # Llama分词器会将一个中文字切分为多个token,因此需要放开一些最大长度,保证数据的完整性 model = AutoModelForCausalLM.from_pretrained(path, trust_remote_code=True, torch_dtype=torch.half, device_map="auto") def process_func(example): dataset = load_dataset("json", data_files={"/zhen_huan_dataset_1000.json"}, split = 'train') processed_dataset = dataset.map( config = LoraConfig( output_dir="/code/lora" args = TrainingArguments( trainer = Trainer( |
deepseek-chat模型LorA微调完没有adapter_config.json,看其他issue里说,是因为transformer的版本问题,LorA微调出来的参数和基座模型的参数直接合并了,但是直接运行Lora微调出来的参数,模型回复的所有token都是0~,感觉是没有进行推理~
The text was updated successfully, but these errors were encountered: