Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Attention mask should be of size (4, 1, 240, 480), but is torch.Size([4, 1, 240, 240]) #12

Open
LiBinNLP opened this issue Dec 13, 2023 · 3 comments

Comments

@LiBinNLP
Copy link

I met this issue when fine-tuning the LLaMa-7B-Chat-hf with example dataset:

Traceback (most recent call last):
File "finetune-lora.py", line 656, in
train()
File "finetune-lora.py", line 622, in train
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/transformers/trainer.py", line 1537, in train
return inner_training_loop(
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/transformers/trainer.py", line 1854, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/transformers/trainer.py", line 2732, in training_step
self.accelerator.backward(loss)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/accelerate/accelerator.py", line 1905, in backward
loss.backward(**kwargs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/_tensor.py", line 488, in backward
torch.autograd.backward(
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/autograd/init.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/autograd/function.py", line 267, in apply
return user_fn(self, *args)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/utils/checkpoint.py", line 141, in backward
outputs = ctx.run_function(*detached_inputs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
output = module._old_forward(*args, **kwargs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 789, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
output = module._old_forward(*args, **kwargs)
File "/sda/libin/anaconda3/envs/llama2/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 423, in forward
raise ValueError(
ValueError: Attention mask should be of size (4, 1, 240, 480), but is torch.Size([4, 1, 240, 240])

@mesdaq
Copy link

mesdaq commented Dec 18, 2023

解决了吗,一样的问题,是输入的模型问题吗

@LiBinNLP
Copy link
Author

解决了吗,一样的问题,是输入的模型问题吗

没解决,用另一个仓库的代码就没问题了:https://github.com/tloen/alpaca-lora。

@yangjianxin1
Copy link

遇到相同的问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants