Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when trying to run Llama2-7b: attention_mask and position_ids are None #60

Open
Philippe-Guyard opened this issue Jun 26, 2024 · 1 comment

Comments

@Philippe-Guyard
Copy link

From what I see in the Llama2 code on hugging face, the attention_mask and position_ids variables are never set by the model. This results in cache['attention_mask'] and cache['position_ids'] being None and the script failing on lib/prune.py line 144:

    if f"model.layers.{i}" in model.hf_device_map:   ## handle the case for llama-30B and llama-65B, when the device map has multiple GPUs;
            dev = model.hf_device_map[f"model.layers.{i}"]
            inps, outs, attention_mask, position_ids = inps.to(dev), outs.to(dev), attention_mask.to(dev), position_ids.to(dev)

Please note that I do not have access to GPUs with more than 40GB VRAM, and the 7B model does not fit in 40GB for me, so I have to use a device map for the 7B model, which leads to the following error.

@Logan-007L
Copy link

hello, Have you solved it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants