Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving quantized Aria weights #134

Open
leon-seidel opened this issue Nov 21, 2024 · 2 comments
Open

Saving quantized Aria weights #134

leon-seidel opened this issue Nov 21, 2024 · 2 comments

Comments

@leon-seidel
Copy link

First of all, great work with the Aria HQQ quant!
Is it somehow possible to save the quantized weights? Neither model.save_pretrained(save_dir) nor AutoHQQHFModel.save_quantized(model, save_dir) seem to work with HQQ Aria for now, as the seem to only save a 2GB part of the model. Would be ideal to just pull the quant from the hub instead of the full model.
Thanks in advance,
Leon

@mobicham
Copy link
Collaborator

Hi @leon-seidel
I don't think it's possible to do it with the current code. I can make a hacky version but I am waiting for the official PR to get merged, then I think we should be able to save/load directly via transformers:
huggingface/transformers#34157

@leon-seidel
Copy link
Author

Alright, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants