Saving quantized Aria weights #134

leon-seidel · 2024-11-21T10:56:24Z

First of all, great work with the Aria HQQ quant!
Is it somehow possible to save the quantized weights? Neither model.save_pretrained(save_dir) nor AutoHQQHFModel.save_quantized(model, save_dir) seem to work with HQQ Aria for now, as the seem to only save a 2GB part of the model. Would be ideal to just pull the quant from the hub instead of the full model.
Thanks in advance,
Leon

The text was updated successfully, but these errors were encountered:

mobicham · 2024-11-21T11:03:29Z

Hi @leon-seidel
I don't think it's possible to do it with the current code. I can make a hacky version but I am waiting for the official PR to get merged, then I think we should be able to save/load directly via transformers:
huggingface/transformers#34157

leon-seidel · 2024-11-21T11:27:08Z

Alright, thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Saving quantized Aria weights #134

Saving quantized Aria weights #134

leon-seidel commented Nov 21, 2024

mobicham commented Nov 21, 2024

leon-seidel commented Nov 21, 2024

Saving quantized Aria weights #134

Saving quantized Aria weights #134

Comments

leon-seidel commented Nov 21, 2024

mobicham commented Nov 21, 2024

leon-seidel commented Nov 21, 2024