You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, great work with the Aria HQQ quant!
Is it somehow possible to save the quantized weights? Neither model.save_pretrained(save_dir) nor AutoHQQHFModel.save_quantized(model, save_dir) seem to work with HQQ Aria for now, as the seem to only save a 2GB part of the model. Would be ideal to just pull the quant from the hub instead of the full model.
Thanks in advance,
Leon
The text was updated successfully, but these errors were encountered:
Hi @leon-seidel
I don't think it's possible to do it with the current code. I can make a hacky version but I am waiting for the official PR to get merged, then I think we should be able to save/load directly via transformers: huggingface/transformers#34157
First of all, great work with the Aria HQQ quant!
Is it somehow possible to save the quantized weights? Neither
model.save_pretrained(save_dir)
norAutoHQQHFModel.save_quantized(model, save_dir)
seem to work with HQQ Aria for now, as the seem to only save a 2GB part of the model. Would be ideal to just pull the quant from the hub instead of the full model.Thanks in advance,
Leon
The text was updated successfully, but these errors were encountered: