-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rotation+gptq data #20
Comments
Ref to #13 (comment). |
Hi,thank you for your kindly help. from my test result as following, the results with wikitext2 seems ok, and the results with pile calib dataset is not aligned with your original data. The pile data I used in from https://huggingface.co/datasets/mit-han-lab/pile-val-backup/tree/main,
|
@Andy0422 We used pile for smoothing and wikitext2 for gptq in our paper. But the current code has fixed this issue to use the same dataset for both smoothing and gptq. So it is normal that you cannot reprocude the results of our paper using the latest code. It is not relevant with the pile data. |
|
@Andy0422 It is probably correct. |
@HandH1998 One more question, do you employ the online Hadmamad transform before the down_proj or ignore all the online transform in your implementation? If yes, do you evaluate the overhead in inference? Thanks~ |
@Andy0422 I don't employ the online Hadamard transform. |
Hi,
Can you share the rotation+gptq ppl data? is it better than smoothquant+gptq? Many tks!
The text was updated successfully, but these errors were encountered: