Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rotation+gptq data #20

Open
Andy0422 opened this issue Oct 11, 2024 · 7 comments
Open

rotation+gptq data #20

Andy0422 opened this issue Oct 11, 2024 · 7 comments

Comments

@Andy0422
Copy link

Hi,

Can you share the rotation+gptq ppl data? is it better than smoothquant+gptq? Many tks!

@HandH1998
Copy link
Owner

Ref to #13 (comment).
In my practice, rotation+gptq is generally better than smooth+gptq for per-channel quantization. However, this is not the case for some models, such as #17.

@Andy0422
Copy link
Author

Andy0422 commented Oct 14, 2024

@HandH1998

Hi,thank you for your kindly help.
I encountered another problem with the calibration data,

from my test result as following, the results with wikitext2 seems ok, and the results with pile calib dataset is not aligned with your original data. The pile data I used in from https://huggingface.co/datasets/mit-han-lab/pile-val-backup/tree/main,
could share your pile dataset for me? or share your comments on this finding. email: [email protected].

<style> </style>
Granularity Method Llama-2 Wikitext2 Pile paper data
per-channel smooth+gptq 7B 5.98 6.14 5.95
per-group smooth+gptq   5.71 5.78 5.71

@HandH1998
Copy link
Owner

@Andy0422 We used pile for smoothing and wikitext2 for gptq in our paper. But the current code has fixed this issue to use the same dataset for both smoothing and gptq. So it is normal that you cannot reprocude the results of our paper using the latest code. It is not relevant with the pile data.

@Andy0422
Copy link
Author

@Andy0422 We used pile for smoothing and wikitext2 for gptq in our paper. But the current code has fixed this issue to use the same dataset for both smoothing and gptq. So it is normal that you cannot reprocude the results of our paper using the latest code. It is not relevant with the pile data.
@HandH1998
okay, see... So do you think our test results is correct ? Thank you!

@HandH1998
Copy link
Owner

@Andy0422 It is probably correct.

@Andy0422
Copy link
Author

@Andy0422 It is probably correct.

@HandH1998 One more question, do you employ the online Hadmamad transform before the down_proj or ignore all the online transform in your implementation? If yes, do you evaluate the overhead in inference? Thanks~

@HandH1998
Copy link
Owner

@Andy0422 I don't employ the online Hadamard transform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants