Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantization Fixes #35

Merged
merged 5 commits into from
Apr 25, 2024
Merged

Quantization Fixes #35

merged 5 commits into from
Apr 25, 2024

Conversation

Satrat
Copy link

@Satrat Satrat commented Apr 25, 2024

A few change so we match pytorch quantization more closely:

  • Adding averaging constant and renaming MinMaxObserver to MovingAverageMinMaxObserver
  • using torch.aminmax to calculate mins and maxes in observer. (This fixed some errors we had against the pytorch quantization, I'm really not sure why)
  • slight implementation changes in scale/zp calculation

Before these changes we were consistently seeing a 10-15% drop in perplexity compared to pytorch quantization. Now we are matching them within 2% and aren't consistently worse. We still don't have exact matching of scale/zeropoint calculations but I'm chalking that up to C++ vs Python differences (some of the pytorch quantization is implemented in c++)

perplexity test is in: neuralmagic/sparseml#2246

:param observed: observed tensor to calculate quantization parameters for
:return: tuple of scale and zero point derived from the observed tensor
"""

min_val = torch.tensor([observed.min()])
max_val = torch.tensor([observed.max()])
min_val, max_val = torch.aminmax(observed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why this makes a difference

@bfineran bfineran merged commit b3edb25 into main Apr 25, 2024
2 checks passed
@bfineran bfineran deleted the sa/observer_fix branch April 25, 2024 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants