-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added some preliminary unit tests to the CNNs 'quantize_model' #927
Conversation
I missed this comment on the other PR. The bias bitwidth can be None, which means that the bias is not quantized. Leaving this here if it could be useful for some tests. |
I'll incorporate it! |
66e029b
to
56daf8d
Compare
from brevitas.nn import QuantReLU | ||
from brevitas.quant_tensor import QuantTensor | ||
from brevitas_examples.imagenet_classification.ptq.ptq_common import quantize_model | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remaining ToDos, but we can add to it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Save for the missing minifloat tests (see below), I think we can add the other two todos and review/merge this.
Great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, shall do!
) | ||
|
||
|
||
def test_layerwise_valid_minifloat_bit_widths(minimal_model): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is getting there, but still need to work on this. I am testing if my explicit implementation of minifloat quantization matches the under-the-hood Brevitas one, but could use some guidance on whether this is correctly implemented. I'll be hacking at it either way!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it can be helpful, there are two PRs (#922 and #919) where we are expanding support to minifloat, to match the level of support we have for integer quantization. This means that a minifloat QuantTensor will have the correct metadata to properly characterize it, and I think that could be helpful in the writing of the tests.
If you agree with this, I'm happy to leave the minifloat tests to another PRs after those two have been merged, as not to block this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good!
@@ -557,5 +557,8 @@ def check_positive_int(*args): | |||
We check that every inputted value is positive, and an integer. | |||
""" | |||
for arg in args: | |||
if not arg: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant to PR: #934
Included here to make the tests pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR has been merged so you can just rebase now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, have rebased!
b0d1a7d
to
16a275d
Compare
I think all of the ToDos (for this PR) are done, subject to whatever changes are desired! The two tests I added in the latest commit:
|
Some Some debugging:
inside the Graph Mode forward call.
I'm AFK for next week, but will pick this up when I get back! I'll look into why the input scale is missing, and solutions for it. |
Ok, I've narrowed in somewhat on what the issue is with the Python/Torch versioning. I refer to the old version that isn't working as 1.9.1 after Torch 1.9.1. It could also be Python versioning, but I assume not. In 1.9.1, we get an error when we try to feed data through the quantized model:
where the quantized model is given by e.g.:
The issue is in Ultimately, this is because the graph of the model is not quantizing correctly.
Whereas if I print out the graph in a later version of PyTorch (not 1.9.1), it uses the quantized tensor correctly:
The issue particularly happens in There is a red herring: unlike later versions, 1.9.1 throws this warning (from inside
However, that warning is because of a bug in PyTorch, and they've since fixed it. I.e. in 1.9.1 they had:
Instead of:
I.e. The difference in the graph actually manifests here (from
For some reason, in 1.9.1 the graph doesn't change, but it does in late Torch versions. The issue may be in Sorry for the piecemeal update, am working on it in the background, but I leave an update just so in case I end up being busy with something else there'll be some record. Hopefully I can figure this out tomorrow! |
Thanks for the update. I will try again using these insights. Let me know if you manage to find out more, and I'll do the same! |
I haven't really been able to spend a lot of time on this. The issue seems to be rooted in a specific Torch version, and I'm wondering if it's not just simpler for Brevitas to specify that that specific Torch version isn't supported: it might legitimately be a bug in an old version of PyTorch when FX Graph mode was new. If that's not an option I can try and debug this again. |
I think I pushed something to dev that broke some of your tests. Thanks for implementing all these tests, it is an amazing work! |
…st some aspects of 'layerwise' and 'fx' quantization, as well as some invalid inputs, e.g. invalid strings and zero and negative valued bit widths.
…arted work on minifloat quantization
…g MSE for qparam calibration.
147f516
to
1f9e2f2
Compare
Sure, no problem! Thanks, I appreciate it! 🙂 |
This PR is a work in progress, and I expect to add more tests.
As of commit 66e029b, we test some aspects of
layerwise
andfx
quantization, as well as some invalid inputs, e.g. invalid strings and zero and negative valued bit widths.