[PT2E] observers do not handle inputs with different shapes correctly #2112

Xia-Weiwen · 2025-04-23T03:03:26Z

Summary
The observers in Torchao, such as AffineQuantizedMinMaxObserver, support different quantization granularity by keeping block sizes. For example, if granularity is PerTensor and input shape is (16, 3, 224, 224), the observer keeps block sizes = (16, 3, 224, 224).
However, the block sizes would be wrong if inputs with different shapes are passed in. For example, if another input with shape = (16, 3, 56, 56) comes, the block sizes are updated as (16, 3, 56, 56). It is wrong for inputs with shape = (16, 3, 224, 224).

How to reproduce
The code to reproduce is similar as the reproducer here: #2094
One needs to bypass the #2094 issue by changing source code manually and also run the converted_model after convert_pt2e:

    converted_model = convert_pt2e(prepared_model)
    move_exported_model_to_eval(converted_model)
    print("[info] converted_model =\n", converted_model)
    converted_model(*example_inputs)  # add this line

The text was updated successfully, but these errors were encountered:

jerryzh168 · 2025-04-23T05:02:34Z

yeah we haven't test per tensor that much I think, we'd need to fix per tensor use case

jerryzh168 · 2025-04-23T05:10:30Z

I have thought about this scenario a bit before, I think the fix for this one would be using granularity instead of block_size in the observer, please feel free to take this if you have time @Xia-Weiwen

Xia-Weiwen · 2025-04-23T05:12:41Z

Do you mean using granularity in all cases or only for per-tensor?

Xia-Weiwen · 2025-04-24T00:14:59Z

I think using block size = -1 for per-tensor also makes sense. What do you think?

jerryzh168 · 2025-04-24T00:21:11Z

Do you mean using granularity in all cases or only for per-tensor?

just for per tensor for now, we can assert that only one of granularity and block_size can be valid, and use the one that's available

jerryzh168 · 2025-04-24T00:22:13Z

I think using block size = -1 for per-tensor also makes sense. What do you think?

yeah this could make sense, but then the question is can we express all dynamic shapes with block_size?

Xia-Weiwen · 2025-04-24T00:42:32Z

I think so. For a certain dimension:

If group size > 1 is specified, we just use that group size as block size
If group size == 1, such as per-channel and per token, we use block size = 1
If it's not group-wise quantization, such as per-tensor and the other dimension of per-channel, we use block size = -1

All cases can be covered. How does that sound?

jerryzh168 · 2025-04-24T00:53:38Z

I think so. For a certain dimension:

If group size > 1 is specified, we just use that group size as block size

If group size == 1, such as per-channel and per token, we use block size = 1

If it's not group-wise quantization, such as per-tensor and the other dimension of per-channel, we use block size = -1

All cases can be covered. How does that sound?

I remember discussed this with @drisspg and @vkuzo before. there is another dimension which is the number of dims (rank of the tensor), e.g. for per token/per row, it is actually: all the preceding dims except for last dim should use size 1, except for the last dim:

ao/torchao/quantization/observer.py

Line 81 in 2fcab01

elif isinstance(granularity, PerRow):

, e.g. for 2-dim tensor, block_size will be (1, shape[-1]) for 3-dim tnesor block_size will be (1, 1, shape[-1])

I'm not sure if we will have a input tensor that has variable rank though

So in theory this is not enough, but maybe it's enough in practice?

Xia-Weiwen · 2025-04-24T01:03:55Z

I see. Yeah maybe an observer would not see multiple ranks in practice but I am not sure. Would you have more discussions and decide what to do? Thanks.

jerryzh168 · 2025-04-24T02:10:45Z

yeah we'll discuss in our team meeting tomorrow to decide on what to do

jerryzh168 · 2025-04-28T23:46:15Z

@Xia-Weiwen we discussed in the meeting and we feel that adding -1 to represent dynamic size makes sense, and since we don't anticipate seeing cases that has dynamic dimension size (rank) as well, we can ignore that for now. do you want to start adding support for -1 for per tensor quantization first

Xia-Weiwen · 2025-04-29T02:07:10Z

@Xia-Weiwen we discussed in the meeting and we feel that adding -1 to represent dynamic size makes sense, and since we don't anticipate seeing cases that has dynamic dimension size (rank) as well, we can ignore that for now.

Great

do you want to start adding support for -1 for per tensor quantization first

Sure, I will work on that.

BTW, sounds like it can also fix the issue #2094

Xia-Weiwen · 2025-05-06T09:49:51Z

Hi @jerryzh168 We do see dynamic dimension size when running Resnet18. So, block_size = -1 does not work. Do you want to use granularity instead of block sizes for per-tensor quantization? Thanks.

jerryzh168 · 2025-05-06T18:23:48Z

Hi @jerryzh168 We do see dynamic dimension size when running Resnet18. So, block_size = -1 does not work. Do you want to use granularity instead of block sizes for per-tensor quantization? Thanks.

I see, if it's just per tensor, I think we can use -1 to represent it

drisspg · 2025-05-08T01:39:44Z

The discussion was that dynamic rank is less common, and in that case -1 will likely not work. But most common quantization schemes w/ varying sizes in fixed rank tensors can be expressed w/ -1 + integer

Xia-Weiwen · 2025-05-08T01:45:34Z

The discussion was that dynamic rank is less common, and in that case -1 will likely not work. But most common quantization schemes w/ varying sizes in fixed rank tensors can be expressed w/ -1 + integer

Thanks for your comment. Dynamic rank was found when running Resnet18, which is a classic CNN model. And the same issue may exit with all Resnet-like models. May I know what the conclusion is to cover all cases? And what does "-1 + integer" mean? Thanks.

jerryzh168 · 2025-05-08T02:12:28Z

@Xia-Weiwen for the case you found in resnet18, is the dynamic dimension tensor only quantized with per tensor quantization? if so I think we can use a single -1 number to represent it: both dynamic rank and dynamic dim, in other cases, we will only allow fixed rank and dynamic dim. So for example:

Per Tensor: -1
Per Group: (1, 128)
Per Axis (axis = 2): (m, n, 1)
Per Axis (axis = 2 and dim 0 is dynamic): (-1, n, 1)

Would this work?

Xia-Weiwen · 2025-05-08T03:45:55Z

@jerryzh168 Thanks for the suggestion. As I asked here #2177 (comment), it requires bypassing checks of dimensions for per-tensor quantization. Is it OK for you?

jerryzh168 · 2025-05-08T04:04:10Z

yeah I think that is fine, we need to adapt our quant primitives to work with -1 semantics overall

Xia-Weiwen · 2025-05-08T07:17:15Z

Ok. I have updated the PR here: #2177. Now we use [-1] instead of integer -1 for per-tensor quant due to some grammar checks. Is it ok? Please review the PR. Thanks.

Xia-Weiwen assigned jerryzh168 Apr 23, 2025

Xia-Weiwen changed the title ~~[PT2E] observers does not handle inputs with different shapes correctly~~ [PT2E] observers do not handle inputs with different shapes correctly Apr 23, 2025

jerryzh168 added the pt2e_quant pt2 export quantization label Apr 29, 2025

Xia-Weiwen self-assigned this Apr 29, 2025

jerryzh168 added the triaged label Apr 29, 2025

jerryzh168 removed their assignment Apr 29, 2025

Xia-Weiwen mentioned this issue May 6, 2025

[PT2E] Fix per-tensor observer issue with varying shape & rank #2177

Open

[PT2E] observers do not handle inputs with different shapes correctly #2112

[PT2E] observers do not handle inputs with different shapes correctly #2112

Comments

Xia-Weiwen commented Apr 23, 2025

jerryzh168 commented Apr 23, 2025

Uh oh!

jerryzh168 commented Apr 23, 2025

Uh oh!

Xia-Weiwen commented Apr 23, 2025

Uh oh!

Xia-Weiwen commented Apr 24, 2025

Uh oh!

jerryzh168 commented Apr 24, 2025

Uh oh!

jerryzh168 commented Apr 24, 2025

Uh oh!

Xia-Weiwen commented Apr 24, 2025

Uh oh!

jerryzh168 commented Apr 24, 2025

Uh oh!

Xia-Weiwen commented Apr 24, 2025

Uh oh!

jerryzh168 commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jerryzh168 commented Apr 28, 2025

Uh oh!

Xia-Weiwen commented Apr 29, 2025

Uh oh!

Xia-Weiwen commented May 6, 2025

Uh oh!

jerryzh168 commented May 6, 2025

Uh oh!

drisspg commented May 8, 2025

Uh oh!

Xia-Weiwen commented May 8, 2025

Uh oh!

jerryzh168 commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Xia-Weiwen commented May 8, 2025

Uh oh!

jerryzh168 commented May 8, 2025

Uh oh!

Xia-Weiwen commented May 8, 2025

Uh oh!

jerryzh168 commented Apr 24, 2025 •

edited

Loading

jerryzh168 commented May 8, 2025 •

edited

Loading