You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
for supportting CNN mode, I modified the GPTQ code as follows:
1, supportting group conv;
2, use symmetric quantization without zero point parameter.
But I found it performance not good on mobilenetv2/mnasnet1_0 models when quantization bits = 4.
Here are my results:
model | FP32 | GPTQ_W4 sym
mbv2 71.88 60.84(84.64%)
mnasnet1_0 73.47 64.71(88.08%)
I saw resnet18/resnet50 quantization result in your paper only, have you tested gptq on mobilenetv2/mnasnet1_0 model?
Looking forward to your reply...
The text was updated successfully, but these errors were encountered:
Hi,
for supportting CNN mode, I modified the GPTQ code as follows:
1, supportting group conv;
2, use symmetric quantization without zero point parameter.
But I found it performance not good on mobilenetv2/mnasnet1_0 models when quantization bits = 4.
Here are my results:
model | FP32 | GPTQ_W4 sym
mbv2 71.88 60.84(84.64%)
mnasnet1_0 73.47 64.71(88.08%)
I saw resnet18/resnet50 quantization result in your paper only, have you tested gptq on mobilenetv2/mnasnet1_0 model?
Looking forward to your reply...
The text was updated successfully, but these errors were encountered: