求助为什么量化以后模型没有减小？ #564

stricklandye · 2024-05-28T09:27:42Z

使用repo里的ProgramEntrance_1.py来量化一个简单的模型:

class GarmentClassifier(nn.Module):
    def __init__(self):
        super(GarmentClassifier, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 4 * 4, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.reshape(-1, 16 * 4 * 4)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

有些不理解的是，为什么量化后的模型大小没有变化？使用toch fx是可以显著减小模型大小的。根据issue68，ppq的做法就是导出一个fp32模型以及一份conf文件，并且这个conf文件需要喂给推理框架。我端侧框架用的mnn，ppq导出的json怎么直接喂给mnn呢？有示例代码吗？

下面是上面示例模型经过torch fx量化和 ppq以后的结果:

[4.0K]  .
├── [1.8K]  demo.py
├── [175K]  fashion_mnist.onnx // 原模型大小
├── [ 58K]  fx_quant_fashion_mnist.onnx // fx 模型量化后模型
├── [ 163]  generate_data.py
├── [8.5K]  ppq_runner.py
└── [4.0K]  working
    ├── [175K]  model.onnx
    ├── [ 37K]  quant_cfg.json
    └── [182K]  quantized.onnx // ppq量化后模型

我应该怎么做可以直接减小模型大小？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

求助为什么量化以后模型没有减小？ #564

求助为什么量化以后模型没有减小？ #564

stricklandye commented May 28, 2024 •

edited

Loading

求助 为什么量化以后模型没有减小？ #564

求助 为什么量化以后模型没有减小？ #564

Comments

stricklandye commented May 28, 2024 • edited Loading

求助为什么量化以后模型没有减小？ #564

求助为什么量化以后模型没有减小？ #564

stricklandye commented May 28, 2024 •

edited

Loading