Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

求助 为什么量化以后模型没有减小? #564

Open
stricklandye opened this issue May 28, 2024 · 0 comments
Open

求助 为什么量化以后模型没有减小? #564

stricklandye opened this issue May 28, 2024 · 0 comments

Comments

@stricklandye
Copy link

stricklandye commented May 28, 2024

使用repo里的ProgramEntrance_1.py来量化一个简单的模型:

class GarmentClassifier(nn.Module):
    def __init__(self):
        super(GarmentClassifier, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 4 * 4, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.reshape(-1, 16 * 4 * 4)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

有些不理解的是,为什么量化后的模型大小没有变化?使用toch fx是可以显著减小模型大小的。根据issue68,ppq的做法就是导出一个fp32模型以及一份conf文件,并且这个conf文件需要喂给推理框架。我端侧框架用的mnn,ppq导出的json怎么直接喂给mnn呢? 有示例代码吗?

下面是上面示例模型经过torch fx量化和 ppq以后的结果:

[4.0K]  .
├── [1.8K]  demo.py
├── [175K]  fashion_mnist.onnx // 原模型大小
├── [ 58K]  fx_quant_fashion_mnist.onnx // fx 模型量化后模型
├── [ 163]  generate_data.py
├── [8.5K]  ppq_runner.py
└── [4.0K]  working
    ├── [175K]  model.onnx
    ├── [ 37K]  quant_cfg.json
    └── [182K]  quantized.onnx // ppq量化后模型

我应该怎么做可以直接减小模型大小?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant