-
-
Notifications
You must be signed in to change notification settings - Fork 310
Yihui He 何宜晖 edited this page Jan 9, 2018
·
2 revisions
Our 3C approach applies 3 methods sequentially. Given conv weights W
:
- Spatial Decomposition produces
W_v
andW_h
. - Channel Decomposition decomposes
W_h
and outputsW_h'
andW_p
. - Channel Pruning prunes
W_p
.
In the beginning, we adopt Filter Reconstruction in the Spatial Decomposition, which is data independent.
We found that the whole model performance can be improved by minimizing the error on the output feature map after ReLU with W_h
(namely, data dependent). The method is from nonlinear case 3.2 in Channel Decomposition. The corresponding function in our code is nonlinear_fc
.
It involves two alternative steps.
First, minimize the error on the feature map before ReLU with linear least squares:
Second, minimize the error on the feature map after ReLU: