Unverified Commit 30361a2e authored by Ningxin Zheng's avatar Ningxin Zheng Committed by GitHub
Browse files

update compression speedup documentation (#3979)

parent 726a46de
...@@ -104,7 +104,7 @@ Complicated models may have residual connection/concat operations in their model ...@@ -104,7 +104,7 @@ Complicated models may have residual connection/concat operations in their model
If the layers have channel dependency are assigned with different sparsities (here we only discuss the structured pruning by L1FilterPruner/L2FilterPruner), then there will be a shape conflict during these layers. Even the pruned model with mask works fine, the pruned model cannot be speedup to the final model directly that runs on the devices, because there will be a shape conflict when the model tries to add/concat the outputs of these layers. This tool is to find the layers that have channel count dependencies to help users better prune their model. If the layers have channel dependency are assigned with different sparsities (here we only discuss the structured pruning by L1FilterPruner/L2FilterPruner), then there will be a shape conflict during these layers. Even the pruned model with mask works fine, the pruned model cannot be speedup to the final model directly that runs on the devices, because there will be a shape conflict when the model tries to add/concat the outputs of these layers. This tool is to find the layers that have channel count dependencies to help users better prune their model.
Usage Usage
^^^^^ """""
.. code-block:: python .. code-block:: python
...@@ -114,7 +114,7 @@ Usage ...@@ -114,7 +114,7 @@ Usage
channel_depen.export('dependency.csv') channel_depen.export('dependency.csv')
Output Example Output Example
^^^^^^^^^^^^^^ """"""""""""""
The following lines are the output example of torchvision.models.resnet18 exported by ChannelDependency. The layers at the same line have output channel dependencies with each other. For example, layer1.1.conv2, conv1, and layer1.0.conv2 have output channel dependencies with each other, which means the output channel(filters) numbers of these three layers should be same with each other, otherwise, the model may have shape conflict. The following lines are the output example of torchvision.models.resnet18 exported by ChannelDependency. The layers at the same line have output channel dependencies with each other. For example, layer1.1.conv2, conv1, and layer1.0.conv2 have output channel dependencies with each other, which means the output channel(filters) numbers of these three layers should be same with each other, otherwise, the model may have shape conflict.
...@@ -139,11 +139,27 @@ MaskConflict ...@@ -139,11 +139,27 @@ MaskConflict
When the masks of different layers in a model have conflict (for example, assigning different sparsities for the layers that have channel dependency), we can fix the mask conflict by MaskConflict. Specifically, the MaskConflict loads the masks exported by the pruners(L1FilterPruner, etc), and check if there is mask conflict, if so, MaskConflict sets the conflicting masks to the same value. When the masks of different layers in a model have conflict (for example, assigning different sparsities for the layers that have channel dependency), we can fix the mask conflict by MaskConflict. Specifically, the MaskConflict loads the masks exported by the pruners(L1FilterPruner, etc), and check if there is mask conflict, if so, MaskConflict sets the conflicting masks to the same value.
.. code-block:: bash .. code-block:: python
from nni.compression.pytorch.utils.mask_conflict import fix_mask_conflict from nni.compression.pytorch.utils.mask_conflict import fix_mask_conflict
fixed_mask = fix_mask_conflict('./resnet18_mask', net, data) fixed_mask = fix_mask_conflict('./resnet18_mask', net, data)
not_safe_to_prune
^^^^^^^^^^^^^^^^^
If we try to prune a layer whose output tensor is taken as the input by a shape-constraint OP(for example, view, reshape), then such pruning maybe not be safe. For example, we have a convolutional layer followed by a view function.
.. code-block:: python
x = self.conv(x) # output shape is (batch, 1024, 3, 3)
x = x.view(-1, 1024)
If the output shape of the pruned conv layer is not divisible by 1024(for example(batch, 500, 3, 3)), we may meet a shape error. We cannot replace such a function that directly operates on the Tensor. Therefore, we need to be careful when pruning such layers. The function not_safe_to_prune finds all the layers followed by a shape-constraint function. Here is an example for usage. If you meet a shape error when running the forward inference on the speeduped model, you can exclude the layers returned by not_safe_to_prune and try again.
.. code-block:: python
not_safe = not_safe_to_prune(model, dummy_input)
Model FLOPs/Parameters Counter Model FLOPs/Parameters Counter
------------------------------ ------------------------------
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment