update compression speedup documentation (#3979)

30361a2e · Ningxin Zheng · GitHub · 726a46de · 30361a2e
Unverified Commit 30361a2e authored Jul 27, 2021 by Ningxin Zheng Committed by GitHub Jul 27, 2021
Hide whitespace changes
Inline Side-by-side

Showing with 19 additions and 3 deletions

docs/en_US/Compression/CompressionUtils.rst docs/en_US/Compression/CompressionUtils.rst +19 -3

No files found.
--- a/docs/en_US/Compression/CompressionUtils.rst
+++ b/docs/en_US/Compression/CompressionUtils.rst
@@ -104,7 +104,7 @@ Complicated models may have residual connection/concat operations in their model
 If the layers have channel dependency are assigned with different sparsities (here we only discuss the structured pruning by L1FilterPruner/L2FilterPruner), then there will be a shape conflict during these layers. Even the pruned model with mask works fine, the pruned model cannot be speedup to the final model directly that runs on the devices, because there will be a shape conflict when the model tries to add/concat the outputs of these layers. This tool is to find the layers that have channel count dependencies to help users better prune their model.
 Usage
-^^^^^
+"""""
 .. code-block:: python
@@ -114,7 +114,7 @@ Usage
   channel_depen.export('dependency.csv')
 Output Example
-^^^^^^^^^^^^^^
+""""""""""""""
 The following lines are the output example of torchvision.models.resnet18 exported by ChannelDependency. The layers at the same line have output channel dependencies with each other. For example, layer1.1.conv2, conv1, and layer1.0.conv2 have output channel dependencies with each other, which means the output channel(filters) numbers of these three layers should be same with each other, otherwise, the model may have shape conflict. 
@@ -139,11 +139,27 @@ MaskConflict
 When the masks of different layers in a model have conflict (for example, assigning different sparsities for the layers that have channel dependency), we can fix the mask conflict by MaskConflict. Specifically, the MaskConflict loads the masks exported by the pruners(L1FilterPruner, etc), and check if there is mask conflict, if so, MaskConflict sets the conflicting masks to the same value.
-.. code-block:: bash
+.. code-block:: python
   from nni.compression.pytorch.utils.mask_conflict import fix_mask_conflict
   fixed_mask = fix_mask_conflict('./resnet18_mask', net, data)
+not_safe_to_prune
+^^^^^^^^^^^^^^^^^
+If we try to prune a layer whose output tensor is taken as the input by a shape-constraint OP(for example, view, reshape), then such pruning maybe not be safe. For example, we have a convolutional layer followed by a view function.
+.. code-block:: python
+   x = self.conv(x) # output shape is (batch, 1024, 3, 3)
+   x = x.view(-1, 1024)
+If the output shape of the pruned conv layer is not divisible by 1024(for example(batch, 500, 3, 3)), we may meet a shape error. We cannot replace such a function that directly operates on the Tensor. Therefore, we need to be careful when pruning such layers. The function not_safe_to_prune finds all the layers followed by a shape-constraint function. Here is an example for usage. If you meet a shape error when running the forward inference on the speeduped model, you can exclude the layers returned by not_safe_to_prune and try again. 
+.. code-block:: python
+   not_safe = not_safe_to_prune(model, dummy_input)
 Model FLOPs/Parameters Counter
 ------------------------------