"...composable_kernel_rocm.git" did not exist on "fcfe70f9f443dcd79438cbdcca48e67acc6113a5"
Unverified Commit 0494cae1 authored by colorjam's avatar colorjam Committed by GitHub
Browse files

Update readme doc link (#3482)

parent e85f029b
This diff is collapsed.
...@@ -13,7 +13,7 @@ The experiments are performed with the following pruners/datasets/models: ...@@ -13,7 +13,7 @@ The experiments are performed with the following pruners/datasets/models:
* *
Models: :githublink:`VGG16, ResNet18, ResNet50 <examples/model_compress/models/cifar10>` Models: :githublink:`VGG16, ResNet18, ResNet50 <examples/model_compress/pruning/models/cifar10>`
* *
Datasets: CIFAR-10 Datasets: CIFAR-10
...@@ -96,14 +96,14 @@ Implementation Details ...@@ -96,14 +96,14 @@ Implementation Details
This avoids potential issues of counting them of masked models. This avoids potential issues of counting them of masked models.
* *
The experiment code can be found :githublink:`here <examples/model_compress/auto_pruners_torch.py>`. The experiment code can be found :githublink:`here <examples/model_compress/pruning/auto_pruners_torch.py>`.
Experiment Result Rendering Experiment Result Rendering
^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^
* *
If you follow the practice in the :githublink:`example <examples/model_compress/auto_pruners_torch.py>`\ , for every single pruning experiment, the experiment result will be saved in JSON format as follows: If you follow the practice in the :githublink:`example <examples/model_compress/pruning/auto_pruners_torch.py>`\ , for every single pruning experiment, the experiment result will be saved in JSON format as follows:
.. code-block:: json .. code-block:: json
...@@ -114,8 +114,8 @@ Experiment Result Rendering ...@@ -114,8 +114,8 @@ Experiment Result Rendering
} }
* *
The experiment results are saved :githublink:`here <examples/model_compress/comparison_of_pruners>`. The experiment results are saved :githublink:`here <examples/model_compress/pruning/comparison_of_pruners>`.
You can refer to :githublink:`analyze <examples/model_compress/comparison_of_pruners/analyze.py>` to plot new performance comparison figures. You can refer to :githublink:`analyze <examples/model_compress/pruning/comparison_of_pruners/analyze.py>` to plot new performance comparison figures.
Contribution Contribution
------------ ------------
......
...@@ -14,7 +14,9 @@ NNI provides a model compression toolkit to help user compress and speed up thei ...@@ -14,7 +14,9 @@ NNI provides a model compression toolkit to help user compress and speed up thei
* Provide friendly and easy-to-use compression utilities for users to dive into the compression process and results. * Provide friendly and easy-to-use compression utilities for users to dive into the compression process and results.
* Concise interface for users to customize their own compression algorithms. * Concise interface for users to customize their own compression algorithms.
*Note that the interface and APIs are unified for both PyTorch and TensorFlow, currently only PyTorch version has been supported, TensorFlow version will be supported in future.* .. note::
Since NNI compression algorithms are not meant to compress model while NNI speedup tool can truly compress model and reduce latency. To obtain a truly compact model, users should conduct `model speedup <./ModelSpeedup.rst>`__. The interface and APIs are unified for both PyTorch and TensorFlow, currently only PyTorch version has been supported, TensorFlow version will be supported in future.
Supported Algorithms Supported Algorithms
-------------------- --------------------
...@@ -24,7 +26,7 @@ The algorithms include pruning algorithms and quantization algorithms. ...@@ -24,7 +26,7 @@ The algorithms include pruning algorithms and quantization algorithms.
Pruning Algorithms Pruning Algorithms
^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^
Pruning algorithms compress the original network by removing redundant weights or channels of layers, which can reduce model complexity and address the over-fitting issue. Pruning algorithms compress the original network by removing redundant weights or channels of layers, which can reduce model complexity and address the over-fitting issue.
.. list-table:: .. list-table::
:header-rows: 1 :header-rows: 1
......
...@@ -17,30 +17,37 @@ The ``dict``\ s in the ``list`` are applied one by one, that is, the configurati ...@@ -17,30 +17,37 @@ The ``dict``\ s in the ``list`` are applied one by one, that is, the configurati
There are different keys in a ``dict``. Some of them are common keys supported by all the compression algorithms: There are different keys in a ``dict``. Some of them are common keys supported by all the compression algorithms:
* **op_types**\ : This is to specify what types of operations to be compressed. 'default' means following the algorithm's default setting. * **op_types**\ : This is to specify what types of operations to be compressed. 'default' means following the algorithm's default setting. All suported module types are defined in :githublink:`default_layers.py <nni/compression/pytorch/default_layers.py>` for pytorch.
* **op_names**\ : This is to specify by name what operations to be compressed. If this field is omitted, operations will not be filtered by it. * **op_names**\ : This is to specify by name what operations to be compressed. If this field is omitted, operations will not be filtered by it.
* **exclude**\ : Default is False. If this field is True, it means the operations with specified types and names will be excluded from the compression. * **exclude**\ : Default is False. If this field is True, it means the operations with specified types and names will be excluded from the compression.
Some other keys are often specific to a certain algorithm, users can refer to `pruning algorithms <./Pruner.rst>`__ and `quantization algorithms <./Quantizer.rst>`__ for the keys allowed by each algorithm. Some other keys are often specific to a certain algorithm, users can refer to `pruning algorithms <./Pruner.rst>`__ and `quantization algorithms <./Quantizer.rst>`__ for the keys allowed by each algorithm.
A simple example of configuration is shown below: To prune all ``Conv2d`` layers with the sparsity of 0.6, the configuration can be written as:
.. code-block:: python .. code-block:: python
[ [{
{ 'sparsity': 0.6,
'sparsity': 0.8, 'op_types': ['Conv2d']
'op_types': ['default'] }]
},
{ To control the sparsity of specific layers, the configuration can be written as:
'sparsity': 0.6,
'op_names': ['op_name1', 'op_name2'] .. code-block:: python
},
{ [{
'exclude': True, 'sparsity': 0.8,
'op_names': ['op_name3'] 'op_types': ['default']
} },
] {
'sparsity': 0.6,
'op_names': ['op_name1', 'op_name2']
},
{
'exclude': True,
'op_names': ['op_name3']
}]
It means following the algorithm's default setting for compressed operations with sparsity 0.8, but for ``op_name1`` and ``op_name2`` use sparsity 0.6, and do not compress ``op_name3``. It means following the algorithm's default setting for compressed operations with sparsity 0.8, but for ``op_name1`` and ``op_name2`` use sparsity 0.6, and do not compress ``op_name3``.
...@@ -62,10 +69,10 @@ bits length of quantization, key is the quantization type, value is the quantiza ...@@ -62,10 +69,10 @@ bits length of quantization, key is the quantization type, value is the quantiza
.. code-block:: bash .. code-block:: bash
{ {
quant_bits: { quant_bits: {
'weight': 8, 'weight': 8,
'output': 4, 'output': 4,
}, },
} }
when the value is int type, all quantization types share same bits length. eg. when the value is int type, all quantization types share same bits length. eg.
...@@ -73,7 +80,7 @@ when the value is int type, all quantization types share same bits length. eg. ...@@ -73,7 +80,7 @@ when the value is int type, all quantization types share same bits length. eg.
.. code-block:: bash .. code-block:: bash
{ {
quant_bits: 8, # weight or output quantization are all 8 bits quant_bits: 8, # weight or output quantization are all 8 bits
} }
The following example shows a more complete ``config_list``\ , it uses ``op_names`` (or ``op_types``\ ) to specify the target layers along with the quantization bits for those layers. The following example shows a more complete ``config_list``\ , it uses ``op_names`` (or ``op_types``\ ) to specify the target layers along with the quantization bits for those layers.
...@@ -81,25 +88,26 @@ The following example shows a more complete ``config_list``\ , it uses ``op_name ...@@ -81,25 +88,26 @@ The following example shows a more complete ``config_list``\ , it uses ``op_name
.. code-block:: bash .. code-block:: bash
config_list = [{ config_list = [{
'quant_types': ['weight'], 'quant_types': ['weight'],
'quant_bits': 8, 'quant_bits': 8,
'op_names': ['conv1'] 'op_names': ['conv1']
}, { },
'quant_types': ['weight'], {
'quant_bits': 4, 'quant_types': ['weight'],
'quant_start_step': 0, 'quant_bits': 4,
'op_names': ['conv2'] 'quant_start_step': 0,
}, { 'op_names': ['conv2']
'quant_types': ['weight'], },
'quant_bits': 3, {
'op_names': ['fc1'] 'quant_types': ['weight'],
}, 'quant_bits': 3,
{ 'op_names': ['fc1']
'quant_types': ['weight'], },
'quant_bits': 2, {
'op_names': ['fc2'] 'quant_types': ['weight'],
} 'quant_bits': 2,
] 'op_names': ['fc2']
}]
In this example, 'op_names' is the name of layer and four layers will be quantized to different quant_bits. In this example, 'op_names' is the name of layer and four layers will be quantized to different quant_bits.
......
cifar-10-python.tar.gz
cifar-10-batches-py/
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment