[Model Compression] auto compression (#3631)

a8879dd6 · J-shang · GitHub · 580c5979 · a8879dd6 · 580c5979
Unverified Commit a8879dd6 authored May 28, 2021 by J-shang Committed by GitHub May 28, 2021
14 changed files
--- a/docs/en_US/Compression/AutoCompression.rst
+++ b/docs/en_US/Compression/AutoCompression.rst
+Auto Compression with NNI Experiment
+====================================
+If you want to compress your model, but don't know what compression algorithm to choose, or don't know what sparsity is suitable for your model, or just want to try more possibilities, auto compression may help you.
+Users can choose different compression algorithms and define the algorithms' search space, then auto compression will launch an NNI experiment and try different compression algorithms with varying sparsity automatically. 
+Of course, in addition to the sparsity rate, users can also introduce other related parameters into the search space.
+If you don't know what is search space or how to write search space, `this <./Tutorial/SearchSpaceSpec.rst>`__ is for your reference.
+Auto compression using experience is similar to the NNI experiment in python.
+The main differences are as follows:
+* Use a generator to help generate search space object.
+* Need to provide the model to be compressed, and the model should have already been pre-trained.
+* No need to set ``trial_command``, additional need to set ``auto_compress_module`` as ``AutoCompressionExperiment`` input.
+Generate search space
+---------------------
+Due to the extensive use of nested search space, we recommend a using generator to configure search space.
+The following is an example. Using ``add_config()`` add subconfig, then ``dumps()`` search space dict.
+.. code-block:: python
+    from nni.algorithms.compression.pytorch.auto_compress import AutoCompressionSearchSpaceGenerator
+    generator = AutoCompressionSearchSpaceGenerator()
+    generator.add_config('level', [
+        {
+            "sparsity": {
+                "_type": "uniform",
+                "_value": [0.01, 0.99]
+            },
+            'op_types': ['default']
+        }
+    ])
+    generator.add_config('qat', [
+    {
+        'quant_types': ['weight', 'output'],
+        'quant_bits': {
+            'weight': 8,
+            'output': 8
+        },
+        'op_types': ['Conv2d', 'Linear']
+    }])
+    search_space = generator.dumps()
+Now we support the following pruners and quantizers:
+.. code-block:: python
+    PRUNER_DICT = {
+        'level': LevelPruner,
+        'slim': SlimPruner,
+        'l1': L1FilterPruner,
+        'l2': L2FilterPruner,
+        'fpgm': FPGMPruner,
+        'taylorfo': TaylorFOWeightFilterPruner,
+        'apoz': ActivationAPoZRankFilterPruner,
+        'mean_activation': ActivationMeanRankFilterPruner
+    }
+    QUANTIZER_DICT = {
+        'naive': NaiveQuantizer,
+        'qat': QAT_Quantizer,
+        'dorefa': DoReFaQuantizer,
+        'bnn': BNNQuantizer
+    }
+Provide user model for compression
+----------------------------------
+Users need to inherit ``AbstractAutoCompressionModule`` and override the abstract class function.
+.. code-block:: python
+    from nni.algorithms.compression.pytorch.auto_compress import AbstractAutoCompressionModule
+    class AutoCompressionModule(AbstractAutoCompressionModule):
+        @classmethod
+        def model(cls) -> nn.Module:
+            ...
+            return _model
+        @classmethod
+        def evaluator(cls) -> Callable[[nn.Module], float]:
+            ...
+            return _evaluator
+Users need to implement at least ``model()`` and ``evaluator()``.
+If you use iterative pruner, you need to additional implement ``optimizer_factory()``, ``criterion()`` and ``sparsifying_trainer()``.
+If you want to finetune the model after compression, you need to implement ``optimizer_factory()``, ``criterion()``, ``post_compress_finetuning_trainer()`` and ``post_compress_finetuning_epochs()``.
+The ``optimizer_factory()`` should return a factory function, the input is an iterable variable, i.e. your ``model.parameters()``, and the output is an optimizer instance.
+The two kinds of ``trainer()`` should return a trainer with input ``model, optimizer, criterion, current_epoch``.
+The full abstract interface refers to :githublink:`interface.py <nni/algorithms/compression/pytorch/auto_compress/interface.py>`.
+An example of ``AutoCompressionModule`` implementation refers to :githublink:`auto_compress_module.py <examples/model_compress/auto_compress/torch/auto_compress_module.py>`.
+Launch NNI experiment
+---------------------
+Similar to launch from python, the difference is no need to set ``trial_command`` and put the user-provided ``AutoCompressionModule`` as ``AutoCompressionExperiment`` input.
+.. code-block:: python
+    from pathlib import Path
+    from nni.algorithms.compression.pytorch.auto_compress import AutoCompressionExperiment
+    from auto_compress_module import AutoCompressionModule
+    experiment = AutoCompressionExperiment(AutoCompressionModule, 'local')
+    experiment.config.experiment_name = 'auto compression torch example'
+    experiment.config.trial_concurrency = 1
+    experiment.config.max_trial_number = 10
+    experiment.config.search_space = search_space
+    experiment.config.trial_code_directory = Path(__file__).parent
+    experiment.config.tuner.name = 'TPE'
+    experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
+    experiment.config.training_service.use_active_gpu = True
+    experiment.run(8088)
--- a/docs/en_US/Compression/AutoPruningUsingTuners.rst
+++ b/docs/en_US/Compression/AutoPruningUsingTuners.rst
-Automatic Model Pruning using NNI Tuners
-========================================
-It's convenient to implement auto model pruning with NNI compression and NNI tuners
-First, model compression with NNI
---------------------------------
-You can easily compress a model with NNI compression. Take pruning for example, you can prune a pretrained model with L2FilterPruner like this
-.. code-block:: python
-   from nni.algorithms.compression.pytorch.pruning import L2FilterPruner
-   config_list = [{ 'sparsity': 0.5, 'op_types': ['Conv2d'] }]
-   pruner = L2FilterPruner(model, config_list)
-   pruner.compress()
-The 'Conv2d' op_type stands for the module types defined in :githublink:`default_layers.py <nni/compression/pytorch/default_layers.py>` for pytorch.
-Therefore ``{ 'sparsity': 0.5, 'op_types': ['Conv2d'] }``\ means that **all layers with specified op_types will be compressed with the same 0.5 sparsity**. When ``pruner.compress()`` called, the model is compressed with masks and after that you can normally fine tune this model and **pruned weights won't be updated** which have been masked.
-Then, make this automatic
-------------------------
-The previous example manually chose L2FilterPruner and pruned with a specified sparsity. Different sparsity and different pruners may have different effects on different models. This process can be done with NNI tuners.
-Firstly, modify our codes for few lines
-.. code-block:: python
-    import nni
-    from nni.algorithms.compression.pytorch.pruning import *
-    params = nni.get_parameters()
-    sparsity = params['sparsity']
-    pruner_name = params['pruner']
-    model_name = params['model']
-    model, pruner = get_model_pruner(model_name, pruner_name, sparsity)
-    pruner.compress()
-    train(model)  # your code for fine-tuning the model
-    acc = test(model)  # test the fine-tuned model
-    nni.report_final_results(acc)
-Then, define a ``config`` file in YAML to automatically tuning model, pruning algorithm and sparsity.
-.. code-block:: yaml
-    searchSpace:
-    sparsity:
-      _type: choice
-      _value: [0.25, 0.5, 0.75]
-    pruner:
-      _type: choice
-      _value: ['slim', 'l2filter', 'fpgm', 'apoz']
-    model:
-      _type: choice
-      _value: ['vgg16', 'vgg19']
-    trainingService:
-    platform: local
-    trialCodeDirectory: .
-    trialCommand: python3 basic_pruners_torch.py --nni
-    trialConcurrency: 1
-    trialGpuNumber: 0
-    tuner:
-      name: GridSearch
-The full example can be found :githublink:`here <examples/model_compress/pruning/config.yml>`
-Finally, start the searching via
-.. code-block:: bash
-   nnictl create -c config.yml
--- a/docs/en_US/Compression/advanced.rst
+++ b/docs/en_US/Compression/advanced.rst
@@ -6,4 +6,4 @@ Advanced Usage
    Framework <./Framework>
    Customize a new algorithm <./CustomizeCompressor>
-    Automatic Model Compression <./AutoPruningUsingTuners>
+    Automatic Model Compression (Beta) <./AutoCompression>
--- a/docs/en_US/Compression/pruning.rst
+++ b/docs/en_US/Compression/pruning.rst
@@ -23,4 +23,3 @@ For details, please refer to the following tutorials:
    Pruners <Pruner>
    Dependency Aware Mode <DependencyAware>
    Model Speedup <ModelSpeedup>
-    Automatic Model Pruning with NNI Tuners <AutoPruningUsingTuners>
--- a/examples/model_compress/auto_compress/torch/auto_compress_module.py
+++ b/examples/model_compress/auto_compress/torch/auto_compress_module.py
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT license.
+from typing import Callable, Optional, Iterable
+import torch
+import torch.nn as nn
+import torch.optim as optim
+import torch.nn.functional as F
+from torchvision import datasets, transforms
+from nni.algorithms.compression.pytorch.auto_compress import AbstractAutoCompressionModule
+torch.manual_seed(1)
+class LeNet(nn.Module):
+    def __init__(self):
+        super(LeNet, self).__init__()
+        self.conv1 = nn.Conv2d(1, 32, 3, 1)
+        self.conv2 = nn.Conv2d(32, 64, 3, 1)
+        self.dropout1 = nn.Dropout2d(0.25)
+        self.dropout2 = nn.Dropout2d(0.5)
+        self.fc1 = nn.Linear(9216, 128)
+        self.fc2 = nn.Linear(128, 10)
+    def forward(self, x):
+        x = self.conv1(x)
+        x = F.relu(x)
+        x = self.conv2(x)
+        x = F.relu(x)
+        x = F.max_pool2d(x, 2)
+        x = self.dropout1(x)
+        x = torch.flatten(x, 1)
+        x = self.fc1(x)
+        x = F.relu(x)
+        x = self.dropout2(x)
+        x = self.fc2(x)
+        output = F.log_softmax(x, dim=1)
+        return output
+_use_cuda = torch.cuda.is_available()
+_train_kwargs = {'batch_size': 64}
+_test_kwargs = {'batch_size': 1000}
+if _use_cuda:
+    _cuda_kwargs = {'num_workers': 1,
+                    'pin_memory': True,
+                    'shuffle': True}
+    _train_kwargs.update(_cuda_kwargs)
+    _test_kwargs.update(_cuda_kwargs)
+_transform = transforms.Compose([
+    transforms.ToTensor(),
+    transforms.Normalize((0.1307,), (0.3081,))
+])
+_device = torch.device("cuda" if _use_cuda else "cpu")
+_train_loader = None
+_test_loader = None
+def _train(model, optimizer, criterion, epoch):
+    global _train_loader
+    if _train_loader is None:
+        dataset = datasets.MNIST('./data', train=True, download=True, transform=_transform)
+        _train_loader = torch.utils.data.DataLoader(dataset, **_train_kwargs)
+    model.train()
+    for data, target in _train_loader:
+        data, target = data.to(_device), target.to(_device)
+        optimizer.zero_grad()
+        output = model(data)
+        loss = criterion(output, target)
+        loss.backward()
+        optimizer.step()
+def _test(model):
+    global _test_loader
+    if _test_loader is None:
+        dataset = datasets.MNIST('./data', train=False, transform=_transform)
+        _test_loader = torch.utils.data.DataLoader(dataset, **_test_kwargs)
+    model.eval()
+    test_loss = 0
+    correct = 0
+    with torch.no_grad():
+        for data, target in _test_loader:
+            data, target = data.to(_device), target.to(_device)
+            output = model(data)
+            test_loss += F.nll_loss(output, target, reduction='sum').item()
+            pred = output.argmax(dim=1, keepdim=True)
+            correct += pred.eq(target.view_as(pred)).sum().item()
+    test_loss /= len(_test_loader.dataset)
+    acc = 100 * correct / len(_test_loader.dataset)
+    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
+        test_loss, correct, len(_test_loader.dataset), acc))
+    return acc
+_model = LeNet().to(_device)
+_model.load_state_dict(torch.load('mnist_pretrain_lenet.pth'))
+class AutoCompressionModule(AbstractAutoCompressionModule):
+    @classmethod
+    def model(cls) -> nn.Module:
+        return _model
+    @classmethod
+    def evaluator(cls) -> Callable[[nn.Module], float]:
+        return _test
+    @classmethod
+    def optimizer_factory(cls) -> Optional[Callable[[Iterable], optim.Optimizer]]:
+        def _optimizer_factory(params: Iterable):
+            return torch.optim.SGD(params, lr=0.01)
+        return _optimizer_factory
+    @classmethod
+    def criterion(cls) -> Optional[Callable]:
+        return F.nll_loss
+    @classmethod
+    def sparsifying_trainer(cls, compress_algorithm_name: str) -> Optional[Callable[[nn.Module, optim.Optimizer, Callable, int], None]]:
+        return _train
+    @classmethod
+    def post_compress_finetuning_trainer(cls, compress_algorithm_name: str) -> Optional[Callable[[nn.Module, optim.Optimizer, Callable, int], None]]:
+        return _train
+    @classmethod
+    def post_compress_finetuning_epochs(cls, compress_algorithm_name: str) -> int:
+        return 2
--- a/examples/model_compress/auto_compress/torch/auto_compress_torch.py
+++ b/examples/model_compress/auto_compress/torch/auto_compress_torch.py
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT license.
+from pathlib import Path
+from nni.algorithms.compression.pytorch.auto_compress import AutoCompressionExperiment, AutoCompressionSearchSpaceGenerator
+from auto_compress_module import AutoCompressionModule
+generator = AutoCompressionSearchSpaceGenerator()
+generator.add_config('level', [
+    {
+        "sparsity": {
+            "_type": "uniform",
+            "_value": [0.01, 0.99]
+        },
+        'op_types': ['default']
+    }
+])
+generator.add_config('l1', [
+    {
+        "sparsity": {
+            "_type": "uniform",
+            "_value": [0.01, 0.99]
+        },
+        'op_types': ['Conv2d']
+    }
+])
+generator.add_config('qat', [
+    {
+        'quant_types': ['weight', 'output'],
+        'quant_bits': {
+            'weight': 8,
+            'output': 8
+        },
+        'op_types': ['Conv2d', 'Linear']
+    }])
+search_space = generator.dumps()
+experiment = AutoCompressionExperiment(AutoCompressionModule, 'local')
+experiment.config.experiment_name = 'auto compression torch example'
+experiment.config.trial_concurrency = 1
+experiment.config.max_trial_number = 10
+experiment.config.search_space = search_space
+experiment.config.trial_code_directory = Path(__file__).parent
+experiment.config.tuner.name = 'TPE'
+experiment.config.tuner.class_args['optimize_mode'] = 'maximize'
+experiment.config.training_service.use_active_gpu = True
+experiment.run(8088)
--- a/examples/model_compress/auto_compress/torch/mnist_pretrain_lenet.pth
+++ b/examples/model_compress/auto_compress/torch/mnist_pretrain_lenet.pth
--- a/nni/algorithms/compression/pytorch/auto_compress/__init__.py
+++ b/nni/algorithms/compression/pytorch/auto_compress/__init__.py
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT license.
+from .experiment import AutoCompressionExperiment
+from .interface import AbstractAutoCompressionModule
+from .utils import AutoCompressionSearchSpaceGenerator
--- a/nni/algorithms/compression/pytorch/auto_compress/auto_compress_engine.py
+++ b/nni/algorithms/compression/pytorch/auto_compress/auto_compress_engine.py
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT license.
+import logging
+from typing import Optional, Callable
+import json_tricks
+from torch.nn import Module
+from torch.optim import Optimizer
+import nni
+from .constants import PRUNER_DICT, QUANTIZER_DICT
+from .interface import BaseAutoCompressionEngine, AbstractAutoCompressionModule
+from .utils import import_
+_logger = logging.getLogger(__name__)
+_logger.setLevel(logging.INFO)
+class AutoCompressionEngine(BaseAutoCompressionEngine):
+    @classmethod
+    def __convert_compact_pruner_params_to_config_list(cls, compact_config: dict) -> list:
+        config_dict = {}
+        for key, value in compact_config.items():
+            _, op_types, op_names, var_name = key.split('::')
+            config_dict.setdefault((op_types, op_names), {})
+            config_dict[(op_types, op_names)][var_name] = value
+        config_list = []
+        for key, config in config_dict.items():
+            op_types, op_names = key
+            op_types = op_types.split(':') if op_types else []
+            op_names = op_names.split(':') if op_names else []
+            if op_types:
+                config['op_types'] = op_types
+            if op_names:
+                config['op_names'] = op_names
+            if 'op_types' in config or 'op_names' in config:
+                config_list.append(config)
+        return config_list
+    @classmethod
+    def __convert_compact_quantizer_params_to_config_list(cls, compact_config: dict) -> list:
+        config_dict = {}
+        for key, value in compact_config.items():
+            _, quant_types, op_types, op_names, var_name = key.split('::')
+            config_dict.setdefault((quant_types, op_types, op_names), {})
+            config_dict[(quant_types, op_types, op_names)][var_name] = value
+        config_list = []
+        for key, config in config_dict.items():
+            quant_types, op_types, op_names = key
+            quant_types = quant_types.split(':')
+            op_types = op_types.split(':')
+            op_names = op_names.split(':')
+            if quant_types:
+                config['quant_types'] = quant_types
+            else:
+                continue
+            if op_types:
+                config['op_types'] = op_types
+            if op_names:
+                config['op_names'] = op_names
+            if 'op_types' in config or 'op_names' in config:
+                config_list.append(config)
+        return config_list
+    @classmethod
+    def _convert_compact_params_to_config_list(cls, compressor_type: str, compact_config: dict) -> list:
+        func_dict = {
+            'pruner': cls.__convert_compact_pruner_params_to_config_list,
+            'quantizer': cls.__convert_compact_quantizer_params_to_config_list
+        }
+        return func_dict[compressor_type](compact_config)
+    @classmethod
+    def __compress_pruning(cls, algorithm_name: str,
+                           model: Module,
+                           config_list: list,
+                           optimizer_factory: Optional[Callable],
+                           criterion: Optional[Callable],
+                           sparsifying_trainer: Optional[Callable[[Module, Optimizer, Callable, int], None]],
+                           finetuning_trainer: Optional[Callable[[Module, Optimizer, Callable, int], None]],
+                           finetuning_epochs: int,
+                           **compressor_parameter_dict) -> Module:
+        if algorithm_name in ['level', 'l1', 'l2', 'fpgm']:
+            pruner = PRUNER_DICT[algorithm_name](model, config_list, **compressor_parameter_dict)
+        elif algorithm_name in ['slim', 'taylorfo', 'apoz', 'mean_activation']:
+            optimizer = None if optimizer_factory is None else optimizer_factory(model.parameters())
+            pruner = PRUNER_DICT[algorithm_name](model, config_list, optimizer, sparsifying_trainer, criterion, **compressor_parameter_dict)
+        else:
+            raise ValueError('Unsupported compression algorithm: {}.'.format(algorithm_name))
+        compressed_model = pruner.compress()
+        if finetuning_trainer is not None:
+            # note that in pruning process, finetuning will use an un-patched optimizer
+            optimizer = optimizer_factory(compressed_model.parameters())
+            for i in range(finetuning_epochs):
+                finetuning_trainer(compressed_model, optimizer, criterion, i)
+        pruner.get_pruned_weights()
+        return compressed_model
+    @classmethod
+    def __compress_quantization(cls, algorithm_name: str,
+                                model: Module,
+                                config_list: list,
+                                optimizer_factory: Optional[Callable],
+                                criterion: Optional[Callable],
+                                sparsifying_trainer: Optional[Callable[[Module, Optimizer, Callable, int], None]],
+                                finetuning_trainer: Optional[Callable[[Module, Optimizer, Callable, int], None]],
+                                finetuning_epochs: int,
+                                **compressor_parameter_dict) -> Module:
+        optimizer = None if optimizer_factory is None else optimizer_factory(model.parameters())
+        quantizer = QUANTIZER_DICT[algorithm_name](model, config_list, optimizer, **compressor_parameter_dict)
+        compressed_model = quantizer.compress()
+        if finetuning_trainer is not None:
+            # note that in quantization process, finetuning will use a patched optimizer
+            for i in range(finetuning_epochs):
+                finetuning_trainer(compressed_model, optimizer, criterion, i)
+        return compressed_model
+    @classmethod
+    def _compress(cls, compressor_type: str,
+                  algorithm_name: str,
+                  model: Module,
+                  config_list: list,
+                  optimizer_factory: Optional[Callable],
+                  criterion: Optional[Callable],
+                  sparsifying_trainer: Optional[Callable[[Module, Optimizer, Callable, int], None]],
+                  finetuning_trainer: Optional[Callable[[Module, Optimizer, Callable, int], None]],
+                  finetuning_epochs: int,
+                  **compressor_parameter_dict) -> Module:
+        func_dict = {
+            'pruner': cls.__compress_pruning,
+            'quantizer': cls.__compress_quantization
+        }
+        _logger.info('%s compressor config_list:\n%s', algorithm_name, json_tricks.dumps(config_list, indent=4))
+        compressed_model = func_dict[compressor_type](algorithm_name, model, config_list, optimizer_factory, criterion, sparsifying_trainer,
+                                                      finetuning_trainer, finetuning_epochs, **compressor_parameter_dict)
+        return compressed_model
+    @classmethod
+    def trial_execute_compress(cls, module_name):
+        auto_compress_module: AbstractAutoCompressionModule = import_(module_name)
+        algorithm_config = nni.get_next_parameter()['algorithm_name']
+        algorithm_name = algorithm_config['_name']
+        compact_config = {k: v for k, v in algorithm_config.items() if k.startswith('config_list::')}
+        parameter_dict = {k.split('parameter::')[1]: v for k, v in algorithm_config.items() if k.startswith('parameter::')}
+        compressor_type = 'quantizer' if algorithm_name in QUANTIZER_DICT else 'pruner'
+        config_list = cls._convert_compact_params_to_config_list(compressor_type, compact_config)
+        model, evaluator = auto_compress_module.model(), auto_compress_module.evaluator()
+        optimizer_factory, criterion = auto_compress_module.optimizer_factory(), auto_compress_module.criterion()
+        sparsifying_trainer = auto_compress_module.sparsifying_trainer(algorithm_name)
+        finetuning_trainer = auto_compress_module.post_compress_finetuning_trainer(algorithm_name)
+        finetuning_epochs = auto_compress_module.post_compress_finetuning_epochs(algorithm_name)
+        compressed_model = cls._compress(compressor_type, algorithm_name, model, config_list, optimizer_factory,
+                                         criterion, sparsifying_trainer, finetuning_trainer, finetuning_epochs, **parameter_dict)
+        nni.report_final_result(evaluator(compressed_model))
--- a/nni/algorithms/compression/pytorch/auto_compress/constants.py
+++ b/nni/algorithms/compression/pytorch/auto_compress/constants.py
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT license.
+from ..pruning import LevelPruner, SlimPruner, L1FilterPruner, L2FilterPruner, FPGMPruner, TaylorFOWeightFilterPruner, \
+    ActivationAPoZRankFilterPruner, ActivationMeanRankFilterPruner
+from ..quantization.quantizers import NaiveQuantizer, QAT_Quantizer, DoReFaQuantizer, BNNQuantizer
+PRUNER_DICT = {
+    'level': LevelPruner,
+    'slim': SlimPruner,
+    'l1': L1FilterPruner,
+    'l2': L2FilterPruner,
+    'fpgm': FPGMPruner,
+    'taylorfo': TaylorFOWeightFilterPruner,
+    'apoz': ActivationAPoZRankFilterPruner,
+    'mean_activation': ActivationMeanRankFilterPruner
+}
+QUANTIZER_DICT = {
+    'naive': NaiveQuantizer,
+    'qat': QAT_Quantizer,
+    'dorefa': DoReFaQuantizer,
+    'bnn': BNNQuantizer
+}
--- a/nni/algorithms/compression/pytorch/auto_compress/experiment.py
+++ b/nni/algorithms/compression/pytorch/auto_compress/experiment.py
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT license.
+import inspect
+from pathlib import Path, PurePath
+from typing import overload, Union, List
+from numpy import tri
+from nni.experiment import Experiment, ExperimentConfig
+from nni.algorithms.compression.pytorch.auto_compress.interface import AbstractAutoCompressionModule
+class AutoCompressionExperiment(Experiment):
+    @overload
+    def __init__(self, auto_compress_module: AbstractAutoCompressionModule, config: ExperimentConfig) -> None:
+        """
+        Prepare an experiment.
+        Use `Experiment.run()` to launch it.
+        Parameters
+        ----------
+        auto_compress_module
+            The module provided by the user implements the `AbstractAutoCompressionModule` interfaces.
+            Remember put the module file under `trial_code_directory`.
+        config
+            Experiment configuration.
+        """
+        ...
+    @overload
+    def __init__(self, auto_compress_module: AbstractAutoCompressionModule, training_service: Union[str, List[str]]) -> None:
+        """
+        Prepare an experiment, leaving configuration fields to be set later.
+        Example usage::
+            experiment = Experiment(auto_compress_module, 'remote')
+            experiment.config.trial_command = 'python3 trial.py'
+            experiment.config.machines.append(RemoteMachineConfig(ip=..., user_name=...))
+            ...
+            experiment.run(8080)
+        Parameters
+        ----------
+        auto_compress_module
+            The module provided by the user implements the `AbstractAutoCompressionModule` interfaces.
+            Remember put the module file under `trial_code_directory`.
+        training_service
+            Name of training service.
+            Supported value: "local", "remote", "openpai", "aml", "kubeflow", "frameworkcontroller", "adl" and hybrid training service.
+        """
+        ...
+    def __init__(self, auto_compress_module: AbstractAutoCompressionModule, config=None, training_service=None):
+        super().__init__(config, training_service)
+        self.module_file_path = str(PurePath(inspect.getfile(auto_compress_module)))
+        self.module_name = auto_compress_module.__name__
+    def start(self, port: int, debug: bool) -> None:
+        trial_code_directory = str(PurePath(Path(self.config.trial_code_directory).absolute())) + '/'
+        assert self.module_file_path.startswith(trial_code_directory), 'The file path of the user-provided module should under trial_code_directory.'
+        relative_module_path = self.module_file_path.split(trial_code_directory)[1]
+        # only support linux, need refactor?
+        command = 'python3 -m nni.algorithms.compression.pytorch.auto_compress.trial_entry --module_file_name {} --module_class_name {}'
+        self.config.trial_command = command.format(relative_module_path, self.module_name)
+        super().start(port=port, debug=debug)
--- a/nni/algorithms/compression/pytorch/auto_compress/interface.py
+++ b/nni/algorithms/compression/pytorch/auto_compress/interface.py
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT license.
+from abc import ABC, abstractmethod
+from typing import Optional, Callable, Iterable
+from torch.nn import Module
+from torch.optim import Optimizer
+class BaseAutoCompressionEngine(ABC):
+    @classmethod
+    @abstractmethod
+    def trial_execute_compress(cls):
+        """
+        Execute the compressing trial.
+        """
+        pass
+class AbstractAutoCompressionModule(ABC):
+    """
+    The abstract container that user need to implement.
+    """
+    @classmethod
+    @abstractmethod
+    def model(cls) -> Module:
+        """
+        Returns
+        -------
+        torch.nn.Module
+            Model to be compress.
+        """
+        pass
+    @classmethod
+    @abstractmethod
+    def evaluator(cls) -> Callable[[Module], float]:
+        """
+        Returns
+        -------
+        function
+            The function used to evaluate the compressed model, return a scalar.
+        """
+        pass
+    @classmethod
+    @abstractmethod
+    def optimizer_factory(cls) -> Optional[Callable[[Iterable], Optimizer]]:
+        """
+        Returns
+        -------
+        Optional[Callable[[Iterable], Optimizer]]
+            Optimizer factory function. Input is a iterable value, i.e. `model.parameters()`.
+            Output is the `torch.optim.Optimizer` instance.
+        """
+        pass
+    @classmethod
+    @abstractmethod
+    def criterion(cls) -> Optional[Callable]:
+        """
+        Returns
+        -------
+        Optional[Callable]
+            The criterion function used to train the model.
+        """
+        pass
+    @classmethod
+    @abstractmethod
+    def sparsifying_trainer(cls, compress_algorithm_name: str) -> Optional[Callable[[Module, Optimizer, Callable, int], None]]:
+        """
+        The trainer is used in sparsifying process.
+        Parameters
+        ----------
+        compress_algorithm_name: str
+            The name of pruner and quantizer, i.e. 'level', 'l1', 'qat'.
+        Returns
+        -------
+        Optional[Callable[[Module, Optimizer, Callable, int], None]]
+            Used to train model in compress stage, include `model, optimizer, criterion, current_epoch` as function arguments.
+        """
+        pass
+    @classmethod
+    @abstractmethod
+    def post_compress_finetuning_trainer(cls, compress_algorithm_name: str) -> Optional[Callable[[Module, Optimizer, Callable, int], None]]:
+        """
+        The trainer is used in post-compress finetuning process.
+        Parameters
+        ----------
+        compress_algorithm_name: str
+            The name of pruner and quantizer, i.e. 'level', 'l1', 'qat'.
+        Returns
+        -------
+        Optional[Callable[[Module, Optimizer, Callable, int], None]]
+            Used to train model in finetune stage, include `model, optimizer, criterion, current_epoch` as function arguments.
+        """
+        pass
+    @classmethod
+    @abstractmethod
+    def post_compress_finetuning_epochs(cls, compress_algorithm_name: str) -> int:
+        """
+        The epochs in post-compress finetuning process.
+        Parameters
+        ----------
+        compress_algorithm_name: str
+            The name of pruner and quantizer, i.e. 'level', 'l1', 'qat'.
+        Returns
+        -------
+        int
+            The finetuning epoch number.
+        """
+        pass
--- a/nni/algorithms/compression/pytorch/auto_compress/trial_entry.py
+++ b/nni/algorithms/compression/pytorch/auto_compress/trial_entry.py
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT license.
+"""
+Entrypoint for trials.
+"""
+import argparse
+from pathlib import Path
+import re
+from .auto_compress_engine import AutoCompressionEngine
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(description='trial entry for auto compression.')
+    parser.add_argument('--module_file_name', required=True, dest='module_file_name', help='the path of auto compression module file')
+    parser.add_argument('--module_class_name', required=True, dest='module_class_name', help='the name of auto compression module')
+    args = parser.parse_args()
+    module_name = Path(args.module_file_name).as_posix()
+    module_name = re.sub(re.escape('.py') + '$', '', module_name).replace('/', '.') + '.' + args.module_class_name
+    AutoCompressionEngine.trial_execute_compress(module_name)
--- a/nni/algorithms/compression/pytorch/auto_compress/utils.py
+++ b/nni/algorithms/compression/pytorch/auto_compress/utils.py
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT license.
+from typing import Any
+from .constants import PRUNER_DICT, QUANTIZER_DICT
+class AutoCompressionSearchSpaceGenerator:
+    """
+    For convenient generation of search space that can be used by tuner.
+    """
+    def __init__(self):
+        self.algorithm_choice_list = []
+    def add_config(self, algorithm_name: str, config_list: list, **algo_kwargs):
+        """
+        This function used for distinguish algorithm type is pruning or quantization.
+        Then call `self._add_pruner_config()` or `self._add_quantizer_config()`.
+        """
+        if algorithm_name in PRUNER_DICT:
+            self._add_pruner_config(algorithm_name, config_list, **algo_kwargs)
+        if algorithm_name in QUANTIZER_DICT:
+            self._add_quantizer_config(algorithm_name, config_list, **algo_kwargs)
+    def _add_pruner_config(self, pruner_name: str, config_list: list, **algo_kwargs):
+        """
+        Parameters
+        ----------
+        pruner_name
+            Supported pruner name: 'level', 'slim', 'l1', 'l2', 'fpgm', 'taylorfo', 'apoz', 'mean_activation'.
+        config_list
+            Except 'op_types' and 'op_names', other config value can be written as `{'_type': ..., '_value': ...}`.
+        **algo_kwargs
+            The additional pruner parameters except 'model', 'config_list', 'optimizer', 'trainer', 'criterion'.
+            i.e., you can set `statistics_batch_num={'_type': 'choice', '_value': [1, 2, 3]}` in TaylorFOWeightFilterPruner or just `statistics_batch_num=1`.
+        """
+        sub_search_space = {'_name': pruner_name}
+        for config in config_list:
+            op_types = config.pop('op_types', [])
+            op_names = config.pop('op_names', [])
+            key_prefix = 'config_list::{}::{}'.format(':'.join(op_types), ':'.join(op_names))
+            for var_name, var_search_space in config.items():
+                sub_search_space['{}::{}'.format(key_prefix, var_name)] = self._wrap_single_value(var_search_space)
+        for parameter_name, parameter_search_space in algo_kwargs.items():
+            key_prefix = 'parameter'
+            sub_search_space['{}::{}'.format(key_prefix, parameter_name)] = self._wrap_single_value(parameter_search_space)
+        self.algorithm_choice_list.append(sub_search_space)
+    def _add_quantizer_config(self, quantizer_name: str, config_list: list, **algo_kwargs):
+        """
+        Parameters
+        ----------
+        quantizer_name
+            Supported pruner name: 'naive', 'qat', 'dorefa', 'bnn'.
+        config_list
+            Except 'quant_types', 'op_types' and 'op_names', other config value can be written as `{'_type': ..., '_value': ...}`.
+        **algo_kwargs
+            The additional pruner parameters except 'model', 'config_list', 'optimizer'.
+        """
+        sub_search_space = {'_name': quantizer_name}
+        for config in config_list:
+            quant_types = config.pop('quant_types', [])
+            op_types = config.pop('op_types', [])
+            op_names = config.pop('op_names', [])
+            key_prefix = 'config_list::{}::{}::{}'.format(':'.join(quant_types), ':'.join(op_types), ':'.join(op_names))
+            for var_name, var_search_space in config.items():
+                sub_search_space['{}::{}'.format(key_prefix, var_name)] = self._wrap_single_value(var_search_space)
+        for parameter_name, parameter_search_space in algo_kwargs.items():
+            key_prefix = 'parameter'
+            sub_search_space['{}::{}'.format(key_prefix, parameter_name)] = self._wrap_single_value(parameter_search_space)
+        self.algorithm_choice_list.append(sub_search_space)
+    def dumps(self) -> dict:
+        """
+        Dump the search space as a dict.
+        """
+        search_space = {
+            'algorithm_name': {
+                '_type': 'choice',
+                '_value': self.algorithm_choice_list
+            }
+        }
+        return search_space
+    @classmethod
+    def loads(cls, search_space: dict):
+        """
+        Return a AutoCompressionSearchSpaceGenerator instance load from a search space dict.
+        """
+        generator = AutoCompressionSearchSpaceGenerator()
+        generator.algorithm_choice_list = search_space['algorithm_name']['_value']
+        return generator
+    def _wrap_single_value(self, value) -> dict:
+        if not isinstance(value, dict):
+            converted_value = {
+                '_type': 'choice',
+                '_value': [value]
+            }
+        elif '_type' not in value:
+            converted_value = {}
+            for k, v in value.items():
+                converted_value[k] = self._wrap_single_value(v)
+        else:
+            converted_value = value
+        return converted_value
+def import_(target: str, allow_none: bool = False) -> Any:
+    if target is None:
+        return None
+    path, identifier = target.rsplit('.', 1)
+    module = __import__(path, globals(), locals(), [identifier])
+    return getattr(module, identifier)