"src/vscode:/vscode.git/clone" did not exist on "f95d9ed80ac4a863810e92097e0291db6e04c222"
Commit 9d468d2c authored by Tang Lang's avatar Tang Lang Committed by liuzhe-lz
Browse files

fix pruner bugs and add model compression README (#1624)

* fix builtin pruners bug

* use type_as

* fix pruner bugs and add model compression README

* fix example bugs

* add AutoCompression.md and remove sensitive pruner

* fix tf pruner bugs

* update overview

* Pruner.md
parent 8f778aa5
# Automatic Model Compression on NNI # Automatic Model Compression on NNI
TBD. It's convenient to implement auto model compression with NNI compression and NNI tuners
\ No newline at end of file
## First, model compression with NNI
You can easily compress a model with NNI compression. Take pruning for example, you can prune a pretrained model with LevelPruner like this
```python
from nni.compression.torch import LevelPruner
config_list = [{ 'sparsity': 0.8, 'op_types': 'default' }]
pruner = LevelPruner(config_list)
pruner(model)
```
```{ 'sparsity': 0.8, 'op_types': 'default' }```means that **all layers with weight will be compressed with the same 0.8 sparsity**. When ```pruner(model)``` called, the model is compressed with masks and after that you can normally fine tune this model and **pruned weights won't be updated** which have been masked.
## Then, make this automatic
The previous example manually choosed LevelPruner and pruned all layers with the same sparsity, this is obviously sub-optimal because different layers may have different redundancy. Layer sparsity should be carefully tuned to achieve least model performance degradation and this can be done with NNI tuners.
The first thing we need to do is to design a search space, here we use a nested search space which contains choosing pruning algorithm and optimizing layer sparsity.
```json
{
"prune_method": {
"_type": "choice",
"_value": [
{
"_name": "agp",
"conv0_sparsity": {
"_type": "uniform",
"_value": [
0.1,
0.9
]
},
"conv1_sparsity": {
"_type": "uniform",
"_value": [
0.1,
0.9
]
},
},
{
"_name": "level",
"conv0_sparsity": {
"_type": "uniform",
"_value": [
0.1,
0.9
]
},
"conv1_sparsity": {
"_type": "uniform",
"_value": [
0.01,
0.9
]
},
}
]
}
}
```
Then we need to modify our codes for few lines
```python
import nni
from nni.compression.torch import *
params = nni.get_parameters()
conv0_sparsity = params['prune_method']['conv0_sparsity']
conv1_sparsity = params['prune_method']['conv1_sparsity']
# these raw sparsity should be scaled if you need total sparsity constrained
config_list_level = [{ 'sparsity': conv0_sparsity, 'op_name': 'conv0' },
{ 'sparsity': conv1_sparsity, 'op_name': 'conv1' }]
config_list_agp = [{'initial_sparsity': 0, 'final_sparsity': conv0_sparsity,
'start_epoch': 0, 'end_epoch': 3,
'frequency': 1,'op_name': 'conv0' },
{'initial_sparsity': 0, 'final_sparsity': conv1_sparsity,
'start_epoch': 0, 'end_epoch': 3,
'frequency': 1,'op_name': 'conv1' },]
PRUNERS = {'level':LevelPruner(config_list_level),'agp':AGP_Pruner(config_list_agp)}
pruner = PRUNERS(params['prune_method']['_name'])
pruner(model)
... # fine tuning
acc = evaluate(model) # evaluation
nni.report_final_results(acc)
```
Last, define our task and automatically tuning pruning methods with layers sparsity
```yaml
authorName: default
experimentName: Auto_Compression
trialConcurrency: 2
maxExecDuration: 100h
maxTrialNum: 500
#choice: local, remote, pai
trainingServicePlatform: local
#choice: true, false
useAnnotation: False
searchSpacePath: search_space.json
tuner:
#choice: TPE, Random, Anneal...
builtinTunerName: TPE
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
trial:
command: bash run_prune.sh
codeDir: .
gpuNum: 1
```
...@@ -2,13 +2,12 @@ ...@@ -2,13 +2,12 @@
NNI provides an easy-to-use toolkit to help user design and use compression algorithms. It supports Tensorflow and PyTorch with unified interface. For users to compress their models, they only need to add several lines in their code. There are some popular model compression algorithms built-in in NNI. Users could further use NNI's auto tuning power to find the best compressed model, which is detailed in [Auto Model Compression](./AutoCompression.md). On the other hand, users could easily customize their new compression algorithms using NNI's interface, refer to the tutorial [here](#customize-new-compression-algorithms). NNI provides an easy-to-use toolkit to help user design and use compression algorithms. It supports Tensorflow and PyTorch with unified interface. For users to compress their models, they only need to add several lines in their code. There are some popular model compression algorithms built-in in NNI. Users could further use NNI's auto tuning power to find the best compressed model, which is detailed in [Auto Model Compression](./AutoCompression.md). On the other hand, users could easily customize their new compression algorithms using NNI's interface, refer to the tutorial [here](#customize-new-compression-algorithms).
## Supported algorithms ## Supported algorithms
We have provided two naive compression algorithms and four popular ones for users, including three pruning algorithms and three quantization algorithms: We have provided two naive compression algorithms and three popular ones for users, including two pruning algorithms and three quantization algorithms:
|Name|Brief Introduction of Algorithm| |Name|Brief Introduction of Algorithm|
|---|---| |---|---|
| [Level Pruner](./Pruner.md#level-pruner) | Pruning the specified ratio on each weight based on absolute values of weights | | [Level Pruner](./Pruner.md#level-pruner) | Pruning the specified ratio on each weight based on absolute values of weights |
| [AGP Pruner](./Pruner.md#agp-pruner) | Automated gradual pruning (To prune, or not to prune: exploring the efficacy of pruning for model compression) [Reference Paper](https://arxiv.org/abs/1710.01878)| | [AGP Pruner](./Pruner.md#agp-pruner) | Automated gradual pruning (To prune, or not to prune: exploring the efficacy of pruning for model compression) [Reference Paper](https://arxiv.org/abs/1710.01878)|
| [Sensitivity Pruner](./Pruner.md#sensitivity-pruner) | Learning both Weights and Connections for Efficient Neural Networks. [Reference Paper](https://arxiv.org/abs/1506.02626)|
| [Naive Quantizer](./Quantizer.md#naive-quantizer) | Quantize weights to default 8 bits | | [Naive Quantizer](./Quantizer.md#naive-quantizer) | Quantize weights to default 8 bits |
| [QAT Quantizer](./Quantizer.md#qat-quantizer) | Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. [Reference Paper](http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf)| | [QAT Quantizer](./Quantizer.md#qat-quantizer) | Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. [Reference Paper](http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf)|
| [DoReFa Quantizer](./Quantizer.md#dorefa-quantizer) | DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. [Reference Paper](https://arxiv.org/abs/1606.06160)| | [DoReFa Quantizer](./Quantizer.md#dorefa-quantizer) | DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. [Reference Paper](https://arxiv.org/abs/1606.06160)|
......
...@@ -48,7 +48,7 @@ from nni.compression.tensorflow import AGP_Pruner ...@@ -48,7 +48,7 @@ from nni.compression.tensorflow import AGP_Pruner
config_list = [{ config_list = [{
'initial_sparsity': 0, 'initial_sparsity': 0,
'final_sparsity': 0.8, 'final_sparsity': 0.8,
'start_epoch': 1, 'start_epoch': 0,
'end_epoch': 10, 'end_epoch': 10,
'frequency': 1, 'frequency': 1,
'op_types': 'default' 'op_types': 'default'
...@@ -62,7 +62,7 @@ from nni.compression.torch import AGP_Pruner ...@@ -62,7 +62,7 @@ from nni.compression.torch import AGP_Pruner
config_list = [{ config_list = [{
'initial_sparsity': 0, 'initial_sparsity': 0,
'final_sparsity': 0.8, 'final_sparsity': 0.8,
'start_epoch': 1, 'start_epoch': 0,
'end_epoch': 10, 'end_epoch': 10,
'frequency': 1, 'frequency': 1,
'op_types': 'default' 'op_types': 'default'
...@@ -86,47 +86,9 @@ You can view example for more information ...@@ -86,47 +86,9 @@ You can view example for more information
#### User configuration for AGP Pruner #### User configuration for AGP Pruner
* **initial_sparsity:** This is to specify the sparsity when compressor starts to compress * **initial_sparsity:** This is to specify the sparsity when compressor starts to compress
* **final_sparsity:** This is to specify the sparsity when compressor finishes to compress * **final_sparsity:** This is to specify the sparsity when compressor finishes to compress
* **start_epoch:** This is to specify the epoch number when compressor starts to compress * **start_epoch:** This is to specify the epoch number when compressor starts to compress, default start from epoch 0
* **end_epoch:** This is to specify the epoch number when compressor finishes to compress * **end_epoch:** This is to specify the epoch number when compressor finishes to compress
* **frequency:** This is to specify every *frequency* number epochs compressor compress once * **frequency:** This is to specify every *frequency* number epochs compressor compress once, default frequency=1
*** ***
## Sensitivity Pruner
In [Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/abs/1506.02626), author Song Han and provide an algorithm to find the sensitivity of each layer and set the pruning threshold to each layer.
>We used the sensitivity results to find each layer’s threshold: for example, the smallest threshold was applied to the most sensitive layer, which is the first convolutional layer... The pruning threshold is chosen as a quality parameter multiplied by the standard deviation of a layer’s weights
### Usage
You can prune weight step by step and reach one target sparsity by Sensitivity Pruner with the code below.
Tensorflow code
```python
from nni.compression.tensorflow import SensitivityPruner
config_list = [{ 'sparsity':0.8, 'op_types': 'default' }]
pruner = SensitivityPruner(config_list)
pruner(tf.get_default_graph())
```
PyTorch code
```python
from nni.compression.torch import SensitivityPruner
config_list = [{ 'sparsity':0.8, 'op_types': 'default' }]
pruner = SensitivityPruner(config_list)
pruner(model)
```
Like AGP Pruner, you should update mask information every epoch by adding code below
Tensorflow code
```python
pruner.update_epoch(epoch, sess)
```
PyTorch code
```python
pruner.update_epoch(epoch)
```
You can view example for more information
#### User configuration for Sensitivity Pruner
* **sparsity:** This is to specify the sparsity operations to be compressed to
***
# Run model compression examples
You can run these examples easily like this, take torch pruning for example
```bash
python main_torch_pruner.py
```
This example uses AGP Pruner. Initiating a pruner needs a user provided configuration which can be provided in two ways:
- By reading ```configure_example.yaml```, this can make code clean when your configuration is complicated
- Directly config in your codes
In our example, we simply config model compression in our codes like this
```python
configure_list = [{
'initial_sparsity': 0,
'final_sparsity': 0.8,
'start_epoch': 0,
'end_epoch': 10,
'frequency': 1,
'op_type': 'default'
}]
pruner = AGP_Pruner(configure_list)
```
When ```pruner(model)``` is called, your model is injected with masks as embedded operations. For example, a layer takes a weight as input, we will insert an operation between the weight and the layer, this operation takes the weight as input and outputs a new weight applied by the mask. Thus, the masks are applied at any time the computation goes through the operations. You can fine-tune your model **without** any modifications.
```python
for epoch in range(10):
# update_epoch is for pruner to be aware of epochs, so that it could adjust masks during training.
pruner.update_epoch(epoch)
print('# Epoch {} #'.format(epoch))
train(model, device, train_loader, optimizer)
test(model, device, test_loader)
```
When fine tuning finished, pruned weights are all masked and you can get masks like this
```
masks = pruner.mask_list
layer_name = xxx
mask = masks[layer_name]
```
AGPruner: AGPruner:
config: config:
- -
start_epoch: 1 start_epoch: 0
end_epoch: 10 end_epoch: 10
frequency: 1 frequency: 1
initial_sparsity: 0.05 initial_sparsity: 0.05
......
...@@ -4,23 +4,26 @@ from tensorflow.examples.tutorials.mnist import input_data ...@@ -4,23 +4,26 @@ from tensorflow.examples.tutorials.mnist import input_data
def weight_variable(shape): def weight_variable(shape):
return tf.Variable(tf.truncated_normal(shape, stddev = 0.1)) return tf.Variable(tf.truncated_normal(shape, stddev=0.1))
def bias_variable(shape): def bias_variable(shape):
return tf.Variable(tf.constant(0.1, shape = shape)) return tf.Variable(tf.constant(0.1, shape=shape))
def conv2d(x_input, w_matrix): def conv2d(x_input, w_matrix):
return tf.nn.conv2d(x_input, w_matrix, strides = [ 1, 1, 1, 1 ], padding = 'SAME') return tf.nn.conv2d(x_input, w_matrix, strides=[1, 1, 1, 1], padding='SAME')
def max_pool(x_input, pool_size): def max_pool(x_input, pool_size):
size = [ 1, pool_size, pool_size, 1 ] size = [1, pool_size, pool_size, 1]
return tf.nn.max_pool(x_input, ksize = size, strides = size, padding = 'SAME') return tf.nn.max_pool(x_input, ksize=size, strides=size, padding='SAME')
class Mnist: class Mnist:
def __init__(self): def __init__(self):
images = tf.placeholder(tf.float32, [ None, 784 ], name = 'input_x') images = tf.placeholder(tf.float32, [None, 784], name='input_x')
labels = tf.placeholder(tf.float32, [ None, 10 ], name = 'input_y') labels = tf.placeholder(tf.float32, [None, 10], name='input_y')
keep_prob = tf.placeholder(tf.float32, name='keep_prob') keep_prob = tf.placeholder(tf.float32, name='keep_prob')
self.images = images self.images = images
...@@ -35,35 +38,35 @@ class Mnist: ...@@ -35,35 +38,35 @@ class Mnist:
self.fcw1 = None self.fcw1 = None
self.cross = None self.cross = None
with tf.name_scope('reshape'): with tf.name_scope('reshape'):
x_image = tf.reshape(images, [ -1, 28, 28, 1 ]) x_image = tf.reshape(images, [-1, 28, 28, 1])
with tf.name_scope('conv1'): with tf.name_scope('conv1'):
w_conv1 = weight_variable([ 5, 5, 1, 32 ]) w_conv1 = weight_variable([5, 5, 1, 32])
self.w1 = w_conv1 self.w1 = w_conv1
b_conv1 = bias_variable([ 32 ]) b_conv1 = bias_variable([32])
self.b1 = b_conv1 self.b1 = b_conv1
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1) h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
with tf.name_scope('pool1'): with tf.name_scope('pool1'):
h_pool1 = max_pool(h_conv1, 2) h_pool1 = max_pool(h_conv1, 2)
with tf.name_scope('conv2'): with tf.name_scope('conv2'):
w_conv2 = weight_variable([ 5, 5, 32, 64 ]) w_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([ 64 ]) b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2) h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2)
with tf.name_scope('pool2'): with tf.name_scope('pool2'):
h_pool2 = max_pool(h_conv2, 2) h_pool2 = max_pool(h_conv2, 2)
with tf.name_scope('fc1'): with tf.name_scope('fc1'):
w_fc1 = weight_variable([ 7 * 7 * 64, 1024 ]) w_fc1 = weight_variable([7 * 7 * 64, 1024])
self.fcw1 = w_fc1 self.fcw1 = w_fc1
b_fc1 = bias_variable([ 1024 ]) b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [ -1, 7 * 7 * 64 ]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)
with tf.name_scope('dropout'): with tf.name_scope('dropout'):
h_fc1_drop = tf.nn.dropout(h_fc1, 0.5) h_fc1_drop = tf.nn.dropout(h_fc1, 0.5)
with tf.name_scope('fc2'): with tf.name_scope('fc2'):
w_fc2 = weight_variable([ 1024, 10 ]) w_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([ 10 ]) b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, w_fc2) + b_fc2 y_conv = tf.matmul(h_fc1_drop, w_fc2) + b_fc2
with tf.name_scope('loss'): with tf.name_scope('loss'):
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = labels, logits = y_conv)) cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=y_conv))
self.cross = cross_entropy self.cross = cross_entropy
with tf.name_scope('adam_optimizer'): with tf.name_scope('adam_optimizer'):
self.train_step = tf.train.AdamOptimizer(0.0001).minimize(cross_entropy) self.train_step = tf.train.AdamOptimizer(0.0001).minimize(cross_entropy)
...@@ -75,17 +78,17 @@ class Mnist: ...@@ -75,17 +78,17 @@ class Mnist:
def main(): def main():
tf.set_random_seed(0) tf.set_random_seed(0)
data = input_data.read_data_sets('data', one_hot = True) data = input_data.read_data_sets('data', one_hot=True)
model = Mnist() model = Mnist()
'''you can change this to SensitivityPruner to implement it '''you can change this to LevelPruner to implement it
pruner = SensitivityPruner(configure_list) pruner = LevelPruner(configure_list)
''' '''
configure_list = [{ configure_list = [{
'initial_sparsity': 0, 'initial_sparsity': 0,
'final_sparsity': 0.8, 'final_sparsity': 0.8,
'start_epoch': 1, 'start_epoch': 0,
'end_epoch': 10, 'end_epoch': 10,
'frequency': 1, 'frequency': 1,
'op_type': 'default' 'op_type': 'default'
...@@ -100,27 +103,26 @@ def main(): ...@@ -100,27 +103,26 @@ def main():
# you can also use compress(model) or compress_default_graph() for tensorflow compressor # you can also use compress(model) or compress_default_graph() for tensorflow compressor
# pruner.compress(tf.get_default_graph()) # pruner.compress(tf.get_default_graph())
with tf.Session() as sess: with tf.Session() as sess:
sess.run(tf.global_variables_initializer()) sess.run(tf.global_variables_initializer())
for batch_idx in range(2000): for batch_idx in range(2000):
if batch_idx % 10 == 0:
pruner.update_epoch(batch_idx / 10, sess)
batch = data.train.next_batch(2000) batch = data.train.next_batch(2000)
model.train_step.run(feed_dict = { model.train_step.run(feed_dict={
model.images: batch[0], model.images: batch[0],
model.labels: batch[1], model.labels: batch[1],
model.keep_prob: 0.5 model.keep_prob: 0.5
}) })
if batch_idx % 10 == 0: if batch_idx % 10 == 0:
test_acc = model.accuracy.eval(feed_dict = { test_acc = model.accuracy.eval(feed_dict={
model.images: data.test.images, model.images: data.test.images,
model.labels: data.test.labels, model.labels: data.test.labels,
model.keep_prob: 1.0 model.keep_prob: 1.0
}) })
pruner.update_epoch(batch_idx / 10,sess)
print('test accuracy', test_acc) print('test accuracy', test_acc)
test_acc = model.accuracy.eval(feed_dict={
test_acc = model.accuracy.eval(feed_dict = {
model.images: data.test.images, model.images: data.test.images,
model.labels: data.test.labels, model.labels: data.test.labels,
model.keep_prob: 1.0 model.keep_prob: 1.0
......
...@@ -20,7 +20,7 @@ class Mnist(torch.nn.Module): ...@@ -20,7 +20,7 @@ class Mnist(torch.nn.Module):
x = x.view(-1, 4 * 4 * 50) x = x.view(-1, 4 * 4 * 50)
x = F.relu(self.fc1(x)) x = F.relu(self.fc1(x))
x = self.fc2(x) x = self.fc2(x)
return F.log_softmax(x, dim = 1) return F.log_softmax(x, dim=1)
def train(model, device, train_loader, optimizer): def train(model, device, train_loader, optimizer):
...@@ -35,6 +35,7 @@ def train(model, device, train_loader, optimizer): ...@@ -35,6 +35,7 @@ def train(model, device, train_loader, optimizer):
if batch_idx % 100 == 0: if batch_idx % 100 == 0:
print('{:2.0f}% Loss {}'.format(100 * batch_idx / len(train_loader), loss.item())) print('{:2.0f}% Loss {}'.format(100 * batch_idx / len(train_loader), loss.item()))
def test(model, device, test_loader): def test(model, device, test_loader):
model.eval() model.eval()
test_loss = 0 test_loss = 0
...@@ -43,35 +44,36 @@ def test(model, device, test_loader): ...@@ -43,35 +44,36 @@ def test(model, device, test_loader):
for data, target in test_loader: for data, target in test_loader:
data, target = data.to(device), target.to(device) data, target = data.to(device), target.to(device)
output = model(data) output = model(data)
test_loss += F.nll_loss(output, target, reduction = 'sum').item() test_loss += F.nll_loss(output, target, reduction='sum').item()
pred = output.argmax(dim = 1, keepdim = True) pred = output.argmax(dim=1, keepdim=True)
correct += pred.eq(target.view_as(pred)).sum().item() correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(test_loader.dataset) test_loss /= len(test_loader.dataset)
print('Loss: {} Accuracy: {}%)\n'.format( print('Loss: {} Accuracy: {}%)\n'.format(
test_loss, 100 * correct / len(test_loader.dataset))) test_loss, 100 * correct / len(test_loader.dataset)))
def main(): def main():
torch.manual_seed(0) torch.manual_seed(0)
device = torch.device('cpu') device = torch.device('cpu')
trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]) trans = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))])
train_loader = torch.utils.data.DataLoader( train_loader = torch.utils.data.DataLoader(
datasets.MNIST('data', train = True, download = True, transform = trans), datasets.MNIST('data', train=True, download=True, transform=trans),
batch_size = 64, shuffle = True) batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader( test_loader = torch.utils.data.DataLoader(
datasets.MNIST('data', train = False, transform = trans), datasets.MNIST('data', train=False, transform=trans),
batch_size = 1000, shuffle = True) batch_size=1000, shuffle=True)
model = Mnist() model = Mnist()
'''you can change this to SensitivityPruner to implement it '''you can change this to LevelPruner to implement it
pruner = SensitivityPruner(configure_list) pruner = LevelPruner(configure_list)
''' '''
configure_list = [{ configure_list = [{
'initial_sparsity': 0, 'initial_sparsity': 0,
'final_sparsity': 0.8, 'final_sparsity': 0.8,
'start_epoch': 1, 'start_epoch': 0,
'end_epoch': 10, 'end_epoch': 10,
'frequency': 1, 'frequency': 1,
'op_type': 'default' 'op_type': 'default'
...@@ -82,14 +84,13 @@ def main(): ...@@ -82,14 +84,13 @@ def main():
# you can also use compress(model) method # you can also use compress(model) method
# like that pruner.compress(model) # like that pruner.compress(model)
optimizer = torch.optim.SGD(model.parameters(), lr = 0.01, momentum = 0.5) optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
for epoch in range(10): for epoch in range(10):
pruner.update_epoch(epoch)
print('# Epoch {} #'.format(epoch)) print('# Epoch {} #'.format(epoch))
train(model, device, train_loader, optimizer) train(model, device, train_loader, optimizer)
test(model, device, test_loader) test(model, device, test_loader)
pruner.update_epoch(epoch)
if __name__ == '__main__': if __name__ == '__main__':
main() main()
...@@ -2,7 +2,7 @@ import logging ...@@ -2,7 +2,7 @@ import logging
import tensorflow as tf import tensorflow as tf
from .compressor import Pruner from .compressor import Pruner
__all__ = [ 'LevelPruner', 'AGP_Pruner', 'SensitivityPruner' ] __all__ = ['LevelPruner', 'AGP_Pruner']
_logger = logging.getLogger(__name__) _logger = logging.getLogger(__name__)
...@@ -14,10 +14,18 @@ class LevelPruner(Pruner): ...@@ -14,10 +14,18 @@ class LevelPruner(Pruner):
- sparsity - sparsity
""" """
super().__init__(config_list) super().__init__(config_list)
self.mask_list = {}
self.if_init_list = {}
def calc_mask(self, weight, config, **kwargs): def calc_mask(self, weight, config, op_name, **kwargs):
if self.if_init_list.get(op_name, True):
threshold = tf.contrib.distributions.percentile(tf.abs(weight), config['sparsity'] * 100) threshold = tf.contrib.distributions.percentile(tf.abs(weight), config['sparsity'] * 100)
return tf.cast(tf.math.greater(tf.abs(weight), threshold), weight.dtype) mask = tf.cast(tf.math.greater(tf.abs(weight), threshold), weight.dtype)
self.mask_list.update({op_name: mask})
self.if_init_list.update({op_name: False})
else:
mask = self.mask_list[op_name]
return mask
class AGP_Pruner(Pruner): class AGP_Pruner(Pruner):
...@@ -29,6 +37,7 @@ class AGP_Pruner(Pruner): ...@@ -29,6 +37,7 @@ class AGP_Pruner(Pruner):
Learning of Phones and other Consumer Devices, Learning of Phones and other Consumer Devices,
https://arxiv.org/pdf/1710.01878.pdf https://arxiv.org/pdf/1710.01878.pdf
""" """
def __init__(self, config_list): def __init__(self, config_list):
""" """
config_list: supported keys: config_list: supported keys:
...@@ -39,15 +48,25 @@ class AGP_Pruner(Pruner): ...@@ -39,15 +48,25 @@ class AGP_Pruner(Pruner):
- frequency: if you want update every 2 epoch, you can set it 2 - frequency: if you want update every 2 epoch, you can set it 2
""" """
super().__init__(config_list) super().__init__(config_list)
self.mask_list = {}
self.if_init_list = {}
self.now_epoch = tf.Variable(0) self.now_epoch = tf.Variable(0)
self.assign_handler = [] self.assign_handler = []
def calc_mask(self, weight, config, **kwargs): def calc_mask(self, weight, config, op_name, **kwargs):
start_epoch = config.get('start_epoch', 0)
freq = config.get('frequency', 1)
if self.now_epoch >= start_epoch and self.if_init_list.get(op_name, True) and (
self.now_epoch - start_epoch) % freq == 0:
target_sparsity = self.compute_target_sparsity(config) target_sparsity = self.compute_target_sparsity(config)
threshold = tf.contrib.distributions.percentile(weight, target_sparsity * 100) threshold = tf.contrib.distributions.percentile(weight, target_sparsity * 100)
# stop gradient in case gradient change the mask # stop gradient in case gradient change the mask
mask = tf.stop_gradient(tf.cast(tf.math.greater(weight, threshold), weight.dtype)) mask = tf.stop_gradient(tf.cast(tf.math.greater(weight, threshold), weight.dtype))
self.assign_handler.append(tf.assign(weight, weight * mask)) self.assign_handler.append(tf.assign(weight, weight * mask))
self.mask_list.update({op_name: tf.constant(mask)})
self.if_init_list.update({op_name: False})
else:
mask = self.mask_list[op_name]
return mask return mask
def compute_target_sparsity(self, config): def compute_target_sparsity(self, config):
...@@ -62,49 +81,16 @@ class AGP_Pruner(Pruner): ...@@ -62,49 +81,16 @@ class AGP_Pruner(Pruner):
return final_sparsity return final_sparsity
now_epoch = tf.minimum(self.now_epoch, tf.constant(end_epoch)) now_epoch = tf.minimum(self.now_epoch, tf.constant(end_epoch))
span = int(((end_epoch - start_epoch-1)//freq)*freq) span = int(((end_epoch - start_epoch - 1) // freq) * freq)
assert span > 0 assert span > 0
base = tf.cast(now_epoch - start_epoch, tf.float32) / span base = tf.cast(now_epoch - start_epoch, tf.float32) / span
target_sparsity = (final_sparsity + target_sparsity = (final_sparsity +
(initial_sparsity - final_sparsity)* (initial_sparsity - final_sparsity) *
(tf.pow(1.0 - base, 3))) (tf.pow(1.0 - base, 3)))
return target_sparsity return target_sparsity
def update_epoch(self, epoch, sess): def update_epoch(self, epoch, sess):
sess.run(self.assign_handler) sess.run(self.assign_handler)
sess.run(tf.assign(self.now_epoch, int(epoch))) sess.run(tf.assign(self.now_epoch, int(epoch)))
for k in self.if_init_list.keys():
self.if_init_list[k] = True
class SensitivityPruner(Pruner):
"""Use algorithm from "Learning both Weights and Connections for Efficient Neural Networks"
https://arxiv.org/pdf/1506.02626v3.pdf
I.e.: "The pruning threshold is chosen as a quality parameter multiplied
by the standard deviation of a layers weights."
"""
def __init__(self, config_list):
"""
config_list: supported keys
- sparsity: chosen pruning sparsity
"""
super().__init__(config_list)
self.layer_mask = {}
self.assign_handler = []
def calc_mask(self, weight, config, op_name, **kwargs):
target_sparsity = config['sparsity'] * tf.math.reduce_std(weight)
mask = tf.get_variable(op_name + '_mask', initializer=tf.ones(weight.shape), trainable=False)
self.layer_mask[op_name] = mask
weight_assign_handler = tf.assign(weight, mask*weight)
# use control_dependencies so that weight_assign_handler will be executed before mask_update_handler
with tf.control_dependencies([weight_assign_handler]):
threshold = tf.contrib.distributions.percentile(weight, target_sparsity * 100)
# stop gradient in case gradient change the mask
new_mask = tf.stop_gradient(tf.cast(tf.math.greater(weight, threshold), weight.dtype))
mask_update_handler = tf.assign(mask, new_mask)
self.assign_handler.append(mask_update_handler)
return mask
def update_epoch(self, epoch, sess):
sess.run(self.assign_handler)
...@@ -2,7 +2,7 @@ import logging ...@@ -2,7 +2,7 @@ import logging
import torch import torch
from .compressor import Pruner from .compressor import Pruner
__all__ = ['LevelPruner', 'AGP_Pruner', 'SensitivityPruner'] __all__ = ['LevelPruner', 'AGP_Pruner']
logger = logging.getLogger('torch pruner') logger = logging.getLogger('torch pruner')
...@@ -17,14 +17,22 @@ class LevelPruner(Pruner): ...@@ -17,14 +17,22 @@ class LevelPruner(Pruner):
- sparsity - sparsity
""" """
super().__init__(config_list) super().__init__(config_list)
self.mask_list = {}
self.if_init_list = {}
def calc_mask(self, weight, config, **kwargs): def calc_mask(self, weight, config, op_name, **kwargs):
if self.if_init_list.get(op_name, True):
w_abs = weight.abs() w_abs = weight.abs()
k = int(weight.numel() * config['sparsity']) k = int(weight.numel() * config['sparsity'])
if k == 0: if k == 0:
return torch.ones(weight.shape).type_as(weight) return torch.ones(weight.shape).type_as(weight)
threshold = torch.topk(w_abs.view(-1), k, largest=False).values.max() threshold = torch.topk(w_abs.view(-1), k, largest=False).values.max()
return torch.gt(w_abs, threshold).type_as(weight) mask = torch.gt(w_abs, threshold).type_as(weight)
self.mask_list.update({op_name: mask})
self.if_init_list.update({op_name: False})
else:
mask = self.mask_list[op_name]
return mask
class AGP_Pruner(Pruner): class AGP_Pruner(Pruner):
...@@ -48,9 +56,14 @@ class AGP_Pruner(Pruner): ...@@ -48,9 +56,14 @@ class AGP_Pruner(Pruner):
""" """
super().__init__(config_list) super().__init__(config_list)
self.mask_list = {} self.mask_list = {}
self.now_epoch = 1 self.now_epoch = 0
self.if_init_list = {}
def calc_mask(self, weight, config, op_name, **kwargs): def calc_mask(self, weight, config, op_name, **kwargs):
start_epoch = config.get('start_epoch', 0)
freq = config.get('frequency', 1)
if self.now_epoch >= start_epoch and self.if_init_list.get(op_name, True) and (
self.now_epoch - start_epoch) % freq == 0:
mask = self.mask_list.get(op_name, torch.ones(weight.shape).type_as(weight)) mask = self.mask_list.get(op_name, torch.ones(weight.shape).type_as(weight))
target_sparsity = self.compute_target_sparsity(config) target_sparsity = self.compute_target_sparsity(config)
k = int(weight.numel() * target_sparsity) k = int(weight.numel() * target_sparsity)
...@@ -60,12 +73,15 @@ class AGP_Pruner(Pruner): ...@@ -60,12 +73,15 @@ class AGP_Pruner(Pruner):
w_abs = weight.abs() * mask w_abs = weight.abs() * mask
threshold = torch.topk(w_abs.view(-1), k, largest=False).values.max() threshold = torch.topk(w_abs.view(-1), k, largest=False).values.max()
new_mask = torch.gt(w_abs, threshold).type_as(weight) new_mask = torch.gt(w_abs, threshold).type_as(weight)
self.mask_list[op_name] = new_mask self.mask_list.update({op_name: new_mask})
self.if_init_list.update({op_name: False})
else:
new_mask = self.mask_list.get(op_name, torch.ones(weight.shape).type_as(weight))
return new_mask return new_mask
def compute_target_sparsity(self, config): def compute_target_sparsity(self, config):
end_epoch = config.get('end_epoch', 1) end_epoch = config.get('end_epoch', 1)
start_epoch = config.get('start_epoch', 1) start_epoch = config.get('start_epoch', 0)
freq = config.get('frequency', 1) freq = config.get('frequency', 1)
final_sparsity = config.get('final_sparsity', 0) final_sparsity = config.get('final_sparsity', 0)
initial_sparsity = config.get('initial_sparsity', 0) initial_sparsity = config.get('initial_sparsity', 0)
...@@ -86,35 +102,5 @@ class AGP_Pruner(Pruner): ...@@ -86,35 +102,5 @@ class AGP_Pruner(Pruner):
def update_epoch(self, epoch): def update_epoch(self, epoch):
if epoch > 0: if epoch > 0:
self.now_epoch = epoch self.now_epoch = epoch
for k in self.if_init_list.keys():
self.if_init_list[k] = True
class SensitivityPruner(Pruner):
"""Use algorithm from "Learning both Weights and Connections for Efficient Neural Networks"
https://arxiv.org/pdf/1506.02626v3.pdf
I.e.: "The pruning threshold is chosen as a quality parameter multiplied
by the standard deviation of a layers weights."
"""
def __init__(self, config_list):
"""
config_list: supported keys:
- sparsity: chosen pruning sparsity
"""
super().__init__(config_list)
self.mask_list = {}
def calc_mask(self, weight, config, op_name, **kwargs):
mask = self.mask_list.get(op_name, torch.ones(weight.shape).type_as(weight))
# if we want to generate new mask, we should update weight first
weight = weight * mask
target_sparsity = config['sparsity'] * torch.std(weight).item()
k = int(weight.numel() * target_sparsity)
if k == 0:
return mask
w_abs = weight.abs()
threshold = torch.topk(w_abs.view(-1), k, largest=False).values.max()
new_mask = torch.gt(w_abs, threshold).type_as(weight)
self.mask_list[op_name] = new_mask
return new_mask
...@@ -38,7 +38,6 @@ class Compressor: ...@@ -38,7 +38,6 @@ class Compressor:
if config is not None: if config is not None:
self._instrument_layer(layer, config) self._instrument_layer(layer, config)
def bind_model(self, model): def bind_model(self, model):
"""This method is called when a model is bound to the compressor. """This method is called when a model is bound to the compressor.
Users can optionally overload this method to do model-specific initialization. Users can optionally overload this method to do model-specific initialization.
...@@ -56,7 +55,6 @@ class Compressor: ...@@ -56,7 +55,6 @@ class Compressor:
""" """
pass pass
def _instrument_layer(self, layer, config): def _instrument_layer(self, layer, config):
raise NotImplementedError() raise NotImplementedError()
...@@ -90,7 +88,6 @@ class Pruner(Compressor): ...@@ -90,7 +88,6 @@ class Pruner(Compressor):
""" """
raise NotImplementedError("Pruners must overload calc_mask()") raise NotImplementedError("Pruners must overload calc_mask()")
def _instrument_layer(self, layer, config): def _instrument_layer(self, layer, config):
# TODO: support multiple weight tensors # TODO: support multiple weight tensors
# create a wrapper forward function to replace the original one # create a wrapper forward function to replace the original one
......
...@@ -5,24 +5,28 @@ import torch.nn.functional as F ...@@ -5,24 +5,28 @@ import torch.nn.functional as F
import nni.compression.tensorflow as tf_compressor import nni.compression.tensorflow as tf_compressor
import nni.compression.torch as torch_compressor import nni.compression.torch as torch_compressor
def weight_variable(shape): def weight_variable(shape):
return tf.Variable(tf.truncated_normal(shape, stddev = 0.1)) return tf.Variable(tf.truncated_normal(shape, stddev=0.1))
def bias_variable(shape): def bias_variable(shape):
return tf.Variable(tf.constant(0.1, shape = shape)) return tf.Variable(tf.constant(0.1, shape=shape))
def conv2d(x_input, w_matrix): def conv2d(x_input, w_matrix):
return tf.nn.conv2d(x_input, w_matrix, strides = [ 1, 1, 1, 1 ], padding = 'SAME') return tf.nn.conv2d(x_input, w_matrix, strides=[1, 1, 1, 1], padding='SAME')
def max_pool(x_input, pool_size): def max_pool(x_input, pool_size):
size = [ 1, pool_size, pool_size, 1 ] size = [1, pool_size, pool_size, 1]
return tf.nn.max_pool(x_input, ksize = size, strides = size, padding = 'SAME') return tf.nn.max_pool(x_input, ksize=size, strides=size, padding='SAME')
class TfMnist: class TfMnist:
def __init__(self): def __init__(self):
images = tf.placeholder(tf.float32, [ None, 784 ], name = 'input_x') images = tf.placeholder(tf.float32, [None, 784], name='input_x')
labels = tf.placeholder(tf.float32, [ None, 10 ], name = 'input_y') labels = tf.placeholder(tf.float32, [None, 10], name='input_y')
keep_prob = tf.placeholder(tf.float32, name='keep_prob') keep_prob = tf.placeholder(tf.float32, name='keep_prob')
self.images = images self.images = images
...@@ -37,35 +41,35 @@ class TfMnist: ...@@ -37,35 +41,35 @@ class TfMnist:
self.fcw1 = None self.fcw1 = None
self.cross = None self.cross = None
with tf.name_scope('reshape'): with tf.name_scope('reshape'):
x_image = tf.reshape(images, [ -1, 28, 28, 1 ]) x_image = tf.reshape(images, [-1, 28, 28, 1])
with tf.name_scope('conv1'): with tf.name_scope('conv1'):
w_conv1 = weight_variable([ 5, 5, 1, 32 ]) w_conv1 = weight_variable([5, 5, 1, 32])
self.w1 = w_conv1 self.w1 = w_conv1
b_conv1 = bias_variable([ 32 ]) b_conv1 = bias_variable([32])
self.b1 = b_conv1 self.b1 = b_conv1
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1) h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
with tf.name_scope('pool1'): with tf.name_scope('pool1'):
h_pool1 = max_pool(h_conv1, 2) h_pool1 = max_pool(h_conv1, 2)
with tf.name_scope('conv2'): with tf.name_scope('conv2'):
w_conv2 = weight_variable([ 5, 5, 32, 64 ]) w_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([ 64 ]) b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2) h_conv2 = tf.nn.relu(conv2d(h_pool1, w_conv2) + b_conv2)
with tf.name_scope('pool2'): with tf.name_scope('pool2'):
h_pool2 = max_pool(h_conv2, 2) h_pool2 = max_pool(h_conv2, 2)
with tf.name_scope('fc1'): with tf.name_scope('fc1'):
w_fc1 = weight_variable([ 7 * 7 * 64, 1024 ]) w_fc1 = weight_variable([7 * 7 * 64, 1024])
self.fcw1 = w_fc1 self.fcw1 = w_fc1
b_fc1 = bias_variable([ 1024 ]) b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [ -1, 7 * 7 * 64 ]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)
with tf.name_scope('dropout'): with tf.name_scope('dropout'):
h_fc1_drop = tf.nn.dropout(h_fc1, 0.5) h_fc1_drop = tf.nn.dropout(h_fc1, 0.5)
with tf.name_scope('fc2'): with tf.name_scope('fc2'):
w_fc2 = weight_variable([ 1024, 10 ]) w_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([ 10 ]) b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, w_fc2) + b_fc2 y_conv = tf.matmul(h_fc1_drop, w_fc2) + b_fc2
with tf.name_scope('loss'): with tf.name_scope('loss'):
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = labels, logits = y_conv)) cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=y_conv))
self.cross = cross_entropy self.cross = cross_entropy
with tf.name_scope('adam_optimizer'): with tf.name_scope('adam_optimizer'):
self.train_step = tf.train.AdamOptimizer(0.0001).minimize(cross_entropy) self.train_step = tf.train.AdamOptimizer(0.0001).minimize(cross_entropy)
...@@ -73,6 +77,7 @@ class TfMnist: ...@@ -73,6 +77,7 @@ class TfMnist:
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(labels, 1)) correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(labels, 1))
self.accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) self.accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
class TorchMnist(torch.nn.Module): class TorchMnist(torch.nn.Module):
def __init__(self): def __init__(self):
super().__init__() super().__init__()
...@@ -89,22 +94,22 @@ class TorchMnist(torch.nn.Module): ...@@ -89,22 +94,22 @@ class TorchMnist(torch.nn.Module):
x = x.view(-1, 4 * 4 * 50) x = x.view(-1, 4 * 4 * 50)
x = F.relu(self.fc1(x)) x = F.relu(self.fc1(x))
x = self.fc2(x) x = self.fc2(x)
return F.log_softmax(x, dim = 1) return F.log_softmax(x, dim=1)
class CompressorTestCase(TestCase): class CompressorTestCase(TestCase):
def test_tf_pruner(self): def test_tf_pruner(self):
model = TfMnist() model = TfMnist()
configure_list = [{'sparsity':0.8, 'op_types':'default'}] configure_list = [{'sparsity': 0.8, 'op_types': 'default'}]
tf_compressor.LevelPruner(configure_list).compress_default_graph() tf_compressor.LevelPruner(configure_list).compress_default_graph()
def test_tf_quantizer(self): def test_tf_quantizer(self):
model = TfMnist() model = TfMnist()
tf_compressor.NaiveQuantizer([{'op_types': 'default'}]).compress_default_graph() tf_compressor.NaiveQuantizer([{'op_types': 'default'}]).compress_default_graph()
def test_torch_pruner(self): def test_torch_pruner(self):
model = TorchMnist() model = TorchMnist()
configure_list = [{'sparsity':0.8, 'op_types':'default'}] configure_list = [{'sparsity': 0.8, 'op_types': 'default'}]
torch_compressor.LevelPruner(configure_list).compress(model) torch_compressor.LevelPruner(configure_list).compress(model)
def test_torch_quantizer(self): def test_torch_quantizer(self):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment