Unverified Commit accb40f9 authored by Guoxin's avatar Guoxin Committed by GitHub
Browse files

compression benchmark (#2742)

parent d0a9b106
# Comparison of Filter Pruning Algorithms
To provide an initial insight into the performance of various filter pruning algorithms,
we conduct extensive experiments with various pruning algorithms on some benchmark models and datasets.
We present the experiment result in this document.
In addition, we provide friendly instructions on the re-implementation of these experiments to facilitate further contributions to this effort.
## Experiment Setting
The experiments are performed with the following pruners/datasets/models:
* Models: [VGG16, ResNet18, ResNet50](https://github.com/microsoft/nni/tree/master/examples/model_compress/models/cifar10)
* Datasets: CIFAR-10
* Pruners:
- These pruners are included:
- Pruners with scheduling : `SimulatedAnnealing Pruner`, `NetAdapt Pruner`, `AutoCompress Pruner`.
Given the overal sparsity requirement, these pruners can automatically generate a sparsity distribution among different layers.
- One-shot pruners: `L1Filter Pruner`, `L2Filter Pruner`, `FPGM Pruner`.
The sparsity of each layer is set the same as the overall sparsity in this experiment.
- Only **filter pruning** performances are compared here.
For the pruners with scheduling, `L1Filter Pruner` is used as the base algorithm. That is to say, after the sparsities distribution is decided by the scheduling algorithm, `L1Filter Pruner` is used to performn real pruning.
- All the pruners listed above are implemented in [nni](https://github.com/microsoft/nni/tree/master/docs/en_US/Compressor/Overview.md).
## Experiment Result
For each dataset/model/pruner combination, we prune the model to different levels by setting a series of target sparsities for the pruner.
Here we plot both **Number of Weights - Performances** curve and **FLOPs - Performance** curve.
As a reference, we also plot the result declared in the paper [AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates](http://arxiv.org/abs/1907.03141) for models VGG16 and ResNet18 on CIFAR-10.
The experiment result are shown in the following figures:
CIFAR-10, VGG16:
![](../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_vgg16.png)
CIFAR-10, ResNet18:
![](../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet18.png)
CIFAR-10, ResNet50:
![](../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet50.png)
## Analysis
From the experiment result, we get the following conclusions:
* Given the constraint on the number of parameters, the pruners with scheduling ( `AutoCompress Pruner` , `SimualatedAnnealing Pruner` ) performs better than the others when the constraint is strict. However, they have no such advantage in FLOPs/Performances comparison since only number of parameters constraint is considered in the optimization process;
* The basic algorithms `L1Filter Pruner` , `L2Filter Pruner` , `FPGM Pruner` performs very similarly in these experiments;
* `NetAdapt Pruner` can not achieve very high compression rate. This is caused by its mechanism that it prunes only one layer each pruning iteration. This leads to un-acceptable complexity if the sparsity per iteration is much lower than the overall sparisity constraint.
## Experiments Reproduction
### Implementation Details
* The experiment results are all collected with the default configuration of the pruners in nni, which means that when we call a pruner class in nni, we don't change any default class arguments.
* Both FLOPs and the number of parameters are counted with [Model FLOPs/Parameters Counter](https://github.com/microsoft/nni/blob/master/docs/en_US/Compressor/CompressionUtils.md#model-flopsparameters-counter) after [model speed up](https://github.com/microsoft/nni/blob/master/docs/en_US/Compressor/ModelSpeedup.md). This avoids potential issues of counting them of masked models.
* The experiment code can be found [here]( https://github.com/microsoft/nni/tree/master/examples/model_compress/auto_pruners_torch.py).
### Experiment Result Rendering
* If you follow the practice in the [example]( https://github.com/microsoft/nni/tree/master/examples/model_compress/auto_pruners_torch.py), for every single pruning experiment, the experiment result will be saved in JSON format as follows:
``` json
{
"performance": {"original": 0.9298, "pruned": 0.1, "speedup": 0.1, "finetuned": 0.7746},
"params": {"original": 14987722.0, "speedup": 167089.0},
"flops": {"original": 314018314.0, "speedup": 38589922.0}
}
```
* The experiment results are saved [here](https://github.com/microsoft/nni/tree/master/examples/model_compress/experiment_data).
You can refer to [analyze](https://github.com/microsoft/nni/tree/master/examples/model_compress/experiment_data/analyze.py) to plot new performance comparison figures.
## Contribution
### TODO Items
* Pruners constrained by FLOPS/latency
* More pruning algorithms/datasets/models
### Issues
For algorithm implementation & experiment issues, please [create an issue](https://github.com/microsoft/nni/issues/new/).
...@@ -8,4 +8,5 @@ Performance comparison and analysis can help users decide a proper algorithm (e. ...@@ -8,4 +8,5 @@ Performance comparison and analysis can help users decide a proper algorithm (e.
:maxdepth: 1 :maxdepth: 1
Neural Architecture Search Comparison <NasComparison> Neural Architecture Search Comparison <NasComparison>
Hyper-parameter Tuning Algorithm Comparsion <HpoComparison> Hyper-parameter Tuning Algorithm Comparsion <HpoComparison>
\ No newline at end of file Model Compression Algorithm Comparsion <ModelCompressionComparison>
\ No newline at end of file
...@@ -42,6 +42,7 @@ Pruning algorithms compress the original network by removing redundant weights o ...@@ -42,6 +42,7 @@ Pruning algorithms compress the original network by removing redundant weights o
| [SimulatedAnnealing Pruner](https://nni.readthedocs.io/en/latest/Compressor/Pruner.html#simulatedannealing-pruner) | Automatic pruning with a guided heuristic search method, Simulated Annealing algorithm [Reference Paper](https://arxiv.org/abs/1907.03141) | | [SimulatedAnnealing Pruner](https://nni.readthedocs.io/en/latest/Compressor/Pruner.html#simulatedannealing-pruner) | Automatic pruning with a guided heuristic search method, Simulated Annealing algorithm [Reference Paper](https://arxiv.org/abs/1907.03141) |
| [AutoCompress Pruner](https://nni.readthedocs.io/en/latest/Compressor/Pruner.html#autocompress-pruner) | Automatic pruning by iteratively call SimulatedAnnealing Pruner and ADMM Pruner [Reference Paper](https://arxiv.org/abs/1907.03141) | | [AutoCompress Pruner](https://nni.readthedocs.io/en/latest/Compressor/Pruner.html#autocompress-pruner) | Automatic pruning by iteratively call SimulatedAnnealing Pruner and ADMM Pruner [Reference Paper](https://arxiv.org/abs/1907.03141) |
You can refer to this [benchmark](https://github.com/microsoft/nni/tree/master/docs/en_US/Benchmark.md) for the performance of these pruners on some benchmark problems.
### Quantization Algorithms ### Quantization Algorithms
......
...@@ -9,78 +9,81 @@ import os ...@@ -9,78 +9,81 @@ import os
import json import json
import torch import torch
from torch.optim.lr_scheduler import StepLR, MultiStepLR from torch.optim.lr_scheduler import StepLR, MultiStepLR
from torchvision import datasets, transforms, models from torchvision import datasets, transforms
from models.mnist.lenet import LeNet from models.mnist.lenet import LeNet
from models.cifar10.vgg import VGG from models.cifar10.vgg import VGG
from nni.compression.torch import L1FilterPruner, SimulatedAnnealingPruner, ADMMPruner, NetAdaptPruner, AutoCompressPruner from models.cifar10.resnet import ResNet18, ResNet50
from nni.compression.torch import L1FilterPruner, L2FilterPruner, FPGMPruner
from nni.compression.torch import SimulatedAnnealingPruner, ADMMPruner, NetAdaptPruner, AutoCompressPruner
from nni.compression.torch import ModelSpeedup from nni.compression.torch import ModelSpeedup
from nni.compression.torch.utils.counter import count_flops_params
def get_data(args): def get_data(dataset, data_dir, batch_size, test_batch_size):
''' '''
get data get data
''' '''
kwargs = {'num_workers': 1, 'pin_memory': True} if torch.cuda.is_available() else { kwargs = {'num_workers': 1, 'pin_memory': True} if torch.cuda.is_available() else {
} }
if args.dataset == 'mnist': if dataset == 'mnist':
train_loader = torch.utils.data.DataLoader( train_loader = torch.utils.data.DataLoader(
datasets.MNIST(args.data_dir, train=True, download=True, datasets.MNIST(data_dir, train=True, download=True,
transform=transforms.Compose([ transform=transforms.Compose([
transforms.ToTensor(), transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,)) transforms.Normalize((0.1307,), (0.3081,))
])), ])),
batch_size=args.batch_size, shuffle=True, **kwargs) batch_size=batch_size, shuffle=True, **kwargs)
val_loader = torch.utils.data.DataLoader( val_loader = torch.utils.data.DataLoader(
datasets.MNIST(args.data_dir, train=False, datasets.MNIST(data_dir, train=False,
transform=transforms.Compose([ transform=transforms.Compose([
transforms.ToTensor(), transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,)) transforms.Normalize((0.1307,), (0.3081,))
])), ])),
batch_size=args.test_batch_size, shuffle=True, **kwargs) batch_size=test_batch_size, shuffle=True, **kwargs)
criterion = torch.nn.NLLLoss() criterion = torch.nn.NLLLoss()
elif args.dataset == 'cifar10': elif dataset == 'cifar10':
normalize = transforms.Normalize( normalize = transforms.Normalize(
(0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)) (0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
train_loader = torch.utils.data.DataLoader( train_loader = torch.utils.data.DataLoader(
datasets.CIFAR10(args.data_dir, train=True, transform=transforms.Compose([ datasets.CIFAR10(data_dir, train=True, transform=transforms.Compose([
transforms.RandomHorizontalFlip(), transforms.RandomHorizontalFlip(),
transforms.RandomCrop(32, 4), transforms.RandomCrop(32, 4),
transforms.ToTensor(), transforms.ToTensor(),
normalize, normalize,
]), download=True), ]), download=True),
batch_size=args.batch_size, shuffle=True, **kwargs) batch_size=batch_size, shuffle=True, **kwargs)
val_loader = torch.utils.data.DataLoader( val_loader = torch.utils.data.DataLoader(
datasets.CIFAR10(args.data_dir, train=False, transform=transforms.Compose([ datasets.CIFAR10(data_dir, train=False, transform=transforms.Compose([
transforms.ToTensor(), transforms.ToTensor(),
normalize, normalize,
])), ])),
batch_size=args.batch_size, shuffle=False, **kwargs) batch_size=batch_size, shuffle=False, **kwargs)
criterion = torch.nn.CrossEntropyLoss() criterion = torch.nn.CrossEntropyLoss()
elif args.dataset == 'imagenet': elif dataset == 'imagenet':
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]) std=[0.229, 0.224, 0.225])
train_loader = torch.utils.data.DataLoader( train_loader = torch.utils.data.DataLoader(
datasets.ImageFolder(os.path.join(args.data_dir, 'train'), datasets.ImageFolder(os.path.join(data_dir, 'train'),
transform=transforms.Compose([ transform=transforms.Compose([
transforms.RandomResizedCrop(224), transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(), transforms.RandomHorizontalFlip(),
transforms.ToTensor(), transforms.ToTensor(),
normalize, normalize,
])), ])),
batch_size=args.batch_size, shuffle=True, **kwargs) batch_size=batch_size, shuffle=True, **kwargs)
val_loader = torch.utils.data.DataLoader( val_loader = torch.utils.data.DataLoader(
datasets.ImageFolder(os.path.join(args.data_dir, 'val'), datasets.ImageFolder(os.path.join(data_dir, 'val'),
transform=transforms.Compose([ transform=transforms.Compose([
transforms.Resize(256), transforms.Resize(256),
transforms.CenterCrop(224), transforms.CenterCrop(224),
transforms.ToTensor(), transforms.ToTensor(),
normalize, normalize,
])), ])),
batch_size=args.test_batch_size, shuffle=True, **kwargs) batch_size=test_batch_size, shuffle=True, **kwargs)
criterion = torch.nn.CrossEntropyLoss() criterion = torch.nn.CrossEntropyLoss()
return train_loader, val_loader, criterion return train_loader, val_loader, criterion
...@@ -127,65 +130,91 @@ def test(model, device, criterion, val_loader): ...@@ -127,65 +130,91 @@ def test(model, device, criterion, val_loader):
return accuracy return accuracy
def get_trained_model(args, device, train_loader, val_loader, criterion): def get_trained_model_optimizer(args, device, train_loader, val_loader, criterion):
if args.model == 'LeNet': if args.model == 'LeNet':
model = LeNet().to(device) model = LeNet().to(device)
optimizer = torch.optim.Adadelta(model.parameters(), lr=1) if args.load_pretrained_model:
scheduler = StepLR(optimizer, step_size=1, gamma=0.7) model.load_state_dict(torch.load(args.pretrained_model_dir))
for epoch in range(args.pretrain_epochs): optimizer = torch.optim.Adadelta(model.parameters(), lr=1e-4)
train(args, model, device, train_loader, else:
criterion, optimizer, epoch) optimizer = torch.optim.Adadelta(model.parameters(), lr=1)
scheduler.step() scheduler = StepLR(optimizer, step_size=1, gamma=0.7)
elif args.model == 'vgg16': elif args.model == 'vgg16':
model = VGG(depth=16).to(device) model = VGG(depth=16).to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, if args.load_pretrained_model:
momentum=0.9, model.load_state_dict(torch.load(args.pretrained_model_dir))
weight_decay=5e-4) optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.9, weight_decay=5e-4)
scheduler = MultiStepLR( else:
optimizer, milestones=[int(args.pretrain_epochs*0.5), int(args.pretrain_epochs*0.75)], gamma=0.1) optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)
for epoch in range(args.pretrain_epochs): scheduler = MultiStepLR(
train(args, model, device, train_loader, optimizer, milestones=[int(args.pretrain_epochs*0.5), int(args.pretrain_epochs*0.75)], gamma=0.1)
criterion, optimizer, epoch)
scheduler.step()
elif args.model == 'resnet18': elif args.model == 'resnet18':
model = models.resnet18(pretrained=False, num_classes=10).to(device) model = ResNet18().to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, if args.load_pretrained_model:
momentum=0.9, model.load_state_dict(torch.load(args.pretrained_model_dir))
weight_decay=5e-4) optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.9, weight_decay=5e-4)
scheduler = MultiStepLR( else:
optimizer, milestones=[int(args.pretrain_epochs*0.5), int(args.pretrain_epochs*0.75)], gamma=0.1) optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4)
scheduler = MultiStepLR(
optimizer, milestones=[int(args.pretrain_epochs*0.5), int(args.pretrain_epochs*0.75)], gamma=0.1)
elif args.model == 'resnet50':
model = ResNet50().to(device)
if args.load_pretrained_model:
model.load_state_dict(torch.load(args.pretrained_model_dir))
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.9, weight_decay=5e-4)
else:
optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4)
scheduler = MultiStepLR(
optimizer, milestones=[int(args.pretrain_epochs*0.5), int(args.pretrain_epochs*0.75)], gamma=0.1)
else:
raise ValueError("model not recognized")
if not args.load_pretrained_model:
best_acc = 0
best_epoch = 0
for epoch in range(args.pretrain_epochs): for epoch in range(args.pretrain_epochs):
train(args, model, device, train_loader, train(args, model, device, train_loader, criterion, optimizer, epoch)
criterion, optimizer, epoch)
scheduler.step() scheduler.step()
elif args.model == 'mobilenet_v2': acc = test(model, device, criterion, val_loader)
model = models.mobilenet_v2(pretrained=True).to(device) if acc > best_acc:
best_acc = acc
if args.save_model: best_epoch = epoch
torch.save(model.state_dict(), os.path.join( state_dict = model.state_dict()
args.experiment_data_dir, 'model_trained.pth')) model.load_state_dict(state_dict)
print('Model trained saved to %s', args.experiment_data_dir) print('Best acc:', best_acc)
print('Best epoch:', best_epoch)
if args.save_model:
torch.save(state_dict, os.path.join(args.experiment_data_dir, 'model_trained.pth'))
print('Model trained saved to %s', args.experiment_data_dir)
return model, optimizer return model, optimizer
def get_dummy_input(args, device): def get_dummy_input(args, device):
if args.dataset == 'mnist': if args.dataset == 'mnist':
dummy_input = torch.randn( dummy_input = torch.randn([args.test_batch_size, 1, 28, 28]).to(device)
[args.test_batch_size, 1, 28, 28]).to(device)
elif args.dataset in ['cifar10', 'imagenet']: elif args.dataset in ['cifar10', 'imagenet']:
dummy_input = torch.randn( dummy_input = torch.randn([args.test_batch_size, 3, 32, 32]).to(device)
[args.test_batch_size, 3, 32, 32]).to(device)
return dummy_input return dummy_input
def get_input_size(dataset):
if dataset == 'mnist':
input_size = (1, 1, 28, 28)
elif dataset == 'cifar10':
input_size = (1, 3, 32, 32)
elif dataset == 'imagenet':
input_size = (1, 3, 256, 256)
return input_size
def main(args): def main(args):
# prepare dataset # prepare dataset
torch.manual_seed(0) torch.manual_seed(0)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_loader, val_loader, criterion = get_data(args) train_loader, val_loader, criterion = get_data(args.dataset, args.data_dir, args.batch_size, args.test_batch_size)
model, optimizer = get_trained_model(args, device, train_loader, val_loader, criterion) model, optimizer = get_trained_model_optimizer(args, device, train_loader, val_loader, criterion)
def short_term_fine_tuner(model, epochs=1): def short_term_fine_tuner(model, epochs=1):
for epoch in range(epochs): for epoch in range(epochs):
...@@ -198,11 +227,15 @@ def main(args): ...@@ -198,11 +227,15 @@ def main(args):
return test(model, device, criterion, val_loader) return test(model, device, criterion, val_loader)
# used to save the performance of the original & pruned & finetuned models # used to save the performance of the original & pruned & finetuned models
result = {} result = {'flops': {}, 'params': {}, 'performance':{}}
flops, params = count_flops_params(model, get_input_size(args.dataset))
result['flops']['original'] = flops
result['params']['original'] = params
evaluation_result = evaluator(model) evaluation_result = evaluator(model)
print('Evaluation result (original model): %s' % evaluation_result) print('Evaluation result (original model): %s' % evaluation_result)
result['original'] = evaluation_result result['performance']['original'] = evaluation_result
# module types to prune, only "Conv2d" supported for channel pruning # module types to prune, only "Conv2d" supported for channel pruning
if args.base_algo in ['l1', 'l2']: if args.base_algo in ['l1', 'l2']:
...@@ -218,6 +251,10 @@ def main(args): ...@@ -218,6 +251,10 @@ def main(args):
if args.pruner == 'L1FilterPruner': if args.pruner == 'L1FilterPruner':
pruner = L1FilterPruner(model, config_list) pruner = L1FilterPruner(model, config_list)
elif args.pruner == 'L2FilterPruner':
pruner = L2FilterPruner(model, config_list)
elif args.pruner == 'FPGMPruner':
pruner = FPGMPruner(model, config_list)
elif args.pruner == 'NetAdaptPruner': elif args.pruner == 'NetAdaptPruner':
pruner = NetAdaptPruner(model, config_list, short_term_fine_tuner=short_term_fine_tuner, evaluator=evaluator, pruner = NetAdaptPruner(model, config_list, short_term_fine_tuner=short_term_fine_tuner, evaluator=evaluator,
base_algo=args.base_algo, experiment_data_dir=args.experiment_data_dir) base_algo=args.base_algo, experiment_data_dir=args.experiment_data_dir)
...@@ -263,99 +300,123 @@ def main(args): ...@@ -263,99 +300,123 @@ def main(args):
experiment_data_dir=args.experiment_data_dir) experiment_data_dir=args.experiment_data_dir)
else: else:
raise ValueError( raise ValueError(
"Please use L1FilterPruner, NetAdaptPruner, SimulatedAnnealingPruner, ADMMPruner or AutoCompressPruner in this example.") "Pruner not supported.")
# Pruner.compress() returns the masked model # Pruner.compress() returns the masked model
# but for AutoCompressPruner, Pruner.compress() returns directly the pruned model # but for AutoCompressPruner, Pruner.compress() returns directly the pruned model
model_masked = pruner.compress() model = pruner.compress()
evaluation_result = evaluator(model_masked) evaluation_result = evaluator(model)
print('Evaluation result (masked model): %s' % evaluation_result) print('Evaluation result (masked model): %s' % evaluation_result)
result['pruned'] = evaluation_result result['performance']['pruned'] = evaluation_result
if args.save_model: if args.save_model:
pruner.export_model( pruner.export_model(
os.path.join(args.experiment_data_dir, 'model_masked.pth'), os.path.join(args.experiment_data_dir, 'mask.pth')) os.path.join(args.experiment_data_dir, 'model_masked.pth'), os.path.join(args.experiment_data_dir, 'mask.pth'))
print('Masked model saved to %s', args.experiment_data_dir) print('Masked model saved to %s', args.experiment_data_dir)
# model speed up
if args.speed_up:
if args.pruner != 'AutoCompressPruner':
if args.model == 'LeNet':
model = LeNet().to(device)
elif args.model == 'vgg16':
model = VGG(depth=16).to(device)
elif args.model == 'resnet18':
model = ResNet18().to(device)
elif args.model == 'resnet50':
model = ResNet50().to(device)
model.load_state_dict(torch.load(os.path.join(args.experiment_data_dir, 'model_masked.pth')))
masks_file = os.path.join(args.experiment_data_dir, 'mask.pth')
m_speedup = ModelSpeedup(model, dummy_input, masks_file, device)
m_speedup.speedup_model()
evaluation_result = evaluator(model)
print('Evaluation result (speed up model): %s' % evaluation_result)
result['performance']['speedup'] = evaluation_result
torch.save(model.state_dict(), os.path.join(args.experiment_data_dir, 'model_speed_up.pth'))
print('Speed up model saved to %s', args.experiment_data_dir)
flops, params = count_flops_params(model, get_input_size(args.dataset))
result['flops']['speedup'] = flops
result['params']['speedup'] = params
if args.fine_tune: if args.fine_tune:
if args.dataset == 'mnist': if args.dataset == 'mnist':
optimizer = torch.optim.Adadelta(model_masked.parameters(), lr=1) optimizer = torch.optim.Adadelta(model.parameters(), lr=1)
scheduler = StepLR(optimizer, step_size=1, gamma=0.7)
for epoch in range(args.fine_tune_epochs):
train(args, model_masked, device, train_loader, criterion, optimizer, epoch)
scheduler.step()
test(model_masked, device, criterion, val_loader)
elif args.dataset == 'cifar10':
optimizer = torch.optim.SGD(model_masked.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)
scheduler = StepLR(optimizer, step_size=1, gamma=0.7) scheduler = StepLR(optimizer, step_size=1, gamma=0.7)
for epoch in range(args.fine_tune_epochs): elif args.dataset == 'cifar10' and args.model == 'vgg16':
train(args, model_masked, device, train_loader, criterion, optimizer, epoch) optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)
scheduler.step() scheduler = MultiStepLR(
test(model_masked, device, criterion, val_loader) optimizer, milestones=[int(args.fine_tune_epochs*0.5), int(args.fine_tune_epochs*0.75)], gamma=0.1)
elif args.dataset == 'imagenet': elif args.dataset == 'cifar10' and args.model == 'resnet18':
for epoch in range(args.fine_tune_epochs): optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4)
optimizer = torch.optim.SGD(model_masked.parameters(), lr=0.05, momentum=0.9, weight_decay=5e-4) scheduler = MultiStepLR(
train(args, model_masked, device, train_loader, criterion, optimizer, epoch) optimizer, milestones=[int(args.fine_tune_epochs*0.5), int(args.fine_tune_epochs*0.75)], gamma=0.1)
test(model_masked, device, criterion, val_loader) elif args.dataset == 'cifar10' and args.model == 'resnet50':
optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4)
evaluation_result = evaluator(model_masked) scheduler = MultiStepLR(
print('Evaluation result (fine tuned): %s' % evaluation_result) optimizer, milestones=[int(args.fine_tune_epochs*0.5), int(args.fine_tune_epochs*0.75)], gamma=0.1)
result['finetuned'] = evaluation_result best_acc = 0
for epoch in range(args.fine_tune_epochs):
train(args, model, device, train_loader, criterion, optimizer, epoch)
scheduler.step()
acc = evaluator(model)
if acc > best_acc:
best_acc = acc
torch.save(model.state_dict(), os.path.join(args.experiment_data_dir, 'model_fine_tuned.pth'))
if args.save_model: print('Evaluation result (fine tuned): %s' % best_acc)
pruner.export_model(os.path.join( print('Fined tuned model saved to %s', args.experiment_data_dir)
args.experiment_data_dir, 'model_fine_tuned.pth'), os.path.join(args.experiment_data_dir, 'mask.pth')) result['performance']['finetuned'] = best_acc
print('Fined tuned model saved to %s', args.experiment_data_dir)
# model speed up with open(os.path.join(args.experiment_data_dir, 'result.json'), 'w+') as f:
if args.speed_up and args.pruner != 'AutoCompressPruner':
if args.model == 'LeNet':
model = LeNet().to(device)
elif args.model == 'vgg16':
model = VGG(depth=16).to(device)
elif args.model == 'resnet18':
model = models.resnet18(pretrained=False, num_classes=10).to(device)
elif args.model == 'mobilenet_v2':
model = models.mobilenet_v2(pretrained=False).to(device)
model.load_state_dict(torch.load(os.path.join(args.experiment_data_dir, 'model_fine_tuned.pth')))
masks_file = os.path.join(args.experiment_data_dir, 'mask.pth')
m_speedup = ModelSpeedup(model, dummy_input, masks_file, device)
m_speedup.speedup_model()
evaluation_result = evaluator(model)
print('Evaluation result (speed up model): %s' % evaluation_result)
result['speedup'] = evaluation_result
torch.save(model.state_dict(), os.path.join(args.experiment_data_dir, 'model_speed_up.pth'))
print('Speed up model saved to %s', args.experiment_data_dir)
with open(os.path.join(args.experiment_data_dir, 'performance.json'), 'w+') as f:
json.dump(result, f) json.dump(result, f)
if __name__ == '__main__': if __name__ == '__main__':
def str2bool(v): def str2bool(s):
if isinstance(v, bool): if isinstance(s, bool):
return v return s
if v.lower() in ('yes', 'true', 't', 'y', '1'): if s.lower() in ('yes', 'true', 't', 'y', '1'):
return True return True
elif v.lower() in ('no', 'false', 'f', 'n', '0'): if s.lower() in ('no', 'false', 'f', 'n', '0'):
return False return False
else: raise argparse.ArgumentTypeError('Boolean value expected.')
raise argparse.ArgumentTypeError('Boolean value expected.')
parser = argparse.ArgumentParser(description='PyTorch Example for SimulatedAnnealingPruner') parser = argparse.ArgumentParser(description='PyTorch Example for SimulatedAnnealingPruner')
# dataset and model
parser.add_argument('--dataset', type=str, default='cifar10',
help='dataset to use, mnist, cifar10 or imagenet')
parser.add_argument('--data-dir', type=str, default='./data/',
help='dataset directory')
parser.add_argument('--model', type=str, default='vgg16',
help='model to use, LeNet, vgg16, resnet18 or resnet50')
parser.add_argument('--load-pretrained-model', type=str2bool, default=False,
help='whether to load pretrained model')
parser.add_argument('--pretrained-model-dir', type=str, default='./',
help='path to pretrained model')
parser.add_argument('--pretrain-epochs', type=int, default=100,
help='number of epochs to pretrain the model')
parser.add_argument('--batch-size', type=int, default=64,
help='input batch size for training (default: 64)')
parser.add_argument('--test-batch-size', type=int, default=64,
help='input batch size for testing (default: 64)')
parser.add_argument('--fine-tune', type=str2bool, default=True,
help='whether to fine-tune the pruned model')
parser.add_argument('--fine-tune-epochs', type=int, default=5,
help='epochs to fine tune')
parser.add_argument('--experiment-data-dir', type=str, default='./experiment_data',
help='For saving experiment data')
# pruner
parser.add_argument('--pruner', type=str, default='SimulatedAnnealingPruner', parser.add_argument('--pruner', type=str, default='SimulatedAnnealingPruner',
help='pruner to use, L1FilterPruner, NetAdaptPruner, SimulatedAnnealingPruner, ADMMPruner or AutoCompressPruner') help='pruner to use')
parser.add_argument('--base-algo', type=str, default='l1', parser.add_argument('--base-algo', type=str, default='l1',
help='base pruning algorithm. level, l1 or l2') help='base pruning algorithm. level, l1 or l2')
parser.add_argument('--sparsity', type=float, default=0.3, parser.add_argument('--sparsity', type=float, default=0.1,
help='overall target sparsity') help='target overall target sparsity')
parser.add_argument('--speed-up', type=str2bool, default=False,
help='Whether to speed-up the pruned model')
# param for SimulatedAnnealingPruner # param for SimulatedAnnealingPruner
parser.add_argument('--cool-down-rate', type=float, default=0.9, parser.add_argument('--cool-down-rate', type=float, default=0.9,
help='cool down rate') help='cool down rate')
...@@ -363,29 +424,16 @@ if __name__ == '__main__': ...@@ -363,29 +424,16 @@ if __name__ == '__main__':
parser.add_argument('--sparsity-per-iteration', type=float, default=0.05, parser.add_argument('--sparsity-per-iteration', type=float, default=0.05,
help='sparsity_per_iteration of NetAdaptPruner') help='sparsity_per_iteration of NetAdaptPruner')
parser.add_argument('--dataset', type=str, default='mnist', # speed-up
help='dataset to use, mnist, cifar10 or imagenet (default MNIST)') parser.add_argument('--speed-up', type=str2bool, default=False,
parser.add_argument('--model', type=str, default='LeNet', help='Whether to speed-up the pruned model')
help='model to use, LeNet, vgg16, resnet18 or mobilenet_v2')
parser.add_argument('--fine-tune', type=str2bool, default=True,
help='whether to fine-tune the pruned model')
parser.add_argument('--fine-tune-epochs', type=int, default=10,
help='epochs to fine tune')
parser.add_argument('--data-dir', type=str, default='/datasets/',
help='dataset directory')
parser.add_argument('--experiment-data-dir', type=str, default='./',
help='For saving experiment data')
parser.add_argument('--batch-size', type=int, default=64, # others
help='input batch size for training (default: 64)')
parser.add_argument('--test-batch-size', type=int, default=64,
help='input batch size for testing (default: 64)')
parser.add_argument('--pretrain-epochs', type=int, default=1,
help='number of epochs to pretrain the model')
parser.add_argument('--log-interval', type=int, default=200, parser.add_argument('--log-interval', type=int, default=200,
help='how many batches to wait before logging training status') help='how many batches to wait before logging training status')
parser.add_argument('--save-model', type=str2bool, default=True, parser.add_argument('--save-model', type=str2bool, default=True,
help='For Saving the current Model') help='For Saving the current Model')
args = parser.parse_args() args = parser.parse_args()
if not os.path.exists(args.experiment_data_dir): if not os.path.exists(args.experiment_data_dir):
......
import argparse
import json
import matplotlib.pyplot as plt
def plot_performance_comparison(args):
# reference data, performance of the original model and the performance declared in the AutoCompress Paper
references = {
'original':{
'cifar10':{
'vgg16':{
'performance': 0.9298,
'params':14987722.0,
'flops':314018314.0
},
'resnet18':{
'performance': 0.9433,
'params':11173962.0,
'flops':556651530.0
},
'resnet50':{
'performance': 0.9488,
'params':23520842.0,
'flops':1304694794.0
}
}
},
'AutoCompressPruner':{
'cifar10':{
'vgg16':{
'performance': 0.9321,
'params':52.2, # times
'flops':8.8
},
'resnet18':{
'performance': 0.9381,
'params':54.2, # times
'flops':12.2
}
}
}
}
markers = ['v', '^', '<', '1', '2', '3', '4', '8', '*', '+', 'o']
with open('cifar10/comparison_result_{}.json'.format(args.model), 'r') as jsonfile:
result = json.load(jsonfile)
pruners = result.keys()
performances = {}
flops = {}
params = {}
sparsities = {}
for pruner in pruners:
performances[pruner] = [val['performance'] for val in result[pruner]]
flops[pruner] = [val['flops'] for val in result[pruner]]
params[pruner] = [val['params'] for val in result[pruner]]
sparsities[pruner] = [val['sparsity'] for val in result[pruner]]
fig, axs = plt.subplots(2, 1, figsize=(8, 10))
fig.suptitle('Channel Pruning Comparison on {}/CIFAR10'.format(args.model))
fig.subplots_adjust(hspace=0.5)
for idx, pruner in enumerate(pruners):
axs[0].scatter(params[pruner], performances[pruner], marker=markers[idx], label=pruner)
axs[1].scatter(flops[pruner], performances[pruner], marker=markers[idx], label=pruner)
# references
params_original = references['original']['cifar10'][args.model]['params']
performance_original = references['original']['cifar10'][args.model]['performance']
axs[0].plot(params_original, performance_original, 'rx', label='original model')
if args.model in ['vgg16', 'resnet18']:
axs[0].plot(params_original/references['AutoCompressPruner']['cifar10'][args.model]['params'],
references['AutoCompressPruner']['cifar10'][args.model]['performance'],
'bx', label='AutoCompress Paper')
axs[0].set_title("Performance v.s. Number of Parameters")
axs[0].set_xlabel("Number of Parameters")
axs[0].set_ylabel('Accuracy')
axs[0].legend()
# references
flops_original = references['original']['cifar10'][args.model]['flops']
performance_original = references['original']['cifar10'][args.model]['performance']
axs[1].plot(flops_original, performance_original, 'rx', label='original model')
if args.model in ['vgg16', 'resnet18']:
axs[1].plot(flops_original/references['AutoCompressPruner']['cifar10'][args.model]['flops'],
references['AutoCompressPruner']['cifar10'][args.model]['performance'],
'bx', label='AutoCompress Paper')
axs[1].set_title("Performance v.s. FLOPs")
axs[1].set_xlabel("FLOPs")
axs[1].set_ylabel('Accuracy')
axs[1].legend()
plt.savefig('img/performance_comparison_{}.png'.format(args.model))
plt.close()
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='PyTorch MNIST Example')
parser.add_argument('--model', type=str, default='vgg16',
help='vgg16, resnet18 or resnet50')
args = parser.parse_args()
plot_performance_comparison(args)
{
"L1FilterPruner": [
{
"sparsity": 0.1,
"params": 9642085.0,
"flops": 496882684.0,
"performance": 0.9436
},
{
"sparsity": 0.2,
"params": 8149126.0,
"flops": 436381222.0,
"performance": 0.9472
},
{
"sparsity": 0.3,
"params": 6705269.0,
"flops": 371666312.0,
"performance": 0.9391
},
{
"sparsity": 0.4,
"params": 5335138.0,
"flops": 307050934.0,
"performance": 0.9433
},
{
"sparsity": 0.5,
"params": 3998122.0,
"flops": 237900244.0,
"performance": 0.9379
},
{
"sparsity": 0.6,
"params": 2767325.0,
"flops": 175308326.0,
"performance": 0.9326
},
{
"sparsity": 0.7,
"params": 1617817.0,
"flops": 108532198.0,
"performance": 0.928
},
{
"sparsity": 0.8,
"params": 801338.0,
"flops": 53808728.0,
"performance": 0.9145
},
{
"sparsity": 0.9,
"params": 229372.0,
"flops": 15304972.0,
"performance": 0.8858
},
{
"sparsity": 0.95,
"params": 61337.0,
"flops": 4305146.0,
"performance": 0.8441
},
{
"sparsity": 0.975,
"params": 17763.0,
"flops": 1561644.0,
"performance": 0.7294
}
],
"L2FilterPruner": [
{
"sparsity": 0.1,
"params": 9680242.0,
"flops": 497492746.0,
"performance": 0.9423
},
{
"sparsity": 0.2,
"params": 8137784.0,
"flops": 436199900.0,
"performance": 0.9471
},
{
"sparsity": 0.3,
"params": 6702679.0,
"flops": 369733768.0,
"performance": 0.9415
},
{
"sparsity": 0.4,
"params": 5330426.0,
"flops": 305512736.0,
"performance": 0.9411
},
{
"sparsity": 0.5,
"params": 3961076.0,
"flops": 236467814.0,
"performance": 0.9349
},
{
"sparsity": 0.6,
"params": 2776512.0,
"flops": 175872204.0,
"performance": 0.9393
},
{
"sparsity": 0.7,
"params": 1622571.0,
"flops": 107994906.0,
"performance": 0.9295
},
{
"sparsity": 0.8,
"params": 797075.0,
"flops": 53534414.0,
"performance": 0.9187
},
{
"sparsity": 0.9,
"params": 232153.0,
"flops": 15385078.0,
"performance": 0.8838
},
{
"sparsity": 0.95,
"params": 58180.0,
"flops": 4510072.0,
"performance": 0.8396
},
{
"sparsity": 0.975,
"params": 16836.0,
"flops": 1429752.0,
"performance": 0.7482
}
],
"FPGMPruner": [
{
"sparsity": 0.1,
"params": 9705680.0,
"flops": 497899454.0,
"performance": 0.9443
},
{
"sparsity": 0.2,
"params": 8160468.0,
"flops": 436562544.0,
"performance": 0.946
},
{
"sparsity": 0.3,
"params": 6710052.0,
"flops": 367960482.0,
"performance": 0.9452
},
{
"sparsity": 0.4,
"params": 5334205.0,
"flops": 306166432.0,
"performance": 0.9412
},
{
"sparsity": 0.5,
"params": 4007259.0,
"flops": 237702210.0,
"performance": 0.9385
},
{
"sparsity": 0.6,
"params": 2782236.0,
"flops": 175813620.0,
"performance": 0.9304
},
{
"sparsity": 0.7,
"params": 1634603.0,
"flops": 108904676.0,
"performance": 0.9249
},
{
"sparsity": 0.8,
"params": 799610.0,
"flops": 53645918.0,
"performance": 0.9203
},
{
"sparsity": 0.9,
"params": 233644.0,
"flops": 15408784.0,
"performance": 0.8856
},
{
"sparsity": 0.95,
"params": 56518.0,
"flops": 4266910.0,
"performance": 0.83
},
{
"sparsity": 0.975,
"params": 17610.0,
"flops": 1441836.0,
"performance": 0.7356
}
],
"NetAdaptPruner": [
{
"sparsity": 0.1,
"params": 11173962.0,
"flops": 556651530.0,
"performance": 0.9474
},
{
"sparsity": 0.2,
"params": 10454958.0,
"flops": 545147466.0,
"performance": 0.9482
},
{
"sparsity": 0.3,
"params": 9299986.0,
"flops": 526681564.0,
"performance": 0.9469
},
{
"sparsity": 0.4,
"params": 8137618.0,
"flops": 508087276.0,
"performance": 0.9451
},
{
"sparsity": 0.5,
"params": 6267654.0,
"flops": 478185102.0,
"performance": 0.947
},
{
"sparsity": 0.6,
"params": 5277444.0,
"flops": 462341742.0,
"performance": 0.9469
},
{
"sparsity": 0.7,
"params": 4854190.0,
"flops": 455580628.0,
"performance": 0.9466
},
{
"sparsity": 0.8,
"params": 3531098.0,
"flops": 434411156.0,
"performance": 0.9472
}
],
"SimulatedAnnealingPruner": [
{
"sparsity": 0.1,
"params": 10307424.0,
"flops": 537697098.0,
"performance": 0.942
},
{
"sparsity": 0.2,
"params": 9264598.0,
"flops": 513101368.0,
"performance": 0.9456
},
{
"sparsity": 0.3,
"params": 7999316.0,
"flops": 489260738.0,
"performance": 0.946
},
{
"sparsity": 0.4,
"params": 6996176.0,
"flops": 450768626.0,
"performance": 0.9413
},
{
"sparsity": 0.5,
"params": 5412616.0,
"flops": 408698434.0,
"performance": 0.9477
},
{
"sparsity": 0.6,
"params": 5106924.0,
"flops": 391735326.0,
"performance": 0.9483
},
{
"sparsity": 0.7,
"params": 3032105.0,
"flops": 269777978.0,
"performance": 0.9414
},
{
"sparsity": 0.8,
"params": 2423230.0,
"flops": 294783862.0,
"performance": 0.9384
},
{
"sparsity": 0.9,
"params": 1151046.0,
"flops": 209639226.0,
"performance": 0.939
},
{
"sparsity": 0.95,
"params": 394406.0,
"flops": 108776618.0,
"performance": 0.923
},
{
"sparsity": 0.975,
"params": 250649.0,
"flops": 84645050.0,
"performance": 0.917
}
],
"AutoCompressPruner": [
{
"sparsity": 0.1,
"params": 10238286.0,
"flops": 536590794.0,
"performance": 0.9406
},
{
"sparsity": 0.2,
"params": 9272049.0,
"flops": 512333916.0,
"performance": 0.9392
},
{
"sparsity": 0.3,
"params": 8099915.0,
"flops": 485418056.0,
"performance": 0.9398
},
{
"sparsity": 0.4,
"params": 6864547.0,
"flops": 449359492.0,
"performance": 0.9406
},
{
"sparsity": 0.5,
"params": 6106994.0,
"flops": 430766432.0,
"performance": 0.9397
},
{
"sparsity": 0.6,
"params": 5338096.0,
"flops": 415085278.0,
"performance": 0.9384
},
{
"sparsity": 0.7,
"params": 3701330.0,
"flops": 351057878.0,
"performance": 0.938
},
{
"sparsity": 0.8,
"params": 2229760.0,
"flops": 269058346.0,
"performance": 0.9388
},
{
"sparsity": 0.9,
"params": 1108564.0,
"flops": 189355930.0,
"performance": 0.9348
},
{
"sparsity": 0.95,
"params": 616893.0,
"flops": 159314256.0,
"performance": 0.93
},
{
"sparsity": 0.975,
"params": 297368.0,
"flops": 113398292.0,
"performance": 0.9072
}
]
}
\ No newline at end of file
{
"L1FilterPruner": [
{
"sparsity": 0.1,
"params": 20378141.0,
"flops": 1134740738.0,
"performance": 0.9456
},
{
"sparsity": 0.2,
"params": 17286560.0,
"flops": 966734852.0,
"performance": 0.9433
},
{
"sparsity": 0.3,
"params": 14403947.0,
"flops": 807114812.0,
"performance": 0.9396
},
{
"sparsity": 0.4,
"params": 11558288.0,
"flops": 656314106.0,
"performance": 0.9402
},
{
"sparsity": 0.5,
"params": 8826728.0,
"flops": 507965924.0,
"performance": 0.9394
},
{
"sparsity": 0.6,
"params": 6319902.0,
"flops": 374211960.0,
"performance": 0.9372
},
{
"sparsity": 0.7,
"params": 4063713.0,
"flops": 246788556.0,
"performance": 0.9304
},
{
"sparsity": 0.8,
"params": 2120717.0,
"flops": 133614422.0,
"performance": 0.9269
},
{
"sparsity": 0.9,
"params": 652524.0,
"flops": 41973714.0,
"performance": 0.9081
},
{
"sparsity": 0.95,
"params": 195468.0,
"flops": 13732020.0,
"performance": 0.8723
},
{
"sparsity": 0.975,
"params": 58054.0,
"flops": 4268104.0,
"performance": 0.7941
}
],
"L2FilterPruner": [
{
"sparsity": 0.1,
"params": 20378141.0,
"flops": 1134740738.0,
"performance": 0.9442
},
{
"sparsity": 0.2,
"params": 17275244.0,
"flops": 966400928.0,
"performance": 0.9463
},
{
"sparsity": 0.3,
"params": 14415409.0,
"flops": 807710914.0,
"performance": 0.9367
},
{
"sparsity": 0.4,
"params": 11564310.0,
"flops": 656653008.0,
"performance": 0.9391
},
{
"sparsity": 0.5,
"params": 8843266.0,
"flops": 508086256.0,
"performance": 0.9381
},
{
"sparsity": 0.6,
"params": 6316815.0,
"flops": 373882614.0,
"performance": 0.9368
},
{
"sparsity": 0.7,
"params": 4054272.0,
"flops": 246477678.0,
"performance": 0.935
},
{
"sparsity": 0.8,
"params": 2129321.0,
"flops": 134527520.0,
"performance": 0.9275
},
{
"sparsity": 0.9,
"params": 667500.0,
"flops": 42927060.0,
"performance": 0.9129
},
{
"sparsity": 0.95,
"params": 192464.0,
"flops": 13669430.0,
"performance": 0.8757
},
{
"sparsity": 0.975,
"params": 58250.0,
"flops": 4365620.0,
"performance": 0.7978
}
],
"FPGMPruner": [
{
"sparsity": 0.1,
"params": 20401570.0,
"flops": 1135114552.0,
"performance": 0.9438
},
{
"sparsity": 0.2,
"params": 17321414.0,
"flops": 967137398.0,
"performance": 0.9427
},
{
"sparsity": 0.3,
"params": 14418221.0,
"flops": 807755756.0,
"performance": 0.9422
},
{
"sparsity": 0.4,
"params": 11565000.0,
"flops": 655412124.0,
"performance": 0.9403
},
{
"sparsity": 0.5,
"params": 8829840.0,
"flops": 506715294.0,
"performance": 0.9355
},
{
"sparsity": 0.6,
"params": 6308085.0,
"flops": 374231682.0,
"performance": 0.9359
},
{
"sparsity": 0.7,
"params": 4054237.0,
"flops": 246511714.0,
"performance": 0.9285
},
{
"sparsity": 0.8,
"params": 2134187.0,
"flops": 134456366.0,
"performance": 0.9275
},
{
"sparsity": 0.9,
"params": 665931.0,
"flops": 42859752.0,
"performance": 0.9083
},
{
"sparsity": 0.95,
"params": 191590.0,
"flops": 13641052.0,
"performance": 0.8762
},
{
"sparsity": 0.975,
"params": 57767.0,
"flops": 4350074.0,
"performance": 0.789
}
],
"NetAdaptPruner": [
{
"sparsity": 0.1,
"params": 22348970.0,
"flops": 1275701258.0,
"performance": 0.9404
},
{
"sparsity": 0.2,
"params": 21177162.0,
"flops": 1256952330.0,
"performance": 0.9445
},
{
"sparsity": 0.3,
"params": 18407434.0,
"flops": 1212636682.0,
"performance": 0.9433
},
{
"sparsity": 0.4,
"params": 16061284.0,
"flops": 1175098282.0,
"performance": 0.9401
}
],
"SimulatedAnnealingPruner": [
{
"sparsity": 0.1,
"params": 20551755.0,
"flops": 1230145122.0,
"performance": 0.9438
},
{
"sparsity": 0.2,
"params": 17766048.0,
"flops": 1159924128.0,
"performance": 0.9432
},
{
"sparsity": 0.3,
"params": 15105146.0,
"flops": 1094478662.0,
"performance": 0.943
},
{
"sparsity": 0.4,
"params": 12378092.0,
"flops": 1008801158.0,
"performance": 0.9398
},
{
"sparsity": 0.5,
"params": 9890487.0,
"flops": 911941770.0,
"performance": 0.9426
},
{
"sparsity": 0.6,
"params": 7638262.0,
"flops": 831218770.0,
"performance": 0.9412
},
{
"sparsity": 0.7,
"params": 5469936.0,
"flops": 691881792.0,
"performance": 0.9405
},
{
"sparsity": 0.8,
"params": 3668951.0,
"flops": 580850666.0,
"performance": 0.941
},
{
"sparsity": 0.9,
"params": 1765284.0,
"flops": 389162310.0,
"performance": 0.9294
}
],
"AutoCompressPruner": [
{
"sparsity": 0.1,
"params": 20660299.0,
"flops": 1228508590.0,
"performance": 0.9337
},
{
"sparsity": 0.2,
"params": 17940465.0,
"flops": 1152868146.0,
"performance": 0.9326
},
{
"sparsity": 0.3,
"params": 15335831.0,
"flops": 1084996094.0,
"performance": 0.9348
},
{
"sparsity": 0.4,
"params": 12821408.0,
"flops": 991305524.0,
"performance": 0.936
},
{
"sparsity": 0.5,
"params": 10695425.0,
"flops": 919638860.0,
"performance": 0.9349
},
{
"sparsity": 0.6,
"params": 8536821.0,
"flops": 802011678.0,
"performance": 0.9339
},
{
"sparsity": 0.7,
"params": 7276898.0,
"flops": 744248114.0,
"performance": 0.9337
},
{
"sparsity": 0.8,
"params": 5557721.0,
"flops": 643881710.0,
"performance": 0.9323
},
{
"sparsity": 0.9,
"params": 3925140.0,
"flops": 512545272.0,
"performance": 0.9304
},
{
"sparsity": 0.95,
"params": 2867004.0,
"flops": 365184762.0,
"performance": 0.9263
},
{
"sparsity": 0.975,
"params": 1773257.0,
"flops": 229320266.0,
"performance": 0.9175
}
]
}
\ No newline at end of file
{
"L1FilterPruner": [
{
"sparsity": 0.1,
"params": 12187336.0,
"flops": 256252606.0,
"performance": 0.9344
},
{
"sparsity": 0.2,
"params": 9660216.0,
"flops": 203049930.0,
"performance": 0.9371
},
{
"sparsity": 0.3,
"params": 7435417.0,
"flops": 155477470.0,
"performance": 0.9341
},
{
"sparsity": 0.4,
"params": 5493954.0,
"flops": 114721578.0,
"performance": 0.9317
},
{
"sparsity": 0.5,
"params": 3820010.0,
"flops": 79155722.0,
"performance": 0.9309
},
{
"sparsity": 0.6,
"params": 2478632.0,
"flops": 51618494.0,
"performance": 0.9229
},
{
"sparsity": 0.7,
"params": 1420600.0,
"flops": 29455306.0,
"performance": 0.9031
},
{
"sparsity": 0.8,
"params": 658553.0,
"flops": 13290974.0,
"performance": 0.8756
},
{
"sparsity": 0.9,
"params": 186178.0,
"flops": 3574570.0,
"performance": 0.8145
},
{
"sparsity": 0.95,
"params": 58680.0,
"flops": 1050570.0,
"performance": 0.6983
},
{
"sparsity": 0.975,
"params": 23408.0,
"flops": 329918.0,
"performance": 0.5573
}
],
"L2FilterPruner": [
{
"sparsity": 0.1,
"params": 12187336.0,
"flops": 256252606.0,
"performance": 0.9357
},
{
"sparsity": 0.2,
"params": 9660216.0,
"flops": 203049930.0,
"performance": 0.9355
},
{
"sparsity": 0.3,
"params": 7435417.0,
"flops": 155477470.0,
"performance": 0.9337
},
{
"sparsity": 0.4,
"params": 5493954.0,
"flops": 114721578.0,
"performance": 0.9308
},
{
"sparsity": 0.5,
"params": 3820010.0,
"flops": 79155722.0,
"performance": 0.9285
},
{
"sparsity": 0.6,
"params": 2478632.0,
"flops": 51618494.0,
"performance": 0.9208
},
{
"sparsity": 0.7,
"params": 1420600.0,
"flops": 29455306.0,
"performance": 0.909
},
{
"sparsity": 0.8,
"params": 658553.0,
"flops": 13290974.0,
"performance": 0.8698
},
{
"sparsity": 0.9,
"params": 186178.0,
"flops": 3574570.0,
"performance": 0.8203
},
{
"sparsity": 0.95,
"params": 58680.0,
"flops": 1050570.0,
"performance": 0.7063
},
{
"sparsity": 0.975,
"params": 23408.0,
"flops": 329918.0,
"performance": 0.5455
}
],
"FPGMPruner": [
{
"sparsity": 0.1,
"params": 12187336.0,
"flops": 256252606.0,
"performance": 0.937
},
{
"sparsity": 0.2,
"params": 9660216.0,
"flops": 203049930.0,
"performance": 0.936
},
{
"sparsity": 0.3,
"params": 7435417.0,
"flops": 155477470.0,
"performance": 0.9359
},
{
"sparsity": 0.4,
"params": 5493954.0,
"flops": 114721578.0,
"performance": 0.9302
},
{
"sparsity": 0.5,
"params": 3820010.0,
"flops": 79155722.0,
"performance": 0.9233
},
{
"sparsity": 0.6,
"params": 2478632.0,
"flops": 51618494.0,
"performance": 0.922
},
{
"sparsity": 0.7,
"params": 1420600.0,
"flops": 29455306.0,
"performance": 0.9022
},
{
"sparsity": 0.8,
"params": 658553.0,
"flops": 13290974.0,
"performance": 0.8794
},
{
"sparsity": 0.9,
"params": 186178.0,
"flops": 3574570.0,
"performance": 0.8276
},
{
"sparsity": 0.95,
"params": 58680.0,
"flops": 1050570.0,
"performance": 0.6967
},
{
"sparsity": 0.975,
"params": 23408.0,
"flops": 329918.0,
"performance": 0.3683
}
],
"NetAdaptPruner": [
{
"sparsity": 0.1,
"params": 13492098.0,
"flops": 308484330.0,
"performance": 0.9376
},
{
"sparsity": 0.2,
"params": 11998408.0,
"flops": 297641410.0,
"performance": 0.9374
},
{
"sparsity": 0.3,
"params": 10504344.0,
"flops": 281928834.0,
"performance": 0.9369
},
{
"sparsity": 0.4,
"params": 8263221.0,
"flops": 272964342.0,
"performance": 0.9382
},
{
"sparsity": 0.5,
"params": 6769885.0,
"flops": 249070966.0,
"performance": 0.9388
},
{
"sparsity": 0.6,
"params": 6022137.0,
"flops": 237106998.0,
"performance": 0.9383
},
{
"sparsity": 0.7,
"params": 4526754.0,
"flops": 222152490.0,
"performance": 0.936
},
{
"sparsity": 0.8,
"params": 3032759.0,
"flops": 162401210.0,
"performance": 0.9362
}
],
"SimulatedAnnealingPruner": [
{
"sparsity": 0.1,
"params": 12691704.0,
"flops": 301467870.0,
"performance": 0.9366
},
{
"sparsity": 0.2,
"params": 10318461.0,
"flops": 275724450.0,
"performance": 0.9362
},
{
"sparsity": 0.3,
"params": 8217127.0,
"flops": 246321046.0,
"performance": 0.9371
},
{
"sparsity": 0.4,
"params": 6458368.0,
"flops": 232948294.0,
"performance": 0.9378
},
{
"sparsity": 0.5,
"params": 4973079.0,
"flops": 217675254.0,
"performance": 0.9362
},
{
"sparsity": 0.6,
"params": 3131526.0,
"flops": 151576878.0,
"performance": 0.9347
},
{
"sparsity": 0.7,
"params": 1891036.0,
"flops": 76575574.0,
"performance": 0.9289
},
{
"sparsity": 0.8,
"params": 1170751.0,
"flops": 107532322.0,
"performance": 0.9325
},
{
"sparsity": 0.9,
"params": 365978.0,
"flops": 46241354.0,
"performance": 0.9167
},
{
"sparsity": 0.95,
"params": 167089.0,
"flops": 38589922.0,
"performance": 0.7746
},
{
"sparsity": 0.975,
"params": 96779.0,
"flops": 26838230.0,
"performance": 0.1
}
],
"AutoCompressPruner": [
{
"sparsity": 0.1,
"params": 12460277.0,
"flops": 290311730.0,
"performance": 0.9352
},
{
"sparsity": 0.2,
"params": 10138147.0,
"flops": 269180938.0,
"performance": 0.9324
},
{
"sparsity": 0.3,
"params": 8033350.0,
"flops": 241789714.0,
"performance": 0.9357
},
{
"sparsity": 0.4,
"params": 6105156.0,
"flops": 213573294.0,
"performance": 0.9367
},
{
"sparsity": 0.5,
"params": 4372604.0,
"flops": 185826362.0,
"performance": 0.9387
},
{
"sparsity": 0.6,
"params": 3029629.0,
"flops": 166285498.0,
"performance": 0.9334
},
{
"sparsity": 0.7,
"params": 1897060.0,
"flops": 134897806.0,
"performance": 0.9359
},
{
"sparsity": 0.8,
"params": 1145509.0,
"flops": 111766450.0,
"performance": 0.9334
},
{
"sparsity": 0.9,
"params": 362546.0,
"flops": 50777246.0,
"performance": 0.9261
},
{
"sparsity": 0.95,
"params": 149735.0,
"flops": 39201770.0,
"performance": 0.8924
},
{
"sparsity": 0.975,
"params": 45378.0,
"flops": 13213974.0,
"performance": 0.8193
}
]
}
\ No newline at end of file
import torch
import torch.nn as nn
import torch.nn.functional as F
class BasicBlock(nn.Module):
expansion = 1
def __init__(self, in_planes, planes, stride=1):
super(BasicBlock, self).__init__()
self.conv1 = nn.Conv2d(
in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)
self.bn1 = nn.BatchNorm2d(planes)
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3,
stride=1, padding=1, bias=False)
self.bn2 = nn.BatchNorm2d(planes)
self.shortcut = nn.Sequential()
if stride != 1 or in_planes != self.expansion*planes:
self.shortcut = nn.Sequential(
nn.Conv2d(in_planes, self.expansion*planes,
kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(self.expansion*planes)
)
def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
out = self.bn2(self.conv2(out))
out += self.shortcut(x)
out = F.relu(out)
return out
class Bottleneck(nn.Module):
expansion = 4
def __init__(self, in_planes, planes, stride=1):
super(Bottleneck, self).__init__()
self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False)
self.bn1 = nn.BatchNorm2d(planes)
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3,
stride=stride, padding=1, bias=False)
self.bn2 = nn.BatchNorm2d(planes)
self.conv3 = nn.Conv2d(planes, self.expansion *
planes, kernel_size=1, bias=False)
self.bn3 = nn.BatchNorm2d(self.expansion*planes)
self.shortcut = nn.Sequential()
if stride != 1 or in_planes != self.expansion*planes:
self.shortcut = nn.Sequential(
nn.Conv2d(in_planes, self.expansion*planes,
kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(self.expansion*planes)
)
def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
out = F.relu(self.bn2(self.conv2(out)))
out = self.bn3(self.conv3(out))
out += self.shortcut(x)
out = F.relu(out)
return out
class ResNet(nn.Module):
def __init__(self, block, num_blocks, num_classes=10):
super(ResNet, self).__init__()
self.in_planes = 64
# this layer is different from torchvision.resnet18() since this model adopted for Cifar10
self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)
self.bn1 = nn.BatchNorm2d(64)
self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)
self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)
self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)
self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)
self.linear = nn.Linear(512*block.expansion, num_classes)
def _make_layer(self, block, planes, num_blocks, stride):
strides = [stride] + [1]*(num_blocks-1)
layers = []
for stride in strides:
layers.append(block(self.in_planes, planes, stride))
self.in_planes = planes * block.expansion
return nn.Sequential(*layers)
def forward(self, x):
out = F.relu(self.bn1(self.conv1(x)))
out = self.layer1(out)
out = self.layer2(out)
out = self.layer3(out)
out = self.layer4(out)
out = F.avg_pool2d(out, 4)
out = out.view(out.size(0), -1)
out = self.linear(out)
return out
def ResNet18():
return ResNet(BasicBlock, [2, 2, 2, 2])
def ResNet34():
return ResNet(BasicBlock, [3, 4, 6, 3])
def ResNet50():
return ResNet(Bottleneck, [3, 4, 6, 3])
def ResNet101():
return ResNet(Bottleneck, [3, 4, 23, 3])
def ResNet152():
return ResNet(Bottleneck, [3, 8, 36, 3])
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment