# MobileNetV3 An implementation of MobileNetV3 with pyTorch # Theory  You can find the paper of MobileNetV3 at [Searching for MobileNetV3](https://arxiv.org/abs/1905.02244). # Prepare data * CIFAR-10 * CIFAR-100 * SVHN * Tiny-ImageNet * ImageNet: Please move validation images to labeled subfolders, you can use the script [here](https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh). # Train * Train from scratch: ``` CUDA_VISIBLE_DEVICES=3 python train.py --batch-size=128 --mode=small \ --print-freq=100 --dataset=CIFAR100 --ema-decay=0 --label-smoothing=0.1 \ --lr=0.3 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 \ --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200 --width-multiplier=1 \ -nbd -zero-gamma -mixup ``` where the meaning of the parameters are as followed: ``` batch-size mode: using MobileNetV3-Small(if set to small) or MobileNetV3-Large(if set to large). dataset: which dataset to use(CIFAR10, CIFAR100, SVHN, TinyImageNet or ImageNet). ema-decay: decay of EMA, if set to 0, do not use EMA. label-smoothing: $epsilon$ using in label smoothing, if set to 0, do not use label smoothing. lr-decay: learning rate decay schedule, step or cos. lr-min: min lr in cos lr decay. warmup-epochs: warmup epochs using in cos lr deacy. num-epochs: total training epochs. nbd: no bias decay. zero-gamma: zero $gamma$ of last BN in each block. mixup: using Mixup. ``` # Pretrained models  We have provided the pretrained MobileNetV3-Small model in `pretrained`. # Experiments ## Training setting ### on ImageNet ``` CUDA_VISIBLE_DEVICES=5 python train.py --batch-size=128 --mode=small --print-freq=2000 --dataset=imagenet \ --ema-decay=0.99 --label-smoothing=0.1 --lr=0.1 --save-epoch-freq=50 --lr-decay=cos --lr-min=0 --warmup-epochs=5 \ --weight-decay=1e-5 --num-epochs=250 --num-workers=2 --width-multiplier=1 -dali -nbd -mixup -zero-gamma -save ``` ### on CIFAR-10 ``` CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR10\ --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\ --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1 ``` ### on CIFAR-100 ``` CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR100\ --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\ --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1 ```  Using more tricks: ``` CUDA_VISIBLE_DEVICES=1 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=CIFAR100\ --ema-decay=0.999 --label-smoothing=0.1 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\ --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=400 --num-workers=2 --width-multiplier=1\ -zero-gamma -nbd -mixup ``` ### on SVHN ``` CUDA_VISIBLE_DEVICES=3 python train.py --batch-size=128 --mode=small --print-freq=1000 --dataset=SVHN\ --ema-decay=0 --label-smoothing=0 --lr=0.35 --save-epoch-freq=1000 --lr-decay=cos --lr-min=0\ --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=20 --num-workers=2 --width-multiplier=1 ``` ### on Tiny-ImageNet ``` CUDA_VISIBLE_DEVICES=7 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=tinyimagenet\ --data-dir=/media/data2/chenjiarong/ImageData/tiny-imagenet --ema-decay=0 --label-smoothing=0 --lr=0.15\ --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200\ --num-workers=2 --width-multiplier=1 -dali ```  Using more tricks: ``` CUDA_VISIBLE_DEVICES=7 python train.py --batch-size=128 --mode=small --print-freq=100 --dataset=tinyimagenet\ --data-dir=/media/data2/chenjiarong/ImageData/tiny-imagenet --ema-decay=0.999 --label-smoothing=0.1 --lr=0.15\ --save-epoch-freq=1000 --lr-decay=cos --lr-min=0 --warmup-epochs=5 --weight-decay=6e-5 --num-epochs=200\ --num-workers=2 --width-multiplier=1 -dali -nbd -mixup ``` ## MobileNetV3-Large ### on ImageNet | | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Offical 1.0 | 219 M | 5.4 M | 75.2% | - | | Ours 1.0 | 216.6 M | 5.47 M | - | - | ### on CIFAR-10 | | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 66.47 M | 4.21 M | - | - | ### on CIFAR-100 | | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 66.58 M | 4.32 M | - | - | ## MobileNetV3-Small ### on ImageNet | | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Offical 1.0 | 56.5 M | 2.53 M | 67.4% | - | | Ours 1.0 | 56.51 M | 2.53 M | 67.52% | 87.58% |  The pretrained model with top-1 accuracy 67.52% is provided in the folder [pretrained](https://github.com/ShowLo/MobileNetV3/tree/master/pretrained). ### on CIFAR-10 (Average accuracy of 5 runs) | | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 17.51 M | 1.52 M | 92.97% | - | ### on CIFAR-100 (Average accuracy of 5 runs) | | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 17.60 M | 1.61 M | 73.69% | 92.31% | | More Tricks | same | same | 76.24% | 92.58% | ### on SVHN (Average accuracy of 5 runs) | | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 17.51 M | 1.52 M | 97.92% | - | ### on Tiny-ImageNet (Average accuracy of 5 runs) | | Madds | Parameters | Top1-acc | Top5-acc | | ----------- | --------- | ---------- | --------- | --------- | | Ours 1.0 | 51.63 M | 1.71 M | 59.32% | 81.38% | | More Tricks | same | same | 62.62% | 84.04% | ## Dependency  This project uses Python 3.7 and PyTorch 1.1.0. The FLOPs and Parameters and measured using [torchsummaryX](https://github.com/nmhkahn/torchsummaryX).