# MaskFeat Pre-training with Video - [MaskFeat Pre-training with Video](#maskfeat-pre-training-with-video) - [Description](#description) - [Usage](#usage) - [Setup Environment](#setup-environment) - [Data Preparation](#data-preparation) - [Pre-training Commands](#pre-training-commands) - [On Local Single GPU](#on-local-single-gpu) - [On Multiple GPUs](#on-multiple-gpus) - [On Multiple GPUs with Slurm](#on-multiple-gpus-with-slurm) - [Downstream Tasks Commands](#downstream-tasks-commands) - [On Multiple GPUs](#on-multiple-gpus-1) - [On Multiple GPUs with Slurm](#on-multiple-gpus-with-slurm-1) - [Results](#results) - [Citation](#citation) - [Checklist](#checklist) ## Description Author: @fangyixiao18 This is the implementation of **MaskFeat** with video dataset, like Kinetics400. ## Usage ### Setup Environment Requirements: - MMPretrain >= 1.0.0rc0 - MMAction2 >= 1.0.0rc3 Please refer to [Get Started](https://mmpretrain.readthedocs.io/en/latest/get_started.html) documentation of MMPretrain to finish installation. Besides, to process the video data, we apply transforms in MMAction2. The instruction to install MMAction2 can be found in [Get Started documentation](https://mmaction2.readthedocs.io/en/1.x/get_started.html). ### Data Preparation You can refer to the [documentation](https://mmaction2.readthedocs.io/en/1.x/user_guides/2_data_prepare.html) in MMAction2. ### Pre-training Commands At first, you need to add the current folder to `PYTHONPATH`, so that Python can find your model files. In `projects/maskfeat_video/` root directory, please run command below to add it. ```shell export PYTHONPATH=`pwd`:$PYTHONPATH ``` Then run the following commands to train the model: #### On Local Single GPU ```bash # train with mim mim train mmpretrain ${CONFIG} --work-dir ${WORK_DIR} # a specific command example mim train mmpretrain configs/maskfeat_mvit-small_8xb32-amp-coslr-300e_k400.py \ --work-dir work_dirs/selfsup/maskfeat_mvit-small_8xb32-amp-coslr-300e_k400/ # train with scripts python tools/train.py configs/maskfeat_mvit-small_8xb32-amp-coslr-300e_k400.py \ --work-dir work_dirs/selfsup/maskfeat_mvit-small_8xb32-amp-coslr-300e_k400/ ``` #### On Multiple GPUs ```bash # train with mim # a specific command examples, 8 GPUs here mim train mmpretrain configs/maskfeat_mvit-small_8xb32-amp-coslr-300e_k400.py \ --work-dir work_dirs/selfsup/maskfeat_mvit-small_8xb32-amp-coslr-300e_k400/ \ --launcher pytorch --gpus 8 # train with scripts bash tools/dist_train.sh configs/maskfeat_mvit-small_8xb32-amp-coslr-300e_k400.py 8 ``` Note: - CONFIG: the config files under the directory `configs/` - WORK_DIR: the working directory to save configs, logs, and checkpoints #### On Multiple GPUs with Slurm ```bash # train with mim mim train mmpretrain configs/maskfeat_mvit-small_16xb32-amp-coslr-300e_k400.py \ --work-dir work_dirs/selfsup/maskfeat_mvit-small_16xb32-amp-coslr-300e_k400/ \ --launcher slurm --gpus 16 --gpus-per-node 8 \ --partition ${PARTITION} # train with scripts GPUS_PER_NODE=8 GPUS=16 bash tools/slurm_train.sh ${PARTITION} maskfeat-video \ configs/maskfeat_mvit-small_16xb32-amp-coslr-300e_k400.py \ --work-dir work_dirs/selfsup/maskfeat_mvit-small_16xb32-amp-coslr-300e_k400/ ``` Note: - CONFIG: the config files under the directory `configs/` - WORK_DIR: the working directory to save configs, logs, and checkpoints - PARTITION: the slurm partition you are using ### Downstream Tasks Commands To evaluate the **MaskFeat MViT** pretrained with MMPretrain, we recommend to run MMAction2: #### On Multiple GPUs ```bash # command example for train mim train mmaction2 ${CONFIG} \ --work-dir ${WORK_DIR} \ --launcher pytorch -gpus 8 \ --cfg-options model.backbone.init_cfg.type=Pretrained \ model.backbone.init_cfg.checkpoint=${CHECKPOINT} \ model.backbone.init_cfg.prefix="backbone." \ ${PY_ARGS} [optional args] mim train mmaction2 configs/mvit-small_ft-8xb8-coslr-100e_k400.py \ --work-dir work_dirs/benchmarks/maskfeat/training_maskfeat-mvit-k400/ \ --launcher pytorch -gpus 8 \ --cfg-options model.backbone.init_cfg.type=Pretrained \ model.backbone.init_cfg.checkpoint=https://download.openmmlab.com/mmselfsup/1.x/maskfeat/maskfeat_mvit-small_16xb32-amp-coslr-300e_k400/maskfeat_mvit-small_16xb32-amp-coslr-300e_k400_20230131-87d60b6f.pth \ model.backbone.init_cfg.prefix="backbone." \ $PY_ARGS # command example for test mim test mmaction2 configs/mvit-small_ft-8xb16-coslr-100e_k400.py \ --checkpoint https://download.openmmlab.com/mmselfsup/1.x/maskfeat/maskfeat_mvit-small_16xb32-amp-coslr-300e_k400/mvit-small_ft-8xb16-coslr-100e_k400/mvit-small_ft-8xb16-coslr-100e_k400_20230131-5e8303f5.pth \ --work-dir work_dirs/benchmarks/maskfeat/maskfeat-mvit-k400/test/ \ --launcher pytorch --gpus 8 ``` #### On Multiple GPUs with Slurm ```bash mim train mmaction2 ${CONFIG} \ --work-dir ${WORK_DIR} \ --launcher slurm --gpus 8 --gpus-per-node 8 \ --partition ${PARTITION} \ --cfg-options model.backbone.init_cfg.type=Pretrained \ model.backbone.init_cfg.checkpoint=$CHECKPOINT \ model.backbone.init_cfg.prefix="backbone." \ $PY_ARGS mim test mmaction2 ${CONFIG} \ --checkpoint https://download.openmmlab.com/mmselfsup/1.x/maskfeat/maskfeat_mvit-small_16xb32-amp-coslr-300e_k400/mvit-small_ft-8xb16-coslr-100e_k400/mvit-small_ft-8xb16-coslr-100e_k400_20230131-5e8303f5.pth --work-dir ${WORK_DIR} \ --launcher slurm --gpus 8 --gpus-per-node 8 \ --partition ${PARTITION} \ $PY_ARGS ``` Note: - CONFIG: the config files under the directory `configs/` - WORK_DIR: the working directory to save configs, logs, and checkpoints - PARTITION: the slurm partition you are using - CHECKPOINT: the pretrained checkpoint of MMPretrain saved in working directory, like `$WORK_DIR/epoch_300.pth` - PY_ARGS: other optional args ## Results The Fine-tuning results are based on Kinetics400(K400) dataset. Due to the version of K400 dataset, our pretraining, fine-tuning and the final test results are based on MMAction2 version, which is a little different from PySlowFast version.
Algorithm Backbone Epoch Batch Size Fine-tuning Pretrain Links Fine-tuning Links
MaskFeat MViT-small 300 512 81.8 config | model | log config | model | log
Remarks: - We converted the pretrained model from PySlowFast and run fine-tuning with MMAction2, based on MMAction2 version of K400, we got `81.5` test accuracy. The pretrained model from MMPretrain got `81.8`, as provided above. - We also tested our model on [other version](https://github.com/facebookresearch/video-nonlocal-net/blob/main/DATASET.md) of K400, we got `82.1` test accuracy. - Some other details can be found in [MMAction2 MViT page](https://github.com/open-mmlab/mmaction2/tree/dev-1.x/configs/recognition/mvit). ## Citation ```bibtex @InProceedings{wei2022masked, author = {Wei, Chen and Fan, Haoqi and Xie, Saining and Wu, Chao-Yuan and Yuille, Alan and Feichtenhofer, Christoph}, title = {Masked Feature Prediction for Self-Supervised Visual Pre-Training}, booktitle = {CVPR}, year = {2022}, } ``` ## Checklist Here is a checklist illustrating a usual development workflow of a successful project, and also serves as an overview of this project's progress. - [x] Milestone 1: PR-ready, and acceptable to be one of the `projects/`. - [x] Finish the code - [x] Basic docstrings & proper citation - [x] Inference correctness - [x] A full README - [x] Milestone 2: Indicates a successful model implementation. - [x] Training-time correctness - [ ] Milestone 3: Good to be a part of our core package! - [ ] Type hints and docstrings - [ ] Unit tests - [ ] Code polishing - [ ] `metafile.yml` and `README.md` - [ ] Refactor and Move your modules into the core package following the codebase's file hierarchy structure.