更新

382d1b19 · yongshk · c2f37e29 · c2f37e29 · 382d1b19
Commit 382d1b19 authored Jun 08, 2023 by yongshk
Hide whitespace changes
Inline Side-by-side

Showing with 31 additions and 167 deletions

FLAVR_README.md FLAVR_README.md +0 -57

README.md README.md +31 -110

No files found.
--- a/FLAVR_README.md
+++ b/FLAVR_README.md
-# 简介
-FLAVR是一种用于视频插值的深度学习模型，可以通过插值技术将低帧率视频转换为高帧率视频。它通过对低帧率视频进行逐帧处理，并使用深度学习网络来推断丢失的帧，以生成更平滑的高帧率视频。相比传统的插值算法，FLAVR能够生成更加自然和逼真的高帧率视频，并且可以处理复杂的场景和动作。 
-# 测试流程
-## 安装工具包
-pytorch1.10版本[1.10.0a0+git2040069-dtk2210]
-## 加载环境变量
-```
-export PATH={PYTHON3_install_dir}/bin:$PATH
-export LD_LIBRARY_PATH={PYTHON3_install_dir}/lib:$LD_LIBRARY_PATH
-```
-## 下载数据集
-Vimeo-90K septuplet 数据集
-[vimeo_triplet](http://data.csail.mit.edu/tofu/dataset/vimeo_triplet.zip)
-# 运行指令
-## 在 Vimeo-90K septuplets 上训练模型
-要在 Vimeo-90K 数据集上训练您自己的模型，请使用以下命令。[您可以从此链接](http://toflow.csail.mit.edu/)下载数据集(至少需要2个DCU)。
-```
-python main.py --batch_size 32 --test_batch_size 32 --dataset vimeo90K_septuplet --loss 1*L1 --max_epoch 200 --lr 0.0002 --data_root <dataset_path> --n_outputs 1
-```
-GoPro 数据集上的训练类似，更改`n_outputs`为 7 以进行 8 倍插值。
-## 直接进行模型下载
-您可以从以下链接下载预训练的 FLAVR 模型。
-| 方法    | 训练好的模型                                                 |
-| ------- | ------------------------------------------------------------ |
-| **2倍** | [关联](https://drive.google.com/drive/folders/1M6ec7t59exOSlx_Wp6K9_njBlLH2IPBC?usp=sharing) |
-| **4倍** | [关联](https://drive.google.com/file/d/1btmNm4LkHVO9gjAaKKN9CXf5vP7h4hCy/view?usp=sharing) |
-| **8倍** | [关联](https://drive.google.com/drive/folders/1Gd2l69j7UC1Zua7StbUNcomAAhmE-xFb?usp=sharing) |
-### Middleburry 评价
-要在 Middleburry 的公共基准上进行评估，请运行以下命令。
-```
-python Middleburry_Test.py --data_root <data_path> --load_from <model_path> 
-```
-# 参考
-[https://github.com/tarun005/FLAVR](https://github.com/tarun005/FLAVR)
\ No newline at end of file
--- a/README.md
+++ b/README.md
-# FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation
+# FLAVR
-## WACV 2023 (Best Paper Finalist)
+## 模型介绍
+FLAVR是一种用于视频插值的深度学习模型，可以通过插值技术将低帧率视频转换为高帧率视频。
+## 模型结构
+3D U-Net结构、encoder部分采用ResNet-3D，decoder部分采用3D TransConv，以及Spatio-Temporal Feature Gating
-![Eg1](./figures/baloon.gif)
+## 数据集
-![Eg2](./figures/sprite.gif)
+目前模型在Vimeo-90K 数据集上训练
-[[project page](https://tarun005.github.io/FLAVR/)] [[paper](https://arxiv.org/pdf/2012.08512.pdf)] [[Project Video](youtu.be/HFOY7CGpJRM)]
+可以通过[此链接](http://toflow.csail.mit.edu/)进行数据下载
-FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction. It uses a customized encoder decoder architecture with spatio-temporal convolutions and channel gating to capture and interpolate complex motion trajectories between frames to generate realistic high frame rate videos. This repository contains original source code.
+## 训练及推理
+### 环境配置
+python依赖安装：
-## Inference Times
+    Python==3.7.11
+    numpy==1.19.2
+    PyTorch ==1.10.0a0+git2040069.dtk2210
+### 训练
+训练命令：
-FLAVR delivers a better trade-off between speed and accuracy compared to prior frame interpolation methods.
+    python main.py --batch_size 32 \
+                   --test_batch_size 32 \
+                   --dataset vimeo90K_septuplet \
+                   --loss 1*L1 \
+                   --max_epoch 200 \
+                   --lr 0.0002 \
+                   --data_root <dataset_path> \
+                   --n_outputs 1
-Method        | FPS on 512x512 Image (sec)
+GoPro 数据集上的训练类似，更改`n_outputs`为 7 以进行 8 倍插值。
-| ------------- |:-------------:|
-| FLAVR       | 3.10  |
-| SuperSloMo | 3.33 |
-| QVI      |   1.02  |
-| DAIN |   0.77  |
-## Dependencies
-We used the following to train and test the model.
- Ubuntu 18.04
+## 性能和准确率数据
- Python==3.7.4
+测试数据：[test data](http://toflow.csail.mit.edu/)，使用的加速卡:2张 DCU。
- numpy==1.19.2
- [PyTorch](http://pytorch.org/)==1.5.0, torchvision==0.6.0, cudatoolkit==10.1
-## Model
-<center><img src="./figures/arch_dia.png" width="90%"></center>
+## 源码仓库及问题反馈
+* [https://github.com/tarun005/FLAVR](https://github.com/tarun005/FLAVR)
-## Training model on Vimeo-90K septuplets
+## 参考
+* [https://github.com/tarun005/FLAVR](https://github.com/tarun005/FLAVR)
-For training your own model on the Vimeo-90K dataset, use the following command. You can download the dataset from [this link](http://toflow.csail.mit.edu/). The results reported in the paper are trained using 8GPUs.
-``` bash
-python main.py --batch_size 32 --test_batch_size 32 --dataset vimeo90K_septuplet --loss 1*L1 --max_epoch 200 --lr 0.0002 --data_root <dataset_path> --n_outputs 1
-```
-Training on GoPro dataset is similar, change `n_outputs` to 7 for 8x interpolation.
-## Testing using trained model.
-### Trained Models.
-You can download the pretrained FLAVR models from the following links.
- Method        | Trained Model  |
-| ------------- |:-----|
-| **2x** | [Link](https://drive.google.com/drive/folders/1M6ec7t59exOSlx_Wp6K9_njBlLH2IPBC?usp=sharing) |
-| **4x** |   [Link](https://drive.google.com/file/d/1btmNm4LkHVO9gjAaKKN9CXf5vP7h4hCy/view?usp=sharing)   |
-| **8x** |   [Link](https://drive.google.com/drive/folders/1Gd2l69j7UC1Zua7StbUNcomAAhmE-xFb?usp=sharing)  |
-### 2x Interpolation
-For testing a pretrained model on Vimeo-90K septuplet validation set, you can run the following command:
-```bash
-python test.py --dataset vimeo90K_septuplet --data_root <data_path> --load_from <saved_model> --n_outputs 1
-```
-### 8x Interpolation
-For testing a multiframe interpolation model, use the same command as above with multiframe FLAVR model, with `n_outputs` changed accordingly.
-### Time Benchmarking
-The testing script, in addition to computing PSNR and SSIM values, will also output the inference time and speed for interpolation. 
-### Evaluation on Middleburry
-To evaluate on the public benchmark of Middleburry, run the following.
-```bash
-python Middleburry_Test.py --data_root <data_path> --load_from <model_path> 
-```
-The interpolated images will be saved to the folder `Middleburry` in a format that can be readily uploaded to the [leaderboard](https://vision.middlebury.edu/flow/eval/results/results-i2.php). 
-## SloMo-Filter on custom video
-You can use our trained models and apply the slomo filter on your own video (requires OpenCV 4.2.0). Use the following command. If you want to convert a 30FPS video to 240FPS video, simply use the command
-```bash
-python interpolate.py --input_video <input_video> --factor 8 --load_model <model_path>
-```
-by using our [pretrained model](https://drive.google.com/drive/folders/1Gd2l69j7UC1Zua7StbUNcomAAhmE-xFb?usp=sharing) for 8x interpolation. For converting a 30FPS video to 60FPS video, use a 2x model with `factor` 2.
-## Baseline Models
-We also train models for many other previous works on our setting, and provide models for all these methods. Complete benchmarking scripts will also be released soon.
- Method        | PSNR on Vimeo           | Trained Model  |
-| ------------- |:-------------:| -----:|
-| FLAVR       | 36.3    | [Model](https://drive.google.com/drive/folders/1M6ec7t59exOSlx_Wp6K9_njBlLH2IPBC?usp=sharing)
-| AdaCoF      | 35.3 | [Model](https://drive.google.com/file/d/19Y2TDZkSbRgNu-OItvqk3qn5cBWGg1RT/view?usp=sharing) |
-| QVI*      |   35.15    | [Model](https://drive.google.com/file/d/1v2u5diGcvdTLhck8Xwu0baI4zm0JBJhI/view?usp=sharing)   |
-| DAIN |   34.19   | [Model](https://drive.google.com/file/d/1RfrwrHoSX_3RIdsoQgPg9IfGAJRhOoEp/view?usp=sharing)  |
-| SuperSloMo* | 32.90 | [Model](https://drive.google.com/file/d/1dR2at5DQO7w5s2tA5stC95Nmu_ezsPth/view?usp=sharing)
-* SuperSloMo is implemented using code repository from [here](https://github.com/avinashpaliwal/Super-SloMo). Other baselines are implemented using the official codebases. 
-* The numbers presented here for the baselines are slightly better than those reported in the paper.
-## Google Colab
-A Colab notebook to try 2x slow-motion filtering on custom videos is available in the *notebooks* directory of this repo.
-## Model for Motion-Magnification
-Unfortunately, we cannot provide the trained models for motion-magnification at this time. We are working towards making a model available soon.
-## Acknowledgement
-The code is heavily borrowed from Facebook's official [PyTorch video repository](https://github.com/facebookresearch/VMZ) and [CAIN](https://github.com/myungsub/CAIN).
-## Cite
-If this code helps in your work, please consider citing us. 
-``` text
-@article{kalluri2021flavr,
-  title={FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation},
-  author={Kalluri, Tarun and Pathak, Deepak and Chandraker, Manmohan and Tran, Du},
-  booktitle={arxiv},
-  year={2021}
-}
-```