evtexture

a75d2bda · mashun1 · a75d2bda · a75d2bda · a75d2bda · a75d2bda
Commit a75d2bda authored Jul 09, 2024 by mashun1
20 changed files
--- a/.gitignore
+++ b/.gitignore
+.eggs
+*.egg-info
+__pycache__
+# datasets
+datasets/images
+datasets/zbl*
+datasets/*h5
+datasets/y10
+datasets/*.mp4
+datasets/*.avi
+datasets/videos
+datasets/*.zip
+*.log
+experiments/
+others/rpg_vid2e/esim_py/build
+# others/rpg_vid2e/esim_py/dist
+others/rpg_vid2e/esim_py/esim_py.egg-info
+results/
+basicsr/version.py
\ No newline at end of file
--- a/EvTexture_test.ipynb
+++ b/EvTexture_test.ipynb
--- a/LICENSE
+++ b/LICENSE
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+   1. Definitions.
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+   END OF TERMS AND CONDITIONS
+   APPENDIX: How to apply the Apache License to your work.
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+   Copyright [yyyy] [name of copyright owner]
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+       http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/README.md
+++ b/README.md
+# EvTexture
+## 论文
+**EvTexture: Event-driven Texture Enhancement for Video Super-Resolution**
+* https://arxiv.org/abs/2406.13457
+## 模型结构
+EvTexture采用双向循环网络，其中特征向前和向后传播。
+![alt text](readme_imgs/models.png)
+## 算法原理
+在每个时间戳，使用一个运动分支和一个并行的纹理分支，以明确增强纹理区域的恢复。
+![alt text](readme_imgs/alg.png)
+## 环境配置
+### Docker（方法一）
+    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
+    docker run --shm-size 50g --network=host --name=evtexture --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash
+    pip install -r requirements.txt
+    python setup.py develop
+### Dockerfile（方法二）
+    docker build -t <IMAGE_NAME>:<TAG> .
+    docker run --shm-size 50g --network=host --name=evtexture --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash
+    pip install -r requirements.txt
+    python setup.py develop
+### Anaconda (方法三)
+1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装：
+https://developer.hpccube.com/tool/
+    DTK驱动：dtk24.04.1
+    python：python3.10
+    torch: 2.1.0
+    torchvision: 0.16.0
+Tips：以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应
+2、其它非特殊库参照requirements.txt安装
+    pip install -r requirements.txt
+    python setup.py develop
+## 数据集
+[OneDriver](https://1drv.ms/f/c/2d90e71fb9eb254f/EnMm8c2mP_FPv6lwt1jy01YB6bQhoPQ25vtzAhycYisERw?e=DiI2Ab) | [SCNet](http://113.200.138.88:18080/aidatasets/project-dependency/evtexture) 高速下载通道
+### 数据处理
+该部分仅展示整体流程，详细请参考[DataPreparation](datasets/DataPreparation.md)。
+- 将原始视频转换为图像（参考`datasets/utils/video_to_img.py`）；
+- 将图像转换为HWC格式numpy类型的数据并以h5结构存储在`/path/to/xx.h5/images`中;
+- 将视频转换为事件，包含forward及backward，分别存储在`/path/to/xx.h5/voxels_f`及`/path/to/xx.h5/voxels_b`中。
+以上步骤可参考`datasets/dataproparation.py`，注意，该脚本并非官方版本，使用前请仔细检查。执行下述命令准备相应的环境。
+    python -m pip install pybind11
+    export CMAKE_PREFIX_PATH=/usr/local/lib/python3.10/site-packages/pybind11/share/cmake/pybind11/:$CMAKE_PREFIX_PATH
+    sudo apt update
+    sudo apt install libopencv-dev
+    export CMAKE_PREFIX_PATH=/usr/lib/x86_64-linux-gnu/cmake/opencv4/:$CMAKE_PREFIX_PATH
+    下载 eigen3 https://gitlab.com/libeigen/eigen/-/releases/3.4.0  (zip版本)
+    unzip /path/to/eigen3.zip 
+    cd /path/to/eigen3
+    mkdir build && cd build 
+    cmake .. && make 
+    make install
+    # 安装boost
+    sudo apt install build-essential libboost-system-dev libboost-thread-dev libboost-program-options-dev libboost-test-dev
+    sudo apt install libboost-all-dev
+    cd others/rpg_vid2e/esim_pyt && python setup.py install
+注意：数据处理或下载完成后，需按照`basicsr/data/meta_info`中的文件格式创建相应的文件。
+## 推理
+    export HIP_VISIBLE_DEVICES=0
+    # 1表示使用DCU数量
+    bash scripts/dist_test.sh 1 options/test/EvTexture/test_EvTexture_REDS4_BIx4.yml
+    bash scripts/dist_test.sh 1 options/test/EvTexture/test_EvTexture_Vid4_BIx4.yml
+## result
+原始视频
+<video width="320" height="240" controls>
+  <source src="readme_imgs/ori_10y.avi" type="video/avi">
+  Your browser does not support the video tag.
+</video>
+修复后视频
+<video width="320" height="240" controls>
+  <source src="readme_imgs/vid4_10y.avi" type="video/avi">
+  Your browser does not support the video tag.
+</video>
+### 精度
+无
+## 应用场景
+### 算法类别
+`视频超分`
+### 热点应用行业
+`安防,媒体,环境`
+## 预训练权重
+[OneDriver](https://1drv.ms/f/c/2d90e71fb9eb254f/EnMm8c2mP_FPv6lwt1jy01YB6bQhoPQ25vtzAhycYisERw?e=DiI2Ab) | [SCNet](http://113.200.138.88:18080/aimodels/findsource-dependency/evtexture) 高速下载通道
+    experiments/
+    └── pretrained_models
+        └── EvTexture
+            ├── EvTexture_REDS_BIx4.pth
+            └── EvTexture_Vimeo90K_BIx4.pth
+## 源码仓库及问题反馈
+* https://developer.hpccube.com/codes/modelzoo/evtexture_pytorch
+## 参考资料
+* https://github.com/DachunKai/EvTexture/tree/main
--- a/README_official.md
+++ b/README_official.md
+# [EvTexture (ICML 2024)](https://icml.cc/virtual/2024/poster/34032)
+[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/evtexture-event-driven-texture-enhancement/video-super-resolution-on-vid4-4x-upscaling)](https://paperswithcode.com/sota/video-super-resolution-on-vid4-4x-upscaling?p=evtexture-event-driven-texture-enhancement)
+[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/evtexture-event-driven-texture-enhancement/video-super-resolution-on-reds4-4x-upscaling)](https://paperswithcode.com/sota/video-super-resolution-on-reds4-4x-upscaling?p=evtexture-event-driven-texture-enhancement)
+Official Pytorch implementation for the "EvTexture: Event-driven Texture Enhancement for Video Super-Resolution" paper (ICML 2024).
+<p align="center">
+    🌐 <a href="https://dachunkai.github.io/evtexture.github.io/" target="_blank">Project</a> | 📃 <a href="https://arxiv.org/abs/2406.13457" target="_blank">Paper</a> | 🖼️ <a href="https://docs.google.com/presentation/d/1nbDb39TFb374DzBwdz5v20kIREUA0nBH/edit?usp=sharing" target="_blank">Poster</a> <br>
+</p>
+**Authors**: [Dachun Kai](https://github.com/DachunKai/)<sup>[:email:️](mailto:dachunkai@mail.ustc.edu.cn)</sup>, Jiayao Lu, [Yueyi Zhang](https://scholar.google.com.hk/citations?user=LatWlFAAAAAJ&hl=zh-CN&oi=ao)<sup>[:email:️](mailto:zhyuey@ustc.edu.cn)</sup>, [Xiaoyan Sun](https://scholar.google.com/citations?user=VRG3dw4AAAAJ&hl=zh-CN), *University of Science and Technology of China*
+**Feel free to ask questions. If our work helps, please don't hesitate to give us a :star:!**
+## :rocket: News
+- [ ] Release training code
+- [x] 2024/07/02: Release the colab file for a quick test
+- [x] 2024/06/28: Release details to prepare datasets
+- [x] 2024/06/08: Publish docker image
+- [x] 2024/06/08: Release pretrained models and test sets for quick testing
+- [x] 2024/06/07: Video demos released
+- [x] 2024/05/25: Initialize the repository
+- [x] 2024/05/02: :tada: :tada: Our paper was accepted in ICML'2024
+## :bookmark: Table of Content
+1. [Video Demos](#video-demos)
+2. [Code](#code)
+3. [Citation](#citation)
+4. [Contact](#contact)
+5. [License and Acknowledgement](#license-and-acknowledgement)
+## :fire: Video Demos
+A $4\times$ upsampling results on the [Vid4](https://paperswithcode.com/sota/video-super-resolution-on-vid4-4x-upscaling) and [REDS4](https://paperswithcode.com/dataset/reds) test sets.
+https://github.com/DachunKai/EvTexture/assets/66354783/fcf48952-ea48-491c-a4fb-002bb2d04ad3
+https://github.com/DachunKai/EvTexture/assets/66354783/ea3dd475-ba8f-411f-883d-385a5fdf7ff6
+https://github.com/DachunKai/EvTexture/assets/66354783/e1e6b340-64b3-4d94-90ee-54f025f255fb
+https://github.com/DachunKai/EvTexture/assets/66354783/01880c40-147b-4c02-8789-ced0c1bff9c4
+## Code
+### Installation
+* Dependencies: [Miniconda](https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh), [CUDA Toolkit 11.1.1](https://developer.nvidia.com/cuda-11.1.1-download-archive), [torch 1.10.2+cu111](https://download.pytorch.org/whl/cu111/torch-1.10.2%2Bcu111-cp37-cp37m-linux_x86_64.whl), and [torchvision 0.11.3+cu111](https://download.pytorch.org/whl/cu111/torchvision-0.11.3%2Bcu111-cp37-cp37m-linux_x86_64.whl).
+* Run in Conda
+    ```bash
+    conda create -y -n evtexture python=3.7
+    conda activate evtexture
+    pip install torch-1.10.2+cu111-cp37-cp37m-linux_x86_64.whl
+    pip install torchvision-0.11.3+cu111-cp37-cp37m-linux_x86_64.whl
+    git clone https://github.com/DachunKai/EvTexture.git
+    cd EvTexture && pip install -r requirements.txt && python setup.py develop
+    ```
+* Run in Docker :clap:
+  Note: before running the Docker image, make sure to install nvidia-docker by following the [official instructions](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
+  [Option 1] Directly pull the published Docker image we have provided from [Alibaba Cloud](https://cr.console.aliyun.com/cn-hangzhou/instances).
+  ```bash
+  docker pull registry.cn-hangzhou.aliyuncs.com/dachunkai/evtexture:latest
+  ```
+  [Option 2] We also provide a [Dockerfile](https://github.com/DachunKai/EvTexture/blob/main/docker/Dockerfile) that you can use to build the image yourself.
+  ```bash
+  cd EvTexture && docker build -t evtexture ./docker
+  ```
+  The pulled or self-built Docker image containes a complete conda environment named `evtexture`. After running the image, you can mount your data and operate within this environment.
+  ```bash
+  source activate evtexture && cd EvTexture && python setup.py develop
+  ```
+### Test
+1. Download the pretrained models from ([Releases](https://github.com/DachunKai/EvTexture/releases) / [Onedrive](https://1drv.ms/f/c/2d90e71fb9eb254f/EnMm8c2mP_FPv6lwt1jy01YB6bQhoPQ25vtzAhycYisERw?e=DiI2Ab) / [Google Drive](https://drive.google.com/drive/folders/1oqOAZbroYW-yfyzIbLYPMJ2ZQmaaCXKy?usp=sharing) / [Baidu Cloud](https://pan.baidu.com/s/161bfWZGVH1UBCCka93ImqQ?pwd=n8hg)(n8hg)) and place them to `experiments/pretrained_models/EvTexture/`. The network architecture code is in [evtexture_arch.py](https://github.com/DachunKai/EvTexture/blob/main/basicsr/archs/evtexture_arch.py).
+    * *EvTexture_REDS_BIx4.pth*: trained on REDS dataset with BI degradation for $4\times$ SR scale.
+    * *EvTexture_Vimeo90K_BIx4.pth*: trained on Vimeo-90K dataset with BI degradation for $4\times$ SR scale.
+2. Download the preprocessed test sets (including events) for REDS4 and Vid4 from ([Releases](https://github.com/DachunKai/EvTexture/releases) / [Onedrive](https://1drv.ms/f/c/2d90e71fb9eb254f/EnMm8c2mP_FPv6lwt1jy01YB6bQhoPQ25vtzAhycYisERw?e=DiI2Ab) / [Google Drive](https://drive.google.com/drive/folders/1oqOAZbroYW-yfyzIbLYPMJ2ZQmaaCXKy?usp=sharing) / [Baidu Cloud](https://pan.baidu.com/s/161bfWZGVH1UBCCka93ImqQ?pwd=n8hg)(n8hg)), and place them to `datasets/`.
+    * *Vid4_h5*: HDF5 files containing preprocessed test datasets for Vid4.
+    * *REDS4_h5*: HDF5 files containing preprocessed test datasets for REDS4.
+3. Run the following command:
+    * Test on Vid4 for 4x VSR:
+      ```bash
+      ./scripts/dist_test.sh [num_gpus] options/test/EvTexture/test_EvTexture_Vid4_BIx4.yml
+      ```
+    * Test on REDS4 for 4x VSR:
+      ```bash
+      ./scripts/dist_test.sh [num_gpus] options/test/EvTexture/test_EvTexture_REDS4_BIx4.yml
+      ```
+      This will generate the inference results in `results/`. The output results on REDS4 and Vid4 can be downloaded from ([Releases](https://github.com/DachunKai/EvTexture/releases) / [Onedrive](https://1drv.ms/f/c/2d90e71fb9eb254f/EnMm8c2mP_FPv6lwt1jy01YB6bQhoPQ25vtzAhycYisERw?e=DiI2Ab) / [Google Drive](https://drive.google.com/drive/folders/1oqOAZbroYW-yfyzIbLYPMJ2ZQmaaCXKy?usp=sharing) / [Baidu Cloud](https://pan.baidu.com/s/161bfWZGVH1UBCCka93ImqQ?pwd=n8hg)(n8hg)).
+### Data Preparation
+* Both video and event data are required as input, as shown in the [snippet](https://github.com/DachunKai/EvTexture/blob/main/basicsr/archs/evtexture_arch.py#L70). We package each video and its event data into an [HDF5](https://docs.h5py.org/en/stable/quick.html#quick) file.
+* Example: The structure of `calendar.h5` file from the Vid4 dataset is shown below.
+  ```arduino
+  calendar.h5
+  ├── images
+  │   ├── 000000 # frame, ndarray, [H, W, C]
+  │   ├── ...
+  ├── voxels_f
+  │   ├── 000000 # forward event voxel, ndarray, [Bins, H, W]
+  │   ├── ...
+  ├── voxels_b
+  │   ├── 000000 # backward event voxel, ndarray, [Bins, H, W]
+  │   ├── ...
+  ```
+* To simulate and generate the event voxels, refer to the dataset preparation details in [DataPreparation.md](https://github.com/DachunKai/EvTexture/blob/main/datasets/DataPreparation.md).
+### Inference on your own video
+> **:heart: Seeking Collaboration**: For issues [#6](https://github.com/DachunKai/EvTexture/issues/6) and [#7](https://github.com/DachunKai/EvTexture/issues/7), our method can indeed perform inference on videos without event data. The solution is to use an event camera simulator, such as [vid2e](https://github.com/uzh-rpg/rpg_vid2e), to generate event data from the video, and then input both the video data and the generated event data into our model. This part, however, may require extensive engineering work to package everything into a script, as detailed in [DataPreparation.md](https://github.com/DachunKai/EvTexture/blob/main/datasets/DataPreparation.md). We currently do not have enough time to undertake this task, so we are looking for collaborators to join us in this effort! :blush:
+## :blush: Citation
+If you find the code and pre-trained models useful for your research, please consider citing our paper. :smiley:
+```
+@inproceedings{kai2024evtexture,
+  title={Ev{T}exture: {E}vent-driven {T}exture {E}nhancement for {V}ideo {S}uper-{R}esolution},
+  author={Kai, Dachun and Lu, Jiayao and Zhang, Yueyi and Sun, Xiaoyan},
+  booktitle={International Conference on Machine Learning},
+  year={2024},
+  organization={PMLR}
+}
+```
+## Contact
+If you meet any problems, please describe them in issues or contact:
+* Dachun Kai: <dachunkai@mail.ustc.edu.cn>
+## License and Acknowledgement
+This project is released under the Apache-2.0 license. Our work is built upon [BasicSR](https://github.com/XPixelGroup/BasicSR), which is an open source toolbox for image/video restoration tasks. Thanks to the inspirations and codes from [RAFT](https://github.com/princeton-vl/RAFT), [event_utils](https://github.com/TimoStoff/event_utils) and [EvTexture-jupyter](https://github.com/camenduru/EvTexture-jupyter).
--- a/VERSION
+++ b/VERSION
+1.4.2
--- a/basicsr/__init__.py
+++ b/basicsr/__init__.py
+# https://github.com/xinntao/BasicSR
+# flake8: noqa
+from .archs import *
+from .data import *
+from .losses import *
+from .metrics import *
+from .models import *
+from .ops import *
+from .test import *
+from .train import *
+from .utils import *
+from .version import __gitsha__, __version__
--- a/basicsr/archs/__init__.py
+++ b/basicsr/archs/__init__.py
+import importlib
+from copy import deepcopy
+from os import path as osp
+from basicsr.utils import get_root_logger, scandir
+from basicsr.utils.registry import ARCH_REGISTRY
+__all__ = ['build_network']
+# automatically scan and import arch modules for registry
+# scan all the files under the 'archs' folder and collect files ending with '_arch.py'
+arch_folder = osp.dirname(osp.abspath(__file__))
+arch_filenames = [osp.splitext(osp.basename(v))[0] for v in scandir(arch_folder) if v.endswith('_arch.py')]
+# import all the arch modules
+_arch_modules = [importlib.import_module(f'basicsr.archs.{file_name}') for file_name in arch_filenames]
+def build_network(opt):
+    opt = deepcopy(opt)
+    network_type = opt.pop('type')
+    net = ARCH_REGISTRY.get(network_type)(**opt)
+    logger = get_root_logger()
+    logger.info(f'Network [{net.__class__.__name__}] is created.')
+    return net
--- a/basicsr/archs/arch_util.py
+++ b/basicsr/archs/arch_util.py
+import collections.abc
+import math
+import torch
+import torchvision
+import warnings
+from distutils.version import LooseVersion
+from itertools import repeat
+from torch import nn as nn
+from torch.nn import functional as F
+from torch.nn import init as init
+from torch.nn.modules.batchnorm import _BatchNorm
+from basicsr.ops.dcn import ModulatedDeformConvPack, modulated_deform_conv
+from basicsr.utils import get_root_logger
+@torch.no_grad()
+def default_init_weights(module_list, scale=1, bias_fill=0, **kwargs):
+    """Initialize network weights.
+    Args:
+        module_list (list[nn.Module] | nn.Module): Modules to be initialized.
+        scale (float): Scale initialized weights, especially for residual
+            blocks. Default: 1.
+        bias_fill (float): The value to fill bias. Default: 0
+        kwargs (dict): Other arguments for initialization function.
+    """
+    if not isinstance(module_list, list):
+        module_list = [module_list]
+    for module in module_list:
+        for m in module.modules():
+            if isinstance(m, nn.Conv2d):
+                init.kaiming_normal_(m.weight, **kwargs)
+                m.weight.data *= scale
+                if m.bias is not None:
+                    m.bias.data.fill_(bias_fill)
+            elif isinstance(m, nn.Linear):
+                init.kaiming_normal_(m.weight, **kwargs)
+                m.weight.data *= scale
+                if m.bias is not None:
+                    m.bias.data.fill_(bias_fill)
+            elif isinstance(m, _BatchNorm):
+                init.constant_(m.weight, 1)
+                if m.bias is not None:
+                    m.bias.data.fill_(bias_fill)
+def make_layer(basic_block, num_basic_block, **kwarg):
+    """Make layers by stacking the same blocks.
+    Args:
+        basic_block (nn.module): nn.module class for basic block.
+        num_basic_block (int): number of blocks.
+    Returns:
+        nn.Sequential: Stacked blocks in nn.Sequential.
+    """
+    layers = []
+    for _ in range(num_basic_block):
+        layers.append(basic_block(**kwarg))
+    return nn.Sequential(*layers)
+class ResidualBlockNoBN(nn.Module):
+    """Residual block without BN.
+    Args:
+        num_feat (int): Channel number of intermediate features.
+            Default: 64.
+        res_scale (float): Residual scale. Default: 1.
+        pytorch_init (bool): If set to True, use pytorch default init,
+            otherwise, use default_init_weights. Default: False.
+    """
+    def __init__(self, num_feat=64, res_scale=1, pytorch_init=False):
+        super(ResidualBlockNoBN, self).__init__()
+        self.res_scale = res_scale
+        self.conv1 = nn.Conv2d(num_feat, num_feat, 3, 1, 1, bias=True)
+        self.conv2 = nn.Conv2d(num_feat, num_feat, 3, 1, 1, bias=True)
+        self.relu = nn.ReLU(inplace=True)
+        if not pytorch_init:
+            default_init_weights([self.conv1, self.conv2], 0.1)
+    def forward(self, x):
+        identity = x
+        out = self.conv2(self.relu(self.conv1(x)))
+        return identity + out * self.res_scale
+class Upsample(nn.Sequential):
+    """Upsample module.
+    Args:
+        scale (int): Scale factor. Supported scales: 2^n and 3.
+        num_feat (int): Channel number of intermediate features.
+    """
+    def __init__(self, scale, num_feat):
+        m = []
+        if (scale & (scale - 1)) == 0:  # scale = 2^n
+            for _ in range(int(math.log(scale, 2))):
+                m.append(nn.Conv2d(num_feat, 4 * num_feat, 3, 1, 1))
+                m.append(nn.PixelShuffle(2))
+        elif scale == 3:
+            m.append(nn.Conv2d(num_feat, 9 * num_feat, 3, 1, 1))
+            m.append(nn.PixelShuffle(3))
+        else:
+            raise ValueError(f'scale {scale} is not supported. Supported scales: 2^n and 3.')
+        super(Upsample, self).__init__(*m)
+def flow_warp(x, flow, interp_mode='bilinear', padding_mode='zeros', align_corners=True):
+    """Warp an image or feature map with optical flow.
+    Args:
+        x (Tensor): Tensor with size (n, c, h, w).
+        flow (Tensor): Tensor with size (n, h, w, 2), normal value.
+        interp_mode (str): 'nearest' or 'bilinear'. Default: 'bilinear'.
+        padding_mode (str): 'zeros' or 'border' or 'reflection'.
+            Default: 'zeros'.
+        align_corners (bool): Before pytorch 1.3, the default value is
+            align_corners=True. After pytorch 1.3, the default value is
+            align_corners=False. Here, we use the True as default.
+    Returns:
+        Tensor: Warped image or feature map.
+    """
+    assert x.size()[-2:] == flow.size()[1:3]
+    _, _, h, w = x.size()
+    # create mesh grid
+    grid_y, grid_x = torch.meshgrid(torch.arange(0, h).type_as(x), torch.arange(0, w).type_as(x))
+    grid = torch.stack((grid_x, grid_y), 2).float()  # W(x), H(y), 2
+    grid.requires_grad = False
+    vgrid = grid + flow
+    # scale grid to [-1,1]
+    vgrid_x = 2.0 * vgrid[:, :, :, 0] / max(w - 1, 1) - 1.0
+    vgrid_y = 2.0 * vgrid[:, :, :, 1] / max(h - 1, 1) - 1.0
+    vgrid_scaled = torch.stack((vgrid_x, vgrid_y), dim=3)
+    output = F.grid_sample(x, vgrid_scaled, mode=interp_mode, padding_mode=padding_mode, align_corners=align_corners)
+    # TODO, what if align_corners=False
+    return output
+def resize_flow(flow, size_type, sizes, interp_mode='bilinear', align_corners=False):
+    """Resize a flow according to ratio or shape.
+    Args:
+        flow (Tensor): Precomputed flow. shape [N, 2, H, W].
+        size_type (str): 'ratio' or 'shape'.
+        sizes (list[int | float]): the ratio for resizing or the final output
+            shape.
+            1) The order of ratio should be [ratio_h, ratio_w]. For
+            downsampling, the ratio should be smaller than 1.0 (i.e., ratio
+            < 1.0). For upsampling, the ratio should be larger than 1.0 (i.e.,
+            ratio > 1.0).
+            2) The order of output_size should be [out_h, out_w].
+        interp_mode (str): The mode of interpolation for resizing.
+            Default: 'bilinear'.
+        align_corners (bool): Whether align corners. Default: False.
+    Returns:
+        Tensor: Resized flow.
+    """
+    _, _, flow_h, flow_w = flow.size()
+    if size_type == 'ratio':
+        output_h, output_w = int(flow_h * sizes[0]), int(flow_w * sizes[1])
+    elif size_type == 'shape':
+        output_h, output_w = sizes[0], sizes[1]
+    else:
+        raise ValueError(f'Size type should be ratio or shape, but got type {size_type}.')
+    input_flow = flow.clone()
+    ratio_h = output_h / flow_h
+    ratio_w = output_w / flow_w
+    input_flow[:, 0, :, :] *= ratio_w
+    input_flow[:, 1, :, :] *= ratio_h
+    resized_flow = F.interpolate(
+        input=input_flow, size=(output_h, output_w), mode=interp_mode, align_corners=align_corners)
+    return resized_flow
+# TODO: may write a cpp file
+def pixel_unshuffle(x, scale):
+    """ Pixel unshuffle.
+    Args:
+        x (Tensor): Input feature with shape (b, c, hh, hw).
+        scale (int): Downsample ratio.
+    Returns:
+        Tensor: the pixel unshuffled feature.
+    """
+    b, c, hh, hw = x.size()
+    out_channel = c * (scale**2)
+    assert hh % scale == 0 and hw % scale == 0
+    h = hh // scale
+    w = hw // scale
+    x_view = x.view(b, c, h, scale, w, scale)
+    return x_view.permute(0, 1, 3, 5, 2, 4).reshape(b, out_channel, h, w)
+class DCNv2Pack(ModulatedDeformConvPack):
+    """Modulated deformable conv for deformable alignment.
+    Different from the official DCNv2Pack, which generates offsets and masks
+    from the preceding features, this DCNv2Pack takes another different
+    features to generate offsets and masks.
+    ``Paper: Delving Deep into Deformable Alignment in Video Super-Resolution``
+    """
+    def forward(self, x, feat):
+        out = self.conv_offset(feat)
+        o1, o2, mask = torch.chunk(out, 3, dim=1)
+        offset = torch.cat((o1, o2), dim=1)
+        mask = torch.sigmoid(mask)
+        offset_absmean = torch.mean(torch.abs(offset))
+        if offset_absmean > 50:
+            logger = get_root_logger()
+            logger.warning(f'Offset abs mean is {offset_absmean}, larger than 50.')
+        if LooseVersion(torchvision.__version__) >= LooseVersion('0.9.0'):
+            return torchvision.ops.deform_conv2d(x, offset, self.weight, self.bias, self.stride, self.padding,
+                                                 self.dilation, mask)
+        else:
+            return modulated_deform_conv(x, offset, mask, self.weight, self.bias, self.stride, self.padding,
+                                         self.dilation, self.groups, self.deformable_groups)
+def _no_grad_trunc_normal_(tensor, mean, std, a, b):
+    # From: https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/layers/weight_init.py
+    # Cut & paste from PyTorch official master until it's in a few official releases - RW
+    # Method based on https://people.sc.fsu.edu/~jburkardt/presentations/truncated_normal.pdf
+    def norm_cdf(x):
+        # Computes standard normal cumulative distribution function
+        return (1. + math.erf(x / math.sqrt(2.))) / 2.
+    if (mean < a - 2 * std) or (mean > b + 2 * std):
+        warnings.warn(
+            'mean is more than 2 std from [a, b] in nn.init.trunc_normal_. '
+            'The distribution of values may be incorrect.',
+            stacklevel=2)
+    with torch.no_grad():
+        # Values are generated by using a truncated uniform distribution and
+        # then using the inverse CDF for the normal distribution.
+        # Get upper and lower cdf values
+        low = norm_cdf((a - mean) / std)
+        up = norm_cdf((b - mean) / std)
+        # Uniformly fill tensor with values from [low, up], then translate to
+        # [2l-1, 2u-1].
+        tensor.uniform_(2 * low - 1, 2 * up - 1)
+        # Use inverse cdf transform for normal distribution to get truncated
+        # standard normal
+        tensor.erfinv_()
+        # Transform to proper mean, std
+        tensor.mul_(std * math.sqrt(2.))
+        tensor.add_(mean)
+        # Clamp to ensure it's in the proper range
+        tensor.clamp_(min=a, max=b)
+        return tensor
+def trunc_normal_(tensor, mean=0., std=1., a=-2., b=2.):
+    r"""Fills the input Tensor with values drawn from a truncated
+    normal distribution.
+    From: https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/layers/weight_init.py
+    The values are effectively drawn from the
+    normal distribution :math:`\mathcal{N}(\text{mean}, \text{std}^2)`
+    with values outside :math:`[a, b]` redrawn until they are within
+    the bounds. The method used for generating the random values works
+    best when :math:`a \leq \text{mean} \leq b`.
+    Args:
+        tensor: an n-dimensional `torch.Tensor`
+        mean: the mean of the normal distribution
+        std: the standard deviation of the normal distribution
+        a: the minimum cutoff value
+        b: the maximum cutoff value
+    Examples:
+        >>> w = torch.empty(3, 5)
+        >>> nn.init.trunc_normal_(w)
+    """
+    return _no_grad_trunc_normal_(tensor, mean, std, a, b)
+# From PyTorch
+def _ntuple(n):
+    def parse(x):
+        if isinstance(x, collections.abc.Iterable):
+            return x
+        return tuple(repeat(x, n))
+    return parse
+to_1tuple = _ntuple(1)
+to_2tuple = _ntuple(2)
+to_3tuple = _ntuple(3)
+to_4tuple = _ntuple(4)
+to_ntuple = _ntuple
+def closest_larger_multiple_of_minimum_size(size, minimum_size):
+    return int(math.ceil(size / minimum_size) * minimum_size)
+class SizeAdapter(object):
+    """Converts size of input to standard size.
+    Practical deep network works only with input images
+    which height and width are multiples of a minimum size.
+    This class allows to pass to the network images of arbitrary
+    size, by padding the input to the closest multiple
+    and unpadding the network's output to the original size.
+    """
+    def __init__(self, minimum_size=64):
+        self._minimum_size = minimum_size
+        self._pixels_pad_to_width = None
+        self._pixels_pad_to_height = None
+    def _closest_larger_multiple_of_minimum_size(self, size):
+        return closest_larger_multiple_of_minimum_size(size, self._minimum_size)
+    def pad(self, network_input):
+        """Returns "network_input" paded with zeros to the "standard" size.
+        The "standard" size correspond to the height and width that
+        are closest multiples of "minimum_size". The method pads
+        height and width  and and saves padded values. These
+        values are then used by "unpad_output" method.
+        """
+        height, width = network_input.size()[-2:]
+        self._pixels_pad_to_height = (self._closest_larger_multiple_of_minimum_size(height) - height)
+        self._pixels_pad_to_width = (self._closest_larger_multiple_of_minimum_size(width) - width)
+        return nn.ZeroPad2d((self._pixels_pad_to_width, 0, self._pixels_pad_to_height, 0))(network_input)
+    def unpad(self, network_output):
+        """Returns "network_output" cropped to the original size.
+        The cropping is performed using values save by the "pad_input"
+        method.
+        """
+        return network_output[..., self._pixels_pad_to_height:, self._pixels_pad_to_width:]
+class SmallUpdateBlock(nn.Module):
+    def __init__(self, hidden_dim=64, input_dim=64 * 2):
+        super(SmallUpdateBlock, self).__init__()
+        self.gru = ConvGRU(hidden_dim=hidden_dim, input_dim=input_dim)
+        self.res_head = ConvResidualBlocks(num_in_ch=64, num_out_ch=64, num_block=5)
+    def forward(self, net, context, motion):
+        inp = torch.cat([context, motion], dim=1)
+        net = self.gru(net, inp)
+        delta_net = self.res_head(net)
+        return net, delta_net
+class ConvGRU(nn.Module):
+    def __init__(self, hidden_dim=128, input_dim=192 + 128):
+        super(ConvGRU, self).__init__()
+        self.convz = nn.Conv2d(hidden_dim + input_dim, hidden_dim, 3, padding=1)
+        self.convr = nn.Conv2d(hidden_dim + input_dim, hidden_dim, 3, padding=1)
+        self.convq = nn.Conv2d(hidden_dim + input_dim, hidden_dim, 3, padding=1)
+    def forward(self, h, x):
+        hx = torch.cat([h, x], dim=1)
+        z = torch.sigmoid(self.convz(hx))
+        r = torch.sigmoid(self.convr(hx))
+        q = torch.tanh(self.convq(torch.cat([r * h, x], dim=1)))
+        h = (1 - z) * h + z * q
+        return h
+class ConvResidualBlocks(nn.Module):
+    """Conv and residual block used in BasicVSR.
+    Args:
+        num_in_ch (int): Number of input channels. Default: 3.
+        num_out_ch (int): Number of output channels. Default: 64.
+        num_block (int): Number of residual blocks. Default: 15.
+    """
+    def __init__(self, num_in_ch=3, num_out_ch=64, num_block=15):
+        super().__init__()
+        self.main = nn.Sequential(
+            nn.Conv2d(num_in_ch, num_out_ch, 3, 1, 1, bias=True), nn.LeakyReLU(negative_slope=0.1, inplace=True),
+            make_layer(ResidualBlockNoBN, num_block, num_feat=num_out_ch))
+    def forward(self, fea):
+        return self.main(fea)
\ No newline at end of file
--- a/basicsr/archs/evtexture_arch.py
+++ b/basicsr/archs/evtexture_arch.py
+import torch
+from torch import nn as nn
+from torch.nn import functional as F
+from basicsr.utils.registry import ARCH_REGISTRY
+from .unet_arch import UNet
+from .arch_util import flow_warp, ConvResidualBlocks, SmallUpdateBlock
+from .spynet_arch import SpyNet
+@ARCH_REGISTRY.register()
+class EvTexture(nn.Module):
+    """EvTexture: Event-driven Texture Enhancement for Video Super-Resolution (ICML 2024)
+       Note that: this class is for 4x VSR
+    Args:
+        num_feat (int): Number of channels. Default: 64.
+        num_block (int): Number of residual blocks for each branch. Default: 30
+        spynet_path (str): Path to the pretrained weights of SPyNet. Default: None.
+    """
+    def __init__(self, num_feat=64, num_block=30, spynet_path=None):
+        super().__init__()
+        self.num_feat = num_feat
+        # RGB-based flowalignment
+        self.spynet = SpyNet(spynet_path)
+        self.cnet = ConvResidualBlocks(num_in_ch=3, num_out_ch=64, num_block=8)
+        # iterative texture enhancement module
+        self.enet = UNet(inChannels=1, outChannels=num_feat)
+        self.update_block = SmallUpdateBlock(hidden_dim=num_feat, input_dim=num_feat * 2)
+        self.fusion = nn.Conv2d(num_feat * 2, num_feat, 1, 1, 0, bias=True)
+        # propogation
+        self.backward_trunk = ConvResidualBlocks(num_feat + 3, num_feat, num_block)
+        self.forward_trunk = ConvResidualBlocks(num_feat * 2 + 3, num_feat, num_block)
+        # reconstruction
+        self.upconv1 = nn.Conv2d(num_feat, num_feat * 4, 3, 1, 1, bias=True)
+        self.upconv2 = nn.Conv2d(num_feat, num_feat * 4, 3, 1, 1, bias=True)
+        self.conv_hr = nn.Conv2d(num_feat, num_feat, 3, 1, 1)
+        self.conv_last = nn.Conv2d(num_feat, 3, 3, 1, 1)
+        self.pixel_shuffle = nn.PixelShuffle(2)
+        # activation functions
+        self.lrelu = nn.LeakyReLU(negative_slope=0.1, inplace=True)
+    def get_flow(self, x):
+        b, n, c, h, w = x.size()
+        x_1 = x[:, :-1, :, :, :].reshape(-1, c, h, w)
+        x_2 = x[:, 1:, :, :, :].reshape(-1, c, h, w)
+        flows_backward = self.spynet(x_1, x_2).view(b, n - 1, 2, h, w)
+        flows_forward = self.spynet(x_2, x_1).view(b, n - 1, 2, h, w)
+        return flows_forward, flows_backward
+    # context feature extractor
+    def get_feat(self, x):
+        b, n, c, h, w = x.size()
+        feats_ = self.cnet(x.view(-1, c, h, w))
+        h, w = feats_.shape[2:]
+        feats_ = feats_.view(b, n, -1, h, w)
+        return feats_
+    def forward(self, imgs, voxels_f, voxels_b):
+        """Forward function of EvTexture
+        Args:
+            imgs: Input frames with shape (b, n, c, h, w). n is the number of frames.
+            voxels_f: forward event voxel grids with shape (b, n-1 , c, h, w).
+            voxels_b: backward event voxel grids with shape (b, n-1 , c, h, w).
+        Output:
+            out_l: output frames with shape (b, n, c, 4h, 4w)
+        """
+        flows_forward, flows_backward = self.get_flow(imgs)
+        feat_imgs = self.get_feat(imgs)
+        b, n, _, h, w = imgs.size()
+        bins = voxels_f.size()[2]
+        # backward branch
+        out_l = []
+        feat_prop = imgs.new_zeros(b, self.num_feat, h, w)
+        for i in range(n - 1, -1, -1):
+            x_i = imgs[:, i, :, :, :]
+            if i < n - 1:
+                # motion branch by rgb frames
+                flow = flows_backward[:, i, :, :, :]
+                feat_prop_coarse = flow_warp(feat_prop, flow.permute(0, 2, 3, 1))
+                # texture branch by event voxels
+                hidden_state = feat_prop.clone()
+                feat_img = feat_imgs[:, i, :, :, :]  # [B, num_feat, H, W]
+                cur_voxel = voxels_f[:, i, :, :, :]  # [B, Bins, H, W]
+                ## iterative update block
+                feat_prop_fine = feat_prop.clone()
+                for j in range(bins - 1, -1, -1):
+                    voxel_j = cur_voxel[:, j, :, :].unsqueeze(1)  # [B, 1, H, W]
+                    feat_motion = self.enet(voxel_j)  # [B, num_feat, H, W], enet is UNet(inChannels=1, OurChannels=num_feat)
+                    hidden_state, delta_feat = self.update_block(hidden_state, feat_img, feat_motion)  # refine coarse hidden state
+                    feat_prop_fine = feat_prop_fine + delta_feat
+                feat_prop = self.fusion(torch.cat([feat_prop_fine, feat_prop_coarse], dim=1))
+            feat_prop = torch.cat([x_i, feat_prop], dim=1)
+            feat_prop = self.backward_trunk(feat_prop)
+            out_l.insert(0, feat_prop)
+        # forward branch
+        feat_prop = torch.zeros_like(feat_prop)
+        for i in range(0, n):
+            x_i = imgs[:, i, :, :, :]
+            if i > 0:
+                # motion branch by rgb frames
+                flow = flows_forward[:, i - 1, :, :, :]
+                feat_prop_coarse = flow_warp(feat_prop, flow.permute(0, 2, 3, 1))
+                # texture branch by event voxels
+                hidden_state = feat_prop.clone()
+                feat_img = feat_imgs[:, i, :, :, :]  # [B, num_feat, H, W]
+                cur_voxel = voxels_b[:, i - 1, :, :, :]  # [B, Bins, H, W]
+                # iterative update block
+                feat_prop_fine = feat_prop.clone()
+                for j in range(bins - 1, -1, -1):
+                    voxel_j = cur_voxel[:, j, :, :].unsqueeze(1)  # [B, 1, H, W]
+                    feat_motion = self.enet(voxel_j)  # [B, num_feat, H, W], enet is UNet(inChannels=1, OurChannels=64)
+                    hidden_state, delta_feat = self.update_block(hidden_state, feat_img, feat_motion)
+                    feat_prop_fine = feat_prop_fine + delta_feat
+                feat_prop = self.fusion(torch.cat([feat_prop_fine, feat_prop_coarse], dim=1))
+            feat_prop = torch.cat([x_i, out_l[i], feat_prop], dim=1)
+            feat_prop = self.forward_trunk(feat_prop)
+            # upsample
+            out = self.lrelu(self.pixel_shuffle(self.upconv1(feat_prop)))
+            out = self.lrelu(self.pixel_shuffle(self.upconv2(out)))
+            out = self.lrelu(self.conv_hr(out))
+            out = self.conv_last(out)
+            base = F.interpolate(x_i, scale_factor=4, mode='bilinear', align_corners=False)
+            out += base
+            out_l[i] = out
+        return torch.stack(out_l, dim=1)
+# @ARCH_REGISTRY.register()
+# class EvTexture(nn.Module):
+#     """EvTexture: Event-driven Texture Enhancement for Video Super-Resolution (ICML 2024)
+#        Note that: this class is for 4x VSR
+#     Args:
+#         num_feat (int): Number of channels. Default: 64.
+#         num_block (int): Number of residual blocks for each branch. Default: 30
+#         spynet_path (str): Path to the pretrained weights of SPyNet. Default: None.
+#     """
+#     def __init__(self, num_feat=64, num_block=30, spynet_path=None):
+#         super().__init__()
+#         self.num_feat = num_feat
+#         # RGB-based flowalignment
+#         self.spynet = SpyNet(spynet_path)
+#         self.cnet = ConvResidualBlocks(num_in_ch=3, num_out_ch=64, num_block=8)
+#         # iterative texture enhancement module
+#         self.enet = UNet(inChannels=1, outChannels=num_feat)
+#         self.update_block = SmallUpdateBlock(hidden_dim=num_feat, input_dim=num_feat * 2)
+#         self.fusion = nn.Conv2d(num_feat * 2, num_feat, 1, 1, 0, bias=True)
+#         # propogation
+#         self.backward_trunk = ConvResidualBlocks(num_feat + 3, num_feat, num_block)
+#         self.forward_trunk = ConvResidualBlocks(num_feat * 2 + 3, num_feat, num_block)
+#         # reconstruction
+#         self.upconv1 = nn.Conv2d(num_feat, num_feat * 4, 3, 1, 1, bias=True)
+#         self.upconv2 = nn.Conv2d(num_feat, num_feat * 4, 3, 1, 1, bias=True)
+#         self.conv_hr = nn.Conv2d(num_feat, num_feat, 3, 1, 1)
+#         self.conv_last = nn.Conv2d(num_feat, 3, 3, 1, 1)
+#         self.pixel_shuffle = nn.PixelShuffle(2)
+#         # activation functions
+#         self.lrelu = nn.LeakyReLU(negative_slope=0.1, inplace=True)
+#     def get_flow(self, x):
+#         # print(x.size())
+#         b, n, c, h, w = x.size()
+#         x_1 = x[:, :-1, :, :, :].reshape(-1, c, h, w)
+#         x_2 = x[:, 1:, :, :, :].reshape(-1, c, h, w)
+#         flows_backward = self.spynet(x_1, x_2).view(b, n - 1, 2, h, w)
+#         flows_forward = self.spynet(x_2, x_1).view(b, n - 1, 2, h, w)
+#         return flows_forward, flows_backward
+#     # context feature extractor
+#     def get_feat(self, x):
+#         b, n, c, h, w = x.size()
+#         feats_ = self.cnet(x.view(-1, c, h, w))
+#         h, w = feats_.shape[2:]
+#         feats_ = feats_.view(b, n, -1, h, w)
+#         return feats_
+#     def forward(self, imgs, voxels_f, voxels_b):
+#         """Forward function of EvTexture
+#         Args:
+#             imgs: Input frames with shape (b, n, c, h, w). n is the number of frames.
+#             voxels_f: forward event voxel grids with shape (b, n-1 , c, h, w).
+#             voxels_b: backward event voxel grids with shape (b, n-1 , c, h, w).
+#         Output:
+#             out_l: output frames with shape (b, n, c, 4h, 4w)
+#         """
+#         imgs = imgs.unsqueeze(0)
+#         voxels_f = voxels_f.unsqueeze(0)
+#         voxels_b = voxels_b.unsqueeze(0)
+#         flows_forward, flows_backward = self.get_flow(imgs)
+#         feat_imgs = self.get_feat(imgs)
+#         b, n, _, h, w = imgs.size()
+#         bins = voxels_f.size()[2]
+#         # backward branch
+#         out_l = []
+#         feat_prop = imgs.new_zeros(b, self.num_feat, h, w)
+#         for i in range(n - 1, -1, -1):
+#             x_i = imgs[:, i, :, :, :]
+#             if i < n - 1:
+#                 # motion branch by rgb frames
+#                 flow = flows_backward[:, i, :, :, :]
+#                 feat_prop_coarse = flow_warp(feat_prop, flow.permute(0, 2, 3, 1))
+#                 # texture branch by event voxels
+#                 hidden_state = feat_prop.clone()
+#                 feat_img = feat_imgs[:, i, :, :, :]  # [B, num_feat, H, W]
+#                 cur_voxel = voxels_f[:, i, :, :, :]  # [B, Bins, H, W]
+#                 ## iterative update block
+#                 feat_prop_fine = feat_prop.clone()
+#                 for j in range(bins - 1, -1, -1):
+#                     voxel_j = cur_voxel[:, j, :, :].unsqueeze(1)  # [B, 1, H, W]
+#                     feat_motion = self.enet(voxel_j)  # [B, num_feat, H, W], enet is UNet(inChannels=1, OurChannels=num_feat)
+#                     hidden_state, delta_feat = self.update_block(hidden_state, feat_img, feat_motion)  # refine coarse hidden state
+#                     feat_prop_fine = feat_prop_fine + delta_feat
+#                 feat_prop = self.fusion(torch.cat([feat_prop_fine, feat_prop_coarse], dim=1))
+#             feat_prop = torch.cat([x_i, feat_prop], dim=1)
+#             feat_prop = self.backward_trunk(feat_prop)
+#             out_l.insert(0, feat_prop)
+#         # forward branch
+#         feat_prop = torch.zeros_like(feat_prop)
+#         for i in range(0, n):
+#             x_i = imgs[:, i, :, :, :]
+#             if i > 0:
+#                 # motion branch by rgb frames
+#                 flow = flows_forward[:, i - 1, :, :, :]
+#                 feat_prop_coarse = flow_warp(feat_prop, flow.permute(0, 2, 3, 1))
+#                 # texture branch by event voxels
+#                 hidden_state = feat_prop.clone()
+#                 feat_img = feat_imgs[:, i, :, :, :]  # [B, num_feat, H, W]
+#                 cur_voxel = voxels_b[:, i - 1, :, :, :]  # [B, Bins, H, W]
+#                 # iterative update block
+#                 feat_prop_fine = feat_prop.clone()
+#                 for j in range(bins - 1, -1, -1):
+#                     voxel_j = cur_voxel[:, j, :, :].unsqueeze(1)  # [B, 1, H, W]
+#                     feat_motion = self.enet(voxel_j)  # [B, num_feat, H, W], enet is UNet(inChannels=1, OurChannels=64)
+#                     hidden_state, delta_feat = self.update_block(hidden_state, feat_img, feat_motion)
+#                     feat_prop_fine = feat_prop_fine + delta_feat
+#                 feat_prop = self.fusion(torch.cat([feat_prop_fine, feat_prop_coarse], dim=1))
+#             feat_prop = torch.cat([x_i, out_l[i], feat_prop], dim=1)
+#             feat_prop = self.forward_trunk(feat_prop)
+#             # upsample
+#             out = self.lrelu(self.pixel_shuffle(self.upconv1(feat_prop)))
+#             out = self.lrelu(self.pixel_shuffle(self.upconv2(out)))
+#             out = self.lrelu(self.conv_hr(out))
+#             out = self.conv_last(out)
+#             base = F.interpolate(x_i, scale_factor=4, mode='bilinear', align_corners=False)
+#             out += base
+#             out_l[i] = out
+#         return torch.stack(out_l, dim=1)
\ No newline at end of file
--- a/basicsr/archs/spynet_arch.py
+++ b/basicsr/archs/spynet_arch.py
+import math
+import torch
+from torch import nn as nn
+from torch.nn import functional as F
+from basicsr.utils.registry import ARCH_REGISTRY
+from .arch_util import flow_warp
+class BasicModule(nn.Module):
+    """Basic Module for SpyNet.
+    """
+    def __init__(self):
+        super(BasicModule, self).__init__()
+        self.basic_module = nn.Sequential(
+            nn.Conv2d(in_channels=8, out_channels=32, kernel_size=7, stride=1, padding=3), nn.ReLU(inplace=False),
+            nn.Conv2d(in_channels=32, out_channels=64, kernel_size=7, stride=1, padding=3), nn.ReLU(inplace=False),
+            nn.Conv2d(in_channels=64, out_channels=32, kernel_size=7, stride=1, padding=3), nn.ReLU(inplace=False),
+            nn.Conv2d(in_channels=32, out_channels=16, kernel_size=7, stride=1, padding=3), nn.ReLU(inplace=False),
+            nn.Conv2d(in_channels=16, out_channels=2, kernel_size=7, stride=1, padding=3))
+    def forward(self, tensor_input):
+        return self.basic_module(tensor_input)
+@ARCH_REGISTRY.register()
+class SpyNet(nn.Module):
+    """SpyNet architecture.
+    Args:
+        load_path (str): path for pretrained SpyNet. Default: None.
+    """
+    def __init__(self, load_path=None):
+        super(SpyNet, self).__init__()
+        self.basic_module = nn.ModuleList([BasicModule() for _ in range(6)])
+        if load_path:
+            self.load_state_dict(torch.load(load_path, map_location=lambda storage, loc: storage)['params'])
+        self.register_buffer('mean', torch.Tensor([0.485, 0.456, 0.406]).view(1, 3, 1, 1))
+        self.register_buffer('std', torch.Tensor([0.229, 0.224, 0.225]).view(1, 3, 1, 1))
+    def preprocess(self, tensor_input):
+        tensor_output = (tensor_input - self.mean) / self.std
+        return tensor_output
+    def process(self, ref, supp):
+        flow = []
+        ref = [self.preprocess(ref)]
+        supp = [self.preprocess(supp)]
+        for level in range(5):
+            ref.insert(0, F.avg_pool2d(input=ref[0], kernel_size=2, stride=2, count_include_pad=False))
+            supp.insert(0, F.avg_pool2d(input=supp[0], kernel_size=2, stride=2, count_include_pad=False))
+        flow = ref[0].new_zeros(
+            [ref[0].size(0), 2,
+             int(math.floor(ref[0].size(2) / 2.0)),
+             int(math.floor(ref[0].size(3) / 2.0))])
+        for level in range(len(ref)):
+            upsampled_flow = F.interpolate(input=flow, scale_factor=2, mode='bilinear', align_corners=True) * 2.0
+            if upsampled_flow.size(2) != ref[level].size(2):
+                upsampled_flow = F.pad(input=upsampled_flow, pad=[0, 0, 0, 1], mode='replicate')
+            if upsampled_flow.size(3) != ref[level].size(3):
+                upsampled_flow = F.pad(input=upsampled_flow, pad=[0, 1, 0, 0], mode='replicate')
+            flow = self.basic_module[level](torch.cat([
+                ref[level],
+                flow_warp(
+                    supp[level], upsampled_flow.permute(0, 2, 3, 1), interp_mode='bilinear', padding_mode='border'),
+                upsampled_flow
+            ], 1)) + upsampled_flow
+        return flow
+    def forward(self, ref, supp):
+        assert ref.size() == supp.size()
+        h, w = ref.size(2), ref.size(3)
+        w_floor = math.floor(math.ceil(w / 32.0) * 32.0)
+        h_floor = math.floor(math.ceil(h / 32.0) * 32.0)
+        ref = F.interpolate(input=ref, size=(h_floor, w_floor), mode='bilinear', align_corners=False)
+        supp = F.interpolate(input=supp, size=(h_floor, w_floor), mode='bilinear', align_corners=False)
+        flow = F.interpolate(input=self.process(ref, supp), size=(h, w), mode='bilinear', align_corners=False)
+        flow[:, 0, :, :] *= float(w) / float(w_floor)
+        flow[:, 1, :, :] *= float(h) / float(h_floor)
+        return flow
--- a/basicsr/archs/unet_arch.py
+++ b/basicsr/archs/unet_arch.py
+## Modified from timelens. https://github.com/uzh-rpg/rpg_timelens/blob/main/timelens/superslomo/unet.py
+import torch
+import torch.nn.functional as F
+from basicsr.utils.registry import ARCH_REGISTRY
+from .arch_util import SizeAdapter
+from torch import nn
+class up(nn.Module):
+    def __init__(self, inChannels, outChannels):
+        super(up, self).__init__()
+        self.conv1 = nn.Conv2d(inChannels, outChannels, 3, stride=1, padding=1)
+        self.conv2 = nn.Conv2d(2 * outChannels, outChannels, 3, stride=1, padding=1)
+    def forward(self, x, skpCn):
+        x = F.interpolate(x, scale_factor=2, mode="bilinear")
+        x = F.leaky_relu(self.conv1(x), negative_slope=0.1)
+        x = F.leaky_relu(self.conv2(torch.cat((x, skpCn), 1)), negative_slope=0.1)
+        return x
+class down(nn.Module):
+    def __init__(self, inChannels, outChannels, filterSize):
+        super(down, self).__init__()
+        self.conv1 = nn.Conv2d(
+            inChannels,
+            outChannels,
+            filterSize,
+            stride=1,
+            padding=int((filterSize - 1) / 2),
+        )
+        self.conv2 = nn.Conv2d(
+            outChannels,
+            outChannels,
+            filterSize,
+            stride=1,
+            padding=int((filterSize - 1) / 2),
+        )
+    def forward(self, x):
+        x = F.avg_pool2d(x, 2)
+        x = F.leaky_relu(self.conv1(x), negative_slope=0.1)
+        x = F.leaky_relu(self.conv2(x), negative_slope=0.1)
+        return x
+@ARCH_REGISTRY.register()
+class UNet(nn.Module):
+    """Modified version of Unet from SuperSloMo.
+    Difference :
+    1) there is an option to skip ReLU after the last convolution.
+    2) there is a size adapter module that makes sure that input of all sizes
+       can be processed correctly. It is necessary because original
+       UNet can process only inputs with spatial dimensions divisible by 32.
+    """
+    def __init__(self, inChannels, outChannels, ends_with_relu=True, load_path=None):
+        super(UNet, self).__init__()
+        self._ends_with_relu = ends_with_relu
+        self._size_adapter = SizeAdapter(minimum_size=32)
+        # 5-level
+        self.conv1 = nn.Conv2d(inChannels, 8, 7, stride=1, padding=3)
+        self.conv2 = nn.Conv2d(8, 8, 7, stride=1, padding=3)
+        self.down1 = down(8, 16, 5)
+        self.down2 = down(16, 32, 3)
+        self.down3 = down(32, 64, 3)
+        self.down4 = down(64, 128, 3)
+        self.down5 = down(128, 128, 3)
+        self.up1 = up(128, 128)
+        self.up2 = up(128, 64)
+        self.up3 = up(64, 32)
+        self.up4 = up(32, 16)
+        self.up5 = up(16, 8)
+        self.conv3 = nn.Conv2d(8, outChannels, 3, stride=1, padding=1)
+        if load_path:
+            self.load_state_dict(torch.load(load_path, map_location=lambda storage, loc: storage)['params_ema'])
+    def forward(self, x):
+        x = self._size_adapter.pad(x)
+        x = F.leaky_relu(self.conv1(x), negative_slope=0.1)
+        s1 = F.leaky_relu(self.conv2(x), negative_slope=0.1)
+        s2 = self.down1(s1)
+        s3 = self.down2(s2)
+        s4 = self.down3(s3)
+        s5 = self.down4(s4)
+        x = self.down5(s5)
+        x = self.up1(x, s5)
+        x = self.up2(x, s4)
+        x = self.up3(x, s3)
+        x = self.up4(x, s2)
+        x = self.up5(x, s1)
+        # Note that original code has relu et the end.
+        if self._ends_with_relu == True:
+            x = F.leaky_relu(self.conv3(x), negative_slope=0.1)
+        else:
+            x = self.conv3(x)
+        # Size adapter crops the output to the original size.
+        x = self._size_adapter.unpad(x)
+        return x
+def patch_chunk_2x(input):
+    """
+        input (Tensor): [B, C, H, W], and H, W are divisible by 2.
+        return:
+            result (Tensor): [B, 4C, H/2, H/W]
+    """
+    result = []
+    split_h = torch.chunk(input, 2, -2)
+    for sli in split_h:
+        sli_w = torch.chunk(sli, 2, -1)
+        for i in range(2):
+            result.append(sli_w[i])
+    assert len(result) == 4
+    result = torch.cat(result, dim=1)
+    return result
+if __name__ == '__main__':
+    net = UNet(1, 2)
+    input = torch.randn((4, 1, 64, 64))
+    out = net(input)
\ No newline at end of file
--- a/basicsr/archs/vgg_arch.py
+++ b/basicsr/archs/vgg_arch.py
+import os
+import torch
+from collections import OrderedDict
+from torch import nn as nn
+from torchvision.models import vgg as vgg
+from basicsr.utils.registry import ARCH_REGISTRY
+VGG_PRETRAIN_PATH = 'experiments/pretrained_models/vgg19-dcbb9e9d.pth'
+NAMES = {
+    'vgg11': [
+        'conv1_1', 'relu1_1', 'pool1', 'conv2_1', 'relu2_1', 'pool2', 'conv3_1', 'relu3_1', 'conv3_2', 'relu3_2',
+        'pool3', 'conv4_1', 'relu4_1', 'conv4_2', 'relu4_2', 'pool4', 'conv5_1', 'relu5_1', 'conv5_2', 'relu5_2',
+        'pool5'
+    ],
+    'vgg13': [
+        'conv1_1', 'relu1_1', 'conv1_2', 'relu1_2', 'pool1', 'conv2_1', 'relu2_1', 'conv2_2', 'relu2_2', 'pool2',
+        'conv3_1', 'relu3_1', 'conv3_2', 'relu3_2', 'pool3', 'conv4_1', 'relu4_1', 'conv4_2', 'relu4_2', 'pool4',
+        'conv5_1', 'relu5_1', 'conv5_2', 'relu5_2', 'pool5'
+    ],
+    'vgg16': [
+        'conv1_1', 'relu1_1', 'conv1_2', 'relu1_2', 'pool1', 'conv2_1', 'relu2_1', 'conv2_2', 'relu2_2', 'pool2',
+        'conv3_1', 'relu3_1', 'conv3_2', 'relu3_2', 'conv3_3', 'relu3_3', 'pool3', 'conv4_1', 'relu4_1', 'conv4_2',
+        'relu4_2', 'conv4_3', 'relu4_3', 'pool4', 'conv5_1', 'relu5_1', 'conv5_2', 'relu5_2', 'conv5_3', 'relu5_3',
+        'pool5'
+    ],
+    'vgg19': [
+        'conv1_1', 'relu1_1', 'conv1_2', 'relu1_2', 'pool1', 'conv2_1', 'relu2_1', 'conv2_2', 'relu2_2', 'pool2',
+        'conv3_1', 'relu3_1', 'conv3_2', 'relu3_2', 'conv3_3', 'relu3_3', 'conv3_4', 'relu3_4', 'pool3', 'conv4_1',
+        'relu4_1', 'conv4_2', 'relu4_2', 'conv4_3', 'relu4_3', 'conv4_4', 'relu4_4', 'pool4', 'conv5_1', 'relu5_1',
+        'conv5_2', 'relu5_2', 'conv5_3', 'relu5_3', 'conv5_4', 'relu5_4', 'pool5'
+    ]
+}
+def insert_bn(names):
+    """Insert bn layer after each conv.
+    Args:
+        names (list): The list of layer names.
+    Returns:
+        list: The list of layer names with bn layers.
+    """
+    names_bn = []
+    for name in names:
+        names_bn.append(name)
+        if 'conv' in name:
+            position = name.replace('conv', '')
+            names_bn.append('bn' + position)
+    return names_bn
+@ARCH_REGISTRY.register()
+class VGGFeatureExtractor(nn.Module):
+    """VGG network for feature extraction.
+    In this implementation, we allow users to choose whether use normalization
+    in the input feature and the type of vgg network. Note that the pretrained
+    path must fit the vgg type.
+    Args:
+        layer_name_list (list[str]): Forward function returns the corresponding
+            features according to the layer_name_list.
+            Example: {'relu1_1', 'relu2_1', 'relu3_1'}.
+        vgg_type (str): Set the type of vgg network. Default: 'vgg19'.
+        use_input_norm (bool): If True, normalize the input image. Importantly,
+            the input feature must in the range [0, 1]. Default: True.
+        range_norm (bool): If True, norm images with range [-1, 1] to [0, 1].
+            Default: False.
+        requires_grad (bool): If true, the parameters of VGG network will be
+            optimized. Default: False.
+        remove_pooling (bool): If true, the max pooling operations in VGG net
+            will be removed. Default: False.
+        pooling_stride (int): The stride of max pooling operation. Default: 2.
+    """
+    def __init__(self,
+                 layer_name_list,
+                 vgg_type='vgg19',
+                 use_input_norm=True,
+                 range_norm=False,
+                 requires_grad=False,
+                 remove_pooling=False,
+                 pooling_stride=2):
+        super(VGGFeatureExtractor, self).__init__()
+        self.layer_name_list = layer_name_list
+        self.use_input_norm = use_input_norm
+        self.range_norm = range_norm
+        self.names = NAMES[vgg_type.replace('_bn', '')]
+        if 'bn' in vgg_type:
+            self.names = insert_bn(self.names)
+        # only borrow layers that will be used to avoid unused params
+        max_idx = 0
+        for v in layer_name_list:
+            idx = self.names.index(v)
+            if idx > max_idx:
+                max_idx = idx
+        if os.path.exists(VGG_PRETRAIN_PATH):
+            vgg_net = getattr(vgg, vgg_type)(pretrained=False)
+            state_dict = torch.load(VGG_PRETRAIN_PATH, map_location=lambda storage, loc: storage)
+            vgg_net.load_state_dict(state_dict)
+        else:
+            vgg_net = getattr(vgg, vgg_type)(pretrained=True)
+        features = vgg_net.features[:max_idx + 1]
+        modified_net = OrderedDict()
+        for k, v in zip(self.names, features):
+            if 'pool' in k:
+                # if remove_pooling is true, pooling operation will be removed
+                if remove_pooling:
+                    continue
+                else:
+                    # in some cases, we may want to change the default stride
+                    modified_net[k] = nn.MaxPool2d(kernel_size=2, stride=pooling_stride)
+            else:
+                modified_net[k] = v
+        self.vgg_net = nn.Sequential(modified_net)
+        if not requires_grad:
+            self.vgg_net.eval()
+            for param in self.parameters():
+                param.requires_grad = False
+        else:
+            self.vgg_net.train()
+            for param in self.parameters():
+                param.requires_grad = True
+        if self.use_input_norm:
+            # the mean is for image with range [0, 1]
+            self.register_buffer('mean', torch.Tensor([0.485, 0.456, 0.406]).view(1, 3, 1, 1))
+            # the std is for image with range [0, 1]
+            self.register_buffer('std', torch.Tensor([0.229, 0.224, 0.225]).view(1, 3, 1, 1))
+    def forward(self, x):
+        """Forward function.
+        Args:
+            x (Tensor): Input tensor with shape (n, c, h, w).
+        Returns:
+            Tensor: Forward results.
+        """
+        if self.range_norm:
+            x = (x + 1) / 2
+        if self.use_input_norm:
+            x = (x - self.mean) / self.std
+        output = {}
+        for key, layer in self.vgg_net._modules.items():
+            x = layer(x)
+            if key in self.layer_name_list:
+                output[key] = x.clone()
+        return output
--- a/basicsr/data/__init__.py
+++ b/basicsr/data/__init__.py
+import importlib
+import numpy as np
+import random
+import torch
+import torch.utils.data
+from copy import deepcopy
+from functools import partial
+from os import path as osp
+from basicsr.data.prefetch_dataloader import PrefetchDataLoader
+from basicsr.utils import get_root_logger, scandir
+from basicsr.utils.dist_util import get_dist_info
+from basicsr.utils.registry import DATASET_REGISTRY
+__all__ = ['build_dataset', 'build_dataloader']
+# automatically scan and import dataset modules for registry
+# scan all the files under the data folder with '_dataset' in file names
+data_folder = osp.dirname(osp.abspath(__file__))
+dataset_filenames = [osp.splitext(osp.basename(v))[0] for v in scandir(data_folder) if v.endswith('_dataset.py')]
+# import all the dataset modules
+_dataset_modules = [importlib.import_module(f'basicsr.data.{file_name}') for file_name in dataset_filenames]
+def build_dataset(dataset_opt):
+    """Build dataset from options.
+    Args:
+        dataset_opt (dict): Configuration for dataset. It must contain:
+            name (str): Dataset name.
+            type (str): Dataset type.
+    """
+    dataset_opt = deepcopy(dataset_opt)
+    dataset = DATASET_REGISTRY.get(dataset_opt['type'])(dataset_opt)
+    logger = get_root_logger()
+    logger.info(f'Dataset [{dataset.__class__.__name__}] - {dataset_opt["name"]} is built.')
+    return dataset
+def build_dataloader(dataset, dataset_opt, num_gpu=1, dist=False, sampler=None, seed=None):
+    """Build dataloader.
+    Args:
+        dataset (torch.utils.data.Dataset): Dataset.
+        dataset_opt (dict): Dataset options. It contains the following keys:
+            phase (str): 'train' or 'val'.
+            num_worker_per_gpu (int): Number of workers for each GPU.
+            batch_size_per_gpu (int): Training batch size for each GPU.
+        num_gpu (int): Number of GPUs. Used only in the train phase.
+            Default: 1.
+        dist (bool): Whether in distributed training. Used only in the train
+            phase. Default: False.
+        sampler (torch.utils.data.sampler): Data sampler. Default: None.
+        seed (int | None): Seed. Default: None
+    """
+    phase = dataset_opt['phase']
+    rank, _ = get_dist_info()
+    if phase == 'train':
+        if dist:  # distributed training
+            batch_size = dataset_opt['batch_size_per_gpu']
+            num_workers = dataset_opt['num_worker_per_gpu']
+        else:  # non-distributed training
+            multiplier = 1 if num_gpu == 0 else num_gpu
+            batch_size = dataset_opt['batch_size_per_gpu'] * multiplier
+            num_workers = dataset_opt['num_worker_per_gpu'] * multiplier
+        dataloader_args = dict(
+            dataset=dataset,
+            batch_size=batch_size,
+            shuffle=False,
+            num_workers=num_workers,
+            sampler=sampler,
+            drop_last=True)
+        if sampler is None:
+            dataloader_args['shuffle'] = True
+        dataloader_args['worker_init_fn'] = partial(
+            worker_init_fn, num_workers=num_workers, rank=rank, seed=seed) if seed is not None else None
+    elif phase in ['val', 'test']:  # validation
+        dataloader_args = dict(dataset=dataset, batch_size=1, shuffle=False, num_workers=0)
+    else:
+        raise ValueError(f"Wrong dataset phase: {phase}. Supported ones are 'train', 'val' and 'test'.")
+    dataloader_args['pin_memory'] = dataset_opt.get('pin_memory', False)
+    dataloader_args['persistent_workers'] = dataset_opt.get('persistent_workers', False)
+    prefetch_mode = dataset_opt.get('prefetch_mode')
+    if prefetch_mode == 'cpu':  # CPUPrefetcher
+        num_prefetch_queue = dataset_opt.get('num_prefetch_queue', 1)
+        logger = get_root_logger()
+        logger.info(f'Use {prefetch_mode} prefetch dataloader: num_prefetch_queue = {num_prefetch_queue}')
+        return PrefetchDataLoader(num_prefetch_queue=num_prefetch_queue, **dataloader_args)
+    else:
+        # prefetch_mode=None: Normal dataloader
+        # prefetch_mode='cuda': dataloader for CUDAPrefetcher
+        return torch.utils.data.DataLoader(**dataloader_args)
+def worker_init_fn(worker_id, num_workers, rank, seed):
+    # Set the worker seed to num_workers * rank + worker_id + seed
+    worker_seed = num_workers * rank + worker_id + seed
+    np.random.seed(worker_seed)
+    random.seed(worker_seed)
--- a/basicsr/data/data_sampler.py
+++ b/basicsr/data/data_sampler.py
+import math
+import torch
+from torch.utils.data.sampler import Sampler
+class EnlargedSampler(Sampler):
+    """Sampler that restricts data loading to a subset of the dataset.
+    Modified from torch.utils.data.distributed.DistributedSampler
+    Support enlarging the dataset for iteration-based training, for saving
+    time when restart the dataloader after each epoch
+    Args:
+        dataset (torch.utils.data.Dataset): Dataset used for sampling.
+        num_replicas (int | None): Number of processes participating in
+            the training. It is usually the world_size.
+        rank (int | None): Rank of the current process within num_replicas.
+        ratio (int): Enlarging ratio. Default: 1.
+    """
+    def __init__(self, dataset, num_replicas, rank, ratio=1):
+        self.dataset = dataset
+        self.num_replicas = num_replicas
+        self.rank = rank
+        self.epoch = 0
+        self.num_samples = math.ceil(len(self.dataset) * ratio / self.num_replicas)
+        self.total_size = self.num_samples * self.num_replicas
+    def __iter__(self):
+        # deterministically shuffle based on epoch
+        g = torch.Generator()
+        g.manual_seed(self.epoch)
+        indices = torch.randperm(self.total_size, generator=g).tolist()
+        dataset_size = len(self.dataset)
+        indices = [v % dataset_size for v in indices]
+        # subsample
+        indices = indices[self.rank:self.total_size:self.num_replicas]
+        assert len(indices) == self.num_samples
+        return iter(indices)
+    def __len__(self):
+        return self.num_samples
+    def set_epoch(self, epoch):
+        self.epoch = epoch
--- a/basicsr/data/data_util.py
+++ b/basicsr/data/data_util.py
+import cv2
+import numpy as np
+import torch
+from os import path as osp
+from torch.nn import functional as F
+from basicsr.data.transforms import mod_crop
+from basicsr.utils import img2tensor, scandir
+def read_img_seq(path, require_mod_crop=False, scale=1, return_imgname=False):
+    """Read a sequence of images from a given folder path.
+    Args:
+        path (list[str] | str): List of image paths or image folder path.
+        require_mod_crop (bool): Require mod crop for each image.
+            Default: False.
+        scale (int): Scale factor for mod_crop. Default: 1.
+        return_imgname(bool): Whether return image names. Default False.
+    Returns:
+        Tensor: size (t, c, h, w), RGB, [0, 1].
+        list[str]: Returned image name list.
+    """
+    if isinstance(path, list):
+        img_paths = path
+    else:
+        img_paths = sorted(list(scandir(path, full_path=True)))
+    imgs = [cv2.imread(v).astype(np.float32) / 255. for v in img_paths]
+    if require_mod_crop:
+        imgs = [mod_crop(img, scale) for img in imgs]
+    imgs = img2tensor(imgs, bgr2rgb=True, float32=True)
+    imgs = torch.stack(imgs, dim=0)
+    if return_imgname:
+        imgnames = [osp.splitext(osp.basename(path))[0] for path in img_paths]
+        return imgs, imgnames
+    else:
+        return imgs
+def generate_frame_indices(crt_idx, max_frame_num, num_frames, padding='reflection'):
+    """Generate an index list for reading `num_frames` frames from a sequence
+    of images.
+    Args:
+        crt_idx (int): Current center index.
+        max_frame_num (int): Max number of the sequence of images (from 1).
+        num_frames (int): Reading num_frames frames.
+        padding (str): Padding mode, one of
+            'replicate' | 'reflection' | 'reflection_circle' | 'circle'
+            Examples: current_idx = 0, num_frames = 5
+            The generated frame indices under different padding mode:
+            replicate: [0, 0, 0, 1, 2]
+            reflection: [2, 1, 0, 1, 2]
+            reflection_circle: [4, 3, 0, 1, 2]
+            circle: [3, 4, 0, 1, 2]
+    Returns:
+        list[int]: A list of indices.
+    """
+    assert num_frames % 2 == 1, 'num_frames should be an odd number.'
+    assert padding in ('replicate', 'reflection', 'reflection_circle', 'circle'), f'Wrong padding mode: {padding}.'
+    max_frame_num = max_frame_num - 1  # start from 0
+    num_pad = num_frames // 2
+    indices = []
+    for i in range(crt_idx - num_pad, crt_idx + num_pad + 1):
+        if i < 0:
+            if padding == 'replicate':
+                pad_idx = 0
+            elif padding == 'reflection':
+                pad_idx = -i
+            elif padding == 'reflection_circle':
+                pad_idx = crt_idx + num_pad - i
+            else:
+                pad_idx = num_frames + i
+        elif i > max_frame_num:
+            if padding == 'replicate':
+                pad_idx = max_frame_num
+            elif padding == 'reflection':
+                pad_idx = max_frame_num * 2 - i
+            elif padding == 'reflection_circle':
+                pad_idx = (crt_idx - num_pad) - (i - max_frame_num)
+            else:
+                pad_idx = i - num_frames
+        else:
+            pad_idx = i
+        indices.append(pad_idx)
+    return indices
+def paired_paths_from_lmdb(folders, keys):
+    """Generate paired paths from lmdb files.
+    Contents of lmdb. Taking the `lq.lmdb` for example, the file structure is:
+    ::
+        lq.lmdb
+        ├── data.mdb
+        ├── lock.mdb
+        ├── meta_info.txt
+    The data.mdb and lock.mdb are standard lmdb files and you can refer to
+    https://lmdb.readthedocs.io/en/release/ for more details.
+    The meta_info.txt is a specified txt file to record the meta information
+    of our datasets. It will be automatically created when preparing
+    datasets by our provided dataset tools.
+    Each line in the txt file records
+    1)image name (with extension),
+    2)image shape,
+    3)compression level, separated by a white space.
+    Example: `baboon.png (120,125,3) 1`
+    We use the image name without extension as the lmdb key.
+    Note that we use the same key for the corresponding lq and gt images.
+    Args:
+        folders (list[str]): A list of folder path. The order of list should
+            be [input_folder, gt_folder].
+        keys (list[str]): A list of keys identifying folders. The order should
+            be in consistent with folders, e.g., ['lq', 'gt'].
+            Note that this key is different from lmdb keys.
+    Returns:
+        list[str]: Returned path list.
+    """
+    assert len(folders) == 2, ('The len of folders should be 2 with [input_folder, gt_folder]. '
+                               f'But got {len(folders)}')
+    assert len(keys) == 2, f'The len of keys should be 2 with [input_key, gt_key]. But got {len(keys)}'
+    input_folder, gt_folder = folders
+    input_key, gt_key = keys
+    if not (input_folder.endswith('.lmdb') and gt_folder.endswith('.lmdb')):
+        raise ValueError(f'{input_key} folder and {gt_key} folder should both in lmdb '
+                         f'formats. But received {input_key}: {input_folder}; '
+                         f'{gt_key}: {gt_folder}')
+    # ensure that the two meta_info files are the same
+    with open(osp.join(input_folder, 'meta_info.txt')) as fin:
+        input_lmdb_keys = [line.split('.')[0] for line in fin]
+    with open(osp.join(gt_folder, 'meta_info.txt')) as fin:
+        gt_lmdb_keys = [line.split('.')[0] for line in fin]
+    if set(input_lmdb_keys) != set(gt_lmdb_keys):
+        raise ValueError(f'Keys in {input_key}_folder and {gt_key}_folder are different.')
+    else:
+        paths = []
+        for lmdb_key in sorted(input_lmdb_keys):
+            paths.append(dict([(f'{input_key}_path', lmdb_key), (f'{gt_key}_path', lmdb_key)]))
+        return paths
+def paired_paths_from_meta_info_file(folders, keys, meta_info_file, filename_tmpl):
+    """Generate paired paths from an meta information file.
+    Each line in the meta information file contains the image names and
+    image shape (usually for gt), separated by a white space.
+    Example of an meta information file:
+    ```
+    0001_s001.png (480,480,3)
+    0001_s002.png (480,480,3)
+    ```
+    Args:
+        folders (list[str]): A list of folder path. The order of list should
+            be [input_folder, gt_folder].
+        keys (list[str]): A list of keys identifying folders. The order should
+            be in consistent with folders, e.g., ['lq', 'gt'].
+        meta_info_file (str): Path to the meta information file.
+        filename_tmpl (str): Template for each filename. Note that the
+            template excludes the file extension. Usually the filename_tmpl is
+            for files in the input folder.
+    Returns:
+        list[str]: Returned path list.
+    """
+    assert len(folders) == 2, ('The len of folders should be 2 with [input_folder, gt_folder]. '
+                               f'But got {len(folders)}')
+    assert len(keys) == 2, f'The len of keys should be 2 with [input_key, gt_key]. But got {len(keys)}'
+    input_folder, gt_folder = folders
+    input_key, gt_key = keys
+    with open(meta_info_file, 'r') as fin:
+        gt_names = [line.strip().split(' ')[0] for line in fin]
+    paths = []
+    for gt_name in gt_names:
+        basename, ext = osp.splitext(osp.basename(gt_name))
+        input_name = f'{filename_tmpl.format(basename)}{ext}'
+        input_path = osp.join(input_folder, input_name)
+        gt_path = osp.join(gt_folder, gt_name)
+        paths.append(dict([(f'{input_key}_path', input_path), (f'{gt_key}_path', gt_path)]))
+    return paths
+def paired_paths_from_folder(folders, keys, filename_tmpl):
+    """Generate paired paths from folders.
+    Args:
+        folders (list[str]): A list of folder path. The order of list should
+            be [input_folder, gt_folder].
+        keys (list[str]): A list of keys identifying folders. The order should
+            be in consistent with folders, e.g., ['lq', 'gt'].
+        filename_tmpl (str): Template for each filename. Note that the
+            template excludes the file extension. Usually the filename_tmpl is
+            for files in the input folder.
+    Returns:
+        list[str]: Returned path list.
+    """
+    assert len(folders) == 2, ('The len of folders should be 2 with [input_folder, gt_folder]. '
+                               f'But got {len(folders)}')
+    assert len(keys) == 2, f'The len of keys should be 2 with [input_key, gt_key]. But got {len(keys)}'
+    input_folder, gt_folder = folders
+    input_key, gt_key = keys
+    input_paths = list(scandir(input_folder))
+    gt_paths = list(scandir(gt_folder))
+    assert len(input_paths) == len(gt_paths), (f'{input_key} and {gt_key} datasets have different number of images: '
+                                               f'{len(input_paths)}, {len(gt_paths)}.')
+    paths = []
+    for gt_path in gt_paths:
+        basename, ext = osp.splitext(osp.basename(gt_path))
+        input_name = f'{filename_tmpl.format(basename)}{ext}'
+        input_path = osp.join(input_folder, input_name)
+        assert input_name in input_paths, f'{input_name} is not in {input_key}_paths.'
+        gt_path = osp.join(gt_folder, gt_path)
+        paths.append(dict([(f'{input_key}_path', input_path), (f'{gt_key}_path', gt_path)]))
+    return paths
+def paths_from_folder(folder):
+    """Generate paths from folder.
+    Args:
+        folder (str): Folder path.
+    Returns:
+        list[str]: Returned path list.
+    """
+    paths = list(scandir(folder))
+    paths = [osp.join(folder, path) for path in paths]
+    return paths
+def paths_from_lmdb(folder):
+    """Generate paths from lmdb.
+    Args:
+        folder (str): Folder path.
+    Returns:
+        list[str]: Returned path list.
+    """
+    if not folder.endswith('.lmdb'):
+        raise ValueError(f'Folder {folder}folder should in lmdb format.')
+    with open(osp.join(folder, 'meta_info.txt')) as fin:
+        paths = [line.split('.')[0] for line in fin]
+    return paths
+def generate_gaussian_kernel(kernel_size=13, sigma=1.6):
+    """Generate Gaussian kernel used in `duf_downsample`.
+    Args:
+        kernel_size (int): Kernel size. Default: 13.
+        sigma (float): Sigma of the Gaussian kernel. Default: 1.6.
+    Returns:
+        np.array: The Gaussian kernel.
+    """
+    from scipy.ndimage import filters as filters
+    kernel = np.zeros((kernel_size, kernel_size))
+    # set element at the middle to one, a dirac delta
+    kernel[kernel_size // 2, kernel_size // 2] = 1
+    # gaussian-smooth the dirac, resulting in a gaussian filter
+    return filters.gaussian_filter(kernel, sigma)
+def duf_downsample(x, kernel_size=13, scale=4):
+    """Downsamping with Gaussian kernel used in the DUF official code.
+    Args:
+        x (Tensor): Frames to be downsampled, with shape (b, t, c, h, w).
+        kernel_size (int): Kernel size. Default: 13.
+        scale (int): Downsampling factor. Supported scale: (2, 3, 4).
+            Default: 4.
+    Returns:
+        Tensor: DUF downsampled frames.
+    """
+    assert scale in (2, 3, 4), f'Only support scale (2, 3, 4), but got {scale}.'
+    squeeze_flag = False
+    if x.ndim == 4:
+        squeeze_flag = True
+        x = x.unsqueeze(0)
+    b, t, c, h, w = x.size()
+    x = x.view(-1, 1, h, w)
+    pad_w, pad_h = kernel_size // 2 + scale * 2, kernel_size // 2 + scale * 2
+    x = F.pad(x, (pad_w, pad_w, pad_h, pad_h), 'reflect')
+    gaussian_filter = generate_gaussian_kernel(kernel_size, 0.4 * scale)
+    gaussian_filter = torch.from_numpy(gaussian_filter).type_as(x).unsqueeze(0).unsqueeze(0)
+    x = F.conv2d(x, gaussian_filter, stride=scale)
+    x = x[:, :, 2:-2, 2:-2]
+    x = x.view(b, t, c, x.size(2), x.size(3))
+    if squeeze_flag:
+        x = x.squeeze(0)
+    return x
--- a/basicsr/data/degradations.py
+++ b/basicsr/data/degradations.py
--- a/basicsr/data/meta_info/meta_info_CED_h5_test.txt
+++ b/basicsr/data/meta_info/meta_info_CED_h5_test.txt
+people_dynamic_wave.h5 759
+indoors_foosball_2.h5 269
+simple_wires_2.h5 552
+people_dynamic_dancing.h5 1175
+people_dynamic_jumping.h5 792
+simple_fruit_fast.h5 933
+outdoor_jumping_infrared_2.h5 665
+simple_carpet_fast.h5 602
+people_dynamic_armroll.h5 792
+indoors_kitchen_2.h5 635
+people_dynamic_sitting.h5 1075
--- a/basicsr/data/meta_info/meta_info_CED_h5_train.txt
+++ b/basicsr/data/meta_info/meta_info_CED_h5_train.txt
+simple_rabbits.h5 742
+simple_objects_dynamic.h5 1613
+simple_color_keyboard_2.h5 160
+simple_objects.h5 570
+simple_wires_1.h5 370
+simple_color_keyboard_3.h5 528
+simple_jenga_1.h5 387
+simple_color_keyboard_1.h5 263
+simple_carpet.h5 1343
+simple_flowers_infrared.h5 1039
+simple_jenga_2.h5 848
+simple_jenga_destroy.h5 355
+simple_fruit.h5 891
+indoors_foosball_1.h5 193
+indoors_window_autoexposure.h5 706
+indoors_corridor.h5 1008
+indoors_kitchen_1.h5 422
+indoors_kitchen_fast.h5 763
+indoors_office.h5 1247
+indoors_flying_room.h5 1062
+indoors_foosball_3.h5 311
+indoors_window.h5 766
+indoors_dark_25ms.h5 529
+indoors_very_dark_250ms.h5 94
+indoors_dark_100ms.h5 266
+indoors_very_dark_25ms.h5 603
+calib_fluorescent_infrared.h5 263
+calib_low.h5 217
+calib_outdoor_density.h5 286
+calib_outdoor.h5 286
+calib_fluorescent_dynamic.h5 286
+calib_fluorescent_density_infrared.h5 282
+calib_fluorescent.h5 286
+calib_outdoor_infrared.h5 286
+calib_fluorescent_density.h5 286
+calib_outdoor_hdr_infrared.h5 286
+calib_outdoor_dynamic.h5 286
+calib_low_density.h5 217
+driving_city_4.h5 17503
+driving_tunnel.h5 1498
+driving_city_sun_2.h5 7734
+driving_country_sun_2.h5 1180
+driving_city_3.h5 6383
+driving_country.h5 1088
+driving_country_sun_1.h5 1722
+driving_city_2.h5 6139
+driving_city_5.h5 10541
+driving_city_1.h5 9384
+driving_city_sun_1.h5 1230
+driving_tunnel_sun.h5 795
+outdoor_jumping_1_infrared.h5 527
+outdoor_shadow_1_infrared.h5 572
+outdoor_shadow_density_infrared.h5 446
+people_static_wave.h5 763
+people_static_wave_clockwise.h5 716
+people_static_air_guitar.h5 718
+people_static_clap.h5 723
+people_static_dancing_multiple_2.h5 321
+people_static_dancing.h5 1041
+people_static_jumping.h5 707
+people_dynamic_wave_clockwise.h5 822
+people_static_jogging.h5 731
+people_static_wave_counterclockwise.h5 713
+people_static_sitting.h5 1332
+people_dynamic_dancing_multiple.h5 1844
+people_dynamic_wave_counterclockwise.h5 827
+people_dynamic_clap.h5 783
+people_dynamic_jogging.h5 803
+people_static_arm_roll.h5 707
+people_static_dancing_multiple_1.h5 667
+people_dynamic_air_guitar.h5 886
+people_dynamic_selfie.h5 811
+people_static_dancing_multiple_3.h5 267
--- a/basicsr/data/meta_info/meta_info_REDS_10y_h5_test.txt
+++ b/basicsr/data/meta_info/meta_info_REDS_10y_h5_test.txt
+test.h5 90
\ No newline at end of file