Upadte README

22d76282 · yangzhong · da3124dd · 22d76282
Commit 22d76282 authored Dec 17, 2025 by yangzhong
Hide whitespace changes
Inline Side-by-side

Showing with 89 additions and 21 deletions

README.md README.md +89 -21

No files found.
--- a/README.md
+++ b/README.md
-需要用到xformers，所以使用的镜像是 image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-ubuntu22.04-dtk24.04.3-py3.10
+# STAR
-dtk25.04.1和dtk25.04.2的镜像中没有适配安装xformers
+## 论文
+Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
+https://arxiv.org/pdf/2501.02976
+## 模型结构
+STAR 是一种用于真实世界视频超分辨率的新型框架，旨在提升视频的细节清晰度和时间连贯性。该方法创新性地引入了文本到视频（T2V）扩散模型，以应对传统方法在处理复杂退化（如噪声、模糊）时的局限性。‌提出了一种全新的空间-时间增强框架，显著提升了恢复视频的空间细节与时间连贯性。
+![model](C:\Users\yang\Downloads\model.png)
+![mammoth_github](https://developer.sourcefind.cn/codes/modelzoo/MAmmoTH/-/raw/master/picture/mammoth_github.png?inline=false)
+## 环境配置
+### Docker（方法一）
 ```
-# 拉取镜像
+在光源可拉取docker镜像：
 docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-ubuntu22.04-dtk24.04.3-py3.10
-# 创建容器
+创建并启动容器：
 docker run -it --network=host --name=dtk24043_torch23 -v /opt/hyhal:/opt/hyhal:ro -v /usr/local/hyhal:/usr/local/hyhal:ro -v /public:/public:ro --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=128G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1 image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-ubuntu22.04-dtk24.04.3-py3.10
+docker exec -it dtk24043_torch23 /bin/bash
+安装依赖包：
+cd STAR/
+pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
+# 安装环境中缺少的依赖，已有的进行注释，open-clip-torch、diffusers要安装指定版本！！！
+sudo apt-get update && sudo apt-get install ffmpeg libsm6 libxext6  -y
 ```
+### Dockerfile（方法二）
 ```
-git clone https://github.com/NJU-PCALab/STAR.git
+docker build --no-cache -t STAR:latest .
-cd STAR
+docker run -dit --network=host --name=STAR --privileged --device=/dev/kfd --device=/dev/dri --ipc=host --shm-size=16G  --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root --ulimit stack=-1:-1 --ulimit memlock=-1:-1 -v /opt/hyhal/:/opt/hyhal/:ro -v /usr/local/hyhal:/usr/local/hyhal:ro STAR:latest
-pip install -r requirements.txt    # 安装环境中缺少的依赖，已有的进行注释，open-clip-torch要安装指定版本！！！
+docker exec -it STAR /bin/bash
-# 安装diffusers
+安装依赖包：
-git clone -b v0.30.0-release http://developer.sourcefind.cn/codes/OpenDAS/diffusers.git
+cd STAR/
-cd diffusers/
+pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
-python3 setup.py install
+# 安装环境中缺少的依赖，已有的进行注释，open-clip-torch、diffusers要安装指定版本！！！
 sudo apt-get update && sudo apt-get install ffmpeg libsm6 libxext6  -y
 ```
-#### Step 1: 下载预训练模型 [HuggingFace](https://huggingface.co/SherryX/STAR).
+### Anaconda（方法三）
-We provide two versions for I2VGen-XL-based model, `heavy_deg.pt` for heavy degraded videos and `light_deg.pt` for light degraded videos (e.g., the low-resolution video downloaded from video websites).
+```
+1.创建conda虚拟环境：
+conda create -n STAR python=3.10
+2.关于本项目DCU显卡所需的工具包、深度学习库等均可从光合开发者社区下载安装：https://developer.hpccube.com/tool/
+DTK驱动：dtk24.04.3
+python：python3.10
+torch:2.3.0
+Tips：以上DTK、python、torch等DCU相关工具包，版本需要严格一一对应，torch2.1或2.3或2.4都可以
+3.其它非特殊库参照requirements.txt安装
+pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
+```
-You can put the weight into `pretrained_weight/`.
+## 测试数据集
-#### Step 2: 准备测试数据（pr中有，此步跳过）
+你可以将测试视频放在input/video/中；关于提示词，有三个选项：1. 无提示词。2. 自动生成提示词（例如，使用Pllava）。3. 手动编写提示词。你可以将txt文件放在input/text/中。
-You can put the testing videos in the `input/video/`.
+你需要将video_super_resolution/scripts/inference_sr.sh中的路径更改为你本地对应的路径，包括video_folder_path、txt_file_path、model_path和save_dir。
-As for the prompt, there are three options: 1. No prompt. 2. Automatically generate a prompt (e.g., [using Pllava](https://github.com/hpcaitech/Open-Sora/tree/main/tools/caption#pllava-captioning)). 3. Manually write the prompt. You can put the txt file in the `input/text/`.
+## 预训练模型
-#### Step 3: 修改为自己的本地路径
+我们为基于I2VGen-XL的模型提供两个版本，heavy_deg.pt适用于重度退化视频，light_deg.pt适用于轻度退化视频（例如从视频网站下载的低分辨率视频）。
+模型可以通过Huggingface获取[HuggingFace](https://huggingface.co/SherryX/STAR)
-You need to change the paths in `video_super_resolution/scripts/inference_sr.sh` to your local corresponding paths, including `video_folder_path`, `txt_file_path`, `model_path`, and `save_dir`.
+将权重文件放入pretrained_weight/目录中。
-#### Step 4: 运行推理命令
+本项目提供了Huggingface快速下载脚本，可以运行以下命令将权重文件下载到本地`./pretrained_weight/`目录下
+```
+python downmodel.py
+```
+## 推理
+#### 单机单卡推理
 ```
 bash video_super_resolution/scripts/inference_sr.sh
 ```
\ No newline at end of file
+## result
+无
+## 精度
+无
+### 应用场景
+### 算法类别
+Text-to-Video
+### 热点应用行业
+医疗,教育,科研,金融
+## 源码仓库及问题反馈
+- https://developer.sourcefind.cn/codes/modelzoo/star
+## 参考资料
+- https://github.com/NJU-PCALab/STAR