Update docs

2931a72e · gushiqiao · b5e9e9d3 · 2931a72e · 2931a72e · 2931a72e
Commit 2931a72e authored Jul 16, 2025 by gushiqiao
4 changed files
--- a/docs/EN/source/deploy_guides/model_structure.md
+++ b/docs/EN/source/deploy_guides/model_structure.md
+# Model Structure Introduction
+## 📖 Overview
+This document introduces the model directory structure of the Lightx2v project, helping users correctly organize model files for a convenient user experience. Through proper directory organization, users can enjoy the convenience of "one-click startup" without manually configuring complex path parameters.
+## 🗂️ Model Directory Structure
+### Lightx2v Official Model List
+View all available models: [Lightx2v Official Model Repository](https://huggingface.co/lightx2v)
+### Standard Directory Structure
+Using `Wan2.1-I2V-14B-480P-Lightx2v` as an example:
+```
+Model Root Directory/
+├── Wan2.1-I2V-14B-480P-Lightx2v/
+│   ├── config.json                                    # Model configuration file
+│   ├── Wan2.1_VAE.pth                                # VAE variational autoencoder
+│   ├── models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth  # CLIP visual encoder (FP16)
+│   ├── models_t5_umt5-xxl-enc-bf16.pth               # T5 text encoder (BF16)
+│   ├── taew2_1.pth                                   # Lightweight VAE (optional)
+│   ├── fp8/                                          # FP8 quantized version (DIT/T5/CLIP)
+│   ├── int8/                                         # INT8 quantized version (DIT/T5/CLIP)
+│   ├── original/                                     # Original precision version (DIT)
+│   ├── xlm-roberta-large/                            # Multilingual encoder
+│   └── google/                                       # Other shared resources
+```
+### 💾 Storage Recommendations
+**Strongly recommend storing model files on SSD solid-state drives** to significantly improve model loading speed and inference performance.
+**Recommended storage paths**:
+```bash
+/mnt/ssd/models/          # Independent SSD mount point
+/data/ssd/models/         # Data SSD directory
+/opt/models/              # System optimization directory
+```
+## 🔧 Model File Description
+### Core Model Files
+Each model directory contains the following core files:
+| Filename | Size | Purpose | Required |
+|----------|------|---------|----------|
+| `config.json` | ~250B | Model configuration file | ✅ Required |
+| `Wan2.1_VAE.pth` | ~508MB | VAE variational autoencoder | ✅ Required |
+| `models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth` | ~4.77GB | CLIP visual encoder (FP16) | ✅ Required |
+| `models_t5_umt5-xxl-enc-bf16.pth` | ~11.4GB | T5 text encoder (BF16) | ✅ Required |
+| `taew2_1.pth` | ~22.7MB | Lightweight VAE (optional) | ❌ Optional |
+### Quantized Version Directories
+Each model contains multiple quantized versions for different hardware configurations:
+```
+Model Directory/
+├── fp8/                         # FP8 quantized version (H100/A100 high-end GPUs)
+├── int8/                        # INT8 quantized version (general GPUs)
+└── original/                    # Original precision version (DIT)
+```
+**💡 Using Full Precision Models**: To use full precision models, simply copy the official weight files to the `original/` directory.
+## 🚀 Usage Methods
+### Gradio Interface Startup
+When using the Gradio interface, simply specify the model root directory path:
+```bash
+# Image to Video (I2V)
+python gradio_demo_zh.py \
+    --model_path /path/to/Wan2.1-I2V-14B-480P-Lightx2v \
+    --model_size 14b \
+    --task i2v
+# Text to Video (T2V)
+python gradio_demo_zh.py \
+    --model_path /path/to/Wan2.1-T2V-14B-Lightx2v \
+    --model_size 14b \
+    --task t2v
+```
+### Configuration File Startup
+When starting with configuration files, such as [configuration file](https://github.com/ModelTC/LightX2V/tree/main/configs/offload/disk/wan_i2v_phase_lazy_load_480p.json), the following path configurations can be omitted:
+- `tiny_vae_path`: No need to specify, code will automatically search in the model directory
+- `clip_quantized_ckpt`: No need to specify, code will automatically search in the model directory
+- `t5_quantized_ckpt`: No need to specify, code will automatically search in the model directory
+**💡 Simplified Configuration**: After organizing model files according to the recommended directory structure, most path configurations can be omitted as the code will handle them automatically.
+### Manual Download
+1. Visit the [Hugging Face Model Page](https://huggingface.co/lightx2v)
+2. Select the required model version
+3. Download all files to the corresponding directory
+**💡 Download Recommendations**: It is recommended to use SSD storage and ensure stable network connection. For large files, you can use `git lfs` or download tools such as `aria2c`.
+## 💡 Best Practices
+- **Use SSD Storage**: Significantly improve model loading speed and inference performance
+- **Unified Directory Structure**: Facilitate management and switching between different model versions
+- **Reserve Sufficient Space**: Ensure adequate storage space (recommended at least 200GB)
+- **Regular Cleanup**: Delete unnecessary model versions to save space
+- **Network Optimization**: Use stable network connections and download tools
+## 🚨 Common Issues
+### Q: Model files are too large and download is slow?
+A: Use domestic mirror sources, download tools such as `aria2c`, or consider using cloud storage services
+### Q: Model path not found when starting?
+A: Check if the model has been downloaded correctly and verify the path configuration
+### Q: How to switch between different model versions?
+A: Modify the model path parameter in the startup command, supports running multiple model instances simultaneously
+### Q: Model loading is very slow?
+A: Ensure models are stored on SSD, enable lazy loading, and use quantized version models
+### Q: How to set paths in configuration files?
+A: After organizing according to the recommended directory structure, most path configurations can be omitted as the code will handle them automatically
+## 📚 Related Links
+- [Lightx2v Official Model Repository](https://huggingface.co/lightx2v)
+- [Gradio Deployment Guide](./deploy_gradio.md)
+---
+Through proper model file organization, users can enjoy the convenience of "one-click startup" without manually configuring complex path parameters. It is recommended to organize model files according to the structure recommended in this document and fully utilize the advantages of SSD storage.
--- a/docs/ZH_CN/source/deploy_guides/model_structure.md
+++ b/docs/ZH_CN/source/deploy_guides/model_structure.md
+# 模型结构介绍
+## 📖 概述
+本文档介绍 Lightx2v 项目的模型目录结构，帮助用户正确组织模型文件，实现便捷的使用体验。通过合理的目录组织，用户可以享受到"一键启动"的便利，无需手动配置复杂的路径参数。
+## 🗂️ 模型目录结构
+### Lightx2v官方模型列表
+查看所有可用模型：[Lightx2v官方模型仓库](https://huggingface.co/lightx2v)
+### 标准目录结构
+以 `Wan2.1-I2V-14B-480P-Lightx2v` 为例：
+```
+模型根目录/
+├── Wan2.1-I2V-14B-480P-Lightx2v/
+│   ├── config.json                                    # 模型配置文件
+│   ├── Wan2.1_VAE.pth                                # VAE变分自编码器
+│   ├── models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth  # CLIP视觉编码器 (FP16)
+│   ├── models_t5_umt5-xxl-enc-bf16.pth               # T5文本编码器 (BF16)
+│   ├── taew2_1.pth                                   # 轻量级VAE (可选)
+│   ├── fp8/                                          # FP8量化版本 (DIT/T5/CLIP)
+│   ├── int8/                                         # INT8量化版本 (DIT/T5/CLIP)
+│   ├── original/                                     # 原始精度版本 (DIT)
+│   ├── xlm-roberta-large/
+│   └── google/
+```
+### 💾 存储建议
+**强烈建议将模型文件存储在SSD固态硬盘上**，可以显著提升模型加载速度和推理性能。
+**推荐存储路径**：
+```bash
+/mnt/ssd/models/          # 独立SSD挂载点
+/data/ssd/models/         # 数据SSD目录
+/opt/models/              # 系统优化目录
+```
+## 🔧 模型文件说明
+### 核心模型文件
+每个模型目录包含以下核心文件：
+| 文件名 | 大小 | 作用 | 必需性 |
+|--------|------|------|--------|
+| `config.json` | ~250B | 模型配置文件 | ✅ 必需 |
+| `Wan2.1_VAE.pth` | ~508MB | VAE变分自编码器 | ✅ 必需 |
+| `models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth` | ~4.77GB | CLIP视觉编码器 (FP16) | ✅ 必需 |
+| `models_t5_umt5-xxl-enc-bf16.pth` | ~11.4GB | T5文本编码器 (BF16) | ✅ 必需 |
+| `taew2_1.pth` | ~22.7MB | 轻量级VAE (可选) | ❌ 可选 |
+### 量化版本目录
+每个模型都包含多个量化版本，用于不同硬件配置：
+```
+模型目录/
+├── fp8/                         # FP8量化版本 (H100/A100等高端GPU)
+├── int8/                        # INT8量化版本 (通用GPU)
+└── original/                    # 原始精度版本 (DIT)
+```
+**💡 使用全精度模型**：如需使用全精度模型，只需将官方权重文件复制到 `original/` 目录即可。
+## 🚀 使用方法
+### Gradio界面启动
+使用Gradio界面时，只需指定模型根目录路径：
+```bash
+# 图像到视频 (I2V)
+python gradio_demo_zh.py \
+    --model_path /path/to/Wan2.1-I2V-14B-480P-Lightx2v \
+    --model_size 14b \
+    --task i2v
+# 文本到视频 (T2V)
+python gradio_demo_zh.py \
+    --model_path /path/to/models/Wan2.1-T2V-14B-Lightx2v \
+    --model_size 14b \
+    --task t2v
+```
+### 配置文件启动
+使用配置文件启动时, 如[配置文件](https://github.com/ModelTC/LightX2V/tree/main/configs/offload/disk/wan_i2v_phase_lazy_load_480p.json)中的以下路径配置可以省略：
+- `tiny_vae_path`：无需指定，代码会自动在模型目录下查找
+- `clip_quantized_ckpt`：无需指定，代码会自动在模型目录下查找
+- `t5_quantized_ckpt`：无需指定，代码会自动在模型目录下查找
+**💡 简化配置**：按照推荐的目录结构组织模型文件后，大部分路径配置都可以省略，代码会自动处理。
+### 手动下载
+1. 访问 [Hugging Face模型页面](https://huggingface.co/lightx2v)
+2. 选择需要的模型版本
+3. 下载所有文件到对应目录
+**💡 下载建议**：建议使用SSD存储，并确保网络连接稳定。对于大文件，可使用 `git lfs` 或下载工具如 `aria2c`。
+## 💡 最佳实践
+- **使用SSD存储**：显著提升模型加载速度和推理性能
+- **统一目录结构**：便于管理和切换不同模型版本
+- **预留足够空间**：确保有足够的存储空间（建议至少200GB）
+- **定期清理**：删除不需要的模型版本以节省空间
+- **网络优化**：使用稳定的网络连接和下载工具
+## 🚨 常见问题
+### Q: 模型文件太大，下载很慢怎么办？
+A: 使用国内镜像源、下载工具如 `aria2c`，或考虑使用云存储服务
+### Q: 启动时提示模型路径不存在？
+A: 检查模型是否已正确下载，验证路径配置是否正确
+### Q: 如何切换不同的模型版本？
+A: 修改启动命令中的模型路径参数，支持同时运行多个模型实例
+### Q: 模型加载速度很慢？
+A: 确保模型存储在SSD上，启用延迟加载功能，使用量化版本模型
+### Q: 配置文件中的路径如何设置？
+A: 按照推荐目录结构组织后，大部分路径配置可省略，代码会自动处理
+## 📚 相关链接
+- [Lightx2v官方模型仓库](https://huggingface.co/lightx2v)
+- [Gradio部署指南](./deploy_gradio.md)
+---
+通过合理的模型文件组织，用户可以享受到"一键启动"的便捷体验，无需手动配置复杂的路径参数。建议按照本文档的推荐结构组织模型文件，并充分利用SSD存储的优势。
--- a/lightx2v/models/networks/wan/model.py
+++ b/lightx2v/models/networks/wan/model.py
 import os
-import sys
 import torch
 import glob
 import json
@@ -37,7 +36,11 @@ class WanModel:
        self.clean_cuda_cache = self.config.get("clean_cuda_cache", False)
        self.dit_quantized = self.config.mm_config.get("mm_type", "Default") != "Default"
-        self.dit_quantized_ckpt = self.config.get("dit_quantized_ckpt", None)
+        if self.dit_quantized:
+            dit_quant_scheme = self.config.mm_config.get("mm_type").split("-")[1]
+            self.dit_quantized_ckpt = self.config.get("dit_quantized_ckpt", os.path.join(model_path, dit_quant_scheme))
+        else:
+            self.dit_quantized_ckpt = None
        self.weight_auto_quant = self.config.mm_config.get("weight_auto_quant", False)
        if self.dit_quantized:
            assert self.weight_auto_quant or self.dit_quantized_ckpt is not None
@@ -143,7 +146,14 @@ class WanModel:
    def _init_weights(self, weight_dict=None):
        use_bf16 = GET_DTYPE() == "BF16"
        # Some layers run with float32 to achieve high accuracy
-        skip_bf16 = {"norm", "embedding", "modulation", "time", "img_emb.proj.0", "img_emb.proj.4"}
+        skip_bf16 = {
+            "norm",
+            "embedding",
+            "modulation",
+            "time",
+            "img_emb.proj.0",
+            "img_emb.proj.4",
+        }
        if weight_dict is None:
            if not self.dit_quantized or self.weight_auto_quant:
                self.original_weight_dict = self._load_ckpt(use_bf16, skip_bf16)

--- a/lightx2v/models/runners/wan/wan_runner.py
+++ b/lightx2v/models/runners/wan/wan_runner.py
@@ -7,7 +7,9 @@ from PIL import Image
 from lightx2v.utils.registry_factory import RUNNER_REGISTER
 from lightx2v.models.runners.default_runner import DefaultRunner
 from lightx2v.models.schedulers.wan.scheduler import WanScheduler
-from lightx2v.models.schedulers.wan.changing_resolution.scheduler import WanScheduler4ChangingResolution
+from lightx2v.models.schedulers.wan.changing_resolution.scheduler import (
+    WanScheduler4ChangingResolution,
+)
 from lightx2v.models.schedulers.wan.feature_caching.scheduler import (
    WanSchedulerTeaCaching,
    WanSchedulerTaylorCaching,
@@ -50,6 +52,22 @@ class WanRunner(DefaultRunner):
    def load_image_encoder(self):
        image_encoder = None
        if self.config.task == "i2v":
+            # quant_config
+            clip_quantized = self.config.get("clip_quantized", False)
+            if clip_quantized:
+                clip_quant_scheme = self.config.get("clip_quant_scheme", None)
+                assert clip_quant_scheme is not None
+                clip_quantized_ckpt = self.config.get(
+                    "clip_quantized_ckpt",
+                    os.path.join(
+                        os.path.join(self.config.model_path, clip_quant_scheme),
+                        f"clip-{clip_quant_scheme}.pth",
+                    ),
+                )
+            else:
+                clip_quantized_ckpt = None
+                clip_quant_scheme = None
            image_encoder = CLIPModel(
                dtype=torch.float16,
                device=self.init_device,
@@ -57,18 +75,36 @@ class WanRunner(DefaultRunner):
                    self.config.model_path,
                    "models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth",
                ),
-                clip_quantized=self.config.get("clip_quantized", False),
+                clip_quantized=clip_quantized,
-                clip_quantized_ckpt=self.config.get("clip_quantized_ckpt", None),
+                clip_quantized_ckpt=clip_quantized_ckpt,
-                quant_scheme=self.config.get("clip_quant_scheme", None),
+                quant_scheme=clip_quant_scheme,
            )
        return image_encoder
    def load_text_encoder(self):
+        # offload config
        t5_offload = self.config.get("t5_cpu_offload", False)
        if t5_offload:
            t5_device = torch.device("cpu")
        else:
            t5_device = torch.device("cuda")
+        # quant_config
+        t5_quantized = self.config.get("t5_quantized", False)
+        if t5_quantized:
+            t5_quant_scheme = self.config.get("t5_quant_scheme", None)
+            assert t5_quant_scheme is not None
+            t5_quantized_ckpt = self.config.get(
+                "t5_quantized_ckpt",
+                os.path.join(
+                    os.path.join(self.config.model_path, t5_quant_scheme),
+                    f"models_t5_umt5-xxl-enc-{t5_quant_scheme}.pth",
+                ),
+            )
+        else:
+            t5_quant_scheme = None
+            t5_quantized_ckpt = None
        text_encoder = T5EncoderModel(
            text_len=self.config["text_len"],
            dtype=torch.bfloat16,
@@ -78,9 +114,9 @@ class WanRunner(DefaultRunner):
            shard_fn=None,
            cpu_offload=t5_offload,
            offload_granularity=self.config.get("t5_offload_granularity", "model"),
-            t5_quantized=self.config.get("t5_quantized", False),
+            t5_quantized=t5_quantized,
-            t5_quantized_ckpt=self.config.get("t5_quantized_ckpt", None),
+            t5_quantized_ckpt=t5_quantized_ckpt,
-            quant_scheme=self.config.get("t5_quant_scheme", None),
+            quant_scheme=t5_quant_scheme,
        )
        text_encoders = [text_encoder]
        return text_encoders