readme_zh.md 10.8 KB
Newer Older
1
2
# 模型转换工具

gushiqiao's avatar
gushiqiao committed
3
这是一个功能强大的模型权重转换工具,支持格式转换、量化、LoRA融合等多种功能。
4

gushiqiao's avatar
gushiqiao committed
5
## 主要特性
6

gushiqiao's avatar
gushiqiao committed
7
8
9
10
11
12
13
- **格式转换**: 支持 PyTorch (.pth) 和 SafeTensors (.safetensors) 格式互转
- **模型量化**: 支持 INT8 和 FP8 量化,显著减小模型体积
- **架构转换**: 支持 LightX2V 和 Diffusers 架构互转
- **LoRA 融合**: 支持多种 LoRA 格式的加载和融合
- **多模型支持**: 支持 Wan DiT、Qwen Image DiT、T5、CLIP 等
- **灵活保存**: 支持单文件、按块、分块等多种保存方式
- **并行处理**: 大模型转换支持并行加速
14

gushiqiao's avatar
gushiqiao committed
15
## 支持的模型类型
16

gushiqiao's avatar
gushiqiao committed
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
- `wan_dit`: Wan DiT 系列模型(默认)
- `wan_animate_dit`: Wan Animate DiT 模型
- `qwen_image_dit`: Qwen Image DiT 模型
- `wan_t5`: Wan T5 文本编码器
- `wan_clip`: Wan CLIP 视觉编码器

## 核心参数说明

### 基础参数

- `-s, --source`: 输入路径(文件或目录)
- `-o, --output`: 输出目录路径
- `-o_e, --output_ext`: 输出格式,可选 `.pth``.safetensors`(默认)
- `-o_n, --output_name`: 输出文件名(默认: `converted`
- `-t, --model_type`: 模型类型(默认: `wan_dit`

### 架构转换参数

- `-d, --direction`: 转换方向
  - `None`: 不进行架构转换(默认)
  - `forward`: LightX2V → Diffusers
  - `backward`: Diffusers → LightX2V

### 量化参数

- `--quantized`: 启用量化
- `--bits`: 量化位宽,当前仅支持 8 位
- `--linear_dtype`: 线性层量化类型
  - `torch.int8`: INT8 量化
  - `torch.float8_e4m3fn`: FP8 量化
- `--non_linear_dtype`: 非线性层数据类型
  - `torch.bfloat16`: BF16
  - `torch.float16`: FP16
  - `torch.float32`: FP32(默认)
51
- `--device`: 量化使用的设备,可选 `cpu``cuda`(默认)
gushiqiao's avatar
gushiqiao committed
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
- `--comfyui_mode`: ComfyUI 兼容模式
- `--full_quantized`: 全量化模式(ComfyUI 模式下有效)

### LoRA 参数

- `--lora_path`: LoRA 文件路径,支持多个(用空格分隔)
- `--lora_strength`: LoRA 强度系数,支持多个(默认: 1.0)
- `--alpha`: LoRA alpha 参数,支持多个
- `--lora_key_convert`: LoRA 键转换模式
  - `auto`: 自动检测(默认)
  - `same`: 使用原始键名
  - `convert`: 应用与模型相同的转换

### 保存参数

- `--single_file`: 保存为单个文件(注意: 大模型会消耗大量内存)
- `-b, --save_by_block`: 按块保存(推荐用于 backward 转换)
- `-c, --chunk-size`: 分块大小(默认: 100,0 表示不分块)
- `--copy_no_weight_files`: 复制源目录中的非权重文件

### 性能参数

- `--parallel`: 启用并行处理(默认: True)
- `--no-parallel`: 禁用并行处理

## 支持的 LoRA 格式

工具自动检测并支持以下 LoRA 格式:

1. **Standard**: `{key}.lora_up.weight``{key}.lora_down.weight`
2. **Diffusers**: `{key}_lora.up.weight``{key}_lora.down.weight`
3. **Diffusers V2**: `{key}.lora_B.weight``{key}.lora_A.weight`
4. **Diffusers V3**: `{key}.lora.up.weight``{key}.lora.down.weight`
5. **Mochi**: `{key}.lora_B``{key}.lora_A`(无 .weight 后缀)
6. **Transformers**: `{key}.lora_linear_layer.up.weight``{key}.lora_linear_layer.down.weight`
7. **Qwen**: `{key}.lora_B.default.weight``{key}.lora_A.default.weight`

此外还支持差值(diff)格式:
- `.diff`: 权重差值
- `.diff_b`: bias 差值
- `.diff_m`: modulation 差值

## 使用示例

### 1. 模型量化

#### 1.1 Wan DiT 量化为 INT8

**多个 safetensors,按 dit block 存储**
101
102
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
103
104
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
gushiqiao's avatar
Fix  
gushiqiao committed
105
    --output_ext .safetensors \
106
    --output_name wan_int8 \
gushiqiao's avatar
gushiqiao committed
107
    --linear_dtype torch.int8 \
gushiqiao's avatar
gushiqiao committed
108
109
110
    --model_type wan_dit \
    --quantized \
    --save_by_block
111
112
```

gushiqiao's avatar
gushiqiao committed
113
**单个 safetensor 文件**
114
115
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
116
117
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
gushiqiao's avatar
Fix  
gushiqiao committed
118
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
119
120
    --output_name wan2.1_i2v_480p_int8_lightx2v \
    --linear_dtype torch.int8 \
gushiqiao's avatar
gushiqiao committed
121
122
    --model_type wan_dit \
    --quantized \
gushiqiao's avatar
gushiqiao committed
123
    --single_file
124
125
```

gushiqiao's avatar
gushiqiao committed
126
#### 1.2 Wan DiT 量化为 FP8
GoatWu's avatar
GoatWu committed
127

gushiqiao's avatar
gushiqiao committed
128
**多个 safetensors,按 dit block 存储**
GoatWu's avatar
GoatWu committed
129
130
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
131
132
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
GoatWu's avatar
GoatWu committed
133
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
134
135
136
    --output_name wan_fp8 \
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.bfloat16 \
GoatWu's avatar
GoatWu committed
137
    --model_type wan_dit \
gushiqiao's avatar
gushiqiao committed
138
139
    --quantized \
    --save_by_block
GoatWu's avatar
GoatWu committed
140
141
```

gushiqiao's avatar
gushiqiao committed
142
**单个 safetensor 文件**
143
144
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
145
146
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
gushiqiao's avatar
Fix  
gushiqiao committed
147
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
148
    --output_name wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v \
gushiqiao's avatar
gushiqiao committed
149
    --linear_dtype torch.float8_e4m3fn \
gushiqiao's avatar
gushiqiao committed
150
151
152
153
    --non_linear_dtype torch.bfloat16 \
    --model_type wan_dit \
    --quantized \
    --single_file
154
155
```

gushiqiao's avatar
gushiqiao committed
156
**ComfyUI 的 scaled_fp8 格式**
157
158
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
159
160
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
161
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
162
163
164
165
    --output_name wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v_comfyui \
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.bfloat16 \
    --model_type wan_dit \
166
    --quantized \
gushiqiao's avatar
gushiqiao committed
167
168
    --single_file \
    --comfyui_mode
169
170
```

gushiqiao's avatar
gushiqiao committed
171
**ComfyUI 的全 FP8 格式**
172
173
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
174
175
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
176
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
177
    --output_name wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v_comfyui \
178
    --linear_dtype torch.float8_e4m3fn \
gushiqiao's avatar
gushiqiao committed
179
180
    --non_linear_dtype torch.bfloat16 \
    --model_type wan_dit \
181
    --quantized \
gushiqiao's avatar
gushiqiao committed
182
183
184
    --single_file \
    --comfyui_mode \
    --full_quantized
185
```
186

gushiqiao's avatar
gushiqiao committed
187
> **提示**: 对于其他 DIT 模型,切换 `--model_type` 参数即可
188

gushiqiao's avatar
gushiqiao committed
189
190
191
#### 1.3 T5 编码器量化

**INT8 量化**
192
193
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
194
195
196
    --source /path/to/models_t5_umt5-xxl-enc-bf16.pth \
    --output /path/to/output \
    --output_ext .pth \
197
    --output_name models_t5_umt5-xxl-enc-int8 \
gushiqiao's avatar
gushiqiao committed
198
199
    --linear_dtype torch.int8 \
    --non_linear_dtype torch.bfloat16 \
gushiqiao's avatar
gushiqiao committed
200
201
    --model_type wan_t5 \
    --quantized
202
203
```

gushiqiao's avatar
gushiqiao committed
204
**FP8 量化**
205
206
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
207
208
209
    --source /path/to/models_t5_umt5-xxl-enc-bf16.pth \
    --output /path/to/output \
    --output_ext .pth \
210
    --output_name models_t5_umt5-xxl-enc-fp8 \
gushiqiao's avatar
gushiqiao committed
211
212
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.bfloat16 \
gushiqiao's avatar
gushiqiao committed
213
214
    --model_type wan_t5 \
    --quantized
215
216
```

gushiqiao's avatar
gushiqiao committed
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
#### 1.4 CLIP 编码器量化

**INT8 量化**
```bash
python converter.py \
    --source /path/to/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth \
    --output /path/to/output \
    --output_ext .pth \
    --output_name models_clip_open-clip-xlm-roberta-large-vit-huge-14-int8 \
    --linear_dtype torch.int8 \
    --non_linear_dtype torch.float16 \
    --model_type wan_clip \
    --quantized
```

**FP8 量化**
```bash
python converter.py \
    --source /path/to/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth \
    --output /path/to/output \
    --output_ext .pth \
    --output_name models_clip_open-clip-xlm-roberta-large-vit-huge-14-fp8 \
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.float16 \
    --model_type wan_clip \
    --quantized
```

### 2. LoRA 融合

#### 2.1 融合单个 LoRA

```bash
python converter.py \
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_model \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --single_file
```
260

gushiqiao's avatar
gushiqiao committed
261
#### 2.2 融合多个 LoRA
262
263
264

```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
265
266
267
268
269
270
271
272
273
274
275
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_model \
    --model_type wan_dit \
    --lora_path /path/to/lora1.safetensors /path/to/lora2.safetensors \
    --lora_strength 1.0 0.8 \
    --single_file
```

#### 2.3 LoRA 融合后量化
276

gushiqiao's avatar
gushiqiao committed
277
278
279
280
281
282
283
284
285
286
287
288
289
**LoRA 融合 → FP8 量化**
```bash
python converter.py \
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_quantized \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --quantized \
    --linear_dtype torch.float8_e4m3fn \
    --single_file
290
```
gushiqiao's avatar
gushiqiao committed
291
292

**LoRA 融合 → ComfyUI scaled_fp8**
293
294
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
295
296
297
298
299
300
301
302
303
304
305
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_quantized \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --quantized \
    --linear_dtype torch.float8_e4m3fn \
    --single_file \
    --comfyui_mode
306
```
307

gushiqiao's avatar
gushiqiao committed
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
**LoRA 融合 → ComfyUI 全 FP8**
```bash
python converter.py \
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_quantized \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --quantized \
    --linear_dtype torch.float8_e4m3fn \
    --single_file \
    --comfyui_mode \
    --full_quantized
```
324

gushiqiao's avatar
gushiqiao committed
325
#### 2.4 LoRA 键转换模式
326

gushiqiao's avatar
gushiqiao committed
327
**自动检测模式(推荐)**
328
329
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
330
331
332
333
334
    --source /path/to/model/ \
    --output /path/to/output \
    --lora_path /path/to/lora.safetensors \
    --lora_key_convert auto \
    --single_file
335
336
```

gushiqiao's avatar
gushiqiao committed
337
**使用原始键名(LoRA 已经是目标格式)**
338
339
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
    --source /path/to/model/ \
    --output /path/to/output \
    --direction forward \
    --lora_path /path/to/lora.safetensors \
    --lora_key_convert same \
    --single_file
```

**应用转换(LoRA 使用源格式)**
```bash
python converter.py \
    --source /path/to/model/ \
    --output /path/to/output \
    --direction forward \
    --lora_path /path/to/lora.safetensors \
    --lora_key_convert convert \
    --single_file
```

### 3. 架构格式转换

#### 3.1 LightX2V → Diffusers

```bash
python converter.py \
    --source /path/to/Wan2.1-I2V-14B-480P \
    --output /path/to/Wan2.1-I2V-14B-480P-Diffusers \
    --output_ext .safetensors \
    --model_type wan_dit \
    --direction forward \
    --chunk-size 100
```

#### 3.2 Diffusers → LightX2V

```bash
python converter.py \
    --source /path/to/Wan2.1-I2V-14B-480P-Diffusers \
    --output /path/to/Wan2.1-I2V-14B-480P \
    --output_ext .safetensors \
    --model_type wan_dit \
    --direction backward \
    --save_by_block
```

### 4. 格式转换

#### 4.1 .pth → .safetensors

```bash
python converter.py \
    --source /path/to/model.pth \
    --output /path/to/output \
    --output_ext .safetensors \
394
395
    --output_name model \
    --single_file
gushiqiao's avatar
gushiqiao committed
396
397
398
399
400
401
402
403
404
405
406
```

#### 4.2 多个 .safetensors → 单文件

```bash
python converter.py \
    --source /path/to/model_directory/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_model \
    --single_file
407
```