readme_zh.md 11.5 KB
Newer Older
1
2
# 模型转换工具

gushiqiao's avatar
gushiqiao committed
3
这是一个功能强大的模型权重转换工具,支持格式转换、量化、LoRA融合等多种功能。
4

gushiqiao's avatar
gushiqiao committed
5
## 主要特性
6

gushiqiao's avatar
gushiqiao committed
7
8
9
10
11
12
13
- **格式转换**: 支持 PyTorch (.pth) 和 SafeTensors (.safetensors) 格式互转
- **模型量化**: 支持 INT8 和 FP8 量化,显著减小模型体积
- **架构转换**: 支持 LightX2V 和 Diffusers 架构互转
- **LoRA 融合**: 支持多种 LoRA 格式的加载和融合
- **多模型支持**: 支持 Wan DiT、Qwen Image DiT、T5、CLIP 等
- **灵活保存**: 支持单文件、按块、分块等多种保存方式
- **并行处理**: 大模型转换支持并行加速
14

gushiqiao's avatar
gushiqiao committed
15
## 支持的模型类型
16

Yang Yong (雍洋)'s avatar
Yang Yong (雍洋) committed
17
- `hunyuan_dit`: hunyuan DiT 1.5模型
gushiqiao's avatar
gushiqiao committed
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
- `wan_dit`: Wan DiT 系列模型(默认)
- `wan_animate_dit`: Wan Animate DiT 模型
- `qwen_image_dit`: Qwen Image DiT 模型
- `wan_t5`: Wan T5 文本编码器
- `wan_clip`: Wan CLIP 视觉编码器

## 核心参数说明

### 基础参数

- `-s, --source`: 输入路径(文件或目录)
- `-o, --output`: 输出目录路径
- `-o_e, --output_ext`: 输出格式,可选 `.pth``.safetensors`(默认)
- `-o_n, --output_name`: 输出文件名(默认: `converted`
- `-t, --model_type`: 模型类型(默认: `wan_dit`

### 架构转换参数

- `-d, --direction`: 转换方向
  - `None`: 不进行架构转换(默认)
  - `forward`: LightX2V → Diffusers
  - `backward`: Diffusers → LightX2V

### 量化参数

- `--quantized`: 启用量化
- `--bits`: 量化位宽,当前仅支持 8 位
- `--linear_dtype`: 线性层量化类型
  - `torch.int8`: INT8 量化
  - `torch.float8_e4m3fn`: FP8 量化
- `--non_linear_dtype`: 非线性层数据类型
  - `torch.bfloat16`: BF16
  - `torch.float16`: FP16
  - `torch.float32`: FP32(默认)
52
- `--device`: 量化使用的设备,可选 `cpu``cuda`(默认)
gushiqiao's avatar
gushiqiao committed
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
- `--comfyui_mode`: ComfyUI 兼容模式
- `--full_quantized`: 全量化模式(ComfyUI 模式下有效)

### LoRA 参数

- `--lora_path`: LoRA 文件路径,支持多个(用空格分隔)
- `--lora_strength`: LoRA 强度系数,支持多个(默认: 1.0)
- `--alpha`: LoRA alpha 参数,支持多个
- `--lora_key_convert`: LoRA 键转换模式
  - `auto`: 自动检测(默认)
  - `same`: 使用原始键名
  - `convert`: 应用与模型相同的转换

### 保存参数

- `--single_file`: 保存为单个文件(注意: 大模型会消耗大量内存)
- `-b, --save_by_block`: 按块保存(推荐用于 backward 转换)
- `-c, --chunk-size`: 分块大小(默认: 100,0 表示不分块)
- `--copy_no_weight_files`: 复制源目录中的非权重文件

### 性能参数

- `--parallel`: 启用并行处理(默认: True)
- `--no-parallel`: 禁用并行处理

## 支持的 LoRA 格式

工具自动检测并支持以下 LoRA 格式:

1. **Standard**: `{key}.lora_up.weight``{key}.lora_down.weight`
2. **Diffusers**: `{key}_lora.up.weight``{key}_lora.down.weight`
3. **Diffusers V2**: `{key}.lora_B.weight``{key}.lora_A.weight`
4. **Diffusers V3**: `{key}.lora.up.weight``{key}.lora.down.weight`
5. **Mochi**: `{key}.lora_B``{key}.lora_A`(无 .weight 后缀)
6. **Transformers**: `{key}.lora_linear_layer.up.weight``{key}.lora_linear_layer.down.weight`
7. **Qwen**: `{key}.lora_B.default.weight``{key}.lora_A.default.weight`

此外还支持差值(diff)格式:
- `.diff`: 权重差值
- `.diff_b`: bias 差值
- `.diff_m`: modulation 差值

## 使用示例

### 1. 模型量化

#### 1.1 Wan DiT 量化为 INT8

**多个 safetensors,按 dit block 存储**
102
103
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
104
105
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
gushiqiao's avatar
Fix  
gushiqiao committed
106
    --output_ext .safetensors \
107
    --output_name wan_int8 \
gushiqiao's avatar
gushiqiao committed
108
    --linear_dtype torch.int8 \
gushiqiao's avatar
gushiqiao committed
109
110
111
    --model_type wan_dit \
    --quantized \
    --save_by_block
112
113
```

gushiqiao's avatar
gushiqiao committed
114
**单个 safetensor 文件**
115
116
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
117
118
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
gushiqiao's avatar
Fix  
gushiqiao committed
119
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
120
121
    --output_name wan2.1_i2v_480p_int8_lightx2v \
    --linear_dtype torch.int8 \
gushiqiao's avatar
gushiqiao committed
122
123
    --model_type wan_dit \
    --quantized \
gushiqiao's avatar
gushiqiao committed
124
    --single_file
125
126
```

gushiqiao's avatar
gushiqiao committed
127
#### 1.2 Wan DiT 量化为 FP8
GoatWu's avatar
GoatWu committed
128

gushiqiao's avatar
gushiqiao committed
129
**多个 safetensors,按 dit block 存储**
GoatWu's avatar
GoatWu committed
130
131
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
132
133
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
GoatWu's avatar
GoatWu committed
134
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
135
136
137
    --output_name wan_fp8 \
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.bfloat16 \
GoatWu's avatar
GoatWu committed
138
    --model_type wan_dit \
gushiqiao's avatar
gushiqiao committed
139
140
    --quantized \
    --save_by_block
GoatWu's avatar
GoatWu committed
141
142
```

gushiqiao's avatar
gushiqiao committed
143
**单个 safetensor 文件**
144
145
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
146
147
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
gushiqiao's avatar
Fix  
gushiqiao committed
148
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
149
    --output_name wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v \
gushiqiao's avatar
gushiqiao committed
150
    --linear_dtype torch.float8_e4m3fn \
gushiqiao's avatar
gushiqiao committed
151
152
153
154
    --non_linear_dtype torch.bfloat16 \
    --model_type wan_dit \
    --quantized \
    --single_file
155
156
```

gushiqiao's avatar
gushiqiao committed
157
**ComfyUI 的 scaled_fp8 格式**
158
159
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
160
161
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
162
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
163
164
165
166
    --output_name wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v_comfyui \
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.bfloat16 \
    --model_type wan_dit \
167
    --quantized \
gushiqiao's avatar
gushiqiao committed
168
169
    --single_file \
    --comfyui_mode
170
171
```

gushiqiao's avatar
gushiqiao committed
172
**ComfyUI 的全 FP8 格式**
173
174
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
175
176
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
177
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
178
    --output_name wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v_comfyui \
179
    --linear_dtype torch.float8_e4m3fn \
gushiqiao's avatar
gushiqiao committed
180
181
    --non_linear_dtype torch.bfloat16 \
    --model_type wan_dit \
182
    --quantized \
gushiqiao's avatar
gushiqiao committed
183
184
185
    --single_file \
    --comfyui_mode \
    --full_quantized
186
```
187

gushiqiao's avatar
gushiqiao committed
188
> **提示**: 对于其他 DIT 模型,切换 `--model_type` 参数即可
189

gushiqiao's avatar
gushiqiao committed
190
191
192
#### 1.3 T5 编码器量化

**INT8 量化**
193
194
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
195
196
197
    --source /path/to/models_t5_umt5-xxl-enc-bf16.pth \
    --output /path/to/output \
    --output_ext .pth \
198
    --output_name models_t5_umt5-xxl-enc-int8 \
gushiqiao's avatar
gushiqiao committed
199
200
    --linear_dtype torch.int8 \
    --non_linear_dtype torch.bfloat16 \
gushiqiao's avatar
gushiqiao committed
201
202
    --model_type wan_t5 \
    --quantized
203
204
```

gushiqiao's avatar
gushiqiao committed
205
**FP8 量化**
206
207
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
208
209
210
    --source /path/to/models_t5_umt5-xxl-enc-bf16.pth \
    --output /path/to/output \
    --output_ext .pth \
211
    --output_name models_t5_umt5-xxl-enc-fp8 \
gushiqiao's avatar
gushiqiao committed
212
213
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.bfloat16 \
gushiqiao's avatar
gushiqiao committed
214
215
    --model_type wan_t5 \
    --quantized
216
217
```

gushiqiao's avatar
gushiqiao committed
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
#### 1.4 CLIP 编码器量化

**INT8 量化**
```bash
python converter.py \
    --source /path/to/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth \
    --output /path/to/output \
    --output_ext .pth \
    --output_name models_clip_open-clip-xlm-roberta-large-vit-huge-14-int8 \
    --linear_dtype torch.int8 \
    --non_linear_dtype torch.float16 \
    --model_type wan_clip \
    --quantized
```

**FP8 量化**
```bash
python converter.py \
    --source /path/to/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth \
    --output /path/to/output \
    --output_ext .pth \
    --output_name models_clip_open-clip-xlm-roberta-large-vit-huge-14-fp8 \
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.float16 \
    --model_type wan_clip \
    --quantized
```

Yang Yong (雍洋)'s avatar
Yang Yong (雍洋) committed
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
#### 1.5 Qwen25_vl 語言部分量化

**INT8 量化**
```bash
python converter.py \
    --source /path/to/hunyuanvideo-1.5/text_encoder/llm \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name qwen25vl-llm-int8 \
    --linear_dtype torch.int8 \
    --non_linear_dtype torch.float16 \
    --model_type qwen25vl_llm \
    --quantized \
    --single_file
```

**FP8 量化**
```bash
python converter.py \
    --source /path/to/hunyuanvideo-1.5/text_encoder/llm \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name qwen25vl-llm-fp8 \
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.float16 \
    --model_type qwen25vl_llm \
    --quantized \
    --single_file
```

gushiqiao's avatar
gushiqiao committed
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
### 2. LoRA 融合

#### 2.1 融合单个 LoRA

```bash
python converter.py \
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_model \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --single_file
```
291

gushiqiao's avatar
gushiqiao committed
292
#### 2.2 融合多个 LoRA
293
294
295

```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
296
297
298
299
300
301
302
303
304
305
306
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_model \
    --model_type wan_dit \
    --lora_path /path/to/lora1.safetensors /path/to/lora2.safetensors \
    --lora_strength 1.0 0.8 \
    --single_file
```

#### 2.3 LoRA 融合后量化
307

gushiqiao's avatar
gushiqiao committed
308
309
310
311
312
313
314
315
316
317
318
319
320
**LoRA 融合 → FP8 量化**
```bash
python converter.py \
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_quantized \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --quantized \
    --linear_dtype torch.float8_e4m3fn \
    --single_file
321
```
gushiqiao's avatar
gushiqiao committed
322
323

**LoRA 融合 → ComfyUI scaled_fp8**
324
325
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
326
327
328
329
330
331
332
333
334
335
336
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_quantized \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --quantized \
    --linear_dtype torch.float8_e4m3fn \
    --single_file \
    --comfyui_mode
337
```
338

gushiqiao's avatar
gushiqiao committed
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
**LoRA 融合 → ComfyUI 全 FP8**
```bash
python converter.py \
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_quantized \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --quantized \
    --linear_dtype torch.float8_e4m3fn \
    --single_file \
    --comfyui_mode \
    --full_quantized
```
355

gushiqiao's avatar
gushiqiao committed
356
#### 2.4 LoRA 键转换模式
357

gushiqiao's avatar
gushiqiao committed
358
**自动检测模式(推荐)**
359
360
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
361
362
363
364
365
    --source /path/to/model/ \
    --output /path/to/output \
    --lora_path /path/to/lora.safetensors \
    --lora_key_convert auto \
    --single_file
366
367
```

gushiqiao's avatar
gushiqiao committed
368
**使用原始键名(LoRA 已经是目标格式)**
369
370
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
    --source /path/to/model/ \
    --output /path/to/output \
    --direction forward \
    --lora_path /path/to/lora.safetensors \
    --lora_key_convert same \
    --single_file
```

**应用转换(LoRA 使用源格式)**
```bash
python converter.py \
    --source /path/to/model/ \
    --output /path/to/output \
    --direction forward \
    --lora_path /path/to/lora.safetensors \
    --lora_key_convert convert \
    --single_file
```

### 3. 架构格式转换

#### 3.1 LightX2V → Diffusers

```bash
python converter.py \
    --source /path/to/Wan2.1-I2V-14B-480P \
    --output /path/to/Wan2.1-I2V-14B-480P-Diffusers \
    --output_ext .safetensors \
    --model_type wan_dit \
    --direction forward \
    --chunk-size 100
```

#### 3.2 Diffusers → LightX2V

```bash
python converter.py \
    --source /path/to/Wan2.1-I2V-14B-480P-Diffusers \
    --output /path/to/Wan2.1-I2V-14B-480P \
    --output_ext .safetensors \
    --model_type wan_dit \
    --direction backward \
    --save_by_block
```

### 4. 格式转换

#### 4.1 .pth → .safetensors

```bash
python converter.py \
    --source /path/to/model.pth \
    --output /path/to/output \
    --output_ext .safetensors \
425
426
    --output_name model \
    --single_file
gushiqiao's avatar
gushiqiao committed
427
428
429
430
431
432
433
434
435
436
437
```

#### 4.2 多个 .safetensors → 单文件

```bash
python converter.py \
    --source /path/to/model_directory/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_model \
    --single_file
438
```