readme.md 11.1 KB
Newer Older
1
# Model Conversion Tool
2

gushiqiao's avatar
gushiqiao committed
3
A powerful model weight conversion tool that supports format conversion, quantization, LoRA merging, and more.
4

gushiqiao's avatar
gushiqiao committed
5
## Main Features
6

gushiqiao's avatar
gushiqiao committed
7
8
9
10
11
12
13
- **Format Conversion**: Support PyTorch (.pth) and SafeTensors (.safetensors) format conversion
- **Model Quantization**: Support INT8 and FP8 quantization to significantly reduce model size
- **Architecture Conversion**: Support conversion between LightX2V and Diffusers architectures
- **LoRA Merging**: Support loading and merging multiple LoRA formats
- **Multi-Model Support**: Support Wan DiT, Qwen Image DiT, T5, CLIP, etc.
- **Flexible Saving**: Support single file, block-based, and chunked saving methods
- **Parallel Processing**: Support parallel acceleration for large model conversion
14

gushiqiao's avatar
gushiqiao committed
15
## Supported Model Types
16

gushiqiao's avatar
gushiqiao committed
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
- `wan_dit`: Wan DiT series models (default)
- `wan_animate_dit`: Wan Animate DiT models
- `qwen_image_dit`: Qwen Image DiT models
- `wan_t5`: Wan T5 text encoder
- `wan_clip`: Wan CLIP vision encoder

## Core Parameters

### Basic Parameters

- `-s, --source`: Input path (file or directory)
- `-o, --output`: Output directory path
- `-o_e, --output_ext`: Output format, `.pth` or `.safetensors` (default)
- `-o_n, --output_name`: Output file name (default: `converted`)
- `-t, --model_type`: Model type (default: `wan_dit`)

### Architecture Conversion Parameters

- `-d, --direction`: Conversion direction
  - `None`: No architecture conversion (default)
  - `forward`: LightX2V → Diffusers
  - `backward`: Diffusers → LightX2V

### Quantization Parameters

- `--quantized`: Enable quantization
- `--bits`: Quantization bit width, currently only supports 8-bit
- `--linear_dtype`: Linear layer quantization type
  - `torch.int8`: INT8 quantization
  - `torch.float8_e4m3fn`: FP8 quantization
- `--non_linear_dtype`: Non-linear layer data type
  - `torch.bfloat16`: BF16
  - `torch.float16`: FP16
  - `torch.float32`: FP32 (default)
51
- `--device`: Device for quantization, `cpu` or `cuda` (default)
gushiqiao's avatar
gushiqiao committed
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
- `--comfyui_mode`: ComfyUI compatible mode
- `--full_quantized`: Full quantization mode (effective in ComfyUI mode)

### LoRA Parameters

- `--lora_path`: LoRA file path(s), supports multiple (separated by spaces)
- `--lora_strength`: LoRA strength coefficients, supports multiple (default: 1.0)
- `--alpha`: LoRA alpha parameters, supports multiple
- `--lora_key_convert`: LoRA key conversion mode
  - `auto`: Auto-detect (default)
  - `same`: Use original key names
  - `convert`: Apply same conversion as model

### Saving Parameters

- `--single_file`: Save as single file (note: large models consume significant memory)
- `-b, --save_by_block`: Save by blocks (recommended for backward conversion)
- `-c, --chunk-size`: Chunk size (default: 100, 0 means no chunking)
- `--copy_no_weight_files`: Copy non-weight files from source directory

### Performance Parameters

- `--parallel`: Enable parallel processing (default: True)
- `--no-parallel`: Disable parallel processing

## Supported LoRA Formats

The tool automatically detects and supports the following LoRA formats:

1. **Standard**: `{key}.lora_up.weight` and `{key}.lora_down.weight`
2. **Diffusers**: `{key}_lora.up.weight` and `{key}_lora.down.weight`
3. **Diffusers V2**: `{key}.lora_B.weight` and `{key}.lora_A.weight`
4. **Diffusers V3**: `{key}.lora.up.weight` and `{key}.lora.down.weight`
5. **Mochi**: `{key}.lora_B` and `{key}.lora_A` (no .weight suffix)
6. **Transformers**: `{key}.lora_linear_layer.up.weight` and `{key}.lora_linear_layer.down.weight`
7. **Qwen**: `{key}.lora_B.default.weight` and `{key}.lora_A.default.weight`

Additionally supports diff formats:
- `.diff`: Weight diff
- `.diff_b`: Bias diff
- `.diff_m`: Modulation diff

## Usage Examples

### 1. Model Quantization

#### 1.1 Wan DiT Quantization to INT8

**Multiple safetensors, saved by dit blocks**
101
102
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
103
104
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
gushiqiao's avatar
Fix  
gushiqiao committed
105
    --output_ext .safetensors \
106
    --output_name wan_int8 \
gushiqiao's avatar
gushiqiao committed
107
    --linear_dtype torch.int8 \
gushiqiao's avatar
gushiqiao committed
108
109
110
    --model_type wan_dit \
    --quantized \
    --save_by_block
111
112
```

gushiqiao's avatar
gushiqiao committed
113
**Single safetensor file**
114
115
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
116
117
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
gushiqiao's avatar
Fix  
gushiqiao committed
118
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
119
120
    --output_name wan2.1_i2v_480p_int8_lightx2v \
    --linear_dtype torch.int8 \
gushiqiao's avatar
gushiqiao committed
121
122
    --model_type wan_dit \
    --quantized \
gushiqiao's avatar
gushiqiao committed
123
    --single_file
124
125
```

gushiqiao's avatar
gushiqiao committed
126
#### 1.2 Wan DiT Quantization to FP8
GoatWu's avatar
GoatWu committed
127

gushiqiao's avatar
gushiqiao committed
128
**Multiple safetensors, saved by dit blocks**
GoatWu's avatar
GoatWu committed
129
130
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
131
132
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
GoatWu's avatar
GoatWu committed
133
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
134
135
136
    --output_name wan_fp8 \
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.bfloat16 \
GoatWu's avatar
GoatWu committed
137
    --model_type wan_dit \
gushiqiao's avatar
gushiqiao committed
138
139
    --quantized \
    --save_by_block
GoatWu's avatar
GoatWu committed
140
141
```

gushiqiao's avatar
gushiqiao committed
142
**Single safetensor file**
143
144
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
145
146
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
gushiqiao's avatar
Fix  
gushiqiao committed
147
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
148
    --output_name wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v \
gushiqiao's avatar
gushiqiao committed
149
    --linear_dtype torch.float8_e4m3fn \
gushiqiao's avatar
gushiqiao committed
150
151
152
153
    --non_linear_dtype torch.bfloat16 \
    --model_type wan_dit \
    --quantized \
    --single_file
154
155
```

gushiqiao's avatar
gushiqiao committed
156
**ComfyUI scaled_fp8 format**
157
158
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
159
160
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
161
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
162
163
164
165
    --output_name wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v_comfyui \
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.bfloat16 \
    --model_type wan_dit \
166
    --quantized \
gushiqiao's avatar
gushiqiao committed
167
168
    --single_file \
    --comfyui_mode
169
170
```

gushiqiao's avatar
gushiqiao committed
171
**ComfyUI full FP8 format**
172
173
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
174
175
    --source /path/to/Wan2.1-I2V-14B-480P/ \
    --output /path/to/output \
176
    --output_ext .safetensors \
gushiqiao's avatar
gushiqiao committed
177
    --output_name wan2.1_i2v_480p_scaled_fp8_e4m3_lightx2v_comfyui \
178
    --linear_dtype torch.float8_e4m3fn \
gushiqiao's avatar
gushiqiao committed
179
180
    --non_linear_dtype torch.bfloat16 \
    --model_type wan_dit \
181
    --quantized \
gushiqiao's avatar
gushiqiao committed
182
183
184
    --single_file \
    --comfyui_mode \
    --full_quantized
185
```
186

gushiqiao's avatar
gushiqiao committed
187
> **Tip**: For other DIT models, simply switch the `--model_type` parameter
188

gushiqiao's avatar
gushiqiao committed
189
190
191
#### 1.3 T5 Encoder Quantization

**INT8 Quantization**
192
193
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
194
195
196
    --source /path/to/models_t5_umt5-xxl-enc-bf16.pth \
    --output /path/to/output \
    --output_ext .pth \
197
    --output_name models_t5_umt5-xxl-enc-int8 \
gushiqiao's avatar
gushiqiao committed
198
199
    --linear_dtype torch.int8 \
    --non_linear_dtype torch.bfloat16 \
gushiqiao's avatar
gushiqiao committed
200
201
    --model_type wan_t5 \
    --quantized
202
203
```

gushiqiao's avatar
gushiqiao committed
204
**FP8 Quantization**
205
206
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
207
208
209
    --source /path/to/models_t5_umt5-xxl-enc-bf16.pth \
    --output /path/to/output \
    --output_ext .pth \
210
    --output_name models_t5_umt5-xxl-enc-fp8 \
gushiqiao's avatar
gushiqiao committed
211
212
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.bfloat16 \
gushiqiao's avatar
gushiqiao committed
213
214
    --model_type wan_t5 \
    --quantized
215
216
```

gushiqiao's avatar
gushiqiao committed
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
#### 1.4 CLIP Encoder Quantization

**INT8 Quantization**
```bash
python converter.py \
    --source /path/to/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth \
    --output /path/to/output \
    --output_ext .pth \
    --output_name models_clip_open-clip-xlm-roberta-large-vit-huge-14-int8 \
    --linear_dtype torch.int8 \
    --non_linear_dtype torch.float16 \
    --model_type wan_clip \
    --quantized
```

**FP8 Quantization**
```bash
python converter.py \
    --source /path/to/models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth \
    --output /path/to/output \
    --output_ext .pth \
    --output_name models_clip_open-clip-xlm-roberta-large-vit-huge-14-fp8 \
    --linear_dtype torch.float8_e4m3fn \
    --non_linear_dtype torch.float16 \
    --model_type wan_clip \
    --quantized
```

### 2. LoRA Merging

#### 2.1 Merge Single LoRA

```bash
python converter.py \
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_model \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --single_file
```
260

gushiqiao's avatar
gushiqiao committed
261
#### 2.2 Merge Multiple LoRAs
262
263
264

```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
265
266
267
268
269
270
271
272
273
274
275
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_model \
    --model_type wan_dit \
    --lora_path /path/to/lora1.safetensors /path/to/lora2.safetensors \
    --lora_strength 1.0 0.8 \
    --single_file
```

#### 2.3 LoRA Merging with Quantization
276

gushiqiao's avatar
gushiqiao committed
277
278
279
280
281
282
283
284
285
286
287
288
289
**LoRA Merge → FP8 Quantization**
```bash
python converter.py \
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_quantized \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --quantized \
    --linear_dtype torch.float8_e4m3fn \
    --single_file
290
```
gushiqiao's avatar
gushiqiao committed
291
292

**LoRA Merge → ComfyUI scaled_fp8**
293
294
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
295
296
297
298
299
300
301
302
303
304
305
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_quantized \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --quantized \
    --linear_dtype torch.float8_e4m3fn \
    --single_file \
    --comfyui_mode
306
```
307

gushiqiao's avatar
gushiqiao committed
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
**LoRA Merge → ComfyUI Full FP8**
```bash
python converter.py \
    --source /path/to/base_model/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_quantized \
    --model_type wan_dit \
    --lora_path /path/to/lora.safetensors \
    --lora_strength 1.0 \
    --quantized \
    --linear_dtype torch.float8_e4m3fn \
    --single_file \
    --comfyui_mode \
    --full_quantized
```
324

gushiqiao's avatar
gushiqiao committed
325
#### 2.4 LoRA Key Conversion Modes
326

gushiqiao's avatar
gushiqiao committed
327
**Auto-detect mode (recommended)**
328
329
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
330
331
332
333
334
    --source /path/to/model/ \
    --output /path/to/output \
    --lora_path /path/to/lora.safetensors \
    --lora_key_convert auto \
    --single_file
335
336
```

gushiqiao's avatar
gushiqiao committed
337
**Use original key names (LoRA already in target format)**
338
339
```bash
python converter.py \
gushiqiao's avatar
gushiqiao committed
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
    --source /path/to/model/ \
    --output /path/to/output \
    --direction forward \
    --lora_path /path/to/lora.safetensors \
    --lora_key_convert same \
    --single_file
```

**Apply conversion (LoRA in source format)**
```bash
python converter.py \
    --source /path/to/model/ \
    --output /path/to/output \
    --direction forward \
    --lora_path /path/to/lora.safetensors \
    --lora_key_convert convert \
    --single_file
```

### 3. Architecture Format Conversion

#### 3.1 LightX2V → Diffusers

```bash
python converter.py \
    --source /path/to/Wan2.1-I2V-14B-480P \
    --output /path/to/Wan2.1-I2V-14B-480P-Diffusers \
    --output_ext .safetensors \
    --model_type wan_dit \
    --direction forward \
    --chunk-size 100
```

#### 3.2 Diffusers → LightX2V

```bash
python converter.py \
    --source /path/to/Wan2.1-I2V-14B-480P-Diffusers \
    --output /path/to/Wan2.1-I2V-14B-480P \
    --output_ext .safetensors \
    --model_type wan_dit \
    --direction backward \
    --save_by_block
```

### 4. Format Conversion

#### 4.1 .pth → .safetensors

```bash
python converter.py \
    --source /path/to/model.pth \
    --output /path/to/output \
    --output_ext .safetensors \
394
395
    --output_name model \
    --single_file
gushiqiao's avatar
gushiqiao committed
396
397
398
399
400
401
402
403
404
405
406
```

#### 4.2 Multiple .safetensors → Single File

```bash
python converter.py \
    --source /path/to/model_directory/ \
    --output /path/to/output \
    --output_ext .safetensors \
    --output_name merged_model \
    --single_file
407
```