Commits · 070c45bbfa26d6e6c59dd24e5133082c1416d607 · fengzch-das / nunchaku

05 Sep, 2025 1 commit

docs: add the docstrings for v1.0.0 (#656) · 070c45bb

Muyang Li authored Sep 04, 2025

* add v2 flux examples

* add the docs

* add docs

* update

* finished ops

* add ops

* update

* update

* update

* update

* update

* update

* update

* update docstrings

* update

* update

* update

* update

* update

* update

* update

* finished the api docs

* update

* update

070c45bb

03 Sep, 2025 1 commit

feat: async CPU offloading for Python backend (#624) · eb901251

Muyang Li authored Sep 03, 2025

* tmp

* update

* update

* finished the offloading impl

* the offloading is buggy

* update utils

* the offloading is still buggy

* update

* correctness and speedup done; need to check the vram overhead

* done

* final debugging

* update

* update

* correct now

* fix

* update

* use per-layer offloading

* fix the offloading on 5090

* support setting the num_blocks_on_gpu

* change the import name

eb901251

28 Aug, 2025 1 commit
- feat: add support for teacache in flux kontext (#618) · f0c83919
  Boynn authored Aug 28, 2025
```
* feat: add support for teacache in flux kontext

* merge main and make linter happy
```
  f0c83919
27 Aug, 2025 2 commits

feat: Implement V2 FBCaching and Optimize Existing FBCache (#621) · 882aa077

SMG authored Aug 28, 2025

* caching_v2

* rename fb cache and write docstring

* lint

* rename utils to fbcache

* no need maintain sana for caching

882aa077

feat: support lightning Qwen-Image models (#641) · 7b0dbce5

Muyang Li authored Aug 27, 2025

* update

* update

* update README

* update dos

* update docs

* improve the lightning script

* update the example script

* change the repo name

7b0dbce5

15 Aug, 2025 3 commits

chore: fix a typo · 17c7154a
Muyang Li authored Aug 15, 2025

17c7154a
chore: update the qwen-image example · d797a26d
Muyang Li authored Aug 15, 2025

d797a26d

feat: pythonized model and QwenImage Support (#593) · f86ad470

Muyang Li authored Aug 15, 2025

* start refract the codebase

* update

* update

* start to implement ops

* add gemm

* write the docstrings

* define the w4a4 svdq linear

* update

* make the linter happy

* finished the SVDQW4A4Linear

* finished the SVDQW4A4Linear

* update

* update

* add a patcher to the model

* update

* add adanormsinglezero

* update

* update

* finished the naive implementation of nunchaku flux

* add ff

* finished the naive forward

* update

* svdq linear

* start debugging

* fix some issues

* successfully built the model

* update

* successfully load the model

* update

* update

* update

* try to making it runnable

* debugging

* debugging

* debugging

* add bias to awq linear

* run through

* fix the normalization

* update

* update

* update

* fix the attention

* fix the no fuse nvfp models

* update

* finished the fused ff

* make linter happy

* make linter happy

* make linter happy

* debugging the fp16 attn

* nunchaku fp16 is buggy

* finish the fp16 attn

* fp4 done

* fix the lora scales

* add a default value for alpha; need to debug int4

* fix input4

* update

* update

* ff does not work

* specialize the processors

* qwen transformer done. start debugging

* make linter happy

* add schnell v2 for metrics eval

* chore: schnellv2 eval

* update

* ff and attention correct

* need to check what happened to module

* fp4 done

* make linter happy

* update an example script

* reformat

* add an example script

* add the annoucement

* remove a misleading info

* ready to release

f86ad470

14 Aug, 2025 1 commit
- fix: enable correct batch processing in teacache (#601) · 5de6d7cf
  SMG authored Aug 14, 2025
```
* fix teacache_batch

* lint
```
  5de6d7cf
13 Aug, 2025 1 commit

fix: fix LORA key mismatch between FAL.AI and Nunchaku (#557) · 89cba85e

SMG authored Aug 13, 2025

* Fix FLUX.1-Kontext LoRA support and dimension mismatch issues

- Added convert_keys_to_diffusers() for ComfyUI/PEFT format conversion
- Fixed dimension mismatch in LoRA weight concatenation
- Added preprocessing for single_blocks LoRA structure
- Added comprehensive test suite for Kontext LoRA
- Added example script for FLUX.1-Kontext with LoRA

Fixes #354

* lint

* FAL.AI and relight-kontext-lora patch

89cba85e

01 Aug, 2025 1 commit
- feat: support flux.1-krea-dev (#578) · 5225bd9a
  Muyang Li authored Aug 01, 2025
  
  5225bd9a
24 Jul, 2025 1 commit

feat: enable IP-Adapter (XLabs-AI/flux-ip-adapter-v2) support (#418) · 06b7a518

SMG authored Jul 24, 2025



* feat: support IP-adapter

* FBCache and comfyUI

* fixing conflicts

* update

* update example

* update example

* style: make linter happy

* update

* update ipa test

* add docs and rename IP to ip

* docs: add docs for ipa

* docs: add docs for ipa

* add an example for pulid

* update

* save gpu memory

* change the threshold to 0.8

---------
Co-authored-by: Muyang Li <lmxyy1999@foxmail.com>

06b7a518

23 Jul, 2025 1 commit
- feat: add the example script of colossus (#558) · 7eb31b0f
  Muyang Li authored Jul 23, 2025
```
* add colossus script

* finalize colossus
```
  7eb31b0f
30 Jun, 2025 1 commit

feat: update the kontext examples and models (#495) · 259394ae

Muyang Li authored Jun 30, 2025

* update kontext examples

* update tests

* add tests for kontext

* remove the warning of txt_ids and img_ids

* chore: add kontext to be synced from hf to ms

* add kontext demo

* make linter happy

* style: make linter happy

* update docs

259394ae

06 Jun, 2025 1 commit
- fix: segmentation fault when using FBcache with offload=True (#440) · ca4d4b46
  SMG authored Jun 06, 2025
```
* fix:cache issue if offload is set to True

* fix: lint
```
  ca4d4b46
30 May, 2025 1 commit

feat: single-file model loading (#413) · 5182f8f8

Muyang Li authored May 29, 2025

* add a script to merge models

* finished

* try to merge t5

* merge the config into meta files

* rewrite the t5 model loading

* consider the case of subfolder

* merged the qencoder files

* make the linter happy and fix the tests

* pass tests

* add deprecation messages

* add a script to merge models

* schnell script runnable

* update sana

* modify the model paths

* fix the model paths

* style: make the linter happy

* remove the debugging assertion

* chore: fix the qencoder lpips

* fix the lpips

5182f8f8

23 May, 2025 1 commit

feat: upgrade the 4-bit quantized T5 encoder (#320) · 0ade163c

ZIAN HU authored May 23, 2025



* Updating quantized t5 encoder

* Fix formatting based on pre-commit hook

* Update test cases

* Fixing linter issue

* Fix linter reformatting

* support fp4

* style: make linter happy

* update the fp4 lpips

* Prevent downloading original t5 model

* Make sure model in eval mode

---------
Co-authored-by: muyangli <lmxyy1999@foxmail.com>

0ade163c

17 May, 2025 1 commit
- fix a typo · 4f0b7ba1
  muyangli authored May 17, 2025
  
  4f0b7ba1
01 May, 2025 3 commits

style: upgrade the linter (#339) · 57e50f8d
Muyang Li authored May 01, 2025
```
* style: reformated codes

* style: reformated codes
```
57e50f8d

feat: PuLID support (#274) · b737368d

K authored May 01, 2025



* add pulid

* Add the feature that allows the mixed use of pulid and non-pulid after loading pulid to generate the pipeline.

* Added the feature to load LoRA at any time.

* Organized the directory structure.

* Organized the code.

* Removed unused related code from eva-clip.

* style: apply Ruff formatting

* Refactored code and verified pulid works.

* add pulid tests

* auto detect precision in test

* Updated requirements.txt

* update requirements

* style: reformat the example

* style: reformat the example

* style: rename cb to call_back

* style: format the codes

* style: format the codes

* reformated the codes

* fix the repo forward

* clean some dead codes

* wrap up for pulid

---------
Co-authored-by: kkkxue <kkkxue@tencent.com>
Co-authored-by: muyangli <lmxyy1999@foxmail.com>

b737368d

feat: expose norm1 layer to support TeaCache (#234) · b4d3f50b

Andrea Ferretti authored May 01, 2025



* feat: expose norm1 layer to support TeaCache

* feat: add TeaCache example

* feat: add idx as optional parameter

* chore: rename function

* refactor: move TeaCache decorator into example script

* test: add a test for the combination of Nunchaku with TeaCache

* feat: expose norm1 layer to support TeaCache

* feat: add TeaCache example

* feat: add idx as optional parameter

* chore: rename function

* refactor: move TeaCache decorator into example script

* test: add a test for the combination of Nunchaku with TeaCache

* fix: make tests run on low memory hardware

* fix: ensure that memory is correctly released between tests

* fix: avoid moving pipeline to device prematurely

* gpu memory does not release

* need to figure out a way to get compatible with offloading

* wrap up the teacache

---------
Co-authored-by: muyangli <lmxyy1999@foxmail.com>

b4d3f50b

29 Apr, 2025 1 commit
- [Auto Sync] feat: double FB cache + adaptive mechanisms (#76) · b3f12860
  Bluear7878 authored Apr 29, 2025
```
* DoubleFBCache

* rename > DoubleFBCache to use_double_fb_cache
```
  b3f12860
19 Apr, 2025 1 commit
- wrap up the tests · 5939d99f
  muyangli authored Apr 12, 2025
  
  5939d99f
05 Apr, 2025 1 commit
- Merge pull request #70 from mit-han-lab/dev/muyang · 998192ca
  Muyang Li authored Apr 04, 2025
```
Ready to release v0.2.0
```
  998192ca
04 Apr, 2025 3 commits
- pass tests now building wheels · 44ae975c
  Muyang Li authored Apr 04, 2025
  
  44ae975c
- Clean some codes and refract the tests · 2ede5f01
  Muyang Li authored Apr 03, 2025
  
  2ede5f01
- Add controlnet · 235238bd
  Hyunsung Lee authored Apr 02, 2025
  
  235238bd
01 Apr, 2025 6 commits
- Add formatting rule and format fix · cdf5a19b
  Hyunsung Lee authored Mar 28, 2025
  
  cdf5a19b
- Multiple LoRAs · 3ef186fd
  Muyang Li authored Mar 26, 2025
  
  3ef186fd
- Add SanaModel caching · bf0813a6
  Hyunsung Lee authored Mar 18, 2025
  
  bf0813a6
- [major] Fix the tempfile bug in the comfyui · 742a8006
  Muyang Li authored Mar 11, 2025
  
  742a8006
- [feat] add first block cache · 0b1891cd
  muyangli authored Mar 10, 2025
  
  0b1891cd
- Add dynamic Caching when batch_size = 1 for flux model · 39f90121
  Hyunsung Lee authored Mar 12, 2025
  
  39f90121
08 Mar, 2025 2 commits
- update · b4c6f0f0
  muyangli authored Mar 07, 2025
  
  b4c6f0f0
- update two example scripts · a1a15d64
  muyangli authored Mar 07, 2025
  
  a1a15d64
07 Mar, 2025 1 commit

v0.1.4 ready to release · 873a35be

muyangli authored Mar 07, 2025


Co-authored-by: Zhekai Zhang <sxtyzhangzk@gmail.com>
Co-authored-by: Muyang Li <lmxyy1999@foxmail.com>
Co-authored-by: Yujun Lin <16437040+synxlin@users.noreply.github.com>

873a35be

24 Feb, 2025 1 commit
- [minor] fix sana · 3233a41d
  muyangli authored Feb 24, 2025
  
  3233a41d
20 Feb, 2025 1 commit
- [major] support NVFP4; upgrade to 0.1 · 54e6d065
  muyangli authored Feb 20, 2025
  
  54e6d065
19 Feb, 2025 1 commit
- [major] update the lora conversion instructions; add customized lora comfyui... · 7b8d221f
  muyangli authored Feb 18, 2025
```
[major] update the lora conversion instructions; add customized lora comfyui nodes; upgrade the version
```
  7b8d221f
14 Feb, 2025 1 commit
- [major] lora conversion script released; upgrade the model; release flux.1-redux model · 51650a36
  muyangli authored Feb 14, 2025
  
  51650a36