Commits · 070c45bbfa26d6e6c59dd24e5133082c1416d607 · fengzch-das / nunchaku

05 Sep, 2025 1 commit

docs: add the docstrings for v1.0.0 (#656) · 070c45bb

Muyang Li authored Sep 04, 2025

* add v2 flux examples

* add the docs

* add docs

* update

* finished ops

* add ops

* update

* update

* update

* update

* update

* update

* update

* update docstrings

* update

* update

* update

* update

* update

* update

* update

* finished the api docs

* update

* update

070c45bb

04 Sep, 2025 1 commit
- feat: simplify the implementation of the async offloading and support ComfyUI offloading · e0392e42
  Muyang Li authored Sep 04, 2025
  
  e0392e42
03 Sep, 2025 1 commit

feat: async CPU offloading for Python backend (#624) · eb901251

Muyang Li authored Sep 03, 2025

* tmp

* update

* update

* finished the offloading impl

* the offloading is buggy

* update utils

* the offloading is still buggy

* update

* correctness and speedup done; need to check the vram overhead

* done

* final debugging

* update

* update

* correct now

* fix

* update

* use per-layer offloading

* fix the offloading on 5090

* support setting the num_blocks_on_gpu

* change the import name

eb901251

02 Sep, 2025 1 commit

feat: add support for python 3.13 (#651) · ea99072a

Karl Zhou authored Sep 02, 2025

* feat: add support for python 3.13

* update workflow to exclude py3.13+torch2.5

* fix typo

ea99072a

31 Aug, 2025 3 commits
- docs: update the email address · c079cb00
  Muyang Li authored Aug 31, 2025
  
  c079cb00
- chore: change the checkout target for the test · 16106520
  Muyang Li authored Aug 31, 2025
  
  16106520
- chore: change the checkout target for the test · ae7b4636
  Muyang Li authored Aug 31, 2025
  
  ae7b4636
29 Aug, 2025 1 commit
- fix: close the NVFP4 performance gap between the Python backend and C backend · 092e01ec
  Muyang Li authored Aug 28, 2025
```
Co-authored-by: Kung Talon <31659820+kungtalon@users.noreply.github.com>
```
  092e01ec
28 Aug, 2025 2 commits
- feat: add paddings if LoRA ranks for `q, k, v` are different (#603) · 7fcce6f3
  woctordho authored Aug 29, 2025
```
* Add paddings if LoRA ranks for q, k, v are different

* No need to create a list
```
  7fcce6f3
- feat: add support for teacache in flux kontext (#618) · f0c83919
  Boynn authored Aug 28, 2025
```
* feat: add support for teacache in flux kontext

* merge main and make linter happy
```
  f0c83919
27 Aug, 2025 3 commits
- feat: Implement V2 FBCaching and Optimize Existing FBCache (#621) · 882aa077
  SMG authored Aug 28, 2025
```
* caching_v2

* rename fb cache and write docstring

* lint

* rename utils to fbcache

* no need maintain sana for caching
```
  882aa077
- dos: fix a typo in README · c547f3b9
  Muyang Li authored Aug 27, 2025
  
  c547f3b9
- feat: support lightning Qwen-Image models (#641) · 7b0dbce5
  Muyang Li authored Aug 27, 2025
```
* update

* update

* update README

* update dos

* update docs

* improve the lightning script

* update the example script

* change the repo name
```
  7b0dbce5
23 Aug, 2025 1 commit

feat: support Qwen-Image in ComfyUI-nunchaku (#632) · 4132b3bf

Muyang Li authored Aug 23, 2025

* update

* add parameter of act unsigned

* upgrade the diffusers to v0.35.1

* bump the comfyui version to 0.3.51

* update version

* revert the test comfyui-version back to 0.3.44

4132b3bf

19 Aug, 2025 2 commits

fix: fix ValueError in NunchakuQwenImagePipeline.prepare_latents unpacking (#616) · 3ec299f4

Subho Ghosh authored Aug 19, 2025

* fix: enhance latent variable preparation in NunchakuQwenImagePipeline

- Refactored latent variable preparation to utilize the parent method for generating latents.
- Added manual generation of latent_image_ids to ensure correct indexing.

* refactor: clean up whitespace in NunchakuQwenImagePipeline

- Removed unnecessary whitespace in the latent variable preparation section for improved readability.

3ec299f4

feat: FLUX Gradio demos support FP4 (#623) · 9ed2db76

Muyang Li authored Aug 19, 2025

* update app

* depth supports fp4

* update

* fix the demo website

* style: make linter happy

9ed2db76

15 Aug, 2025 3 commits

chore: fix a typo · 17c7154a
Muyang Li authored Aug 15, 2025

17c7154a
chore: update the qwen-image example · d797a26d
Muyang Li authored Aug 15, 2025

d797a26d

feat: pythonized model and QwenImage Support (#593) · f86ad470

Muyang Li authored Aug 15, 2025

* start refract the codebase

* update

* update

* start to implement ops

* add gemm

* write the docstrings

* define the w4a4 svdq linear

* update

* make the linter happy

* finished the SVDQW4A4Linear

* finished the SVDQW4A4Linear

* update

* update

* add a patcher to the model

* update

* add adanormsinglezero

* update

* update

* finished the naive implementation of nunchaku flux

* add ff

* finished the naive forward

* update

* svdq linear

* start debugging

* fix some issues

* successfully built the model

* update

* successfully load the model

* update

* update

* update

* try to making it runnable

* debugging

* debugging

* debugging

* add bias to awq linear

* run through

* fix the normalization

* update

* update

* update

* fix the attention

* fix the no fuse nvfp models

* update

* finished the fused ff

* make linter happy

* make linter happy

* make linter happy

* debugging the fp16 attn

* nunchaku fp16 is buggy

* finish the fp16 attn

* fp4 done

* fix the lora scales

* add a default value for alpha; need to debug int4

* fix input4

* update

* update

* ff does not work

* specialize the processors

* qwen transformer done. start debugging

* make linter happy

* add schnell v2 for metrics eval

* chore: schnellv2 eval

* update

* ff and attention correct

* need to check what happened to module

* fp4 done

* make linter happy

* update an example script

* reformat

* add an example script

* add the annoucement

* remove a misleading info

* ready to release

f86ad470

14 Aug, 2025 2 commits
- chore: bump the version to v1.0.0 (#602) · 954c7af9
  Muyang Li authored Aug 14, 2025
  
  954c7af9
- fix: enable correct batch processing in teacache (#601) · 5de6d7cf
  SMG authored Aug 14, 2025
```
* fix teacache_batch

* lint
```
  5de6d7cf
13 Aug, 2025 5 commits

chore: fix the wheel hf uploading (#599) · 3bcc2d43
Muyang Li authored Aug 13, 2025
```
* fix the hf uploading

* use conda python
```
3bcc2d43
chore: support torch 2.9 uploading wheel to hf (#598) · 433f0b22
Muyang Li authored Aug 13, 2025
```
* chore: support uploading wheel to hf

* support ready_for_review
```
433f0b22
chore: skip draft pr for the linter · 2327c994
Muyang Li authored Aug 13, 2025

2327c994
chore: bump the version to v0.3.2 · 0c7f5359
Muyang Li authored Aug 13, 2025

0c7f5359

fix: fix LORA key mismatch between FAL.AI and Nunchaku (#557) · 89cba85e

SMG authored Aug 13, 2025

* Fix FLUX.1-Kontext LoRA support and dimension mismatch issues

- Added convert_keys_to_diffusers() for ComfyUI/PEFT format conversion
- Fixed dimension mismatch in LoRA weight concatenation
- Added preprocessing for single_blocks LoRA structure
- Added comprehensive test suite for Kontext LoRA
- Added example script for FLUX.1-Kontext with LoRA

Fixes #354

* lint

* FAL.AI and relight-kontext-lora patch

89cba85e

09 Aug, 2025 1 commit
- chore: fix workflow for torch 2.8 (#592) · e4fe2547
  Kung Talon authored Aug 08, 2025
```
* fix workflow for torch 2.8

* update based on main
```
  e4fe2547
07 Aug, 2025 1 commit
- chore: support torch2.8 wheels (#590) · cc9f1d6d
  Muyang Li authored Aug 06, 2025
  
  cc9f1d6d
03 Aug, 2025 3 commits
- chore: add test trials (#580) · be8a7ba2
  Muyang Li authored Aug 03, 2025
```
* chore: add test trials

* update the test score
```
  be8a7ba2
- chore: update the test score · da1aaca0
  Muyang Li authored Aug 02, 2025
  
  da1aaca0
- chore: Update run_all_tests.py · a7b467e6
  Muyang Li authored Aug 02, 2025
  
  a7b467e6
02 Aug, 2025 7 commits
- chore: nightly build support workflow dispatch · 2c6d17b5
  Muyang Li authored Aug 02, 2025
  
  2c6d17b5
- chore: fix the nightly build ci · 875eb916
  Muyang Li authored Aug 02, 2025
  
  875eb916
- style: make linter happy · 4177e3d6
  Muyang Li authored Aug 02, 2025
  
  4177e3d6
- chore: update nightly build ci · 56ff6d8f
  Muyang Li authored Aug 02, 2025
  
  56ff6d8f
- chore: better test logs · 69eb9e6f
  Muyang Li authored Aug 02, 2025
  
  69eb9e6f
- chore: use -s for pytests · 92d75723
  Muyang Li authored Aug 02, 2025
  
  92d75723
- chore: add tests for 4090 machines · a8b771dd
  Muyang Li authored Aug 02, 2025
  
  a8b771dd
01 Aug, 2025 2 commits

fix: re-enable transformer caching in apply_cache_on_pipe (#577) · 71328b1c

yulei authored Aug 02, 2025



* fix: Re-enable transformer caching in apply_cache_on_pipe

* to pass tests

* add the doc string back

* add the doc string back

---------
Co-authored-by: Muyang Li <lmxyy1999@foxmail.com>

71328b1c

feat: support flux.1-krea-dev (#578) · 5225bd9a
Muyang Li authored Aug 01, 2025

5225bd9a