feat: support lightning Qwen-Image models (#641)

* update * update * update README * update dos * update docs * improve the lightning script * update the example script * change the repo name

feat: support lightning Qwen-Image models (#641)
* update * update * update README * update dos * update docs * improve the lightning script * update the example script * change the repo name
7b0dbce5 · Muyang Li · GitHub · 4132b3bf · 7b0dbce5 · 7b0dbce5
Unverified Commit 7b0dbce5 authored Aug 27, 2025 by Muyang Li Committed by GitHub Aug 27, 2025
Show whitespace changes
Inline Side-by-side

Showing with 58 additions and 4 deletions

README.md README.md +2 -1

README_ZH.md README_ZH.md +4 -3

examples/v1/qwen-image-lightning.py examples/v1/qwen-image-lightning.py +52 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -15,17 +15,18 @@ Join our user groups on [**Slack**](https://join.slack.com/t/nunchaku/shared_inv
 ## News
+- **[2025-08-27]** 🔥 Release **4-bit [4/8-step lightning Qwen-Image](https://huggingface.co/lightx2v/Qwen-Image-Lightning) (4/8-step)**! Download on [Hugging Face](https://huggingface.co/nunchaku-tech/nunchaku-qwen-image) or [ModelScope](https://modelscope.cn/models/nunchaku-tech/nunchaku-qwen-image), and try it with our [example script](examples/v1/qwen-image-lightning.py).
 - **[2025-08-15]** 🔥 Our **4-bit Qwen-Image** models are now live on [Hugging Face](https://huggingface.co/nunchaku-tech/nunchaku-qwen-image)! Get started with our [example script](examples/v1/qwen-image.py). *ComfyUI, LoRA, and CPU offloading support are coming soon!*
 - **[2025-08-15]** 🚀 The **Python backend** is now available! Explore our Pythonic FLUX models [here](nunchaku/models/transformers/transformer_flux_v2.py) and see the modular **4-bit linear layer** [here](nunchaku/models/linear.py).
 - **[2025-07-31]** 🚀 **[FLUX.1-Krea-dev](https://www.krea.ai/blog/flux-krea-open-source-release) is now supported!** Check out our new [example script](./examples/flux.1-krea-dev.py) to get started.
 - **[2025-07-13]** 🚀 The official [**Nunchaku documentation**](https://nunchaku.tech/docs/nunchaku/) is now live! Explore comprehensive guides and resources to help you get started.
 - **[2025-06-29]** 🔥 Support **FLUX.1-Kontext**! Try out our [example script](./examples/flux.1-kontext-dev.py) to see it in action! Our demo is available at this [link](https://svdquant.mit.edu/kontext/)!
 - **[2025-06-01]** 🚀 **Release v0.3.0!** This update adds support for multiple-batch inference, [**ControlNet-Union-Pro 2.0**](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0), initial integration of [**PuLID**](https://github.com/ToTheBeginning/PuLID), and introduces [**Double FB Cache**](examples/flux.1-dev-double_cache.py). You can now load Nunchaku FLUX models as a single file, and our upgraded [**4-bit T5 encoder**](https://huggingface.co/nunchaku-tech/nunchaku-t5) now matches **FP8 T5** in quality!
- **[2025-04-16]** 🎥 Released tutorial videos in both [**English**](https://youtu.be/YHAVe-oM7U8?si=cM9zaby_aEHiFXk0) and [**Chinese**](https://www.bilibili.com/video/BV1BTocYjEk5/?share_source=copy_web&vd_source=8926212fef622f25cc95380515ac74ee) to assist installation and usage.
 <details>
 <summary>More</summary>
+- **[2025-04-16]** 🎥 Released tutorial videos in both [**English**](https://youtu.be/YHAVe-oM7U8?si=cM9zaby_aEHiFXk0) and [**Chinese**](https://www.bilibili.com/video/BV1BTocYjEk5/?share_source=copy_web&vd_source=8926212fef622f25cc95380515ac74ee) to assist installation and usage.
 - **[2025-04-09]** 📢 Published the [April roadmap](https://github.com/nunchaku-tech/nunchaku/issues/266) and an [FAQ](https://github.com/nunchaku-tech/nunchaku/discussions/262) to help the community get started and stay up to date with Nunchaku’s development.
 - **[2025-04-05]** 🚀 **Nunchaku v0.2.0 released!** This release brings [**multi-LoRA**](examples/flux.1-dev-multiple-lora.py) and [**ControlNet**](examples/flux.1-dev-controlnet-union-pro.py) support with even faster performance powered by [**FP16 attention**](#fp16-attention) and [**First-Block Cache**](#first-block-cache). We've also added compatibility for [**20-series GPUs**](examples/flux.1-dev-turing.py) — Nunchaku is now more accessible than ever!
 - **[2025-03-07]** 🚀 **Nunchaku v0.1.4 Released!** We've supported [4-bit text encoder and per-layer CPU offloading](#Low-Memory-Inference), reducing FLUX's minimum memory requirement to just **4 GiB** while maintaining a **2–3× speedup**. This update also fixes various issues related to resolution, LoRA, pin memory, and runtime stability. Check out the release notes for full details!

--- a/README_ZH.md
+++ b/README_ZH.md
@@ -15,17 +15,18 @@
 ## 最新动态
+- **[2025-08-27]** 🚀 发布 **4-bit [4/8步 lightning Qwen-Image](https://huggingface.co/lightx2v/Qwen-Image-Lightning)**！可在 [Hugging Face](https://huggingface.co/nunchaku-tech/nunchaku-qwen-image) 和 [ModelScope](https://modelscope.cn/models/nunchaku-tech/nunchaku-qwen-image) 下载。使用我们的 [示例脚本](examples/v1/qwen-image-lightning.py) 开始体验。
 - **[2025-07-31]** 🚀 **[FLUX.1-Krea-dev](https://www.krea.ai/blog/flux-krea-open-source-release) 已支持！** 欢迎参考我们的[示例脚本](./examples/flux.1-krea-dev.py)快速上手。
 - **[2025-07-13]** 🚀 官方 [**Nunchaku 文档**](https://nunchaku.tech/docs/nunchaku/) 上线！欢迎查阅详细的入门指南和资源。
 - **[2025-06-29]** 🔥 支持 **FLUX.1-Kontext**！可参考我们的[示例脚本](./examples/flux.1-kontext-dev.py)体验，在线演示见[此处](https://svdquant.mit.edu/kontext/)！
 - **[2025-06-01]** 🚀 **v0.3.0 发布！** 本次更新支持多 batch 推理、[**ControlNet-Union-Pro 2.0**](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0)、初步集成 [**PuLID**](https://github.com/ToTheBeginning/PuLID)，并引入 [**双 FB Cache**](examples/flux.1-dev-double_cache.py)。现已支持单文件加载 FLUX 模型，升级后的 [**4-bit T5 编码器**](https://huggingface.co/nunchaku-tech/nunchaku-t5) 质量媲美 **FP8 T5**！
- **[2025-04-16]** 🎥 发布中英文[**安装与使用教程视频**](https://youtu.be/YHAVe-oM7U8?si=cM9zaby_aEHiFXk0)（[**B站**](https://www.bilibili.com/video/BV1BTocYjEk5/?share_source=copy_web&vd_source=8926212fef622f25cc95380515ac74ee)）。
- **[2025-04-09]** 📢 发布 [四月路线图](https://github.com/nunchaku-tech/nunchaku/issues/266) 及 [FAQ](https://github.com/nunchaku-tech/nunchaku/discussions/262)，助力社区快速上手并了解最新进展。
- **[2025-04-05]** 🚀 **Nunchaku v0.2.0 发布！** 本次更新带来 [**多 LoRA**](examples/flux.1-dev-multiple-lora.py) 和 [**ControlNet**](examples/flux.1-dev-controlnet-union-pro.py) 支持，并通过 [**FP16 attention**](#fp16-attention) 和 [**First-Block Cache**](#first-block-cache) 实现更快推理。现已兼容 [**20 系显卡**](examples/flux.1-dev-turing.py) —— Nunchaku 更易用！
 <details>
 <summary>更多历史</summary>
+- **[2025-04-16]** 🎥 发布中英文[**安装与使用教程视频**](https://youtu.be/YHAVe-oM7U8?si=cM9zaby_aEHiFXk0)（[**B站**](https://www.bilibili.com/video/BV1BTocYjEk5/?share_source=copy_web&vd_source=8926212fef622f25cc95380515ac74ee)）。
+- **[2025-04-09]** 📢 发布 [四月路线图](https://github.com/nunchaku-tech/nunchaku/issues/266) 及 [FAQ](https://github.com/nunchaku-tech/nunchaku/discussions/262)，助力社区快速上手并了解最新进展。
+- **[2025-04-05]** 🚀 **Nunchaku v0.2.0 发布！** 本次更新带来 [**多 LoRA**](examples/flux.1-dev-multiple-lora.py) 和 [**ControlNet**](examples/flux.1-dev-controlnet-union-pro.py) 支持，并通过 [**FP16 attention**](#fp16-attention) 和 [**First-Block Cache**](#first-block-cache) 实现更快推理。现已兼容 [**20 系显卡**](examples/flux.1-dev-turing.py) —— Nunchaku 更易用！
 - **[2025-03-07]** 🚀 **Nunchaku v0.1.4 发布！** 支持 [4-bit 文本编码器和逐层 CPU 下放](#Low-Memory-Inference)，将 FLUX 最低显存需求降至 **4 GiB**，同时实现 **2–3× 加速**。本次还修复了分辨率、LoRA、pin memory 和稳定性等问题，详见发布说明！
 - **[2025-02-20]** 🚀 **RTX 5090 支持 NVFP4 精度！** NVFP4 相比 INT4 画质更佳，在 RTX 5090 上比 BF16 快 **~3×**。详情见[博客](https://hanlab.mit.edu/blog/svdquant-nvfp4)，用法见 [`examples`](./examples)，在线体验[点此](https://svdquant.mit.edu/flux1-schnell/)！
 - **[2025-02-18]** 🔥 [**自定义 LoRA 转换**](#Customized-LoRA) 和 [**模型量化**](#Customized-Model-Quantization) 教程上线！**[ComfyUI](./comfyui)** 工作流现已支持 **自定义 LoRA** 及 **FLUX.1-Tools**！

--- a/examples/v1/qwen-image-lightning.py
+++ b/examples/v1/qwen-image-lightning.py
+import math
+import torch
+from diffusers import FlowMatchEulerDiscreteScheduler
+from nunchaku.models.transformers.transformer_qwenimage import NunchakuQwenImageTransformer2DModel
+from nunchaku.pipeline.pipeline_qwenimage import NunchakuQwenImagePipeline
+from nunchaku.utils import get_precision
+# From https://github.com/ModelTC/Qwen-Image-Lightning/blob/342260e8f5468d2f24d084ce04f55e101007118b/generate_with_diffusers.py#L82C9-L97C10
+scheduler_config = {
+    "base_image_seq_len": 256,
+    "base_shift": math.log(3),  # We use shift=3 in distillation
+    "invert_sigmas": False,
+    "max_image_seq_len": 8192,
+    "max_shift": math.log(3),  # We use shift=3 in distillation
+    "num_train_timesteps": 1000,
+    "shift": 1.0,
+    "shift_terminal": None,  # set shift_terminal to None
+    "stochastic_sampling": False,
+    "time_shift_type": "exponential",
+    "use_beta_sigmas": False,
+    "use_dynamic_shifting": True,
+    "use_exponential_sigmas": False,
+    "use_karras_sigmas": False,
+}
+scheduler = FlowMatchEulerDiscreteScheduler.from_config(scheduler_config)
+num_inference_steps = 4  # you can also use the 8-step model to improve the quality
+rank = 32  # you can also use the r128 or 8-step model to improve the quality
+model_paths = {
+    4: f"nunchaku-tech/nunchaku-qwen-image/svdq-{get_precision()}_r{rank}-qwen-image-lightningv1.0-4steps.safetensors",
+    8: f"nunchaku-tech/nunchaku-qwen-image/svdq-{get_precision()}_r{rank}-qwen-image-lightningv1.1-8steps.safetensors",
+}
+# Load the model
+transformer = NunchakuQwenImageTransformer2DModel.from_pretrained(model_paths[num_inference_steps])
+pipe = NunchakuQwenImagePipeline.from_pretrained(
+    "Qwen/Qwen-Image", transformer=transformer, scheduler=scheduler, torch_dtype=torch.bfloat16
+)
+prompt = """Bookstore window display. A sign displays “New Arrivals This Week”. Below, a shelf tag with the text “Best-Selling Novels Here”. To the side, a colorful poster advertises “Author Meet And Greet on Saturday” with a central portrait of the author. There are four books on the bookshelf, namely “The light between worlds” “When stars are scattered” “The slient patient” “The night circus”"""
+negative_prompt = " "
+image = pipe(
+    prompt=prompt,
+    negative_prompt=negative_prompt,
+    width=1024,
+    height=1024,
+    num_inference_steps=num_inference_steps,
+    true_cfg_scale=1.0,
+).images[0]
+image.save(f"qwen-image-lightning_r{rank}.png")