Unverified Commit db5934e7 authored by Muyang Li's avatar Muyang Li Committed by GitHub
Browse files

feat: example scripts for Qwen-Image-Edit (#679)

* update

* update

* docs: update README

* update docs

* style: make linter happy
parent fd51fbd0
......@@ -15,17 +15,18 @@ Join our user groups on [**Discord**](https://discord.gg/Wk6PnwX9Sm) and [**WeCh
## News
- **[2025-09-09]** 🔥 Released **4-bit Qwen-Image-Edit** together with the [4/8-step Lightning](https://huggingface.co/lightx2v/Qwen-Image-Lightning) variants! Models are available on [Hugging Face](https://huggingface.co/nunchaku-tech/nunchaku-qwen-image). Try them out with our [example script](examples/v1/qwen-image-edit.py).
- **[2025-09-04]** 🚀 Official release of **Nunchaku v1.0.0**! Qwen-Image now supports **asynchronous offloading**, reducing VRAM usage to as little as **3 GiB** with no performance loss. Check out the [tutorial](https://nunchaku.tech/docs/nunchaku/usage/qwenimage.html) to get started.
- **[2025-08-27]** 🔥 Release **4-bit [4/8-step lightning Qwen-Image](https://huggingface.co/lightx2v/Qwen-Image-Lightning)**! Download on [Hugging Face](https://huggingface.co/nunchaku-tech/nunchaku-qwen-image) or [ModelScope](https://modelscope.cn/models/nunchaku-tech/nunchaku-qwen-image), and try it with our [example script](examples/v1/qwen-image-lightning.py).
- **[2025-08-15]** 🔥 Our **4-bit Qwen-Image** models are now live on [Hugging Face](https://huggingface.co/nunchaku-tech/nunchaku-qwen-image)! Get started with our [example script](examples/v1/qwen-image.py). *ComfyUI, LoRA, and CPU offloading support are coming soon!*
- **[2025-08-15]** 🚀 The **Python backend** is now available! Explore our Pythonic FLUX models [here](nunchaku/models/transformers/transformer_flux_v2.py) and see the modular **4-bit linear layer** [here](nunchaku/models/linear.py).
- **[2025-07-31]** 🚀 **[FLUX.1-Krea-dev](https://www.krea.ai/blog/flux-krea-open-source-release) is now supported!** Check out our new [example script](./examples/flux.1-krea-dev.py) to get started.
- **[2025-07-13]** 🚀 The official [**Nunchaku documentation**](https://nunchaku.tech/docs/nunchaku/) is now live! Explore comprehensive guides and resources to help you get started.
- **[2025-06-29]** 🔥 Support **FLUX.1-Kontext**! Try out our [example script](./examples/flux.1-kontext-dev.py) to see it in action! Our demo is available at this [link](https://svdquant.mit.edu/kontext/)!
<details>
<summary>More</summary>
- **[2025-06-29]** 🔥 Support **FLUX.1-Kontext**! Try out our [example script](./examples/flux.1-kontext-dev.py) to see it in action! Our demo is available at this [link](https://svdquant.mit.edu/kontext/)!
- **[2025-06-01]** 🚀 **Release v0.3.0!** This update adds support for multiple-batch inference, [**ControlNet-Union-Pro 2.0**](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0), initial integration of [**PuLID**](https://github.com/ToTheBeginning/PuLID), and introduces [**Double FB Cache**](examples/flux.1-dev-double_cache.py). You can now load Nunchaku FLUX models as a single file, and our upgraded [**4-bit T5 encoder**](https://huggingface.co/nunchaku-tech/nunchaku-t5) now matches **FP8 T5** in quality!
- **[2025-04-16]** 🎥 Released tutorial videos in both [**English**](https://youtu.be/YHAVe-oM7U8?si=cM9zaby_aEHiFXk0) and [**Chinese**](https://www.bilibili.com/video/BV1BTocYjEk5/?share_source=copy_web&vd_source=8926212fef622f25cc95380515ac74ee) to assist installation and usage.
- **[2025-04-09]** 📢 Published the [April roadmap](https://github.com/nunchaku-tech/nunchaku/issues/266) and an [FAQ](https://github.com/nunchaku-tech/nunchaku/discussions/262) to help the community get started and stay up to date with Nunchaku’s development.
......
......@@ -16,7 +16,8 @@ Check out `DeepCompressor <github_deepcompressor_>`_ for the quantization librar
:caption: Usage Tutorials
usage/basic_usage.rst
usage/qwenimage.rst
usage/qwen-image.rst
usage/qwen-image-edit.rst
usage/lora.rst
usage/kontext.rst
usage/controlnet.rst
......
......@@ -10,4 +10,5 @@
.. _hf_nunchaku_wheels: https://huggingface.co/nunchaku-tech/nunchaku
.. _hf_ip-adapterv2: https://huggingface.co/XLabs-AI/flux-ip-adapter-v2
.. _hf_qwen-image: https://huggingface.co/Qwen/Qwen-Image
.. _hf_qwen-image-edit: https://huggingface.co/Qwen/Qwen-Image-Edit
.. _hf_qwen-image-lightning: https://huggingface.co/lightx2v/Qwen-Image-Lightning
Qwen-Image-Edit
===============
Original Qwen-Image-Edit
------------------------
`Qwen-Image-Edit <hf_qwen-image-edit>`_ is the image editing version of Qwen-Image.
Below is a minimal example for running the 4-bit quantized `Qwen-Image-Edit <hf_qwen-image-edit>`_ model with Nunchaku.
Nunchaku offers an API compatible with `Diffusers <github_diffusers_>`_, allowing for a familiar user experience.
.. literalinclude:: ../../../examples/v1/qwen-image-edit.py
:language: python
:caption: Running Qwen-Image-Edit (`examples/v1/qwen-image-edit.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/v1/qwen-image-edit.py>`__)
:linenos:
When using Nunchaku, replace the standard ``QwenImageTransformer2dModel`` with :class:`~nunchaku.models.transformers.transformer_qwenimage.NunchakuQwenImageTransformer2DModel`.
The :meth:`~nunchaku.models.transformers.transformer_qwenimage.NunchakuQwenImageTransformer2DModel.from_pretrained` method loads quantized models from either Hugging Face or local file paths.
.. note::
- The :func:`~nunchaku.utils.get_precision` function automatically detects whether your GPU supports INT4 or FP4 quantization.
Use FP4 models for Blackwell GPUs (RTX 50-series) and INT4 models for other architectures.
- Increasing the rank (e.g., to 128) can improve output quality.
- To reduce VRAM usage, enable asynchronous CPU offloading with :meth:`~nunchaku.models.transformers.transformer_qwenimage.NunchakuQwenImageTransformer2DModel.set_offload`. For further savings, you may also enable Diffusers' ``pipeline.enable_sequential_cpu_offload()``, but be sure to exclude ``transformer`` from offloading, as Nunchaku's offloading mechanism differs from Diffusers'. With these settings, VRAM usage can be reduced to approximately 3GB.
Distilled Qwen-Image-Edit (Qwen-Image-Lightning)
------------------------------------------------
For faster inference, we provide pre-quantized 4-step and 8-step Qwen-Image-Edit models by integrating `Qwen-Image-Lightning LoRAs <hf_qwen-image-lightning>`_.
See the example script below:
.. literalinclude:: ../../../examples/v1/qwen-image-edit-lightning.py
:language: python
:caption: Running Qwen-Image-Edit-Lightning (`examples/v1/qwen-image-edit-lightning.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/v1/qwen-image-edit-lightning.py>`__)
:linenos:
Custom LoRA support is under development.
......@@ -7,6 +7,7 @@ Original Qwen-Image
.. image:: https://huggingface.co/datasets/nunchaku-tech/cdn/resolve/main/nunchaku/assets/qwen-image.jpg
:alt: Qwen-Image with Nunchaku
`Qwen-Image <hf_qwen-image>`_ is an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering.
Below is a minimal example for running the 4-bit quantized `Qwen-Image <hf_qwen-image>`_ model with Nunchaku.
Nunchaku offers an API compatible with `Diffusers <github_diffusers_>`_, allowing for a familiar user experience.
......
import math
import torch
from diffusers import FlowMatchEulerDiscreteScheduler, QwenImageEditPipeline
from diffusers.utils import load_image
from nunchaku import NunchakuQwenImageTransformer2DModel
from nunchaku.utils import get_gpu_memory, get_precision
# From https://github.com/ModelTC/Qwen-Image-Lightning/blob/342260e8f5468d2f24d084ce04f55e101007118b/generate_with_diffusers.py#L82C9-L97C10
scheduler_config = {
"base_image_seq_len": 256,
"base_shift": math.log(3), # We use shift=3 in distillation
"invert_sigmas": False,
"max_image_seq_len": 8192,
"max_shift": math.log(3), # We use shift=3 in distillation
"num_train_timesteps": 1000,
"shift": 1.0,
"shift_terminal": None, # set shift_terminal to None
"stochastic_sampling": False,
"time_shift_type": "exponential",
"use_beta_sigmas": False,
"use_dynamic_shifting": True,
"use_exponential_sigmas": False,
"use_karras_sigmas": False,
}
scheduler = FlowMatchEulerDiscreteScheduler.from_config(scheduler_config)
num_inference_steps = 8 # you can also use the 8-step model to improve the quality
rank = 128 # you can also use the rank=128 model to improve the quality
model_paths = {
4: f"nunchaku-tech/nunchaku-qwen-image-edit/svdq-{get_precision()}_r{rank}-qwen-image-edit-lightningv1.0-4steps.safetensors",
8: f"nunchaku-tech/nunchaku-qwen-image-edit/svdq-{get_precision()}_r{rank}-qwen-image-edit-lightningv1.0-8steps.safetensors",
}
# Load the model
transformer = NunchakuQwenImageTransformer2DModel.from_pretrained(model_paths[num_inference_steps])
pipeline = QwenImageEditPipeline.from_pretrained(
"Qwen/Qwen-Image-Edit", transformer=transformer, scheduler=scheduler, torch_dtype=torch.bfloat16
)
if get_gpu_memory() > 18:
pipeline.enable_model_cpu_offload()
else:
# use per-layer offloading for low VRAM. This only requires 3-4GB of VRAM.
transformer.set_offload(
True, use_pin_memory=False, num_blocks_on_gpu=1
) # increase num_blocks_on_gpu if you have more VRAM
pipeline._exclude_from_cpu_offload.append("transformer")
pipeline.enable_sequential_cpu_offload()
image = load_image(
"https://qwen-qwen-image-edit.hf.space/gradio_api/file=/tmp/gradio/d02be0b3422c33fc0ad3c64445959f17d3d61286c2d7dba985df3cd53d484b77/neon_sign.png"
).convert("RGB")
prompt = "change the text to read '双截棍 Qwen Image Edit is here'"
inputs = {
"image": image,
"prompt": prompt,
"true_cfg_scale": 1,
"negative_prompt": " ",
"num_inference_steps": num_inference_steps,
}
output = pipeline(**inputs)
output_image = output.images[0]
output_image.save(f"qwen-image-edit-lightning-r{rank}-{num_inference_steps}steps.png")
import torch
from diffusers import QwenImageEditPipeline
from diffusers.utils import load_image
from nunchaku import NunchakuQwenImageTransformer2DModel
from nunchaku.utils import get_gpu_memory, get_precision
rank = 128 # you can also use rank=128 model to improve the quality
# Load the model
transformer = NunchakuQwenImageTransformer2DModel.from_pretrained(
f"nunchaku-tech/nunchaku-qwen-image-edit/svdq-{get_precision()}_r{rank}-qwen-image-edit.safetensors"
)
pipeline = QwenImageEditPipeline.from_pretrained(
"Qwen/Qwen-Image-Edit", transformer=transformer, torch_dtype=torch.bfloat16
)
if get_gpu_memory() > 18:
pipeline.enable_model_cpu_offload()
else:
# use per-layer offloading for low VRAM. This only requires 3-4GB of VRAM.
transformer.set_offload(
True, use_pin_memory=False, num_blocks_on_gpu=1
) # increase num_blocks_on_gpu if you have more VRAM
pipeline._exclude_from_cpu_offload.append("transformer")
pipeline.enable_sequential_cpu_offload()
image = load_image(
"https://qwen-qwen-image-edit.hf.space/gradio_api/file=/tmp/gradio/d02be0b3422c33fc0ad3c64445959f17d3d61286c2d7dba985df3cd53d484b77/neon_sign.png"
).convert("RGB")
prompt = "change the text to read '双截棍 Qwen Image Edit is here'"
inputs = {
"image": image,
"prompt": prompt,
"true_cfg_scale": 4.0,
"negative_prompt": " ",
"num_inference_steps": 50,
}
output = pipeline(**inputs)
output_image = output.images[0]
output_image.save(f"qwen-image-edit-r{rank}.png")
......@@ -42,7 +42,9 @@ if get_gpu_memory() > 18:
pipe.enable_model_cpu_offload()
else:
# use per-layer offloading for low VRAM. This only requires 3-4GB of VRAM.
transformer.set_offload(True)
transformer.set_offload(
True, use_pin_memory=False, num_blocks_on_gpu=1
) # increase num_blocks_on_gpu if you have more VRAM
pipe._exclude_from_cpu_offload.append("transformer")
pipe.enable_sequential_cpu_offload()
......
......@@ -19,7 +19,9 @@ if get_gpu_memory() > 18:
pipe.enable_model_cpu_offload()
else:
# use per-layer offloading for low VRAM. This only requires 3-4GB of VRAM.
transformer.set_offload(True)
transformer.set_offload(
True, use_pin_memory=False, num_blocks_on_gpu=1
) # increase num_blocks_on_gpu if you have more VRAM
pipe._exclude_from_cpu_offload.append("transformer")
pipe.enable_sequential_cpu_offload()
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment