docs: add the docs of nunchaku (#517)

* update sphinx docs * update the doc configration * configure doxyfile * start building the docs * building docs * building docs * update docs * finish the installation documents * finish the installation documents * finish the installation documents * start using rst * use rst instead of md * need to figure out how to maintain rst * update * make linter happy * update * link management * rst is hard to handle * fix the title-only errors * setup the rst linter * add the lora files * lora added, need to be more comprehensive * update * update * finished lora docs * finished the LoRA docs * finished the cn docs * finished the cn docs * finished the qencoder docs * finished the cpu offload * finished the offload docs * add the attention docs * finished the attention docs * finished the fbcache * update * finished the pulid docs * make linter happy * make linter happy * add kontext * update * add the docs for gradio demos * add docs for test.py * add the docs for utils.py * make the doc better displayed * update * update * add some docs * style: make linter happy * add docs * update * add caching docs * make linter happy * add api docs * fix the t5 docs * fix the t5 docs * fix the t5 docs * hide the private functions * update * fix the docs of caching utils * update docs * finished the docstring of nunchaku cahcing * update packer * revert the docs * better docs for packer.py * better docs for packer.py * better docs for packer.py * better docs for packer.py * update * update docs * caching done * caching done * lora * lora * lora * update * python docs * reorg docs * add the initial version of faq * update * make linter happy * reorg * reorg * add crossref * make linter happy * better docs * make linter happy * preliminary version of the docs done * update * update README * update README * docs done * update README * update docs * not using datasets 4 for now

docs: add the docs of nunchaku (#517)
* update sphinx docs * update the doc configration * configure doxyfile * start building the docs * building docs * building docs * update docs * finish the installation documents * finish the installation documents * finish the installation documents * start using rst * use rst instead of md * need to figure out how to maintain rst * update * make linter happy * update * link management * rst is hard to handle * fix the title-only errors * setup the rst linter * add the lora files * lora added, need to be more comprehensive * update * update * finished lora docs * finished the LoRA docs * finished the cn docs * finished the cn docs * finished the qencoder docs * finished the cpu offload * finished the offload docs * add the attention docs * finished the attention docs * finished the fbcache * update * finished the pulid docs * make linter happy * make linter happy * add kontext * update * add the docs for gradio demos * add docs for test.py * add the docs for utils.py * make the doc better displayed * update * update * add some docs * style: make linter happy * add docs * update * add caching docs * make linter happy * add api docs * fix the t5 docs * fix the t5 docs * fix the t5 docs * hide the private functions * update * fix the docs of caching utils * update docs * finished the docstring of nunchaku cahcing * update packer * revert the docs * better docs for packer.py * better docs for packer.py * better docs for packer.py * better docs for packer.py * update * update docs * caching done * caching done * lora * lora * lora * update * python docs * reorg docs * add the initial version of faq * update * make linter happy * reorg * reorg * add crossref * make linter happy * better docs * make linter happy * preliminary version of the docs done * update * update README * update README * docs done * update README * update docs * not using datasets 4 for now
51732b7a · Muyang Li · GitHub · 189be8bf · 51732b7a · 51732b7a
Unverified Commit 51732b7a authored Jul 14, 2025 by Muyang Li Committed by GitHub Jul 14, 2025
20 changed files
--- a/.github/ISSUE_TEMPLATE/1-bug-report.yml
+++ b/.github/ISSUE_TEMPLATE/1-bug-report.yml
@@ -8,7 +8,7 @@ body:
    attributes:
      label: Checklist
      options:
-        - label: 1. I have searched for related issues and FAQs (https://github.com/mit-han-lab/nunchaku/blob/main/docs/faq.md) but was unable to find a solution.
+        - label: 1. I have searched for related issues and FAQs (https://nunchaku.tech/docs/nunchaku/faq/faq.html) but was unable to find a solution.
        - label: 2. The issue persists in the latest version.
        - label: 3. Please note that without environment information and a minimal reproducible example, it will be difficult for us to reproduce and address the issue, which may delay our response.
        - label: 4. If your report is a question rather than a bug, please submit it as a discussion at https://github.com/mit-han-lab/nunchaku/discussions/new/choose. Otherwise, this issue will be closed.

--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -11,8 +11,8 @@
 ## Checklist

 - [ ] Code is formatted using Pre-Commit hooks.
- [ ] Relevant unit tests are added in the [`tests`](../tests) directory following the guidance in [`tests/README.md`](../tests/README.md).
- [ ] [README](../README.md) and example scripts in [`examples`](../examples) are updated if necessary.
+- [ ] Relevant unit tests are added in the [`tests`](../tests) directory following the guidance in [`Contribution Guide`](https://nunchaku.tech/docs/nunchaku/developer/contribution_guide.html).
+- [ ] [Documentation](../docs/source) and example scripts in [`examples`](../examples) are updated if necessary.
 - [ ] Throughput/latency benchmarks and quality evaluations are included where applicable.
 - [ ] **For reviewers:** If you're only helping merge the main branch and haven't contributed code to this PR, please remove yourself as a co-author when merging.
 - [ ] Please feel free to join our [Slack](https://join.slack.com/t/nunchaku/shared_invite/zt-3170agzoz-NgZzWaTrEj~n2KEV3Hpl5Q), [Discord](https://discord.gg/Wk6PnwX9Sm) or [WeChat](https://github.com/mit-han-lab/nunchaku/blob/main/assets/wechat.jpg) to discuss your PR.
--- a/.gitignore
+++ b/.gitignore
@@ -103,6 +103,8 @@ instance/

 # Sphinx documentation
 docs/_build/
+docs/build/
+docs/doxygen/

 # PyBuilder
 .pybuilder/

--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -53,8 +53,22 @@ repos:
    rev: v0.17.0
    hooks:
      - id: yamlfmt
-  - repo: https://github.com/executablebooks/mdformat
+  - repo: https://github.com/hukkin/mdformat
    rev: 0.7.22
    hooks:
      - id: mdformat
        name: (Markdown) Format docs with mdformat
+        additional_dependencies:
+          - mdformat-gfm
+          - mdformat-black
+          - mdformat-myst
+  - repo: https://github.com/PyCQA/doc8
+    rev: v2.0.0
+    hooks:
+      - id: doc8
+        additional_dependencies: []
+  - repo: https://github.com/rstcheck/rstcheck
+    rev: main # should be replaced with the current verison
+    hooks:
+      - id: rstcheck
+        additional_dependencies: ['rstcheck[sphinx,toml]']
--- a/README.md
+++ b/README.md
--- a/README_ZH.md
+++ b/README_ZH.md
--- a/app/flux.1/t2i/README.md
+++ b/app/flux.1/t2i/README.md
-# Nunchaku INT4 FLUX.1 Models
+# Nunchaku 4-Bit FLUX.1 Models

 ## Text-to-Image Gradio Demo


--- a/docs/Makefile
+++ b/docs/Makefile
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = source
+BUILDDIR      = build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
--- a/docs/contribution_guide.md
+++ b/docs/contribution_guide.md
-# Contribution Guide
-
-Welcome to **Nunchaku**! We appreciate your interest in contributing. This guide outlines how to set up your environment, run tests, and submit a Pull Request (PR). Whether you're fixing a minor bug or implementing a major feature, we encourage you to follow these steps for a smooth and efficient contribution process.
-
-## 🚀 Setting Up & Building from Source
-
-### 1. Fork and Clone the Repository
-
-> 📌 **Note:** As a new contributor, you won’t have write access to the official Nunchaku repository. Please fork the repository to your own GitHub account, then clone your fork locally:
-
-```shell
-git clone https://github.com/<your_username>/nunchaku.git
-```
-
-### 2. Install Dependencies & Build
-
-To install dependencies and build the project, follow the instructions in our [README](../README.md#installation).
-
-## 🧹 Code Formatting with Pre-Commit
-
-We use [pre-commit](https://pre-commit.com/) hooks to ensure code style consistency. Please install and run it before submitting your changes:
-
-```shell
-pip install pre-commit
-pre-commit install
-pre-commit run --all-files
-```
-
- `pre-commit run --all-files` manually triggers all checks and automatically fixes issues where possible. If it fails initially, re-run until all checks pass.
-
-* ✅ **Ensure your code passes all checks before opening a PR.**
-
-* 🚫 **Do not commit directly to the `main` branch.** Always create a feature branch (e.g., `feat/my-new-feature`), commit your changes there, and open a PR from that branch.
-
-## 🧪 Running Unit Tests & Integrating with CI
-
-Nunchaku uses `pytest` for unit testing. If you're adding a new feature, please include corresponding test cases in the [`tests`](../tests) directory.
-
-For detailed guidance on testing, refer to the [`tests/README.md`](../tests/README.md).
-
-## Acknowledgments
-
-This contribution guide is adapted from [SGLang](https://docs.sglang.ai/references/contribution_guide.html). We thank them for the inspiration.
--- a/docs/contribution_guide_ZH.md
+++ b/docs/contribution_guide_ZH.md
-# 贡献指南
-
-欢迎来到 **Nunchaku**！我们非常感谢您的贡献兴趣。本指南将指导您完成环境配置、测试运行和提交拉取请求（PR）的流程。无论您是修复小问题还是开发重要功能，都请遵循以下步骤以确保贡献过程顺畅高效。
-
-## 🚀 环境设置与源码构建
-
-### 1. Fork并克隆仓库
-
-> 📌 **注意**：作为新贡献者，您暂时没有官方仓库的写入权限。请先将仓库Fork到自己的GitHub账号，然后克隆到本地：
-
-```shell
-git clone https://github.com/<your_username>/nunchaku.git
-```
-
-### 2. 安装依赖与构建
-
-安装依赖并构建项目的具体步骤请参考[README](../README.md#installation)中的说明。
-
-## 🧹 使用Pre-Commit进行代码格式化
-
-我们通过[pre-commit](https://pre-commit.com/)hooks确保代码风格统一。提交更改前请务必安装并运行：
-
-```shell
-pip install pre-commit
-pre-commit install
-pre-commit run --all-files
-```
-
- `pre-commit run --all-files` 会手动触发所有检查并自动修复可解决的问题。若初次运行失败，请重复执行直至全部通过
-
-* ✅ **提交PR前请确保代码通过所有检查**
-
-* 🚫 **禁止直接提交到`main`分支**。请始终创建功能分支（如`feat/my-new-feature`），并在该分支上提交更改后发起PR
-
-## 🧪 单元测试与CI集成
-
-Nunchaku使用`pytest`进行单元测试。新增功能时，请在[`tests`](../tests)目录中添加对应的测试用例。
-
-更多测试细节请参考[`tests/README.md`](../tests/README.md)。
-
-## 致谢
-
-本贡献指南改编自[SGLang](https://docs.sglang.ai/references/contribution_guide.html)，感谢他们的灵感启发。
--- a/docs/doxyfile
+++ b/docs/doxyfile
--- a/docs/faq.md
+++ b/docs/faq.md
-### ❗ Import Error: `ImportError: cannot import name 'to_diffusers' from 'nunchaku.lora.flux' (...)` (e.g., mit-han-lab/nunchaku#250)
-
-This error usually indicates that the nunchaku library was not installed correctly. We’ve prepared step-by-step installation guides for Windows users:
-
-📺 [English tutorial](https://youtu.be/YHAVe-oM7U8?si=cM9zaby_aEHiFXk0) | 📺 [Chinese tutorial](https://www.bilibili.com/video/BV1BTocYjEk5/?share_source=copy_web&vd_source=8926212fef622f25cc95380515ac74ee) | 📖 [Corresponding Text guide](https://github.com/mit-han-lab/nunchaku/blob/main/docs/setup_windows.md)
-
-Please also check the following common causes:
-
- **You only installed the ComfyUI plugin (`ComfyUI-nunchaku`) but not the core `nunchaku` library.** Please follow the [installation instructions in our README](https://github.com/mit-han-lab/nunchaku?tab=readme-ov-file#installation) to install the correct version of the `nunchaku` library.
-
- **You installed `nunchaku` using `pip install nunchaku`, but this is the wrong package.**
-  The `nunchaku` name on PyPI is already taken by an unrelated project. Please uninstall the incorrect package and follow our [installation guide](https://github.com/mit-han-lab/nunchaku?tab=readme-ov-file#installation) to install the correct version.
-
- **(MOST LIKELY) You installed `nunchaku` correctly, but into the wrong Python environment.**
-  If you're using the ComfyUI portable package, its Python interpreter is very likely not the system default. To identify the correct Python path, launch ComfyUI and check the several initial lines in the log. For example, you will find
-
-  ```text
-  ** Python executable: G:\ComfyuI\python\python.exe
-  ```
-
-  To install `nunchaku` into this environment, use the following format:
-
-  ```shell
-  "G:\ComfyUI\python\python.exe" -m pip install <your-wheel-file>.whl
-  ```
-
-  Example (for Python 3.11 and torch 2.6):
-
-  ```shell
-  "G:\ComfyUI\python\python.exe" -m pip install https://github.com/mit-han-lab/nunchaku/releases/download/v0.2.0/nunchaku-0.2.0+torch2.6-cp311-cp311-linux_x86_64.whl
-  ```
-
- **You have a folder named `nunchaku` in your working directory.**
-  Python may mistakenly load from that local folder instead of the installed library. Also, make sure your plugin folder under `custom_nodes` is named `ComfyUI-nunchaku`, not `nunchaku`.
-
-### ❗ Runtime Error: `Assertion failed: this->shape.dataExtent == other.shape.dataExtent, file ...Tensor.h` (e.g., mit-han-lab/nunchaku#212)
-
-This error is typically due to using the wrong model for your GPU.
-
- If you're using a **Blackwell GPU (e.g., RTX 50-series)**, please use our **FP4** models.
- For all other GPUs, use our **INT4** models.
-
-### ❗ System crash or blue screen (e.g., mit-han-lab/nunchaku#57)
-
-We have observed some cases where memory is not properly released after image generation, especially when using ComfyUI. This may lead to system instability or crashes.
-
-We’re actively investigating this issue. If you have experience or insights into memory management in ComfyUI, we would appreciate your help!
-
-### ❗Out of Memory or Slow Model Loading (e.g.,mit-han-lab/nunchaku#249 mit-han-lab/nunchaku#311 mit-han-lab/nunchaku#276)
-
-Try upgrading your CUDA driver and try setting the environment variable `NUNCHAKU_LOAD_METHOD` to either `READ` or `READNOPIN`.
-
-### ❗Same Seeds Produce Slightly Different Images (e.g., mit-han-lab/nunchaku#229 mit-han-lab/nunchaku#294)
-
-This behavior is due to minor precision noise introduced by the GPU’s accumulation order. Because modern GPUs execute operations out of order for better performance, small variations in output can occur, even with the same seed.
-Enforcing strict accumulation order would reduce this variability but significantly hurt performance, so we do not plan to change this behavior.
-
-### ❓ PuLID Support (e.g., mit-han-lab/nunchaku#258)
-
-PuLID support is currently in development and will be included in the next major release.
-
-### ~~❗ Assertion Error: `Assertion failed: a.dtype() == b.dtype(), file ...misc_kernels.cu` (e.g., mit-han-lab/nunchaku#30))~~
-
-~~At the moment, we **only support the 16-bit version of [ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro)**. Support for FP8 and other ControlNets is planned for a future release.~~ ✅ This issue has now been resolved.
-
-### ~~❗ Assertion Error: `assert image_rotary_emb.shape[2] == batch_size * (txt_tokens + img_tokens)` (e.g., [#24](https://github.com/mit-han-lab/ComfyUI-nunchaku/issues/24))~~
-
-~~Currently, **batch sizes greater than 1 are not supported** during inference. We will support this in a future major release.~~ ✅ Multi-batch inference is now supported as of [v0.3.0dev0](https://github.com/mit-han-lab/nunchaku/releases/tag/v0.3.0dev0).
--- a/docs/faq_ZH.md
+++ b/docs/faq_ZH.md
-### ❗ 导入错误：`ImportError: cannot import name 'to_diffusers' from 'nunchaku.lora.flux' (...)`（例如 mit-han-lab/nunchaku#250）
-
-此错误通常表示 `nunchaku` 库未正确安装。我们为 Windows 用户准备了分步安装指南：
-
-📺 [英文教程](https://youtu.be/YHAVe-oM7U8?si=cM9zaby_aEHiFXk0) | 📺 [中文教程](https://www.bilibili.com/video/BV1BTocYjEk5/?share_source=copy_web&vd_source=8926212fef622f25cc95380515ac74ee) | 📖 [对应文本指南](https://github.com/mit-han-lab/nunchaku/blob/main/docs/setup_windows.md)
-
-请同时检查以下常见原因：
-
- **您仅安装了 ComfyUI 插件（`ComfyUI-nunchaku`）而未安装核心 `nunchaku` 库。** 请遵循[README 中的安装说明](https://github.com/mit-han-lab/nunchaku?tab=readme-ov-file#installation)安装正确版本的 `nunchaku` 库。
-
- **您使用 `pip install nunchaku` 安装了错误包。**
-  PyPI 上的 `nunchaku` 名称已被无关项目占用。请卸载错误包并按照[安装指南](https://github.com/mit-han-lab/nunchaku?tab=readme-ov-file#installation)操作。
-
- **（最常见）您正确安装了 `nunchaku`，但安装到了错误的 Python 环境中。**
-  如果使用 ComfyUI 便携包，其 Python 解释器很可能不是系统默认版本。启动 ComfyUI 后，检查日志开头的 Python 路径，例如：
-
-  ```text
-  ** Python executable: G:\ComfyuI\python\python.exe
-  ```
-
-  使用以下命令安装到该环境：
-
-  ```shell
-  "G:\ComfyUI\python\python.exe" -m pip install <your-wheel-file>.whl
-  ```
-
-  示例（Python 3.11 + Torch 2.6）：
-
-  ```shell
-  "G:\ComfyUI\python\python.exe" -m pip install https://github.com/mit-han-lab/nunchaku/releases/download/v0.2.0/nunchaku-0.2.0+torch2.6-cp311-cp311-linux_x86_64.whl
-  ```
-
- **您的工作目录中存在名为 `nunchaku` 的文件夹。**
-  Python 可能会误加载本地文件夹而非已安装库。同时确保 `custom_nodes` 下的插件文件夹名为 `ComfyUI-nunchaku`，而非 `nunchaku`。
-
-### ❗ 运行时错误：`Assertion failed: this->shape.dataExtent == other.shape.dataExtent, file ...Tensor.h`(例如 mit-han-lab/nunchaku#212)
-
-此错误通常由使用与 GPU 不匹配的模型引起：
-
- 若使用 **Blackwell GPU（如 RTX 50 系列）**，请使用 **FP4** 模型。
- 其他 GPU 请使用 **INT4** 模型。
-
-### ❗ 系统崩溃或蓝屏（例如 mit-han-lab/nunchaku#57）
-
-我们观察到在使用 ComfyUI 时，图像生成后内存未正确释放可能导致系统不稳定或崩溃。我们正在积极调查此问题。若您有 ComfyUI 内存管理经验，欢迎协助！
-
-### ❗ 内存不足或模型加载缓慢（例如 mit-han-lab/nunchaku#249、mit-han-lab/nunchaku#311、mit-han-lab/nunchaku#276）
-
-尝试升级 CUDA 驱动，并设置环境变量 `NUNCHAKU_LOAD_METHOD` 为 `READ` 或 `READNOPIN`。
-
-### ❗ 相同种子生成略微不同的图像（例如 mit-han-lab/nunchaku#229、mit-han-lab/nunchaku#294）
-
-此现象由 GPU 计算顺序导致的微小精度噪声引起。强制固定计算顺序会显著降低性能，因此我们不计划调整此行为。
-
-### ❓ PuLID 支持（例如 mit-han-lab/nunchaku#258）
-
-PuLID 支持正在开发中，将在下一主要版本中加入。
-
-### ~~❗ 断言错误：`Assertion failed: a.dtype() == b.dtype(), file ...misc_kernels.cu`（例如 mit-han-lab/nunchaku#30）~~
-
-~~目前我们**仅支持 16 位版本的 [ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro)**。FP8 及其他 ControlNet 支持将在未来版本中加入。~~ ✅ 此问题已解决。
-
-### ~~❗ Assertion Error：`assert image_rotary_emb.shape[2] == batch_size * (txt_tokens + img_tokens)`（例如 mit-han-lab/nunchaku#24）~~
-
-~~当前**不支持推理时批量大小超过 1**。我们将在未来主要版本中支持此功能。~~ ✅ 自 [v0.3.0dev0](https://github.com/mit-han-lab/nunchaku/releases/tag/v0.3.0dev0) 起已支持多批量推理。
--- a/docs/setup_windows.md
+++ b/docs/setup_windows.md
-# Nunchaku Setup Guide (Windows)
-
-# Environment Setup
-
-## 1. Install Cuda
-
-Download and install the latest CUDA Toolkit from the official [NVIDIA CUDA Downloads](https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64&target_version=Server2022&target_type=exe_local). After installation, verify the installation:
-
-```bash
-nvcc --version
-```
-
-## 2. Install Visual Studio C++ Build Tools
-
-Download from the official [Visual Studio Build Tools page](https://visualstudio.microsoft.com/visual-cpp-build-tools/). During installation, select the following workloads:
-
- **Desktop development with C++**
- **C++ tools for Linux development**
-
-### 3. Git
-
-Download Git from [https://git-scm.com/downloads/win](https://git-scm.com/downloads/win) and follow the installation steps.
-
-## 4. (Optional) Install Conda
-
-Conda helps manage Python environments. You can install either Anaconda or Miniconda from the [official site](https://www.anaconda.com/download/success).
-
-## 5. (Optional) Installing ComfyUI
-
-You may have some various ways to install ComfyUI. For example, I used ComfyUI CLI. Once Python is installed, you can install ComfyUI via the CLI:
-
-```shell
-pip install comfy-cli
-comfy-cli install
-```
-
-To launch ComfyUI:
-
-```shell
-comfy-cli launch
-```
-
-# Installing Nunchaku
-
-## Step 1: Identify Your Python Environment
-
-To ensure correct installation, you need to find the Python interpreter used by ComfyUI. Launch ComfyUI and look for this line in the log:
-
-```bash
-** Python executable: G:\ComfyuI\python\python.exe
-```
-
-Then verify the Python version and installed PyTorch version:
-
-```bash
-"G:\ComfyuI\python\python.exe" --version
-"G:\ComfyuI\python\python.exe" -m pip show torch
-```
-
-## Step 2: Install PyTorch (≥2.5) if you haven’t
-
-Install PyTorch appropriate for your setup
-
- **For most users**:
-
-  ```bash
-  "G:\ComfyuI\python\python.exe" -m pip install torch==2.6 torchvision==0.21 torchaudio==2.6
-  ```
-
- **For RTX 50-series GPUs** (requires PyTorch ≥2.7 with CUDA 12.8):
-
-  ```bash
-  "G:\ComfyuI\python\python.exe" -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
-  ```
-
-## Step 3: Install Nunchaku
-
-### Prebuilt Wheels
-
-You can install Nunchaku wheels from one of the following:
-
- [Hugging Face](https://huggingface.co/mit-han-lab/nunchaku/tree/main)
- [ModelScope](https://modelscope.cn/models/Lmxyy1999/nunchaku)
- [GitHub Releases](https://github.com/mit-han-lab/nunchaku/releases)
-
-Example (for Python 3.10 + PyTorch 2.6):
-
-```bash
-"G:\ComfyuI\python\python.exe" -m pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.2.0+torch2.6-cp310-cp310-win_amd64.whl
-```
-
-To verify the installation:
-
-```bash
-"G:\ComfyuI\python\python.exe" -c "import nunchaku"
-```
-
-You can also run a test (requires a Hugging Face token for downloading the models):
-
-```bash
-"G:\ComfyuI\python\python.exe" -m huggingface-cli login
-"G:\ComfyuI\python\python.exe" -m nunchaku.test
-```
-
-### (Alternative) Build Nunchaku from Source
-
-Please use CMD instead of PowerShell for building.
-
- Step 1: Install Build Tools
-
-  ```bash
-  C:\Users\muyang\miniconda3\envs\comfyui\python.exe
-  "G:\ComfyuI\python\python.exe" -m pip install ninja setuptools wheel build
-  ```
-
- Step 2: Clone the Repository
-
-  ```bash
-  git clone https://github.com/mit-han-lab/nunchaku.git
-  cd nunchaku
-  git submodule init
-  git submodule update
-  ```
-
- Step 3: Set Up Visual Studio Environment
-
-  Locate the `VsDevCmd.bat` script on your system. Example path:
-
-  ```
-  C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\Common7\Tools\VsDevCmd.bat
-  ```
-
-  Then run:
-
-  ```bash
-  "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\Common7\Tools\VsDevCmd.bat" -startdir=none -arch=x64 -host_arch=x64
-  set DISTUTILS_USE_SDK=1
-  ```
-
- Step 4: Build Nunchaku
-
-  ```bash
-  "G:\ComfyuI\python\python.exe" setup.py develop
-  ```
-
-  Verify with:
-
-  ```bash
-  "G:\ComfyuI\python\python.exe" -c "import nunchaku"
-  ```
-
-  You can also run a test (requires a Hugging Face token for downloading the models):
-
-  ```bash
-  "G:\ComfyuI\python\python.exe" -m huggingface-cli login
-  "G:\ComfyuI\python\python.exe" -m nunchaku.test
-  ```
-
- (Optional) Step 5: Building wheel for Portable Python
-
-  If building directly with portable Python fails, you can first build the wheel in a working Conda environment, then install the `.whl` file using your portable Python:
-
-  ```shell
-  set NUNCHAKU_INSTALL_MODE=ALL
-  "G:\ComfyuI\python\python.exe" python -m build --wheel --no-isolation
-  ```
-
-# Use Nunchaku in ComfyUI
-
-## 1. Install the Plugin
-
-Clone the [ComfyUI-Nunchaku](https://github.com/mit-han-lab/ComfyUI-nunchaku) plugin into the `custom_nodes` folder:
-
-```
-cd ComfyUI/custom_nodes
-git clone https://github.com/mit-han-lab/ComfyUI-nunchaku.git
-```
-
-Alternatively, install using [ComfyUI-Manager](https://github.com/Comfy-Org/ComfyUI-Manager) or `comfy-cli`.
-
-## 2. Download Models
-
- **Standard FLUX.1-dev Models**
-
-  Start by downloading the standard [FLUX.1-dev text encoders](https://huggingface.co/comfyanonymous/flux_text_encoders/tree/main) and [VAE](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/ae.safetensors). You can also optionally download the original [BF16 FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/flux1-dev.safetensors) model. An example command:
-
-  ```bash
-  huggingface-cli download comfyanonymous/flux_text_encoders clip_l.safetensors --local-dir models/text_encoders
-  huggingface-cli download comfyanonymous/flux_text_encoders t5xxl_fp16.safetensors --local-dir models/text_encoders
-  huggingface-cli download black-forest-labs/FLUX.1-schnell ae.safetensors --local-dir models/vae
-  huggingface-cli download black-forest-labs/FLUX.1-dev flux1-dev.safetensors --local-dir models/diffusion_models
-  ```
-
- **SVDQuant 4-bit FLUX.1-dev Models**
-
-  Next, download the SVDQuant 4-bit models:
-
-  - For **50-series GPUs**, use the [FP4 model](https://huggingface.co/mit-han-lab/svdq-fp4-flux.1-dev).
-  - For **other GPUs**, use the [INT4 model](https://huggingface.co/mit-han-lab/svdq-int4-flux.1-dev).
-
-  Make sure to place the **entire downloaded folder** into `models/diffusion_models`. For example:
-
-  ```bash
-  huggingface-cli download mit-han-lab/svdq-int4-flux.1-dev --local-dir models/diffusion_models/svdq-int4-flux.1-dev
-  ```
-
- **(Optional): Download Sample LoRAs**
-
-  You can test with some sample LoRAs like [FLUX.1-Turbo](https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha/blob/main/diffusion_pytorch_model.safetensors) and [Ghibsky](https://huggingface.co/aleksa-codes/flux-ghibsky-illustration/blob/main/lora.safetensors). Place these files in the `models/loras` directory:
-
-  ```bash
-  huggingface-cli download alimama-creative/FLUX.1-Turbo-Alpha diffusion_pytorch_model.safetensors --local-dir models/loras
-  huggingface-cli download aleksa-codes/flux-ghibsky-illustration lora.safetensors --local-dir models/loras
-  ```
-
-## 3. Set Up Workflows
-
-To use the official workflows, download them from the [ComfyUI-nunchaku](https://github.com/mit-han-lab/ComfyUI-nunchaku/tree/main/workflows) and place them in your `ComfyUI/user/default/workflows` directory. The command can be
-
-```bash
-# From the root of your ComfyUI folder
-cp -r custom_nodes/ComfyUI-nunchaku/workflows user/default/workflows/nunchaku_examples
-```
-
-You can now launch ComfyUI and try running the example workflows.
-
-# Troubleshooting
-
-If you encounter issues, refer to our:
-
- [FAQs](https://github.com/mit-han-lab/nunchaku/discussions/262)
- [GitHub Issues](https://github.com/mit-han-lab/nunchaku/issues)
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
+# Configuration file for the Sphinx documentation builder.
+#
+# For the full list of built-in configuration values, see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+# -- Project information -----------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
+
+from pathlib import Path
+
+project = "Nunchaku"
+copyright = "2025, Nunchaku Team"
+author = "Nunchaku Team"
+
+version_path = Path(__file__).parent.parent.parent / "nunchaku" / "__version__.py"
+version_ns = {}
+exec(version_path.read_text(), {}, version_ns)
+version = release = version_ns["__version__"]
+# release = version
+
+# -- General configuration ---------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
+
+extensions = [
+    "sphinx.ext.autodoc",
+    "sphinx.ext.autosummary",
+    "sphinx.ext.napoleon",
+    "sphinx.ext.viewcode",
+    "sphinx.ext.autosectionlabel",
+    "sphinx.ext.intersphinx",
+    "sphinx_tabs.tabs",
+    "sphinx.ext.extlinks",
+    "myst_parser",
+    "sphinx_copybutton",
+    "sphinxcontrib.mermaid",
+    "nbsphinx",
+    "sphinx.ext.mathjax",
+    "breathe",
+]
+
+templates_path = ["_templates"]
+exclude_patterns = []
+
+# -- Include global link definitions -----------------------------------------
+with open(Path(__file__).parent / "links.rst", encoding="utf-8") as f:
+    rst_epilog = f.read()
+
+# -- Options for HTML output -------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
+
+html_theme = "sphinx_book_theme"
+html_static_path = ["_static"]
+
+napoleon_google_docstring = False
+napoleon_numpy_docstring = True
+
+extlinks = {
+    "nunchaku-issue": ("https://github.com/mit-han-lab/nunchaku/issues/%s", "nunchaku#%s"),
+    "comfyui-issue": ("https://github.com/mit-han-lab/ComfyUI-nunchaku/issues/%s", "ComfyUI-nunchaku#%s"),
+}
--- a/docs/source/developer/contribution_guide.rst
+++ b/docs/source/developer/contribution_guide.rst
+.. Adapting from https://docs.sglang.ai/references/contribution_guide.html
+
+Contribution Guide
+==================
+
+Welcome to **Nunchaku**! We appreciate your interest in contributing.
+This guide outlines how to set up your environment, run tests, and submit a Pull Request (PR).
+Whether you're fixing a minor bug or implementing a major feature, we encourage you to
+follow these steps for a smooth and efficient contribution process.
+
+🚀 Setting Up & Building from Source
+------------------------------------
+
+1. Fork and Clone the Repository
+
+   .. note::
+
+      As a new contributor, you won't have write access to the `Nunchaku repository <nunchaku_repo_>`_.
+      Please fork the repository to your own GitHub account, then clone your fork locally:
+
+   .. code-block:: shell
+
+      git clone https://github.com/<your_username>/nunchaku.git
+
+2. Install Dependencies & Build
+
+   To install dependencies and build the project, follow the instructions in :doc:`Installation <../installation/installation>`.
+
+🧹 Code Formatting with Pre-Commit
+----------------------------------
+
+We use `pre-commit <https://pre-commit.com/>`_ hooks to ensure code style consistency. Please install and run it before submitting your changes:
+
+.. code-block:: shell
+
+   pip install pre-commit
+   pre-commit install
+   pre-commit run --all-files
+
+- ``pre-commit run --all-files`` manually triggers all checks and automatically fixes issues where possible. If it fails initially, re-run until all checks pass.
+
+- ✅ **Ensure your code passes all checks before opening a PR.**
+
+- 🚫 **Do not commit directly to the** ``main`` **branch.**
+- Always create a feature branch (e.g., ``feat/my-new-feature``),
+- commit your changes there, and open a PR from that branch.
+
+🧪 Running Unit Tests & Integrating with CI
+-------------------------------------------
+
+Nunchaku uses ``pytest`` for unit testing. If you're adding a new feature,
+please include corresponding test cases in the ``tests`` directory.
+**Please avoid modifying existing tests.**
+
+Running the Tests
+~~~~~~~~~~~~~~~~~
+
+.. code-block:: shell
+
+   HF_TOKEN=$YOUR_HF_TOKEN pytest -v tests/flux/test_flux_memory.py
+   HF_TOKEN=$YOUR_HF_TOKEN pytest -v tests/flux --ignore=tests/flux/test_flux_memory.py
+   HF_TOKEN=$YOUR_HF_TOKEN pytest -v tests/sana
+
+.. note::
+
+   ``$YOUR_HF_TOKEN`` refers to your Hugging Face access token, required to download models and datasets.
+   You can create one at https://huggingface.co/settings/tokens.
+   If you've already logged in using ``huggingface-cli login``,
+   you can skip setting this environment variable.
+
+Some tests generate images using the original 16-bit models. You can cache these results to speed up future test runs by setting the environment variable ``NUNCHAKU_TEST_CACHE_ROOT``. If not set, the images will be saved in ``test_results/ref``.
+
+Writing Tests
+~~~~~~~~~~~~~
+
+When adding a new feature, please include corresponding test cases in the ``tests`` directory. **Please avoid modifying existing tests.**
+
+To test visual output correctness, you can:
+
+1. **Generate reference images:** Use the original 16-bit model to produce a small number of reference images (e.g., 4).
+
+2. **Generate comparison images:** Run your method using the **same inputs and seeds** to ensure deterministic outputs. You can control the seed by setting the ``generator`` parameter in the diffusers pipeline.
+
+3. **Compute similarity:** Evaluate the similarity between your outputs and the reference images using the `LPIPS <https://arxiv.org/abs/1801.03924>`_ metric. Use the ``compute_lpips`` function provided in ``tests/flux/utils.py``:
+
+   .. code-block:: shell
+
+      lpips = compute_lpips(dir1, dir2)
+
+   Here, ``dir1`` should point to the directory containing the reference images, and ``dir2`` should contain the images generated by your method.
+
+Setting the LPIPS Threshold
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To pass the test, the LPIPS score must be below a predefined threshold—typically **< 0.3**.
+We recommend first running the comparison locally to observe the LPIPS value,
+and then setting the threshold slightly above that value to allow for minor variations.
+Since the test is based on a small sample of images, slight fluctuations are expected;
+a margin of **+0.04** is generally sufficient.
--- a/docs/source/faq/faq.rst
+++ b/docs/source/faq/faq.rst
+Frequently Asked Questions (FAQ)
+================================
+
+.. toctree::
+    :maxdepth: 2
+
+    model.rst
+    usage.rst
--- a/docs/source/faq/model.rst
+++ b/docs/source/faq/model.rst
+Model
+=====
+
+Which model should I choose: INT4 or FP4?
+-----------------------------------------
+
+- For **Blackwell GPUs** (such as the RTX 50-series), we recommend using our **FP4** models for optimal compatibility and performance.
+- For all other GPUs, please use our **INT4** models.
--- a/docs/source/faq/usage.rst
+++ b/docs/source/faq/usage.rst
+Usage
+=====
+
+Out of memory or slow model loading
+-----------------------------------
+
+If you encounter out-of-memory errors or notice that model loading is unusually slow, please try the following steps:
+
+- **Update your CUDA driver** to the latest version, as this can resolve many compatibility and performance issues.
+- **Set the environment variable** `NUNCHAKU_LOAD_METHOD` to either `READ` or `READNOPIN`.
+
+.. note::
+
+   **Related issues:** :nunchaku-issue:`249`, :nunchaku-issue:`276`, :nunchaku-issue:`311`
+
+Why do the same seeds produce slightly different images with Nunchaku?
+----------------------------------------------------------------------
+
+This behavior is due to minor precision noise introduced by the GPU’s accumulation order.
+Because modern GPUs execute operations out of order for better performance, small variations in output can occur, even with the same seed.
+Enforcing strict accumulation order would reduce this variability but significantly hurt performance, so we do not plan to change this behavior.
+
+.. note::
+
+   **Related issues:** :nunchaku-issue:`229`, :nunchaku-issue:`294`
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
+Nunchaku Documentation
+======================
+**Nunchaku** is a high-performance inference engine optimized for low-bit diffusion models and LLMs,
+as introduced in our paper `SVDQuant <svdquant_paper>`_.
+Check out `DeepCompressor <deepcompressor_repo>`_ for the quantization library.
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Installation
+
+   installation/installation.rst
+   installation/setup_windows.rst
+
+.. toctree::
+    :maxdepth: 1
+    :caption: Usage Tutorials
+
+    usage/basic_usage.rst
+    usage/lora.rst
+    usage/kontext.rst
+    usage/controlnet.rst
+    usage/qencoder.rst
+    usage/offload.rst
+    usage/attention.rst
+    usage/fbcache.rst
+    usage/pulid.rst
+
+.. toctree::
+    :maxdepth: 1
+    :caption: Python API Reference
+
+    python_api/nunchaku.rst
+
+.. toctree::
+    :maxdepth: 1
+    :caption: Useful Tools
+    :titlesonly:
+
+    ComfyUI Support: ComfyUI-nunchaku Plugin <https://github.com/mit-han-lab/ComfyUI-nunchaku>
+    Customized Model Quantization: DeepCompressor <https://github.com/mit-han-lab/deepcompressor>
+    Gradio Demos <https://github.com/mit-han-lab/nunchaku/tree/main/app>
+
+
+.. toctree::
+    :maxdepth: 1
+    :caption: Other Resources
+
+    faq/faq.rst
+    developer/contribution_guide.rst