docs: update docs (#544)

* update docs * add docs * add api reference * fixing the links * update * docs: update the html theme * chore: clean a useless workflow

docs: update docs (#544)
* update docs * add docs * add api reference * fixing the links * update * docs: update the html theme * chore: clean a useless workflow
f082491b · Muyang Li · GitHub · 8dc0360e · f082491b · f082491b
Unverified Commit f082491b authored Jul 18, 2025 by Muyang Li Committed by GitHub Jul 18, 2025
12 changed files
--- a/docs/source/python_api/nunchaku.models.transformers.utils.rst
+++ b/docs/source/python_api/nunchaku.models.transformers.utils.rst
@@ -4,4 +4,5 @@ nunchaku.models.transformers.utils
 .. automodule:: nunchaku.models.transformers.utils
   :members:
   :undoc-members:
+   :private-members:
   :show-inheritance:
--- a/docs/source/usage/attention.rst
+++ b/docs/source/usage/attention.rst
@@ -6,7 +6,7 @@ and 50-series GPUs compared to FlashAttention-2, without precision loss.

 .. literalinclude:: ../../../examples/flux.1-dev-fp16attn.py
   :language: python
-   :caption: Running FLUX.1-dev with FP16 Attention (`examples/flux.1-dev-fp16attn.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-fp16attn.py>`__)
+   :caption: Running FLUX.1-dev with FP16 Attention (`examples/flux.1-dev-fp16attn.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-fp16attn.py>`__)
   :linenos:
   :emphasize-lines: 11


--- a/docs/source/usage/basic_usage.rst
+++ b/docs/source/usage/basic_usage.rst
 Basic Usage
 ===========

-The following is a minimal script for running 4-bit `FLUX.1 <flux_repo_>`_ using Nunchaku.
-Nunchaku provides the same API as `Diffusers <diffusers_repo_>`_, so you can use it in a familiar way.
+The following is a minimal script for running 4-bit `FLUX.1 <github_flux_>`_ using Nunchaku.
+Nunchaku provides the same API as `Diffusers <github_diffusers_>`_, so you can use it in a familiar way.

 .. tabs::

@@ -10,18 +10,19 @@ Nunchaku provides the same API as `Diffusers <diffusers_repo_>`_, so you can use

      .. literalinclude:: ../../../examples/flux.1-dev.py
         :language: python
-         :caption: Running FLUX.1-dev (`examples/flux.1-dev.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev.py>`__)
+         :caption: Running FLUX.1-dev (`examples/flux.1-dev.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev.py>`__)
         :linenos:

   .. tab:: Turing GPUs (e.g., RTX 20 series)

      .. literalinclude:: ../../../examples/flux.1-dev-turing.py
         :language: python
-         :caption: Running FLUX.1-dev on Turing GPUs (`examples/flux.1-dev-turing.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-turing.py>`__)
+         :caption: Running FLUX.1-dev on Turing GPUs (`examples/flux.1-dev-turing.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-turing.py>`__)
         :linenos:

 The key difference when using Nunchaku is replacing the standard ``FluxTransformer2dModel``
-with :class:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel`. The :meth:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel.from_pretrained`
+with :class:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel`.
+The :meth:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel.from_pretrained`
 method loads quantized models and accepts either Hugging Face remote file paths or local file paths.

 .. note::

--- a/docs/source/usage/controlnet.rst
+++ b/docs/source/usage/controlnet.rst
 ControlNets
 ===========

-.. image:: https://huggingface.co/mit-han-lab/nunchaku-artifacts/resolve/main/nunchaku/assets/control.jpg
-   :alt: ControlNet integration with Nunchaku
+.. image:: https://huggingface.co/datasets/nunchaku-tech/cdn/resolve/main/nunchaku/assets/control.jpg
+   :alt: ControlNet Integration with Nunchaku

 Nunchaku supports mainly two types of ControlNets for FLUX.1.
-The first one is `FLUX.1-tools <flux1_tools_>`_ from Black-Forest-Labs.
-The second one is the community-contributed ControlNets, like `ControlNet-Union-Pro <controlnet_union_pro_>`_.
+The first one is `FLUX.1-tools <blog_flux1-tools_>`_ from Black-Forest-Labs.
+The second one is the community-contributed ControlNets, like `ControlNet-Union-Pro <hf_cn-union-pro_>`_.

 FLUX.1-tools
 ------------
@@ -16,8 +16,8 @@ FLUX.1-tools Base Models

 Nunchaku provides quantized FLUX.1-tools base models.
 The implementation follows the same pattern as described in :doc:`Basic Usage <./basic_usage>`,
-utilizing an API interface compatible with `Diffusers <diffusers_repo_>`_
-where the ``FluxTransformer2dModel`` is replaced with ``NunchakuFluxTransformer2dModel``.
+utilizing an API interface compatible with `Diffusers <github_diffusers_>`_
+where the ``FluxTransformer2dModel`` is replaced with :class:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel`.
 The primary modification involves switching to the appropriate ControlNet pipeline.
 Refer to the following examples for detailed implementation guidance.

@@ -27,28 +27,28 @@ Refer to the following examples for detailed implementation guidance.

      .. literalinclude:: ../../../examples/flux.1-canny-dev.py
         :language: python
-         :caption: Running FLUX.1-Canny-Dev (`examples/flux.1-canny-dev.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-canny-dev.py>`__)
+         :caption: Running FLUX.1-Canny-Dev (`examples/flux.1-canny-dev.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-canny-dev.py>`__)
         :linenos:

   .. tab:: FLUX.1-Depth-Dev

      .. literalinclude:: ../../../examples/flux.1-depth-dev.py
         :language: python
-         :caption: Running FLUX.1-Depth-Dev (`examples/flux.1-depth-dev.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-depth-dev.py>`__)
+         :caption: Running FLUX.1-Depth-Dev (`examples/flux.1-depth-dev.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-depth-dev.py>`__)
         :linenos:

   .. tab:: FLUX.1-Fill-Dev

      .. literalinclude:: ../../../examples/flux.1-fill-dev.py
         :language: python
-         :caption: Running FLUX.1-Fill-Dev (`examples/flux.1-fill-dev.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-fill-dev.py>`__)
+         :caption: Running FLUX.1-Fill-Dev (`examples/flux.1-fill-dev.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-fill-dev.py>`__)
         :linenos:

   .. tab:: FLUX.1-Redux-Dev

      .. literalinclude:: ../../../examples/flux.1-redux-dev.py
         :language: python
-         :caption: Running FLUX.1-Redux-Dev (`examples/flux.1-redux-dev.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-redux-dev.py>`__)
+         :caption: Running FLUX.1-Redux-Dev (`examples/flux.1-redux-dev.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-redux-dev.py>`__)
         :linenos:

 FLUX.1-tools LoRAs
@@ -64,31 +64,31 @@ requiring only the ``FluxControlPipeline`` for the target model.

      .. literalinclude:: ../../../examples/flux.1-canny-dev-lora.py
         :language: python
-         :caption: Running FLUX.1-Canny-Dev-LoRA (`examples/flux.1-canny-dev-lora.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-canny-dev-lora.py>`__)
+         :caption: Running FLUX.1-Canny-Dev-LoRA (`examples/flux.1-canny-dev-lora.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-canny-dev-lora.py>`__)
         :linenos:

   .. tab:: FLUX.1-Depth-Dev

      .. literalinclude:: ../../../examples/flux.1-depth-dev-lora.py
         :language: python
-         :caption: Running FLUX.1-Depth-Dev-LoRA (`examples/flux.1-depth-dev-lora.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-depth-dev-lora.py>`__)
+         :caption: Running FLUX.1-Depth-Dev-LoRA (`examples/flux.1-depth-dev-lora.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-depth-dev-lora.py>`__)
         :linenos:

 ControlNet-Union-Pro
 --------------------

-`ControlNet-Union-Pro <controlnet_union_pro_>`_ is a community-developed ControlNet implementation for FLUX.1.
+`ControlNet-Union-Pro <hf_cn-union-pro_>`_ is a community-developed ControlNet implementation for FLUX.1.
 Unlike FLUX.1-tools that directly fine-tunes the model to incorporate control signals,
-`ControlNet-Union-Pro <controlnet_union_pro_>`_ uses additional control modules.
+`ControlNet-Union-Pro <hf_cn-union-pro_>`_ uses additional control modules.
 It provides native support for multiple control types including Canny edges and depth maps.

 Nunchaku currently executes these control modules at their original precision levels.
-The following example demonstrates running `ControlNet-Union-Pro <controlnet_union_pro_>`_ with Nunchaku.
+The following example demonstrates running `ControlNet-Union-Pro <hf_cn-union-pro_>`_ with Nunchaku.

 .. literalinclude:: ../../../examples/flux.1-dev-controlnet-union-pro.py
   :language: python
-   :caption: Running ControlNet-Union-Pro (`examples/flux.1-dev-controlnet-union-pro.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-controlnet-union-pro.py>`__)
+   :caption: Running ControlNet-Union-Pro (`examples/flux.1-dev-controlnet-union-pro.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-controlnet-union-pro.py>`__)
   :linenos:

-Usage for `ControlNet-Union-Pro2 <controlnet_union_pro2_>`_ is similar.
+Usage for `ControlNet-Union-Pro2 <hf_cn-union-pro2_>`_ is similar.
 Quantized ControlNet support is currently in development. Stay tuned!
--- a/docs/source/usage/fbcache.rst
+++ b/docs/source/usage/fbcache.rst
+.. _usage-fbcache:
+
 First-Block Cache
 =================

@@ -5,7 +7,7 @@ Nunchaku supports `First-Block Cache (FB Cache) <fbcache>`_ for faster long-step

 .. literalinclude:: ../../../examples/flux.1-dev-cache.py
   :language: python
-   :caption: Running FLUX.1-dev with FB Cache (`examples/flux.1-dev-cache.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-cache.py>`__)
+   :caption: Running FLUX.1-dev with FB Cache (`examples/flux.1-dev-cache.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-cache.py>`__)
   :linenos:
   :emphasize-lines: 15-17


--- a/docs/source/usage/kontext.rst
+++ b/docs/source/usage/kontext.rst
 FLUX.1-Kontext
 ==============

-.. image:: https://huggingface.co/mit-han-lab/nunchaku-artifacts/resolve/main/nunchaku/assets/kontext.png
+.. image:: https://huggingface.co/datasets/nunchaku-tech/cdn/resolve/main/nunchaku/assets/kontext.png
   :alt: FLUX.1-Kontext-dev integration with Nunchaku

 Nunchaku supports `FLUX-Kontext-dev <_flux1_kontext_dev>`_,
@@ -10,5 +10,5 @@ The implementation follows the same pattern as described in :doc:`Basic Usage <.

 .. literalinclude:: ../../../examples/flux.1-kontext-dev.py
   :language: python
-   :caption: Running FLUX.1-Kontext-dev (`examples/flux.1-kontext-dev.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-kontext-dev.py>`__)
+   :caption: Running FLUX.1-Kontext-dev (`examples/flux.1-kontext-dev.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-kontext-dev.py>`__)
   :linenos:
--- a/docs/source/usage/lora.rst
+++ b/docs/source/usage/lora.rst
 Customized LoRAs
 ================

-.. image:: https://huggingface.co/mit-han-lab/nunchaku-artifacts/resolve/main/nunchaku/assets/lora.jpg
+.. image:: https://huggingface.co/datasets/nunchaku-tech/cdn/resolve/main/nunchaku/assets/lora.jpg
   :alt: LoRA integration with Nunchaku

 Single LoRA
 -----------

-`Nunchaku <nunchaku_repo_>`_ seamlessly integrates with off-the-shelf LoRAs without requiring requantization.
+`Nunchaku <github_nunchaku_>`_ seamlessly integrates with off-the-shelf LoRAs without requiring requantization.
 Instead of fusing the LoRA branch into the main branch, we directly concatenate the LoRA weights to our low-rank branch.
 As Nunchaku uses fused kernel, the overhead of a separate low-rank branch is largely reduced.
-Below is an example of running FLUX.1-dev with `Ghibsky <ghibsky_lora_>`_ LoRA.
+Below is an example of running FLUX.1-dev with `Ghibsky <hf_lora_ghibsky_>`_ LoRA.

 .. literalinclude:: ../../../examples/flux.1-dev-lora.py
   :language: python
-   :caption: Running FLUX.1-dev with `Ghibsky <ghibsky_lora_>`_  LoRA (`examples/flux.1-dev-lora.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-lora.py>`__)
+   :caption: Running FLUX.1-dev with `Ghibsky <hf_lora_ghibsky_>`_  LoRA (`examples/flux.1-dev-lora.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-lora.py>`__)
   :linenos:
   :emphasize-lines: 16-19

@@ -24,10 +24,13 @@ The LoRA integration in Nunchaku works through two key methods:
 The :meth:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel.update_lora_params` method loads LoRA weights from a safetensors file. It supports:

 - **Local file path**: ``"/path/to/your/lora.safetensors"``
- **HuggingFace repository with specific file**: ``"aleksa-codes/flux-ghibsky-illustration/lora.safetensors"``. The system automatically downloads and caches the LoRA file on first access.
+- **HuggingFace repository with specific file**: ``"aleksa-codes/flux-ghibsky-illustration/lora.safetensors"``.
+  The system automatically downloads and caches the LoRA file on first access.

 **Controlling LoRA Strength** (lines 18-19):
-The :meth:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel.set_lora_strength` method sets the LoRA strength parameter, which controls how much influence the LoRA has on the final output. A value of 1.0 applies the full LoRA effect, while lower values (e.g., 0.5) apply a more subtle influence.
+The :meth:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel.set_lora_strength` method sets the LoRA strength parameter,
+which controls how much influence the LoRA has on the final output.
+A value of 1.0 applies the full LoRA effect, while lower values (e.g., 0.5) apply a more subtle influence.

 Multiple LoRAs
 --------------
@@ -40,7 +43,7 @@ The following example demonstrates how to compose and load multiple LoRAs:

 .. literalinclude:: ../../../examples/flux.1-dev-multiple-lora.py
   :language: python
-   :caption: Running FLUX.1-dev with `Ghibsky <ghibsky_lora_>`_ and `FLUX-Turbo <turbo_lora_>`_ LoRA (`examples/flux.1-dev-multiple-lora.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-multiple-lora.py>`__)
+   :caption: Running FLUX.1-dev with `Ghibsky <hf_lora_ghibsky_>`_ and `FLUX-Turbo <hf_lora_flux-turbo_>`_ LoRA (`examples/flux.1-dev-multiple-lora.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-multiple-lora.py>`__)
   :linenos:
   :emphasize-lines: 17-23

@@ -68,11 +71,13 @@ This composition method allows for precise control over individual LoRA strength
 LoRA Conversion
 ---------------

-Nunchaku utilizes the `Diffusers <diffusers_repo_>`_ LoRA format as an intermediate representation for converting LoRAs to Nunchaku's native format.
-Both the :meth:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel.update_lora_params` method and :func:`~nunchaku.lora.flux.compose.compose_lora` function internally invoke the `to_diffusers <to_diffusers_lora_>`_ method to convert LoRAs to the `Diffusers <diffusers_repo_>`_ format.
-If LoRA functionality is not working as expected, verify that the LoRA has been properly converted to the `Diffusers <diffusers_repo_>`_ format. Please check `to_diffusers <to_diffusers_lora_>`_ for more details.
+Nunchaku utilizes the `Diffusers <github_diffusers_>`_ LoRA format as an intermediate representation for converting LoRAs to Nunchaku's native format.
+Both the :meth:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel.update_lora_params` method and :func:`~nunchaku.lora.flux.compose.compose_lora`
+function internally invoke :func:`~nunchaku.lora.flux.diffusers_converter.to_diffusers` to convert LoRAs to the `Diffusers <github_diffusers_>`_ format.
+If LoRA functionality is not working as expected, verify that the LoRA has been properly converted to the `Diffusers <github_diffusers_>`_ format.
+Please check :func:`~nunchaku.lora.flux.diffusers_converter.to_diffusers` for more details.

-Following the conversion to `Diffusers <diffusers_repo_>`_ format,
+Following the conversion to `Diffusers <github_diffusers_>`_ format,
 the :meth:`~nunchaku.models.transformers.transformer_flux.NunchakuFluxTransformer2dModel.update_lora_params`
 method calls the :func:`~nunchaku.lora.flux.nunchaku_converter.to_nunchaku` function
 to perform the final conversion to Nunchaku's format.

--- a/docs/source/usage/offload.rst
+++ b/docs/source/usage/offload.rst
@@ -6,7 +6,7 @@ This feature is fully compatible with `Diffusers <diffusers_repo>`_ offload mech

 .. literalinclude:: ../../../examples/flux.1-dev-offload.py
   :language: python
-   :caption: Running FLUX.1-dev with CPU Offload (`examples/flux.1-dev-offload.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-offload.py>`__)
+   :caption: Running FLUX.1-dev with CPU Offload (`examples/flux.1-dev-offload.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-offload.py>`__)
   :linenos:
   :emphasize-lines: 9, 13, 14


--- a/docs/source/usage/pulid.rst
+++ b/docs/source/usage/pulid.rst
@@ -6,7 +6,7 @@ This feature allows you to generate images that maintain specific identity chara

 .. literalinclude:: ../../../examples/flux.1-dev-pulid.py
   :language: python
-   :caption: PuLID Example (`examples/flux.1-dev-pulid.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-pulid.py>`__)
+   :caption: PuLID Example (`examples/flux.1-dev-pulid.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-pulid.py>`__)
   :linenos:

 Implementation Overview

--- a/docs/source/usage/qencoder.rst
+++ b/docs/source/usage/qencoder.rst
@@ -5,7 +5,7 @@ Nunchaku provides a quantized T5 encoder for FLUX.1 to reduce GPU memory usage.

 .. literalinclude:: ../../../examples/flux.1-dev-qencoder.py
   :language: python
-   :caption: Running FLUX.1-dev with Quantized T5 (`examples/flux.1-dev-qencoder.py <https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-qencoder.py>`__)
+   :caption: Running FLUX.1-dev with Quantized T5 (`examples/flux.1-dev-qencoder.py <https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-dev-qencoder.py>`__)
   :linenos:
   :emphasize-lines: 11, 14


--- a/nunchaku/merge_safetensors.py
+++ b/nunchaku/merge_safetensors.py
 """
-Merge split safetensors model files into a single safetensors file.
-
-
+Merge split safetensors model files (deprecated format) into a single safetensors file.

 **Example usage**


--- a/nunchaku/models/transformers/utils.py
+++ b/nunchaku/models/transformers/utils.py
@@ -26,11 +26,6 @@ logger = logging.getLogger(__name__)
 class NunchakuModelLoaderMixin:
    """
    Mixin for standardized model loading in Nunchaku transformer models.
-
-    Provides:
-
-    - :meth:`_build_model`: Load model from a safetensors file.
-    - :meth:`_build_model_legacy`: Load model from a legacy folder structure (deprecated).
    """

    @classmethod
@@ -71,8 +66,8 @@ class NunchakuModelLoaderMixin:
        Build a transformer model from a legacy folder structure.

        .. warning::
-            This method is deprecated and will be removed in v0.4.
-            Please migrate to safetensors-based model loading.
+            This method is deprecated and will be removed in December 2025.
+            Please use :meth:`_build_model` instead.

        Parameters
        ----------
@@ -87,7 +82,7 @@ class NunchakuModelLoaderMixin:
            (transformer, unquantized_part_path, transformer_block_path)
        """
        logger.warning(
-            "Loading models from a folder will be deprecated in v0.4. "
+            "Loading models from a folder will be deprecated in December 2025. "
            "Please download the latest safetensors model, or use one of the following tools to "
            "merge your model into a single file: the CLI utility `python -m nunchaku.merge_safetensors` "
            "or the ComfyUI workflow `merge_safetensors.json`."