[Major] Release v0.1.4

Support 4-bit text encoder and per-layer CPU offloading, reducing FLUX's minimum memory requirement to just 4 GiB while maintaining a 2–3× speedup. Fix various issues related to resolution, LoRA, pin memory, and runtime stability. Check out the release notes for full details!

[Major] Release v0.1.4
Support 4-bit text encoder and per-layer CPU offloading, reducing FLUX's minimum memory requirement to just 4 GiB while maintaining a 2–3× speedup. Fix various issues related to resolution, LoRA, pin memory, and runtime stability. Check out the release notes for full details!
f060b8da · Muyang Li · GitHub · f549dfc6 · 873a35be · f060b8da
Unverified Commit f060b8da authored Mar 07, 2025 by Muyang Li Committed by GitHub Mar 07, 2025
20 changed files
--- a/README.md
+++ b/README.md
@@ -6,6 +6,7 @@ Chere [here](https://github.com/mit-han-lab/nunchaku/issues/149) to join our use

 ### [Paper](http://arxiv.org/abs/2411.05007) | [Project](https://hanlab.mit.edu/projects/svdquant) | [Blog](https://hanlab.mit.edu/blog/svdquant) | [Demo](https://svdquant.mit.edu)

+- **[2025-03-07]** 🚀 **Nunchaku v0.1.4 Released!** We've supported [4-bit text encoder and per-layer CPU offloading](#Low-Memory-Inference), reducing FLUX's minimum memory requirement to just **4 GiB** while maintaining a **2–3× speedup**. This update also fixes various issues related to resolution, LoRA, pin memory, and runtime stability. Check out the release notes for full details!
 - **[2025-02-20]** 🚀 We release the [pre-built wheels](https://huggingface.co/mit-han-lab/nunchaku) to simplify installation! Check [here](#Installation) for the guidance!
 - **[2025-02-20]** 🚀 **Support NVFP4 precision on NVIDIA RTX 5090!** NVFP4 delivers superior image quality compared to INT4, offering **~3× speedup** on the RTX 5090 over BF16. Learn more in our [blog](https://hanlab.mit.edu/blog/svdquant-nvfp4), checkout  [`examples`](./examples) for usage and try [our demo](https://svdquant.mit.edu/flux1-schnell/) online!
 - **[2025-02-18]** 🔥 [**Customized LoRA conversion**](#Customized-LoRA) and [**model quantization**](#Customized-Model-Quantization) instructions are now available! **[ComfyUI](./comfyui)** workflows now support **customized LoRA**, along with **FLUX.1-Tools**!
@@ -45,18 +46,27 @@ SVDQuant is a post-training quantization technique for 4-bit weights and activat

 ## Installation

-### Wheels (Linux only for now)
+### Wheels (for Linux and Windows WSL)

+#### For Windows Users
+To install and use WSL (Windows Subsystem for Linux), follow the instructions [here](https://learn.microsoft.com/en-us/windows/wsl/install). You can also install WSL directly by running the following commands in PowerShell:
+```shell
+wsl --install # install the latest WSL
+wsl # launch WSL
+```
+
+#### Prerequisites for all users
 Before installation, ensure you have [PyTorch>=2.5](https://pytorch.org/) installed. For example, you can use the following command to install PyTorch 2.6:

 ```shell
 pip install torch==2.6 torchvision==0.21 torchaudio==2.6
 ```

+#### Installing nunchaku
 Once PyTorch is installed, you can directly install `nunchaku` from our [Hugging Face repository](https://huggingface.co/mit-han-lab/nunchaku/tree/main). Be sure to select the appropriate wheel for your Python and PyTorch version. For example, for Python 3.11 and PyTorch 2.6:

 ```shell
-pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.1.3+torch2.6-cp311-cp311-linux_x86_64.whl
+pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.1.4+torch2.6-cp311-cp311-linux_x86_64.whl
 ```

 **Note**: NVFP4 wheels are not currently available because PyTorch has not officially supported CUDA 11.8. To use NVFP4, you will need **Blackwell GPUs (e.g., 50-series GPUs)** and must **build from source**.
@@ -81,7 +91,7 @@ pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.
  pip install peft opencv-python gradio spaces GPUtil  # For gradio demos
  ```

-  To enable NVFP4 on Blackwell GPUs (e.g., 50-series GPUs), please install nightly PyTorch with CUDA 12.8. The installation command can be:
+ To enable NVFP4 on Blackwell GPUs (e.g., 50-series GPUs), please install nightly PyTorch with CUDA 12.8. The installation command can be:

  ```shell
  pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
@@ -113,7 +123,7 @@ In [examples](examples), we provide minimal scripts for running INT4 [FLUX.1](ht
 import torch
 from diffusers import FluxPipeline

-from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
+from nunchaku import NunchakuFluxTransformer2dModel

 transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-int4-flux.1-dev")
 pipeline = FluxPipeline.from_pretrained(
@@ -125,6 +135,28 @@ image.save("flux.1-dev.png")

 Specifically, `nunchaku` shares the same APIs as [diffusers](https://github.com/huggingface/diffusers) and can be used in a similar way.

+### Low Memory Inference
+
+To further reduce GPU memory usage, you can use our 4-bit T5 encoder along with CPU offloading, requiring a minimum of just 4GiB of memory. The usage is also simple in the diffusers' way. For example, the [script](examples/int4-flux.1-dev-qencoder.py) for FLUX.1-dev is as follows:
+
+```python
+import torch
+from diffusers import FluxPipeline
+
+from nunchaku import NunchakuFluxTransformer2dModel, NunchakuT5EncoderModel
+
+transformer = NunchakuFluxTransformer2dModel.from_pretrained(
+    "mit-han-lab/svdq-int4-flux.1-dev", offload=True
+)  # set offload to False if you want to disable offloading
+text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
+pipeline = FluxPipeline.from_pretrained(
+    "black-forest-labs/FLUX.1-dev", text_encoder_2=text_encoder_2, transformer=transformer, torch_dtype=torch.bfloat16
+).to("cuda")
+pipeline.enable_sequential_cpu_offload()  # remove this line if you want to disable the CPU offloading
+image = pipeline("A cat holding a sign that says hello world", num_inference_steps=50, guidance_scale=3.5).images[0]
+image.save("flux.1-dev.png")
+```
+
 ## Customized LoRA

 ![lora](./assets/lora.jpg)
@@ -168,7 +200,7 @@ transformer.set_lora_strength(lora_strength)
 import torch
 from diffusers import FluxPipeline

-from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
+from nunchaku import NunchakuFluxTransformer2dModel

 transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-int4-flux.1-dev")
 pipeline = FluxPipeline.from_pretrained(

--- a/app/flux.1/depth_canny/run_gradio.py
+++ b/app/flux.1/depth_canny/run_gradio.py
@@ -12,7 +12,7 @@ from image_gen_aux import DepthPreprocessor
 from PIL import Image

 from nunchaku.models.safety_checker import SafetyChecker
-from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
+from nunchaku.models.transformers.transformer_flux import NunchakuFluxTransformer2dModel
 from utils import get_args
 from vars import (
    DEFAULT_GUIDANCE_CANNY,
@@ -57,7 +57,7 @@ else:
    transformer = NunchakuFluxTransformer2dModel.from_pretrained(f"mit-han-lab/svdq-int4-flux.1-{model_name}")
    pipeline_init_kwargs["transformer"] = transformer
    if args.use_qencoder:
-        from nunchaku.models.text_encoder import NunchakuT5EncoderModel
+        from nunchaku.models.text_encoders.t5_encoder import NunchakuT5EncoderModel

        text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
        pipeline_init_kwargs["text_encoder_2"] = text_encoder_2

--- a/app/flux.1/fill/run_gradio.py
+++ b/app/flux.1/fill/run_gradio.py
@@ -10,7 +10,7 @@ from diffusers import FluxFillPipeline
 from PIL import Image

 from nunchaku.models.safety_checker import SafetyChecker
-from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
+from nunchaku.models.transformers.transformer_flux import NunchakuFluxTransformer2dModel
 from utils import get_args
 from vars import DEFAULT_GUIDANCE, DEFAULT_INFERENCE_STEP, DEFAULT_STYLE_NAME, EXAMPLES, MAX_SEED, STYLE_NAMES, STYLES

@@ -29,7 +29,7 @@ else:
    transformer = NunchakuFluxTransformer2dModel.from_pretrained(f"mit-han-lab/svdq-int4-flux.1-fill-dev")
    pipeline_init_kwargs["transformer"] = transformer
    if args.use_qencoder:
-        from nunchaku.models.text_encoder import NunchakuT5EncoderModel
+        from nunchaku.models.text_encoders.t5_encoder import NunchakuT5EncoderModel

        text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
        pipeline_init_kwargs["text_encoder_2"] = text_encoder_2

--- a/app/flux.1/redux/run_gradio.py
+++ b/app/flux.1/redux/run_gradio.py
@@ -9,7 +9,7 @@ import torch
 from diffusers import FluxPipeline, FluxPriorReduxPipeline
 from PIL import Image

-from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
+from nunchaku.models.transformers.transformer_flux import NunchakuFluxTransformer2dModel
 from utils import get_args
 from vars import DEFAULT_GUIDANCE, DEFAULT_INFERENCE_STEP, EXAMPLES, MAX_SEED


--- a/app/flux.1/sketch/run_gradio.py
+++ b/app/flux.1/sketch/run_gradio.py
@@ -12,7 +12,7 @@ from PIL import Image

 from flux_pix2pix_pipeline import FluxPix2pixTurboPipeline
 from nunchaku.models.safety_checker import SafetyChecker
-from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
+from nunchaku.models.transformers.transformer_flux import NunchakuFluxTransformer2dModel
 from utils import get_args
 from vars import DEFAULT_SKETCH_GUIDANCE, DEFAULT_STYLE_NAME, MAX_SEED, STYLE_NAMES, STYLES

@@ -36,7 +36,7 @@ else:
    transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-int4-flux.1-schnell")
    pipeline_init_kwargs["transformer"] = transformer
    if args.use_qencoder:
-        from nunchaku.models.text_encoder import NunchakuT5EncoderModel
+        from nunchaku.models.text_encoders.t5_encoder import NunchakuT5EncoderModel

        text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
        pipeline_init_kwargs["text_encoder_2"] = text_encoder_2

--- a/app/flux.1/t2i/evaluate.py
+++ b/app/flux.1/t2i/evaluate.py
@@ -57,7 +57,7 @@ def main():
    for dataset_name in args.datasets:
        output_dirname = os.path.join(output_root, dataset_name)
        os.makedirs(output_dirname, exist_ok=True)
-        dataset = get_dataset(name=dataset_name)
+        dataset = get_dataset(name=dataset_name, max_dataset_size=8)
        if args.chunk_step > 1:
            dataset = dataset.select(range(args.chunk_start, len(dataset), args.chunk_step))
        for row in tqdm(dataset):

--- a/app/flux.1/t2i/utils.py
+++ b/app/flux.1/t2i/utils.py
@@ -2,7 +2,7 @@ import torch
 from diffusers import FluxPipeline
 from peft.tuners import lora

-from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
+from nunchaku import NunchakuFluxTransformer2dModel
 from vars import LORA_PATHS, SVDQ_LORA_PATHS


@@ -32,11 +32,11 @@ def get_pipeline(
            else:
                assert precision == "fp4"
                transformer = NunchakuFluxTransformer2dModel.from_pretrained(
-                    "/home/muyang/nunchaku_models/flux.1-schnell-nvfp4-svdq-gptq", precision="fp4"
+                    "mit-han-lab/svdq-fp4-flux.1-schnell", precision="fp4"
                )
            pipeline_init_kwargs["transformer"] = transformer
            if use_qencoder:
-                from nunchaku.models.text_encoder import NunchakuT5EncoderModel
+                from nunchaku.models.text_encoders.t5_encoder import NunchakuT5EncoderModel

                text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
                pipeline_init_kwargs["text_encoder_2"] = text_encoder_2
@@ -53,7 +53,7 @@ def get_pipeline(
                transformer.set_lora_strength(lora_weight)
            pipeline_init_kwargs["transformer"] = transformer
            if use_qencoder:
-                from nunchaku.models.text_encoder import NunchakuT5EncoderModel
+                from nunchaku.models.text_encoders.t5_encoder import NunchakuT5EncoderModel

                text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
                pipeline_init_kwargs["text_encoder_2"] = text_encoder_2

--- a/app/sana/t2i/utils.py
+++ b/app/sana/t2i/utils.py
 import torch
 from diffusers import SanaPAGPipeline

-from nunchaku.models.transformer_sana import NunchakuSanaTransformer2DModel
+from nunchaku.models.transformers.transformer_sana import NunchakuSanaTransformer2DModel


 def hash_str_to_int(s: str) -> int:
@@ -30,7 +30,7 @@ def get_pipeline(
        variant="bf16",
        torch_dtype=torch.bfloat16,
        pag_applied_layers="transformer_blocks.8",
-        **pipeline_init_kwargs
+        **pipeline_init_kwargs,
    )
    if precision == "int4":
        pipeline._set_pag_attn_processor = lambda *args, **kwargs: None

--- a/comfyui/README.md
+++ b/comfyui/README.md
@@ -70,8 +70,6 @@ comfy node registry-install svdquant

     * Install missing nodes (e.g., comfyui-inpainteasy) following [this tutorial](https://github.com/ltdrdata/ComfyUI-Manager?tab=readme-ov-file#support-of-missing-nodes-installation).

-
-
 2. **Download Required Models**: Follow [this tutorial](https://comfyanonymous.github.io/ComfyUI_examples/flux/) and download the required models into the appropriate directories using the commands below:

   ```shell
@@ -92,7 +90,7 @@ comfy node registry-install svdquant

 * **SVDQuant Flux DiT Loader**: A node for loading the FLUX diffusion model. 

-  * `model_path`: Specifies the model location. If set to `mit-han-lab/svdq-int4-flux.1-schnell`, `mit-han-lab/svdq-int4-flux.1-dev`, `mit-han-lab/svdq-int4-flux.1-canny-dev`, `mit-han-lab/svdq-int4-flux.1-fill-dev` or `mit-han-lab/svdq-int4-flux.1-depth-dev`, the model will be automatically downloaded from our Hugging Face repository. Alternatively, you can manually download the model directory by running the following command example:
+  * `model_path`: Specifies the model location. If set to the folder starting with `mit-han-lab`, the model will be automatically downloaded from our Hugging Face repository. Alternatively, you can manually download the model directory by running the following command example:

    ```shell
    huggingface-cli download mit-han-lab/svdq-int4-flux.1-dev --local-dir models/diffusion_models/svdq-int4-flux.1-dev
@@ -100,6 +98,8 @@ comfy node registry-install svdquant

     After downloading, specify the corresponding folder name as the `model_path`.

+  * `cpu_offload`: Enables CPU offloading for the transformer model. While this may reduce GPU memory usage, it can slow down inference. Memory usage will be further optimized in node v0.1.6.
+
  * `device_id`: Indicates the GPU ID for running the model.

 * **SVDQuant FLUX LoRA Loader**: A node for loading LoRA modules for SVDQuant FLUX models.

--- a/comfyui/nodes/lora/flux.py
+++ b/comfyui/nodes/lora/flux.py
@@ -4,10 +4,7 @@ import tempfile
 import folder_paths
 from safetensors.torch import save_file

-from nunchaku.lora.flux.comfyui_converter import comfyui2diffusers
-from nunchaku.lora.flux.diffusers_converter import convert_to_nunchaku_flux_lowrank_dict
-from nunchaku.lora.flux.utils import detect_format
-from nunchaku.lora.flux.xlab_converter import xlab2diffusers
+from nunchaku.lora.flux import comfyui2diffusers, convert_to_nunchaku_flux_lowrank_dict, detect_format, xlab2diffusers


 class SVDQuantFluxLoraLoader:
@@ -25,6 +22,8 @@ class SVDQuantFluxLoraLoader:
        base_model_paths = [
            "mit-han-lab/svdq-int4-flux.1-dev",
            "mit-han-lab/svdq-int4-flux.1-schnell",
+            "mit-han-lab/svdq-fp4-flux.1-dev",
+            "mit-han-lab/svdq-fp4-flux.1-schnell",
            "mit-han-lab/svdq-int4-flux.1-canny-dev",
            "mit-han-lab/svdq-int4-flux.1-depth-dev",
            "mit-han-lab/svdq-int4-flux.1-fill-dev",

--- a/comfyui/nodes/models/flux.py
+++ b/comfyui/nodes/models/flux.py
@@ -10,7 +10,7 @@ from diffusers import FluxTransformer2DModel
 from einops import rearrange, repeat
 from torch import nn

-from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
+from nunchaku import NunchakuFluxTransformer2dModel


 class ComfyUIFluxForwardWrapper(nn.Module):
@@ -67,31 +67,53 @@ class ComfyUIFluxForwardWrapper(nn.Module):
 class SVDQuantFluxDiTLoader:
    @classmethod
    def INPUT_TYPES(s):
-        # folder_paths.get_filename_list("loras"),
        model_paths = [
            "mit-han-lab/svdq-int4-flux.1-schnell",
            "mit-han-lab/svdq-int4-flux.1-dev",
+            "mit-han-lab/svdq-fp4-flux.1-schnell",
+            "mit-han-lab/svdq-fp4-flux.1-dev",
            "mit-han-lab/svdq-int4-flux.1-canny-dev",
            "mit-han-lab/svdq-int4-flux.1-depth-dev",
            "mit-han-lab/svdq-int4-flux.1-fill-dev",
        ]
-        prefix = os.path.join(folder_paths.models_dir, "diffusion_models")
-        local_folders = os.listdir(prefix)
-        local_folders = sorted(
-            [
-                folder
-                for folder in local_folders
-                if not folder.startswith(".") and os.path.isdir(os.path.join(prefix, folder))
-            ]
-        )
+        prefixes = folder_paths.folder_names_and_paths["diffusion_models"][0]
+        local_folders = set()
+        for prefix in prefixes:
+            if os.path.exists(prefix) and os.path.isdir(prefix):
+                local_folders_ = os.listdir(prefix)
+                local_folders_ = [
+                    folder
+                    for folder in local_folders_
+                    if not folder.startswith(".") and os.path.isdir(os.path.join(prefix, folder))
+                ]
+                local_folders.update(local_folders_)
+        local_folders = sorted(list(local_folders))
        model_paths = local_folders + model_paths
        ngpus = len(GPUtil.getGPUs())
        return {
            "required": {
-                "model_path": (model_paths,),
+                "model_path": (
+                    model_paths,
+                    {"tooltip": "The SVDQuant quantized FLUX.1 models. It can be a huggingface path or a local path."},
+                ),
+                "cpu_offload": (
+                    ["enable", "disable"],
+                    {
+                        "default": "disable",
+                        "tooltip": "Whether to enable CPU offload for the transformer model. This may slow down the inference, but may reduce the GPU memory usage.",
+                    },
+                ),
                "device_id": (
                    "INT",
-                    {"default": 0, "min": 0, "max": ngpus, "step": 1, "display": "number", "lazy": True},
+                    {
+                        "default": 0,
+                        "min": 0,
+                        "max": ngpus,
+                        "step": 1,
+                        "display": "number",
+                        "lazy": True,
+                        "tooltip": "The GPU device ID to use for the model.",
+                    },
                ),
            }
        }
@@ -101,14 +123,15 @@ class SVDQuantFluxDiTLoader:
    CATEGORY = "SVDQuant"
    TITLE = "SVDQuant Flux DiT Loader"

-    def load_model(self, model_path: str, device_id: int, **kwargs) -> tuple[FluxTransformer2DModel]:
+    def load_model(self, model_path: str, cpu_offload: str, device_id: int, **kwargs) -> tuple[FluxTransformer2DModel]:
        device = f"cuda:{device_id}"
-        prefix = os.path.join(folder_paths.models_dir, "diffusion_models")
-        if os.path.exists(os.path.join(prefix, model_path)):
-            model_path = os.path.join(prefix, model_path)
-        else:
-            model_path = model_path
-        transformer = NunchakuFluxTransformer2dModel.from_pretrained(model_path).to(device)
+        prefixes = folder_paths.folder_names_and_paths["diffusion_models"][0]
+        for prefix in prefixes:
+            if os.path.exists(os.path.join(prefix, model_path)):
+                model_path = os.path.join(prefix, model_path)
+                break
+        transformer = NunchakuFluxTransformer2dModel.from_pretrained(model_path, offload=cpu_offload == "enable")
+        transformer = transformer.to(device)
        dit_config = {
            "image_model": "flux",
            "patch_size": 2,

--- a/comfyui/nodes/models/text_encoder.py
+++ b/comfyui/nodes/models/text_encoder.py
@@ -4,39 +4,60 @@ import types
 import comfy.sd
 import folder_paths
 import torch
+from torch import nn
 from transformers import T5EncoderModel

+from nunchaku import NunchakuT5EncoderModel
+

 def svdquant_t5_forward(
    self: T5EncoderModel,
    input_ids: torch.LongTensor,
    attention_mask,
+    embeds=None,
    intermediate_output=None,
    final_layer_norm_intermediate=True,
    dtype: str | torch.dtype = torch.bfloat16,
+    **kwargs,
 ):
    assert attention_mask is None
    assert intermediate_output is None
    assert final_layer_norm_intermediate
-    outputs = self.encoder(input_ids, attention_mask=attention_mask)
+    outputs = self.encoder(input_ids=input_ids, inputs_embeds=embeds, attention_mask=attention_mask)
    hidden_states = outputs["last_hidden_state"]
    hidden_states = hidden_states.to(dtype=dtype)
    return hidden_states, None


+class WrappedEmbedding(nn.Module):
+    def __init__(self, embedding: nn.Embedding):
+        super().__init__()
+        self.embedding = embedding
+
+    def forward(self, input: torch.Tensor, out_dtype: torch.dtype | None = None):
+        return self.embedding(input)
+
+    @property
+    def weight(self):
+        return self.embedding.weight
+
+
 class SVDQuantTextEncoderLoader:
    @classmethod
    def INPUT_TYPES(s):
        model_paths = ["mit-han-lab/svdq-flux.1-t5"]
-        prefix = os.path.join(folder_paths.models_dir, "text_encoders")
-        local_folders = os.listdir(prefix)
-        local_folders = sorted(
-            [
-                folder
-                for folder in local_folders
-                if not folder.startswith(".") and os.path.isdir(os.path.join(prefix, folder))
-            ]
-        )
+        prefixes = folder_paths.folder_names_and_paths["text_encoders"][0]
+        local_folders = set()
+        for prefix in prefixes:
+            if os.path.exists(prefix) and os.path.isdir(prefix):
+                local_folders_ = os.listdir(prefix)
+                local_folders_ = [
+                    folder
+                    for folder in local_folders_
+                    if not folder.startswith(".") and os.path.isdir(os.path.join(prefix, folder))
+                ]
+                local_folders.update(local_folders_)
+        local_folders = sorted(list(local_folders))
        model_paths.extend(local_folders)
        return {
            "required": {
@@ -45,14 +66,7 @@ class SVDQuantTextEncoderLoader:
                "text_encoder2": (folder_paths.get_filename_list("text_encoders"),),
                "t5_min_length": (
                    "INT",
-                    {
-                        "default": 512,
-                        "min": 256,
-                        "max": 1024,
-                        "step": 128,
-                        "display": "number",
-                        "lazy": True,
-                    },
+                    {"default": 512, "min": 256, "max": 1024, "step": 128, "display": "number", "lazy": True},
                ),
                "t5_precision": (["BF16", "INT4"],),
                "int4_model": (model_paths, {"tooltip": "The name of the INT4 model."}),
@@ -92,20 +106,23 @@ class SVDQuantTextEncoderLoader:
            clip.tokenizer.t5xxl.min_length = t5_min_length

        if t5_precision == "INT4":
-            from nunchaku.models.text_encoder import NunchakuT5EncoderModel
-
            transformer = clip.cond_stage_model.t5xxl.transformer
            param = next(transformer.parameters())
            dtype = param.dtype
            device = param.device

-            prefix = "models/text_encoders"
-            if os.path.exists(os.path.join(prefix, int4_model)):
-                model_path = os.path.join(prefix, int4_model)
-            else:
+            prefixes = folder_paths.folder_names_and_paths["diffusion_models"][0]
+            model_path = None
+            for prefix in prefixes:
+                if os.path.exists(os.path.join(prefix, int4_model)):
+                    model_path = os.path.join(prefix, int4_model)
+                    break
+            if model_path is None:
                model_path = int4_model
            transformer = NunchakuT5EncoderModel.from_pretrained(model_path)
            transformer.forward = types.MethodType(svdquant_t5_forward, transformer)
+            transformer.shared = WrappedEmbedding(transformer.shared)
+
            clip.cond_stage_model.t5xxl.transformer = (
                transformer.to(device=device, dtype=dtype) if device.type == "cuda" else transformer
            )

--- a/comfyui/pyproject.toml
+++ b/comfyui/pyproject.toml
 [project]
 name = "svdquant"
-description = "SVDQuant ComfyUI Node. SVDQuant is a new post-training training quantization paradigm for diffusion models, which quantize both the weights and activations of FLUX.1 to 4 bits, achieving 3.5× memory and 8.7× latency reduction on a 16GB laptop 4090 GPU."
-version = "0.1.3"
+description = "SVDQuant ComfyUI Node. SVDQuant is a new post-training training quantization paradigm for diffusion models, which quantize both the weights and activations of FLUX.1 to 4 bits, achieving 3.5× memory and 8.7× latency reduction on a 16GB laptop 4090 GPU. GitHub: https://github.com/mit-han-lab/nunchaku"
+version = "0.1.5"
 license = { file = "LICENSE.txt" }
 dependencies = []
-requires-python = ">=3.11, <3.13"
+requires-python = ">=3.10, <3.13"

-[project.urls]
-Repository = "https://github.com/mit-han-lab/nunchaku"
+#[project.urls]
+#Repository = "https://github.com/mit-han-lab/nunchaku"
 #  Used by Comfy Registry https://comfyregistry.org

 [tool.comfy]

--- a/comfyui/workflows/svdq-flux.1-canny.json
+++ b/comfyui/workflows/svdq-flux.1-canny.json
@@ -19,45 +19,50 @@
      "inputs": [
        {
          "name": "model",
+          "localized_name": "model",
+          "label": "model",
          "type": "MODEL",
-          "link": 71,
-          "label": "model"
+          "link": 71
        },
        {
          "name": "positive",
+          "localized_name": "positive",
+          "label": "positive",
          "type": "CONDITIONING",
-          "link": 64,
-          "label": "positive"
+          "link": 64
        },
        {
          "name": "negative",
+          "localized_name": "negative",
+          "label": "negative",
          "type": "CONDITIONING",
-          "link": 65,
-          "label": "negative"
+          "link": 65
        },
        {
          "name": "latent_image",
+          "localized_name": "latent_image",
+          "label": "latent_image",
          "type": "LATENT",
-          "link": 66,
-          "label": "latent_image"
+          "link": 66
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
+          "localized_name": "LATENT",
+          "label": "LATENT",
          "type": "LATENT",
          "links": [
            7
          ],
-          "slot_index": 0,
-          "label": "LATENT"
+          "slot_index": 0
        }
      ],
      "properties": {
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
-        896617285614695,
+        875054580097021,
        "randomize",
        20,
        1,
@@ -83,56 +88,63 @@
      "inputs": [
        {
          "name": "positive",
+          "localized_name": "positive",
+          "label": "positive",
          "type": "CONDITIONING",
-          "link": 67,
-          "label": "positive"
+          "link": 67
        },
        {
          "name": "negative",
+          "localized_name": "negative",
+          "label": "negative",
          "type": "CONDITIONING",
-          "link": 68,
-          "label": "negative"
+          "link": 68
        },
        {
          "name": "vae",
+          "localized_name": "vae",
+          "label": "vae",
          "type": "VAE",
-          "link": 69,
-          "label": "vae"
+          "link": 69
        },
        {
          "name": "pixels",
+          "localized_name": "pixels",
+          "label": "pixels",
          "type": "IMAGE",
-          "link": 70,
-          "label": "pixels"
+          "link": 70
        }
      ],
      "outputs": [
        {
          "name": "positive",
+          "localized_name": "positive",
+          "label": "positive",
          "type": "CONDITIONING",
          "links": [
            64
          ],
-          "slot_index": 0,
-          "label": "positive"
+          "slot_index": 0
        },
        {
          "name": "negative",
+          "localized_name": "negative",
+          "label": "negative",
          "type": "CONDITIONING",
          "links": [
            65
          ],
-          "slot_index": 1,
-          "label": "negative"
+          "slot_index": 1
        },
        {
          "name": "latent",
+          "localized_name": "latent",
+          "label": "latent",
          "type": "LATENT",
          "links": [
            66
          ],
-          "slot_index": 2,
-          "label": "latent"
+          "slot_index": 2
        }
      ],
      "properties": {
@@ -157,26 +169,29 @@
      "inputs": [
        {
          "name": "samples",
+          "localized_name": "samples",
+          "label": "samples",
          "type": "LATENT",
-          "link": 7,
-          "label": "samples"
+          "link": 7
        },
        {
          "name": "vae",
+          "localized_name": "vae",
+          "label": "vae",
          "type": "VAE",
-          "link": 60,
-          "label": "vae"
+          "link": 60
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
+          "label": "IMAGE",
          "type": "IMAGE",
          "links": [
            9
          ],
-          "slot_index": 0,
-          "label": "IMAGE"
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -201,9 +216,10 @@
      "inputs": [
        {
          "name": "images",
+          "localized_name": "images",
+          "label": "images",
          "type": "IMAGE",
-          "link": 9,
-          "label": "images"
+          "link": 9
        }
      ],
      "outputs": [],
@@ -230,13 +246,14 @@
      "outputs": [
        {
          "name": "VAE",
+          "localized_name": "VAE",
+          "label": "VAE",
          "type": "VAE",
          "links": [
            60,
            69
          ],
-          "slot_index": 0,
-          "label": "VAE"
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -263,21 +280,23 @@
      "inputs": [
        {
          "name": "conditioning",
+          "localized_name": "conditioning",
+          "label": "conditioning",
          "type": "CONDITIONING",
-          "link": 41,
-          "label": "conditioning"
+          "link": 41
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
+          "label": "CONDITIONING",
          "type": "CONDITIONING",
+          "shape": 3,
          "links": [
            67
          ],
-          "slot_index": 0,
-          "shape": 3,
-          "label": "CONDITIONING"
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -296,7 +315,7 @@
      ],
      "size": [
        315,
-        106
+        122
      ],
      "flags": {},
      "order": 1,
@@ -305,12 +324,13 @@
      "outputs": [
        {
          "name": "CLIP",
+          "localized_name": "CLIP",
+          "label": "CLIP",
          "type": "CLIP",
          "links": [
            62,
            63
-          ],
-          "label": "CLIP"
+          ]
        }
      ],
      "properties": {
@@ -340,20 +360,22 @@
      "inputs": [
        {
          "name": "clip",
+          "localized_name": "clip",
+          "label": "clip",
          "type": "CLIP",
-          "link": 62,
-          "label": "clip"
+          "link": 62
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
+          "label": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            41
          ],
-          "slot_index": 0,
-          "label": "CONDITIONING"
+          "slot_index": 0
        }
      ],
      "title": "CLIP Text Encode (Positive Prompt)",
@@ -383,9 +405,10 @@
      "inputs": [
        {
          "name": "images",
+          "localized_name": "images",
+          "label": "images",
          "type": "IMAGE",
-          "link": 26,
-          "label": "images"
+          "link": 26
        }
      ],
      "outputs": [],
@@ -411,22 +434,24 @@
      "inputs": [
        {
          "name": "image",
+          "localized_name": "image",
+          "label": "image",
          "type": "IMAGE",
-          "link": 76,
-          "label": "image"
+          "link": 76
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
+          "label": "IMAGE",
          "type": "IMAGE",
+          "shape": 3,
          "links": [
            26,
            70
          ],
-          "slot_index": 0,
-          "shape": 3,
-          "label": "IMAGE"
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -454,6 +479,7 @@
      "inputs": [
        {
          "name": "image",
+          "localized_name": "image",
          "type": "IMAGE",
          "link": 75
        }
@@ -461,6 +487,7 @@
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
          "type": "IMAGE",
          "links": [
            76
@@ -478,48 +505,6 @@
        "center"
      ]
    },
-    {
-      "id": 17,
-      "type": "LoadImage",
-      "pos": [
-        6.694743633270264,
-        562.3865966796875
-      ],
-      "size": [
-        315,
-        314.0000305175781
-      ],
-      "flags": {},
-      "order": 2,
-      "mode": 0,
-      "inputs": [],
-      "outputs": [
-        {
-          "name": "IMAGE",
-          "type": "IMAGE",
-          "links": [
-            75
-          ],
-          "slot_index": 0,
-          "shape": 3,
-          "label": "IMAGE"
-        },
-        {
-          "name": "MASK",
-          "type": "MASK",
-          "links": null,
-          "shape": 3,
-          "label": "MASK"
-        }
-      ],
-      "properties": {
-        "Node name for S&R": "LoadImage"
-      },
-      "widgets_values": [
-        "robot.png",
-        "image"
-      ]
-    },
    {
      "id": 7,
      "type": "CLIPTextEncode",
@@ -539,20 +524,22 @@
      "inputs": [
        {
          "name": "clip",
+          "localized_name": "clip",
+          "label": "clip",
          "type": "CLIP",
-          "link": 63,
-          "label": "clip"
+          "link": 63
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
+          "label": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            68
          ],
-          "slot_index": 0,
-          "label": "CONDITIONING"
+          "slot_index": 0
        }
      ],
      "title": "CLIP Text Encode (Negative Prompt)",
@@ -565,6 +552,50 @@
      "color": "#322",
      "bgcolor": "#533"
    },
+    {
+      "id": 17,
+      "type": "LoadImage",
+      "pos": [
+        6.694743633270264,
+        562.3865966796875
+      ],
+      "size": [
+        315,
+        314.0000305175781
+      ],
+      "flags": {},
+      "order": 2,
+      "mode": 0,
+      "inputs": [],
+      "outputs": [
+        {
+          "name": "IMAGE",
+          "localized_name": "IMAGE",
+          "label": "IMAGE",
+          "type": "IMAGE",
+          "shape": 3,
+          "links": [
+            75
+          ],
+          "slot_index": 0
+        },
+        {
+          "name": "MASK",
+          "localized_name": "MASK",
+          "label": "MASK",
+          "type": "MASK",
+          "shape": 3,
+          "links": null
+        }
+      ],
+      "properties": {
+        "Node name for S&R": "LoadImage"
+      },
+      "widgets_values": [
+        "robot.png",
+        "image"
+      ]
+    },
    {
      "id": 36,
      "type": "SVDQuantFluxDiTLoader",
@@ -574,7 +605,7 @@
      ],
      "size": [
        395.6002197265625,
-        105.77959442138672
+        106
      ],
      "flags": {},
      "order": 3,
@@ -583,6 +614,7 @@
      "outputs": [
        {
          "name": "MODEL",
+          "localized_name": "MODEL",
          "type": "MODEL",
          "links": [
            71
@@ -595,6 +627,7 @@
      },
      "widgets_values": [
        "mit-han-lab/svdq-int4-flux.1-canny-dev",
+        "disable",
        0
      ]
    }
@@ -741,14 +774,14 @@
  "config": {},
  "extra": {
    "ds": {
-      "scale": 0.895430243255241,
+      "scale": 1.5863092971714992,
      "offset": [
-        838.4305404853558,
-        332.05158795287764
+        170.04223120944968,
+        209.5374167314878
      ]
    },
    "node_versions": {
-      "comfy-core": "0.3.14"
+      "comfy-core": "0.3.24"
    }
  },
  "version": 0.4

--- a/comfyui/workflows/svdq-flux.1-depth.json
+++ b/comfyui/workflows/svdq-flux.1-depth.json
@@ -21,20 +21,22 @@
      "inputs": [
        {
          "name": "clip",
+          "localized_name": "clip",
+          "label": "clip",
          "type": "CLIP",
-          "link": 63,
-          "label": "clip"
+          "link": 63
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
+          "label": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            68
          ],
-          "slot_index": 0,
-          "label": "CONDITIONING"
+          "slot_index": 0
        }
      ],
      "title": "CLIP Text Encode (Negative Prompt)",
@@ -56,7 +58,7 @@
      ],
      "size": [
        315,
-        106
+        122
      ],
      "flags": {},
      "order": 0,
@@ -65,12 +67,13 @@
      "outputs": [
        {
          "name": "CLIP",
+          "localized_name": "CLIP",
+          "label": "CLIP",
          "type": "CLIP",
          "links": [
            62,
            63
-          ],
-          "label": "CLIP"
+          ]
        }
      ],
      "properties": {
@@ -100,21 +103,23 @@
      "inputs": [
        {
          "name": "conditioning",
+          "localized_name": "conditioning",
+          "label": "conditioning",
          "type": "CONDITIONING",
-          "link": 41,
-          "label": "conditioning"
+          "link": 41
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
+          "label": "CONDITIONING",
          "type": "CONDITIONING",
+          "shape": 3,
          "links": [
            67
          ],
-          "slot_index": 0,
-          "shape": 3,
-          "label": "CONDITIONING"
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -141,45 +146,50 @@
      "inputs": [
        {
          "name": "model",
+          "localized_name": "model",
+          "label": "model",
          "type": "MODEL",
-          "link": 78,
-          "label": "model"
+          "link": 78
        },
        {
          "name": "positive",
+          "localized_name": "positive",
+          "label": "positive",
          "type": "CONDITIONING",
-          "link": 64,
-          "label": "positive"
+          "link": 64
        },
        {
          "name": "negative",
+          "localized_name": "negative",
+          "label": "negative",
          "type": "CONDITIONING",
-          "link": 65,
-          "label": "negative"
+          "link": 65
        },
        {
          "name": "latent_image",
+          "localized_name": "latent_image",
+          "label": "latent_image",
          "type": "LATENT",
-          "link": 73,
-          "label": "latent_image"
+          "link": 73
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
+          "localized_name": "LATENT",
+          "label": "LATENT",
          "type": "LATENT",
          "links": [
            7
          ],
-          "slot_index": 0,
-          "label": "LATENT"
+          "slot_index": 0
        }
      ],
      "properties": {
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
-        704308966490490,
+        69796511068157,
        "randomize",
        20,
        1,
@@ -188,39 +198,6 @@
        1
      ]
    },
-    {
-      "id": 39,
-      "type": "SVDQuantFluxDiTLoader",
-      "pos": [
-        707.80908203125,
-        -172.0343017578125
-      ],
-      "size": [
-        315,
-        82
-      ],
-      "flags": {},
-      "order": 1,
-      "mode": 0,
-      "inputs": [],
-      "outputs": [
-        {
-          "name": "MODEL",
-          "type": "MODEL",
-          "links": [
-            78
-          ],
-          "slot_index": 0
-        }
-      ],
-      "properties": {
-        "Node name for S&R": "SVDQuantFluxDiTLoader"
-      },
-      "widgets_values": [
-        "mit-han-lab/svdq-int4-flux.1-depth-dev",
-        0
-      ]
-    },
    {
      "id": 43,
      "type": "PreviewImage",
@@ -238,9 +215,10 @@
      "inputs": [
        {
          "name": "images",
+          "localized_name": "images",
+          "label": "images",
          "type": "IMAGE",
-          "link": 87,
-          "label": "images"
+          "link": 87
        }
      ],
      "outputs": [],
@@ -266,26 +244,29 @@
      "inputs": [
        {
          "name": "samples",
+          "localized_name": "samples",
+          "label": "samples",
          "type": "LATENT",
-          "link": 7,
-          "label": "samples"
+          "link": 7
        },
        {
          "name": "vae",
+          "localized_name": "vae",
+          "label": "vae",
          "type": "VAE",
-          "link": 60,
-          "label": "vae"
+          "link": 60
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
+          "label": "IMAGE",
          "type": "IMAGE",
          "links": [
            85
          ],
-          "slot_index": 0,
-          "label": "IMAGE"
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -310,9 +291,10 @@
      "inputs": [
        {
          "name": "images",
+          "localized_name": "images",
+          "label": "images",
          "type": "IMAGE",
-          "link": 85,
-          "label": "images"
+          "link": 85
        }
      ],
      "outputs": [],
@@ -338,6 +320,7 @@
      "inputs": [
        {
          "name": "image",
+          "localized_name": "image",
          "type": "IMAGE",
          "link": 82
        }
@@ -345,6 +328,7 @@
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
          "type": "IMAGE",
          "links": [
            86
@@ -379,20 +363,22 @@
      "inputs": [
        {
          "name": "clip",
+          "localized_name": "clip",
+          "label": "clip",
          "type": "CLIP",
-          "link": 62,
-          "label": "clip"
+          "link": 62
        }
      ],
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
+          "label": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            41
          ],
-          "slot_index": 0,
-          "label": "CONDITIONING"
+          "slot_index": 0
        }
      ],
      "title": "CLIP Text Encode (Positive Prompt)",
@@ -417,19 +403,20 @@
        58
      ],
      "flags": {},
-      "order": 2,
+      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "VAE",
+          "localized_name": "VAE",
+          "label": "VAE",
          "type": "VAE",
          "links": [
            60,
            69
          ],
-          "slot_index": 0,
-          "label": "VAE"
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -456,56 +443,63 @@
      "inputs": [
        {
          "name": "positive",
+          "localized_name": "positive",
+          "label": "positive",
          "type": "CONDITIONING",
-          "link": 67,
-          "label": "positive"
+          "link": 67
        },
        {
          "name": "negative",
+          "localized_name": "negative",
+          "label": "negative",
          "type": "CONDITIONING",
-          "link": 68,
-          "label": "negative"
+          "link": 68
        },
        {
          "name": "vae",
+          "localized_name": "vae",
+          "label": "vae",
          "type": "VAE",
-          "link": 69,
-          "label": "vae"
+          "link": 69
        },
        {
          "name": "pixels",
+          "localized_name": "pixels",
+          "label": "pixels",
          "type": "IMAGE",
-          "link": 88,
-          "label": "pixels"
+          "link": 88
        }
      ],
      "outputs": [
        {
          "name": "positive",
+          "localized_name": "positive",
+          "label": "positive",
          "type": "CONDITIONING",
          "links": [
            64
          ],
-          "slot_index": 0,
-          "label": "positive"
+          "slot_index": 0
        },
        {
          "name": "negative",
+          "localized_name": "negative",
+          "label": "negative",
          "type": "CONDITIONING",
          "links": [
            65
          ],
-          "slot_index": 1,
-          "label": "negative"
+          "slot_index": 1
        },
        {
          "name": "latent",
+          "localized_name": "latent",
+          "label": "latent",
          "type": "LATENT",
          "links": [
            73
          ],
-          "slot_index": 2,
-          "label": "latent"
+          "slot_index": 2
        }
      ],
      "properties": {
@@ -530,6 +524,7 @@
      "inputs": [
        {
          "name": "image",
+          "localized_name": "image",
          "type": "IMAGE",
          "link": 86
        }
@@ -537,6 +532,7 @@
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
          "type": "IMAGE",
          "links": [
            87,
@@ -564,26 +560,28 @@
        314.0000305175781
      ],
      "flags": {},
-      "order": 3,
+      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
+          "label": "IMAGE",
          "type": "IMAGE",
+          "shape": 3,
          "links": [
            82
          ],
-          "slot_index": 0,
-          "shape": 3,
-          "label": "IMAGE"
+          "slot_index": 0
        },
        {
          "name": "MASK",
+          "localized_name": "MASK",
+          "label": "MASK",
          "type": "MASK",
-          "links": null,
          "shape": 3,
-          "label": "MASK"
+          "links": null
        }
      ],
      "properties": {
@@ -593,6 +591,41 @@
        "logo_example.png",
        "image"
      ]
+    },
+    {
+      "id": 39,
+      "type": "SVDQuantFluxDiTLoader",
+      "pos": [
+        707.80908203125,
+        -172.0343017578125
+      ],
+      "size": [
+        315,
+        106
+      ],
+      "flags": {},
+      "order": 3,
+      "mode": 0,
+      "inputs": [],
+      "outputs": [
+        {
+          "name": "MODEL",
+          "localized_name": "MODEL",
+          "type": "MODEL",
+          "links": [
+            78
+          ],
+          "slot_index": 0
+        }
+      ],
+      "properties": {
+        "Node name for S&R": "SVDQuantFluxDiTLoader"
+      },
+      "widgets_values": [
+        "mit-han-lab/svdq-int4-flux.1-depth-dev",
+        "disable",
+        0
+      ]
    }
  ],
  "links": [
@@ -739,12 +772,12 @@
    "ds": {
      "scale": 0.8140274938684042,
      "offset": [
-        1060.3416359459316,
-        529.8567933439979
+        1795.999020278545,
+        750.1636967541119
      ]
    },
    "node_versions": {
-      "comfy-core": "0.3.14"
+      "comfy-core": "0.3.24"
    }
  },
  "version": 0.4

--- a/comfyui/workflows/svdq-flux.1-dev.json
+++ b/comfyui/workflows/svdq-flux.1-dev.json
@@ -19,6 +19,7 @@
      "inputs": [
        {
          "name": "conditioning",
+          "localized_name": "conditioning",
          "type": "CONDITIONING",
          "link": 41
        }
@@ -26,12 +27,13 @@
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
          "type": "CONDITIONING",
+          "shape": 3,
          "links": [
            42
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -60,6 +62,7 @@
      "inputs": [
        {
          "name": "images",
+          "localized_name": "images",
          "type": "IMAGE",
          "link": 9
        }
@@ -82,24 +85,25 @@
        82
      ],
      "flags": {},
-      "order": 6,
+      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "NOISE",
+          "localized_name": "NOISE",
          "type": "NOISE",
+          "shape": 3,
          "links": [
            37
-          ],
-          "shape": 3
+          ]
        }
      ],
      "properties": {
        "Node name for S&R": "RandomNoise"
      },
      "widgets_values": [
-        148576770035090,
+        385675283593224,
        "randomize"
      ],
      "color": "#2a363b",
@@ -117,17 +121,18 @@
        58
      ],
      "flags": {},
-      "order": 0,
+      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "SAMPLER",
+          "localized_name": "SAMPLER",
          "type": "SAMPLER",
+          "shape": 3,
          "links": [
            19
-          ],
-          "shape": 3
+          ]
        }
      ],
      "properties": {
@@ -154,6 +159,7 @@
      "inputs": [
        {
          "name": "model",
+          "localized_name": "model",
          "type": "MODEL",
          "link": 55,
          "slot_index": 0
@@ -162,11 +168,12 @@
      "outputs": [
        {
          "name": "SIGMAS",
+          "localized_name": "SIGMAS",
          "type": "SIGMAS",
+          "shape": 3,
          "links": [
            20
-          ],
-          "shape": 3
+          ]
        }
      ],
      "properties": {
@@ -187,7 +194,7 @@
      ],
      "size": [
        315,
-        130
+        170
      ],
      "flags": {},
      "order": 11,
@@ -195,6 +202,7 @@
      "inputs": [
        {
          "name": "model",
+          "localized_name": "model",
          "type": "MODEL",
          "link": 123,
          "slot_index": 0
@@ -202,32 +210,41 @@
        {
          "name": "width",
          "type": "INT",
-          "link": 115,
-          "slot_index": 1,
+          "pos": [
+            10,
+            84
+          ],
          "widget": {
            "name": "width"
-          }
+          },
+          "link": 115,
+          "slot_index": 1
        },
        {
          "name": "height",
          "type": "INT",
-          "link": 114,
-          "slot_index": 2,
+          "pos": [
+            10,
+            108
+          ],
          "widget": {
            "name": "height"
-          }
+          },
+          "link": 114,
+          "slot_index": 2
        }
      ],
      "outputs": [
        {
          "name": "MODEL",
+          "localized_name": "MODEL",
          "type": "MODEL",
+          "shape": 3,
          "links": [
            54,
            55
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -257,11 +274,13 @@
      "inputs": [
        {
          "name": "samples",
+          "localized_name": "samples",
          "type": "LATENT",
          "link": 24
        },
        {
          "name": "vae",
+          "localized_name": "vae",
          "type": "VAE",
          "link": 12
        }
@@ -269,6 +288,7 @@
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
          "type": "IMAGE",
          "links": [
            9
@@ -298,12 +318,14 @@
      "inputs": [
        {
          "name": "model",
+          "localized_name": "model",
          "type": "MODEL",
          "link": 54,
          "slot_index": 0
        },
        {
          "name": "conditioning",
+          "localized_name": "conditioning",
          "type": "CONDITIONING",
          "link": 42,
          "slot_index": 1
@@ -312,12 +334,13 @@
      "outputs": [
        {
          "name": "GUIDER",
+          "localized_name": "GUIDER",
          "type": "GUIDER",
+          "shape": 3,
          "links": [
            30
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -344,14 +367,14 @@
        {
          "name": "INT",
          "type": "INT",
+          "widget": {
+            "name": "height"
+          },
          "links": [
            113,
            114
          ],
-          "slot_index": 0,
-          "widget": {
-            "name": "height"
-          }
+          "slot_index": 0
        }
      ],
      "title": "height",
@@ -377,21 +400,21 @@
        82
      ],
      "flags": {},
-      "order": 1,
+      "order": 3,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "INT",
          "type": "INT",
+          "widget": {
+            "name": "width"
+          },
          "links": [
            112,
            115
          ],
-          "slot_index": 0,
-          "widget": {
-            "name": "width"
-          }
+          "slot_index": 0
        }
      ],
      "title": "width",
@@ -414,7 +437,7 @@
      ],
      "size": [
        315,
-        106
+        126
      ],
      "flags": {},
      "order": 7,
@@ -423,29 +446,38 @@
        {
          "name": "width",
          "type": "INT",
-          "link": 112,
+          "pos": [
+            10,
+            36
+          ],
          "widget": {
            "name": "width"
-          }
+          },
+          "link": 112
        },
        {
          "name": "height",
          "type": "INT",
-          "link": 113,
+          "pos": [
+            10,
+            60
+          ],
          "widget": {
            "name": "height"
-          }
+          },
+          "link": 113
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
+          "localized_name": "LATENT",
          "type": "LATENT",
+          "shape": 3,
          "links": [
            116
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -474,6 +506,7 @@
      "inputs": [
        {
          "name": "clip",
+          "localized_name": "clip",
          "type": "CLIP",
          "link": 118
        }
@@ -481,6 +514,7 @@
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            41
@@ -515,6 +549,7 @@
      "inputs": [
        {
          "name": "model",
+          "localized_name": "model",
          "type": "MODEL",
          "link": 122
        }
@@ -522,6 +557,7 @@
      "outputs": [
        {
          "name": "MODEL",
+          "localized_name": "MODEL",
          "type": "MODEL",
          "links": [
            123
@@ -539,39 +575,6 @@
        1
      ]
    },
-    {
-      "id": 38,
-      "type": "SVDQuantFluxDiTLoader",
-      "pos": [
-        426.25274658203125,
-        905.1461181640625
-      ],
-      "size": [
-        315,
-        82
-      ],
-      "flags": {},
-      "order": 5,
-      "mode": 0,
-      "inputs": [],
-      "outputs": [
-        {
-          "name": "MODEL",
-          "type": "MODEL",
-          "links": [
-            122
-          ],
-          "slot_index": 0
-        }
-      ],
-      "properties": {
-        "Node name for S&R": "SVDQuantFluxDiTLoader"
-      },
-      "widgets_values": [
-        "mit-han-lab/svdq-int4-flux.1-dev",
-        0
-      ]
-    },
    {
      "id": 10,
      "type": "VAELoader",
@@ -590,12 +593,13 @@
      "outputs": [
        {
          "name": "VAE",
+          "localized_name": "VAE",
          "type": "VAE",
+          "shape": 3,
          "links": [
            12
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -617,12 +621,13 @@
        178
      ],
      "flags": {},
-      "order": 3,
+      "order": 5,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "CLIP",
+          "localized_name": "CLIP",
          "type": "CLIP",
          "links": [
            118
@@ -658,30 +663,35 @@
      "inputs": [
        {
          "name": "noise",
+          "localized_name": "noise",
          "type": "NOISE",
          "link": 37,
          "slot_index": 0
        },
        {
          "name": "guider",
+          "localized_name": "guider",
          "type": "GUIDER",
          "link": 30,
          "slot_index": 1
        },
        {
          "name": "sampler",
+          "localized_name": "sampler",
          "type": "SAMPLER",
          "link": 19,
          "slot_index": 2
        },
        {
          "name": "sigmas",
+          "localized_name": "sigmas",
          "type": "SIGMAS",
          "link": 20,
          "slot_index": 3
        },
        {
          "name": "latent_image",
+          "localized_name": "latent_image",
          "type": "LATENT",
          "link": 116,
          "slot_index": 4
@@ -690,24 +700,61 @@
      "outputs": [
        {
          "name": "output",
+          "localized_name": "output",
          "type": "LATENT",
+          "shape": 3,
          "links": [
            24
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        },
        {
          "name": "denoised_output",
+          "localized_name": "denoised_output",
          "type": "LATENT",
-          "links": null,
-          "shape": 3
+          "shape": 3,
+          "links": null
        }
      ],
      "properties": {
        "Node name for S&R": "SamplerCustomAdvanced"
      },
      "widgets_values": []
+    },
+    {
+      "id": 38,
+      "type": "SVDQuantFluxDiTLoader",
+      "pos": [
+        425.7825012207031,
+        887.9263916015625
+      ],
+      "size": [
+        315,
+        106
+      ],
+      "flags": {},
+      "order": 6,
+      "mode": 0,
+      "inputs": [],
+      "outputs": [
+        {
+          "name": "MODEL",
+          "localized_name": "MODEL",
+          "type": "MODEL",
+          "links": [
+            122
+          ],
+          "slot_index": 0
+        }
+      ],
+      "properties": {
+        "Node name for S&R": "SVDQuantFluxDiTLoader"
+      },
+      "widgets_values": [
+        "mit-han-lab/svdq-int4-flux.1-dev",
+        "disable",
+        0
+      ]
    }
  ],
  "links": [
@@ -870,8 +917,8 @@
    "ds": {
      "scale": 1.0152559799477106,
      "offset": [
-        521.7873982958799,
-        167.19904950835112
+        1093.678904911345,
+        404.94781362261836
      ]
    },
    "groupNodes": {
@@ -1066,7 +1113,7 @@
      }
    },
    "node_versions": {
-      "comfy-core": "0.3.14"
+      "comfy-core": "0.3.24"
    }
  },
  "version": 0.4

--- a/comfyui/workflows/svdq-flux.1-fill.json
+++ b/comfyui/workflows/svdq-flux.1-fill.json
@@ -19,11 +19,13 @@
      "inputs": [
        {
          "name": "samples",
+          "localized_name": "samples",
          "type": "LATENT",
          "link": 7
        },
        {
          "name": "vae",
+          "localized_name": "vae",
          "type": "VAE",
          "link": 60
        }
@@ -31,6 +33,7 @@
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
          "type": "IMAGE",
          "links": [
            95
@@ -60,26 +63,31 @@
      "inputs": [
        {
          "name": "positive",
+          "localized_name": "positive",
          "type": "CONDITIONING",
          "link": 80
        },
        {
          "name": "negative",
+          "localized_name": "negative",
          "type": "CONDITIONING",
          "link": 81
        },
        {
          "name": "vae",
+          "localized_name": "vae",
          "type": "VAE",
          "link": 82
        },
        {
          "name": "pixels",
+          "localized_name": "pixels",
          "type": "IMAGE",
          "link": 107
        },
        {
          "name": "mask",
+          "localized_name": "mask",
          "type": "MASK",
          "link": 108
        }
@@ -87,6 +95,7 @@
      "outputs": [
        {
          "name": "positive",
+          "localized_name": "positive",
          "type": "CONDITIONING",
          "links": [
            77
@@ -95,6 +104,7 @@
        },
        {
          "name": "negative",
+          "localized_name": "negative",
          "type": "CONDITIONING",
          "links": [
            78
@@ -103,6 +113,7 @@
        },
        {
          "name": "latent",
+          "localized_name": "latent",
          "type": "LATENT",
          "links": [
            88
@@ -134,21 +145,25 @@
      "inputs": [
        {
          "name": "model",
+          "localized_name": "model",
          "type": "MODEL",
          "link": 102
        },
        {
          "name": "positive",
+          "localized_name": "positive",
          "type": "CONDITIONING",
          "link": 77
        },
        {
          "name": "negative",
+          "localized_name": "negative",
          "type": "CONDITIONING",
          "link": 78
        },
        {
          "name": "latent_image",
+          "localized_name": "latent_image",
          "type": "LATENT",
          "link": 88
        }
@@ -156,6 +171,7 @@
      "outputs": [
        {
          "name": "LATENT",
+          "localized_name": "LATENT",
          "type": "LATENT",
          "links": [
            7
@@ -167,7 +183,7 @@
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
-        54184445162233,
+        482487939694684,
        "randomize",
        20,
        1,
@@ -176,35 +192,6 @@
        1
      ]
    },
-    {
-      "id": 9,
-      "type": "SaveImage",
-      "pos": [
-        1879,
-        90
-      ],
-      "size": [
-        828.9535522460938,
-        893.8475341796875
-      ],
-      "flags": {},
-      "order": 12,
-      "mode": 0,
-      "inputs": [
-        {
-          "name": "images",
-          "type": "IMAGE",
-          "link": 95
-        }
-      ],
-      "outputs": [],
-      "properties": {
-        "Node name for S&R": "SaveImage"
-      },
-      "widgets_values": [
-        "ComfyUI"
-      ]
-    },
    {
      "id": 26,
      "type": "FluxGuidance",
@@ -222,6 +209,7 @@
      "inputs": [
        {
          "name": "conditioning",
+          "localized_name": "conditioning",
          "type": "CONDITIONING",
          "link": 41
        }
@@ -229,12 +217,13 @@
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
          "type": "CONDITIONING",
+          "shape": 3,
          "links": [
            80
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -263,6 +252,7 @@
      "inputs": [
        {
          "name": "clip",
+          "localized_name": "clip",
          "type": "CLIP",
          "link": 63
        }
@@ -270,6 +260,7 @@
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            81
@@ -296,7 +287,7 @@
      ],
      "size": [
        315,
-        106
+        122
      ],
      "flags": {},
      "order": 0,
@@ -305,6 +296,7 @@
      "outputs": [
        {
          "name": "CLIP",
+          "localized_name": "CLIP",
          "type": "CLIP",
          "links": [
            62,
@@ -322,72 +314,6 @@
        "default"
      ]
    },
-    {
-      "id": 32,
-      "type": "VAELoader",
-      "pos": [
-        1303,
-        424
-      ],
-      "size": [
-        315,
-        58
-      ],
-      "flags": {},
-      "order": 1,
-      "mode": 0,
-      "inputs": [],
-      "outputs": [
-        {
-          "name": "VAE",
-          "type": "VAE",
-          "links": [
-            60,
-            82
-          ],
-          "slot_index": 0
-        }
-      ],
-      "properties": {
-        "Node name for S&R": "VAELoader"
-      },
-      "widgets_values": [
-        "ae.safetensors"
-      ]
-    },
-    {
-      "id": 45,
-      "type": "SVDQuantFluxDiTLoader",
-      "pos": [
-        936.3029174804688,
-        -113.06819915771484
-      ],
-      "size": [
-        315,
-        82
-      ],
-      "flags": {},
-      "order": 2,
-      "mode": 0,
-      "inputs": [],
-      "outputs": [
-        {
-          "name": "MODEL",
-          "type": "MODEL",
-          "links": [
-            102
-          ],
-          "slot_index": 0
-        }
-      ],
-      "properties": {
-        "Node name for S&R": "SVDQuantFluxDiTLoader"
-      },
-      "widgets_values": [
-        "mit-han-lab/svdq-int4-flux.1-fill-dev",
-        0
-      ]
-    },
    {
      "id": 58,
      "type": "ImageAndMaskResizeNode",
@@ -405,11 +331,13 @@
      "inputs": [
        {
          "name": "image",
+          "localized_name": "image",
          "type": "IMAGE",
          "link": 105
        },
        {
          "name": "mask",
+          "localized_name": "mask",
          "type": "MASK",
          "link": 106
        }
@@ -417,6 +345,7 @@
      "outputs": [
        {
          "name": "image",
+          "localized_name": "image",
          "type": "IMAGE",
          "links": [
            107
@@ -425,6 +354,7 @@
        },
        {
          "name": "mask",
+          "localized_name": "mask",
          "type": "MASK",
          "links": [
            108
@@ -460,6 +390,7 @@
      "inputs": [
        {
          "name": "clip",
+          "localized_name": "clip",
          "type": "CLIP",
          "link": 62
        }
@@ -467,6 +398,7 @@
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            41
@@ -496,7 +428,7 @@
        132.3040771484375
      ],
      "flags": {},
-      "order": 3,
+      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [],
@@ -509,6 +441,41 @@
      "color": "#432",
      "bgcolor": "#653"
    },
+    {
+      "id": 45,
+      "type": "SVDQuantFluxDiTLoader",
+      "pos": [
+        936.3029174804688,
+        -113.06819915771484
+      ],
+      "size": [
+        315,
+        106
+      ],
+      "flags": {},
+      "order": 3,
+      "mode": 0,
+      "inputs": [],
+      "outputs": [
+        {
+          "name": "MODEL",
+          "localized_name": "MODEL",
+          "type": "MODEL",
+          "links": [
+            102
+          ],
+          "slot_index": 0
+        }
+      ],
+      "properties": {
+        "Node name for S&R": "SVDQuantFluxDiTLoader"
+      },
+      "widgets_values": [
+        "mit-han-lab/svdq-int4-flux.1-fill-dev",
+        "disable",
+        0
+      ]
+    },
    {
      "id": 17,
      "type": "LoadImage",
@@ -529,30 +496,96 @@
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
          "type": "IMAGE",
+          "shape": 3,
          "links": [
            105
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        },
        {
          "name": "MASK",
+          "localized_name": "MASK",
          "type": "MASK",
+          "shape": 3,
          "links": [
            106
          ],
-          "slot_index": 1,
-          "shape": 3
+          "slot_index": 1
        }
      ],
      "properties": {
        "Node name for S&R": "LoadImage"
      },
      "widgets_values": [
-        "clipspace/clipspace-mask-8389612.599999994.png [input]",
+        "clipspace/clipspace-mask-331829.799999997.png [input]",
        "image"
      ]
+    },
+    {
+      "id": 32,
+      "type": "VAELoader",
+      "pos": [
+        953.8762817382812,
+        440.3467102050781
+      ],
+      "size": [
+        315,
+        58
+      ],
+      "flags": {},
+      "order": 1,
+      "mode": 0,
+      "inputs": [],
+      "outputs": [
+        {
+          "name": "VAE",
+          "localized_name": "VAE",
+          "type": "VAE",
+          "links": [
+            60,
+            82
+          ],
+          "slot_index": 0
+        }
+      ],
+      "properties": {
+        "Node name for S&R": "VAELoader"
+      },
+      "widgets_values": [
+        "ae.safetensors"
+      ]
+    },
+    {
+      "id": 9,
+      "type": "SaveImage",
+      "pos": [
+        1862.43359375,
+        96.36107635498047
+      ],
+      "size": [
+        828.9535522460938,
+        893.8475341796875
+      ],
+      "flags": {},
+      "order": 12,
+      "mode": 0,
+      "inputs": [
+        {
+          "name": "images",
+          "localized_name": "images",
+          "type": "IMAGE",
+          "link": 95
+        }
+      ],
+      "outputs": [],
+      "properties": {
+        "Node name for S&R": "SaveImage"
+      },
+      "widgets_values": [
+        "ComfyUI"
+      ]
    }
  ],
  "links": [
@@ -697,14 +730,14 @@
  "config": {},
  "extra": {
    "ds": {
-      "scale": 0.8390545288824038,
+      "scale": 1.7985878990921451,
      "offset": [
-        815.2093059315082,
-        185.9955477896796
+        -287.8887097712823,
+        208.1745856210748
      ]
    },
    "node_versions": {
-      "comfy-core": "0.3.14",
+      "comfy-core": "0.3.24",
      "comfyui-inpainteasy": "1.0.2"
    }
  },

--- a/comfyui/workflows/svdq-flux.1-schnell.json
+++ b/comfyui/workflows/svdq-flux.1-schnell.json
@@ -14,16 +14,18 @@
        46
      ],
      "flags": {},
-      "order": 12,
+      "order": 10,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
+          "localized_name": "samples",
          "type": "LATENT",
          "link": 24
        },
        {
          "name": "vae",
+          "localized_name": "vae",
          "type": "VAE",
          "link": 12
        }
@@ -31,6 +33,7 @@
      "outputs": [
        {
          "name": "IMAGE",
+          "localized_name": "IMAGE",
          "type": "IMAGE",
          "links": [
            9
@@ -55,12 +58,13 @@
        106
      ],
      "flags": {},
-      "order": 2,
+      "order": 0,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "LATENT",
+          "localized_name": "LATENT",
          "type": "LATENT",
          "links": [
            23
@@ -91,17 +95,18 @@
        58
      ],
      "flags": {},
-      "order": 3,
+      "order": 1,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "SAMPLER",
+          "localized_name": "SAMPLER",
          "type": "SAMPLER",
+          "shape": 3,
          "links": [
            19
-          ],
-          "shape": 3
+          ]
        }
      ],
      "properties": {
@@ -123,11 +128,12 @@
        106
      ],
      "flags": {},
-      "order": 8,
+      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
+          "localized_name": "model",
          "type": "MODEL",
          "link": 41,
          "slot_index": 0
@@ -136,11 +142,12 @@
      "outputs": [
        {
          "name": "SIGMAS",
+          "localized_name": "SIGMAS",
          "type": "SIGMAS",
+          "shape": 3,
          "links": [
            20
-          ],
-          "shape": 3
+          ]
        }
      ],
      "properties": {
@@ -152,31 +159,6 @@
        1
      ]
    },
-    {
-      "id": 27,
-      "type": "Note",
-      "pos": [
-        480,
-        960
-      ],
-      "size": [
-        311.3529052734375,
-        131.16229248046875
-      ],
-      "flags": {},
-      "order": 4,
-      "mode": 0,
-      "inputs": [],
-      "outputs": [],
-      "properties": {
-        "text": ""
-      },
-      "widgets_values": [
-        "The schnell model is a distilled model that can generate a good image with only 4 steps."
-      ],
-      "color": "#432",
-      "bgcolor": "#653"
-    },
    {
      "id": 22,
      "type": "BasicGuider",
@@ -189,17 +171,19 @@
        46
      ],
      "flags": {},
-      "order": 10,
+      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
+          "localized_name": "model",
          "type": "MODEL",
          "link": 42,
          "slot_index": 0
        },
        {
          "name": "conditioning",
+          "localized_name": "conditioning",
          "type": "CONDITIONING",
          "link": 40,
          "slot_index": 1
@@ -208,12 +192,13 @@
      "outputs": [
        {
          "name": "GUIDER",
+          "localized_name": "GUIDER",
          "type": "GUIDER",
+          "shape": 3,
          "links": [
            30
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        }
      ],
      "properties": {
@@ -233,11 +218,12 @@
        164.31304931640625
      ],
      "flags": {},
-      "order": 9,
+      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "clip",
+          "localized_name": "clip",
          "type": "CLIP",
          "link": 43
        }
@@ -245,6 +231,7 @@
      "outputs": [
        {
          "name": "CONDITIONING",
+          "localized_name": "CONDITIONING",
          "type": "CONDITIONING",
          "links": [
            40
@@ -273,35 +260,40 @@
        106
      ],
      "flags": {},
-      "order": 11,
+      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "noise",
+          "localized_name": "noise",
          "type": "NOISE",
          "link": 37,
          "slot_index": 0
        },
        {
          "name": "guider",
+          "localized_name": "guider",
          "type": "GUIDER",
          "link": 30,
          "slot_index": 1
        },
        {
          "name": "sampler",
+          "localized_name": "sampler",
          "type": "SAMPLER",
          "link": 19,
          "slot_index": 2
        },
        {
          "name": "sigmas",
+          "localized_name": "sigmas",
          "type": "SIGMAS",
          "link": 20,
          "slot_index": 3
        },
        {
          "name": "latent_image",
+          "localized_name": "latent_image",
          "type": "LATENT",
          "link": 23,
          "slot_index": 4
@@ -310,18 +302,20 @@
      "outputs": [
        {
          "name": "output",
+          "localized_name": "output",
          "type": "LATENT",
+          "shape": 3,
          "links": [
            24
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        },
        {
          "name": "denoised_output",
+          "localized_name": "denoised_output",
          "type": "LATENT",
-          "links": null,
-          "shape": 3
+          "shape": 3,
+          "links": null
        }
      ],
      "properties": {
@@ -341,11 +335,12 @@
        1060.3828125
      ],
      "flags": {},
-      "order": 13,
+      "order": 11,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
+          "localized_name": "images",
          "type": "IMAGE",
          "link": 9
        }
@@ -368,17 +363,18 @@
        82
      ],
      "flags": {},
-      "order": 6,
+      "order": 2,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
          "name": "NOISE",
+          "localized_name": "NOISE",
          "type": "NOISE",
+          "shape": 3,
          "links": [
            37
-          ],
-          "shape": 3
+          ]
        }
      ],
      "properties": {
@@ -400,10 +396,10 @@
      ],
      "size": [
        352.79998779296875,
-        130
+        178
      ],
      "flags": {},
-      "order": 7,
+      "order": 3,
      "mode": 0,
      "inputs": [
        {
@@ -415,6 +411,7 @@
      "outputs": [
        {
          "name": "CLIP",
+          "localized_name": "CLIP",
          "type": "CLIP",
          "links": [
            43
@@ -429,77 +426,57 @@
        "flux",
        "t5xxl_fp16.safetensors",
        "clip_l.safetensors",
-        512
+        512,
+        "BF16",
+        "mit-han-lab/svdq-flux.1-t5"
      ]
    },
    {
-      "id": 10,
-      "type": "VAELoader",
+      "id": 28,
+      "type": "SVDQuantFluxDiTLoader",
      "pos": [
-        -31.617252349853516,
-        377.54791259765625
+        -10.846628189086914,
+        890.9998779296875
      ],
      "size": [
        315,
-        58
+        106
      ],
      "flags": {},
-      "order": 0,
+      "order": 4,
      "mode": 0,
      "inputs": [],
      "outputs": [
        {
-          "name": "VAE",
-          "type": "VAE",
+          "name": "MODEL",
+          "localized_name": "MODEL",
+          "type": "MODEL",
          "links": [
-            12
+            41,
+            42
          ],
-          "slot_index": 0,
-          "shape": 3
+          "slot_index": 0
        }
      ],
      "properties": {
-        "Node name for S&R": "VAELoader"
+        "Node name for S&R": "SVDQuantFluxDiTLoader"
      },
      "widgets_values": [
-        "ae.safetensors"
+        "mit-han-lab/svdq-int4-flux.1-schnell",
+        "disable",
+        0
      ]
    },
    {
-      "id": 26,
-      "type": "Note",
-      "pos": [
-        -28.286691665649414,
-        511.4660339355469
-      ],
-      "size": [
-        336,
-        288
-      ],
-      "flags": {},
-      "order": 1,
-      "mode": 0,
-      "inputs": [],
-      "outputs": [],
-      "properties": {
-        "text": ""
-      },
-      "widgets_values": [
-        "If you get an error in any of the nodes above make sure the files are in the correct directories.\n\nSee the top of the examples page for the links : https://comfyanonymous.github.io/ComfyUI_examples/flux/\n\nflux1-schnell.safetensors goes in: ComfyUI/models/unet/\n\nt5xxl_fp16.safetensors and clip_l.safetensors go in: ComfyUI/models/clip/\n\nae.safetensors goes in: ComfyUI/models/vae/\n\n\nTip: You can set the weight_dtype above to one of the fp8 types if you have memory issues."
-      ],
-      "color": "#432",
-      "bgcolor": "#653"
-    },
-    {
-      "id": 28,
-      "type": "SVDQuantFluxDiTLoader",
+      "id": 10,
+      "type": "VAELoader",
      "pos": [
-        -10.846628189086914,
-        890.9998779296875
+        874.65625,
+        480.88372802734375
      ],
      "size": [
        315,
-        82
+        58
      ],
      "flags": {},
      "order": 5,
@@ -507,21 +484,21 @@
      "inputs": [],
      "outputs": [
        {
-          "name": "MODEL",
-          "type": "MODEL",
+          "name": "VAE",
+          "localized_name": "VAE",
+          "type": "VAE",
+          "shape": 3,
          "links": [
-            41,
-            42
+            12
          ],
          "slot_index": 0
        }
      ],
      "properties": {
-        "Node name for S&R": "SVDQuantFluxDiTLoader"
+        "Node name for S&R": "VAELoader"
      },
      "widgets_values": [
-        "mit-han-lab/svdq-int4-flux.1-schnell",
-        0
+        "ae.safetensors"
      ]
    }
  ],
@@ -627,11 +604,14 @@
  "config": {},
  "extra": {
    "ds": {
-      "scale": 0.6727499949325652,
+      "scale": 1.1167815779424761,
      "offset": [
-        405.6825017392191,
-        29.738440474209906
+        874.5548427683093,
+        429.12540214017235
      ]
+    },
+    "node_versions": {
+      "comfy-core": "0.3.24"
    }
  },
  "version": 0.4

--- a/examples/fp4-flux.1-dev.py
+++ b/examples/fp4-flux.1-dev.py
 import torch
 from diffusers import FluxPipeline

-from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
+from nunchaku import NunchakuFluxTransformer2dModel

 transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-fp4-flux.1-dev", precision="fp4")
 pipeline = FluxPipeline.from_pretrained(

--- a/examples/fp4-flux.1-schnell.py
+++ b/examples/fp4-flux.1-schnell.py
 import torch
 from diffusers import FluxPipeline

-from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
+from nunchaku import NunchakuFluxTransformer2dModel

 transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-fp4-flux.1-schnell", precision="fp4")
 pipeline = FluxPipeline.from_pretrained(