Unverified Commit f060b8da authored by Muyang Li's avatar Muyang Li Committed by GitHub
Browse files

[Major] Release v0.1.4

Support 4-bit text encoder and per-layer CPU offloading, reducing FLUX's minimum memory requirement to just 4 GiB while maintaining a 2–3× speedup. Fix various issues related to resolution, LoRA, pin memory, and runtime stability. Check out the release notes for full details!
parents f549dfc6 873a35be
......@@ -6,6 +6,7 @@ Chere [here](https://github.com/mit-han-lab/nunchaku/issues/149) to join our use
### [Paper](http://arxiv.org/abs/2411.05007) | [Project](https://hanlab.mit.edu/projects/svdquant) | [Blog](https://hanlab.mit.edu/blog/svdquant) | [Demo](https://svdquant.mit.edu)
- **[2025-03-07]** 🚀 **Nunchaku v0.1.4 Released!** We've supported [4-bit text encoder and per-layer CPU offloading](#Low-Memory-Inference), reducing FLUX's minimum memory requirement to just **4 GiB** while maintaining a **2–3× speedup**. This update also fixes various issues related to resolution, LoRA, pin memory, and runtime stability. Check out the release notes for full details!
- **[2025-02-20]** 🚀 We release the [pre-built wheels](https://huggingface.co/mit-han-lab/nunchaku) to simplify installation! Check [here](#Installation) for the guidance!
- **[2025-02-20]** 🚀 **Support NVFP4 precision on NVIDIA RTX 5090!** NVFP4 delivers superior image quality compared to INT4, offering **~3× speedup** on the RTX 5090 over BF16. Learn more in our [blog](https://hanlab.mit.edu/blog/svdquant-nvfp4), checkout [`examples`](./examples) for usage and try [our demo](https://svdquant.mit.edu/flux1-schnell/) online!
- **[2025-02-18]** 🔥 [**Customized LoRA conversion**](#Customized-LoRA) and [**model quantization**](#Customized-Model-Quantization) instructions are now available! **[ComfyUI](./comfyui)** workflows now support **customized LoRA**, along with **FLUX.1-Tools**!
......@@ -45,18 +46,27 @@ SVDQuant is a post-training quantization technique for 4-bit weights and activat
## Installation
### Wheels (Linux only for now)
### Wheels (for Linux and Windows WSL)
#### For Windows Users
To install and use WSL (Windows Subsystem for Linux), follow the instructions [here](https://learn.microsoft.com/en-us/windows/wsl/install). You can also install WSL directly by running the following commands in PowerShell:
```shell
wsl --install # install the latest WSL
wsl # launch WSL
```
#### Prerequisites for all users
Before installation, ensure you have [PyTorch>=2.5](https://pytorch.org/) installed. For example, you can use the following command to install PyTorch 2.6:
```shell
pip install torch==2.6 torchvision==0.21 torchaudio==2.6
```
#### Installing nunchaku
Once PyTorch is installed, you can directly install `nunchaku` from our [Hugging Face repository](https://huggingface.co/mit-han-lab/nunchaku/tree/main). Be sure to select the appropriate wheel for your Python and PyTorch version. For example, for Python 3.11 and PyTorch 2.6:
```shell
pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.1.3+torch2.6-cp311-cp311-linux_x86_64.whl
pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.1.4+torch2.6-cp311-cp311-linux_x86_64.whl
```
**Note**: NVFP4 wheels are not currently available because PyTorch has not officially supported CUDA 11.8. To use NVFP4, you will need **Blackwell GPUs (e.g., 50-series GPUs)** and must **build from source**.
......@@ -81,7 +91,7 @@ pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.
pip install peft opencv-python gradio spaces GPUtil # For gradio demos
```
To enable NVFP4 on Blackwell GPUs (e.g., 50-series GPUs), please install nightly PyTorch with CUDA 12.8. The installation command can be:
To enable NVFP4 on Blackwell GPUs (e.g., 50-series GPUs), please install nightly PyTorch with CUDA 12.8. The installation command can be:
```shell
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
......@@ -113,7 +123,7 @@ In [examples](examples), we provide minimal scripts for running INT4 [FLUX.1](ht
import torch
from diffusers import FluxPipeline
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
from nunchaku import NunchakuFluxTransformer2dModel
transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-int4-flux.1-dev")
pipeline = FluxPipeline.from_pretrained(
......@@ -125,6 +135,28 @@ image.save("flux.1-dev.png")
Specifically, `nunchaku` shares the same APIs as [diffusers](https://github.com/huggingface/diffusers) and can be used in a similar way.
### Low Memory Inference
To further reduce GPU memory usage, you can use our 4-bit T5 encoder along with CPU offloading, requiring a minimum of just 4GiB of memory. The usage is also simple in the diffusers' way. For example, the [script](examples/int4-flux.1-dev-qencoder.py) for FLUX.1-dev is as follows:
```python
import torch
from diffusers import FluxPipeline
from nunchaku import NunchakuFluxTransformer2dModel, NunchakuT5EncoderModel
transformer = NunchakuFluxTransformer2dModel.from_pretrained(
"mit-han-lab/svdq-int4-flux.1-dev", offload=True
) # set offload to False if you want to disable offloading
text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
pipeline = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev", text_encoder_2=text_encoder_2, transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")
pipeline.enable_sequential_cpu_offload() # remove this line if you want to disable the CPU offloading
image = pipeline("A cat holding a sign that says hello world", num_inference_steps=50, guidance_scale=3.5).images[0]
image.save("flux.1-dev.png")
```
## Customized LoRA
![lora](./assets/lora.jpg)
......@@ -168,7 +200,7 @@ transformer.set_lora_strength(lora_strength)
import torch
from diffusers import FluxPipeline
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
from nunchaku import NunchakuFluxTransformer2dModel
transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-int4-flux.1-dev")
pipeline = FluxPipeline.from_pretrained(
......
......@@ -12,7 +12,7 @@ from image_gen_aux import DepthPreprocessor
from PIL import Image
from nunchaku.models.safety_checker import SafetyChecker
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
from nunchaku.models.transformers.transformer_flux import NunchakuFluxTransformer2dModel
from utils import get_args
from vars import (
DEFAULT_GUIDANCE_CANNY,
......@@ -57,7 +57,7 @@ else:
transformer = NunchakuFluxTransformer2dModel.from_pretrained(f"mit-han-lab/svdq-int4-flux.1-{model_name}")
pipeline_init_kwargs["transformer"] = transformer
if args.use_qencoder:
from nunchaku.models.text_encoder import NunchakuT5EncoderModel
from nunchaku.models.text_encoders.t5_encoder import NunchakuT5EncoderModel
text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
pipeline_init_kwargs["text_encoder_2"] = text_encoder_2
......
......@@ -10,7 +10,7 @@ from diffusers import FluxFillPipeline
from PIL import Image
from nunchaku.models.safety_checker import SafetyChecker
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
from nunchaku.models.transformers.transformer_flux import NunchakuFluxTransformer2dModel
from utils import get_args
from vars import DEFAULT_GUIDANCE, DEFAULT_INFERENCE_STEP, DEFAULT_STYLE_NAME, EXAMPLES, MAX_SEED, STYLE_NAMES, STYLES
......@@ -29,7 +29,7 @@ else:
transformer = NunchakuFluxTransformer2dModel.from_pretrained(f"mit-han-lab/svdq-int4-flux.1-fill-dev")
pipeline_init_kwargs["transformer"] = transformer
if args.use_qencoder:
from nunchaku.models.text_encoder import NunchakuT5EncoderModel
from nunchaku.models.text_encoders.t5_encoder import NunchakuT5EncoderModel
text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
pipeline_init_kwargs["text_encoder_2"] = text_encoder_2
......
......@@ -9,7 +9,7 @@ import torch
from diffusers import FluxPipeline, FluxPriorReduxPipeline
from PIL import Image
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
from nunchaku.models.transformers.transformer_flux import NunchakuFluxTransformer2dModel
from utils import get_args
from vars import DEFAULT_GUIDANCE, DEFAULT_INFERENCE_STEP, EXAMPLES, MAX_SEED
......
......@@ -12,7 +12,7 @@ from PIL import Image
from flux_pix2pix_pipeline import FluxPix2pixTurboPipeline
from nunchaku.models.safety_checker import SafetyChecker
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
from nunchaku.models.transformers.transformer_flux import NunchakuFluxTransformer2dModel
from utils import get_args
from vars import DEFAULT_SKETCH_GUIDANCE, DEFAULT_STYLE_NAME, MAX_SEED, STYLE_NAMES, STYLES
......@@ -36,7 +36,7 @@ else:
transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-int4-flux.1-schnell")
pipeline_init_kwargs["transformer"] = transformer
if args.use_qencoder:
from nunchaku.models.text_encoder import NunchakuT5EncoderModel
from nunchaku.models.text_encoders.t5_encoder import NunchakuT5EncoderModel
text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
pipeline_init_kwargs["text_encoder_2"] = text_encoder_2
......
......@@ -57,7 +57,7 @@ def main():
for dataset_name in args.datasets:
output_dirname = os.path.join(output_root, dataset_name)
os.makedirs(output_dirname, exist_ok=True)
dataset = get_dataset(name=dataset_name)
dataset = get_dataset(name=dataset_name, max_dataset_size=8)
if args.chunk_step > 1:
dataset = dataset.select(range(args.chunk_start, len(dataset), args.chunk_step))
for row in tqdm(dataset):
......
......@@ -2,7 +2,7 @@ import torch
from diffusers import FluxPipeline
from peft.tuners import lora
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
from nunchaku import NunchakuFluxTransformer2dModel
from vars import LORA_PATHS, SVDQ_LORA_PATHS
......@@ -32,11 +32,11 @@ def get_pipeline(
else:
assert precision == "fp4"
transformer = NunchakuFluxTransformer2dModel.from_pretrained(
"/home/muyang/nunchaku_models/flux.1-schnell-nvfp4-svdq-gptq", precision="fp4"
"mit-han-lab/svdq-fp4-flux.1-schnell", precision="fp4"
)
pipeline_init_kwargs["transformer"] = transformer
if use_qencoder:
from nunchaku.models.text_encoder import NunchakuT5EncoderModel
from nunchaku.models.text_encoders.t5_encoder import NunchakuT5EncoderModel
text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
pipeline_init_kwargs["text_encoder_2"] = text_encoder_2
......@@ -53,7 +53,7 @@ def get_pipeline(
transformer.set_lora_strength(lora_weight)
pipeline_init_kwargs["transformer"] = transformer
if use_qencoder:
from nunchaku.models.text_encoder import NunchakuT5EncoderModel
from nunchaku.models.text_encoders.t5_encoder import NunchakuT5EncoderModel
text_encoder_2 = NunchakuT5EncoderModel.from_pretrained("mit-han-lab/svdq-flux.1-t5")
pipeline_init_kwargs["text_encoder_2"] = text_encoder_2
......
import torch
from diffusers import SanaPAGPipeline
from nunchaku.models.transformer_sana import NunchakuSanaTransformer2DModel
from nunchaku.models.transformers.transformer_sana import NunchakuSanaTransformer2DModel
def hash_str_to_int(s: str) -> int:
......@@ -30,7 +30,7 @@ def get_pipeline(
variant="bf16",
torch_dtype=torch.bfloat16,
pag_applied_layers="transformer_blocks.8",
**pipeline_init_kwargs
**pipeline_init_kwargs,
)
if precision == "int4":
pipeline._set_pag_attn_processor = lambda *args, **kwargs: None
......
......@@ -70,8 +70,6 @@ comfy node registry-install svdquant
* Install missing nodes (e.g., comfyui-inpainteasy) following [this tutorial](https://github.com/ltdrdata/ComfyUI-Manager?tab=readme-ov-file#support-of-missing-nodes-installation).
2. **Download Required Models**: Follow [this tutorial](https://comfyanonymous.github.io/ComfyUI_examples/flux/) and download the required models into the appropriate directories using the commands below:
```shell
......@@ -92,7 +90,7 @@ comfy node registry-install svdquant
* **SVDQuant Flux DiT Loader**: A node for loading the FLUX diffusion model.
* `model_path`: Specifies the model location. If set to `mit-han-lab/svdq-int4-flux.1-schnell`, `mit-han-lab/svdq-int4-flux.1-dev`, `mit-han-lab/svdq-int4-flux.1-canny-dev`, `mit-han-lab/svdq-int4-flux.1-fill-dev` or `mit-han-lab/svdq-int4-flux.1-depth-dev`, the model will be automatically downloaded from our Hugging Face repository. Alternatively, you can manually download the model directory by running the following command example:
* `model_path`: Specifies the model location. If set to the folder starting with `mit-han-lab`, the model will be automatically downloaded from our Hugging Face repository. Alternatively, you can manually download the model directory by running the following command example:
```shell
huggingface-cli download mit-han-lab/svdq-int4-flux.1-dev --local-dir models/diffusion_models/svdq-int4-flux.1-dev
......@@ -100,6 +98,8 @@ comfy node registry-install svdquant
After downloading, specify the corresponding folder name as the `model_path`.
* `cpu_offload`: Enables CPU offloading for the transformer model. While this may reduce GPU memory usage, it can slow down inference. Memory usage will be further optimized in node v0.1.6.
* `device_id`: Indicates the GPU ID for running the model.
* **SVDQuant FLUX LoRA Loader**: A node for loading LoRA modules for SVDQuant FLUX models.
......
......@@ -4,10 +4,7 @@ import tempfile
import folder_paths
from safetensors.torch import save_file
from nunchaku.lora.flux.comfyui_converter import comfyui2diffusers
from nunchaku.lora.flux.diffusers_converter import convert_to_nunchaku_flux_lowrank_dict
from nunchaku.lora.flux.utils import detect_format
from nunchaku.lora.flux.xlab_converter import xlab2diffusers
from nunchaku.lora.flux import comfyui2diffusers, convert_to_nunchaku_flux_lowrank_dict, detect_format, xlab2diffusers
class SVDQuantFluxLoraLoader:
......@@ -25,6 +22,8 @@ class SVDQuantFluxLoraLoader:
base_model_paths = [
"mit-han-lab/svdq-int4-flux.1-dev",
"mit-han-lab/svdq-int4-flux.1-schnell",
"mit-han-lab/svdq-fp4-flux.1-dev",
"mit-han-lab/svdq-fp4-flux.1-schnell",
"mit-han-lab/svdq-int4-flux.1-canny-dev",
"mit-han-lab/svdq-int4-flux.1-depth-dev",
"mit-han-lab/svdq-int4-flux.1-fill-dev",
......
......@@ -10,7 +10,7 @@ from diffusers import FluxTransformer2DModel
from einops import rearrange, repeat
from torch import nn
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
from nunchaku import NunchakuFluxTransformer2dModel
class ComfyUIFluxForwardWrapper(nn.Module):
......@@ -67,31 +67,53 @@ class ComfyUIFluxForwardWrapper(nn.Module):
class SVDQuantFluxDiTLoader:
@classmethod
def INPUT_TYPES(s):
# folder_paths.get_filename_list("loras"),
model_paths = [
"mit-han-lab/svdq-int4-flux.1-schnell",
"mit-han-lab/svdq-int4-flux.1-dev",
"mit-han-lab/svdq-fp4-flux.1-schnell",
"mit-han-lab/svdq-fp4-flux.1-dev",
"mit-han-lab/svdq-int4-flux.1-canny-dev",
"mit-han-lab/svdq-int4-flux.1-depth-dev",
"mit-han-lab/svdq-int4-flux.1-fill-dev",
]
prefix = os.path.join(folder_paths.models_dir, "diffusion_models")
local_folders = os.listdir(prefix)
local_folders = sorted(
[
folder
for folder in local_folders
if not folder.startswith(".") and os.path.isdir(os.path.join(prefix, folder))
]
)
prefixes = folder_paths.folder_names_and_paths["diffusion_models"][0]
local_folders = set()
for prefix in prefixes:
if os.path.exists(prefix) and os.path.isdir(prefix):
local_folders_ = os.listdir(prefix)
local_folders_ = [
folder
for folder in local_folders_
if not folder.startswith(".") and os.path.isdir(os.path.join(prefix, folder))
]
local_folders.update(local_folders_)
local_folders = sorted(list(local_folders))
model_paths = local_folders + model_paths
ngpus = len(GPUtil.getGPUs())
return {
"required": {
"model_path": (model_paths,),
"model_path": (
model_paths,
{"tooltip": "The SVDQuant quantized FLUX.1 models. It can be a huggingface path or a local path."},
),
"cpu_offload": (
["enable", "disable"],
{
"default": "disable",
"tooltip": "Whether to enable CPU offload for the transformer model. This may slow down the inference, but may reduce the GPU memory usage.",
},
),
"device_id": (
"INT",
{"default": 0, "min": 0, "max": ngpus, "step": 1, "display": "number", "lazy": True},
{
"default": 0,
"min": 0,
"max": ngpus,
"step": 1,
"display": "number",
"lazy": True,
"tooltip": "The GPU device ID to use for the model.",
},
),
}
}
......@@ -101,14 +123,15 @@ class SVDQuantFluxDiTLoader:
CATEGORY = "SVDQuant"
TITLE = "SVDQuant Flux DiT Loader"
def load_model(self, model_path: str, device_id: int, **kwargs) -> tuple[FluxTransformer2DModel]:
def load_model(self, model_path: str, cpu_offload: str, device_id: int, **kwargs) -> tuple[FluxTransformer2DModel]:
device = f"cuda:{device_id}"
prefix = os.path.join(folder_paths.models_dir, "diffusion_models")
if os.path.exists(os.path.join(prefix, model_path)):
model_path = os.path.join(prefix, model_path)
else:
model_path = model_path
transformer = NunchakuFluxTransformer2dModel.from_pretrained(model_path).to(device)
prefixes = folder_paths.folder_names_and_paths["diffusion_models"][0]
for prefix in prefixes:
if os.path.exists(os.path.join(prefix, model_path)):
model_path = os.path.join(prefix, model_path)
break
transformer = NunchakuFluxTransformer2dModel.from_pretrained(model_path, offload=cpu_offload == "enable")
transformer = transformer.to(device)
dit_config = {
"image_model": "flux",
"patch_size": 2,
......
......@@ -4,39 +4,60 @@ import types
import comfy.sd
import folder_paths
import torch
from torch import nn
from transformers import T5EncoderModel
from nunchaku import NunchakuT5EncoderModel
def svdquant_t5_forward(
self: T5EncoderModel,
input_ids: torch.LongTensor,
attention_mask,
embeds=None,
intermediate_output=None,
final_layer_norm_intermediate=True,
dtype: str | torch.dtype = torch.bfloat16,
**kwargs,
):
assert attention_mask is None
assert intermediate_output is None
assert final_layer_norm_intermediate
outputs = self.encoder(input_ids, attention_mask=attention_mask)
outputs = self.encoder(input_ids=input_ids, inputs_embeds=embeds, attention_mask=attention_mask)
hidden_states = outputs["last_hidden_state"]
hidden_states = hidden_states.to(dtype=dtype)
return hidden_states, None
class WrappedEmbedding(nn.Module):
def __init__(self, embedding: nn.Embedding):
super().__init__()
self.embedding = embedding
def forward(self, input: torch.Tensor, out_dtype: torch.dtype | None = None):
return self.embedding(input)
@property
def weight(self):
return self.embedding.weight
class SVDQuantTextEncoderLoader:
@classmethod
def INPUT_TYPES(s):
model_paths = ["mit-han-lab/svdq-flux.1-t5"]
prefix = os.path.join(folder_paths.models_dir, "text_encoders")
local_folders = os.listdir(prefix)
local_folders = sorted(
[
folder
for folder in local_folders
if not folder.startswith(".") and os.path.isdir(os.path.join(prefix, folder))
]
)
prefixes = folder_paths.folder_names_and_paths["text_encoders"][0]
local_folders = set()
for prefix in prefixes:
if os.path.exists(prefix) and os.path.isdir(prefix):
local_folders_ = os.listdir(prefix)
local_folders_ = [
folder
for folder in local_folders_
if not folder.startswith(".") and os.path.isdir(os.path.join(prefix, folder))
]
local_folders.update(local_folders_)
local_folders = sorted(list(local_folders))
model_paths.extend(local_folders)
return {
"required": {
......@@ -45,14 +66,7 @@ class SVDQuantTextEncoderLoader:
"text_encoder2": (folder_paths.get_filename_list("text_encoders"),),
"t5_min_length": (
"INT",
{
"default": 512,
"min": 256,
"max": 1024,
"step": 128,
"display": "number",
"lazy": True,
},
{"default": 512, "min": 256, "max": 1024, "step": 128, "display": "number", "lazy": True},
),
"t5_precision": (["BF16", "INT4"],),
"int4_model": (model_paths, {"tooltip": "The name of the INT4 model."}),
......@@ -92,20 +106,23 @@ class SVDQuantTextEncoderLoader:
clip.tokenizer.t5xxl.min_length = t5_min_length
if t5_precision == "INT4":
from nunchaku.models.text_encoder import NunchakuT5EncoderModel
transformer = clip.cond_stage_model.t5xxl.transformer
param = next(transformer.parameters())
dtype = param.dtype
device = param.device
prefix = "models/text_encoders"
if os.path.exists(os.path.join(prefix, int4_model)):
model_path = os.path.join(prefix, int4_model)
else:
prefixes = folder_paths.folder_names_and_paths["diffusion_models"][0]
model_path = None
for prefix in prefixes:
if os.path.exists(os.path.join(prefix, int4_model)):
model_path = os.path.join(prefix, int4_model)
break
if model_path is None:
model_path = int4_model
transformer = NunchakuT5EncoderModel.from_pretrained(model_path)
transformer.forward = types.MethodType(svdquant_t5_forward, transformer)
transformer.shared = WrappedEmbedding(transformer.shared)
clip.cond_stage_model.t5xxl.transformer = (
transformer.to(device=device, dtype=dtype) if device.type == "cuda" else transformer
)
......
[project]
name = "svdquant"
description = "SVDQuant ComfyUI Node. SVDQuant is a new post-training training quantization paradigm for diffusion models, which quantize both the weights and activations of FLUX.1 to 4 bits, achieving 3.5× memory and 8.7× latency reduction on a 16GB laptop 4090 GPU."
version = "0.1.3"
description = "SVDQuant ComfyUI Node. SVDQuant is a new post-training training quantization paradigm for diffusion models, which quantize both the weights and activations of FLUX.1 to 4 bits, achieving 3.5× memory and 8.7× latency reduction on a 16GB laptop 4090 GPU. GitHub: https://github.com/mit-han-lab/nunchaku"
version = "0.1.5"
license = { file = "LICENSE.txt" }
dependencies = []
requires-python = ">=3.11, <3.13"
requires-python = ">=3.10, <3.13"
[project.urls]
Repository = "https://github.com/mit-han-lab/nunchaku"
#[project.urls]
#Repository = "https://github.com/mit-han-lab/nunchaku"
# Used by Comfy Registry https://comfyregistry.org
[tool.comfy]
......
......@@ -19,45 +19,50 @@
"inputs": [
{
"name": "model",
"localized_name": "model",
"label": "model",
"type": "MODEL",
"link": 71,
"label": "model"
"link": 71
},
{
"name": "positive",
"localized_name": "positive",
"label": "positive",
"type": "CONDITIONING",
"link": 64,
"label": "positive"
"link": 64
},
{
"name": "negative",
"localized_name": "negative",
"label": "negative",
"type": "CONDITIONING",
"link": 65,
"label": "negative"
"link": 65
},
{
"name": "latent_image",
"localized_name": "latent_image",
"label": "latent_image",
"type": "LATENT",
"link": 66,
"label": "latent_image"
"link": 66
}
],
"outputs": [
{
"name": "LATENT",
"localized_name": "LATENT",
"label": "LATENT",
"type": "LATENT",
"links": [
7
],
"slot_index": 0,
"label": "LATENT"
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "KSampler"
},
"widgets_values": [
896617285614695,
875054580097021,
"randomize",
20,
1,
......@@ -83,56 +88,63 @@
"inputs": [
{
"name": "positive",
"localized_name": "positive",
"label": "positive",
"type": "CONDITIONING",
"link": 67,
"label": "positive"
"link": 67
},
{
"name": "negative",
"localized_name": "negative",
"label": "negative",
"type": "CONDITIONING",
"link": 68,
"label": "negative"
"link": 68
},
{
"name": "vae",
"localized_name": "vae",
"label": "vae",
"type": "VAE",
"link": 69,
"label": "vae"
"link": 69
},
{
"name": "pixels",
"localized_name": "pixels",
"label": "pixels",
"type": "IMAGE",
"link": 70,
"label": "pixels"
"link": 70
}
],
"outputs": [
{
"name": "positive",
"localized_name": "positive",
"label": "positive",
"type": "CONDITIONING",
"links": [
64
],
"slot_index": 0,
"label": "positive"
"slot_index": 0
},
{
"name": "negative",
"localized_name": "negative",
"label": "negative",
"type": "CONDITIONING",
"links": [
65
],
"slot_index": 1,
"label": "negative"
"slot_index": 1
},
{
"name": "latent",
"localized_name": "latent",
"label": "latent",
"type": "LATENT",
"links": [
66
],
"slot_index": 2,
"label": "latent"
"slot_index": 2
}
],
"properties": {
......@@ -157,26 +169,29 @@
"inputs": [
{
"name": "samples",
"localized_name": "samples",
"label": "samples",
"type": "LATENT",
"link": 7,
"label": "samples"
"link": 7
},
{
"name": "vae",
"localized_name": "vae",
"label": "vae",
"type": "VAE",
"link": 60,
"label": "vae"
"link": 60
}
],
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"label": "IMAGE",
"type": "IMAGE",
"links": [
9
],
"slot_index": 0,
"label": "IMAGE"
"slot_index": 0
}
],
"properties": {
......@@ -201,9 +216,10 @@
"inputs": [
{
"name": "images",
"localized_name": "images",
"label": "images",
"type": "IMAGE",
"link": 9,
"label": "images"
"link": 9
}
],
"outputs": [],
......@@ -230,13 +246,14 @@
"outputs": [
{
"name": "VAE",
"localized_name": "VAE",
"label": "VAE",
"type": "VAE",
"links": [
60,
69
],
"slot_index": 0,
"label": "VAE"
"slot_index": 0
}
],
"properties": {
......@@ -263,21 +280,23 @@
"inputs": [
{
"name": "conditioning",
"localized_name": "conditioning",
"label": "conditioning",
"type": "CONDITIONING",
"link": 41,
"label": "conditioning"
"link": 41
}
],
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"label": "CONDITIONING",
"type": "CONDITIONING",
"shape": 3,
"links": [
67
],
"slot_index": 0,
"shape": 3,
"label": "CONDITIONING"
"slot_index": 0
}
],
"properties": {
......@@ -296,7 +315,7 @@
],
"size": [
315,
106
122
],
"flags": {},
"order": 1,
......@@ -305,12 +324,13 @@
"outputs": [
{
"name": "CLIP",
"localized_name": "CLIP",
"label": "CLIP",
"type": "CLIP",
"links": [
62,
63
],
"label": "CLIP"
]
}
],
"properties": {
......@@ -340,20 +360,22 @@
"inputs": [
{
"name": "clip",
"localized_name": "clip",
"label": "clip",
"type": "CLIP",
"link": 62,
"label": "clip"
"link": 62
}
],
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"label": "CONDITIONING",
"type": "CONDITIONING",
"links": [
41
],
"slot_index": 0,
"label": "CONDITIONING"
"slot_index": 0
}
],
"title": "CLIP Text Encode (Positive Prompt)",
......@@ -383,9 +405,10 @@
"inputs": [
{
"name": "images",
"localized_name": "images",
"label": "images",
"type": "IMAGE",
"link": 26,
"label": "images"
"link": 26
}
],
"outputs": [],
......@@ -411,22 +434,24 @@
"inputs": [
{
"name": "image",
"localized_name": "image",
"label": "image",
"type": "IMAGE",
"link": 76,
"label": "image"
"link": 76
}
],
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"label": "IMAGE",
"type": "IMAGE",
"shape": 3,
"links": [
26,
70
],
"slot_index": 0,
"shape": 3,
"label": "IMAGE"
"slot_index": 0
}
],
"properties": {
......@@ -454,6 +479,7 @@
"inputs": [
{
"name": "image",
"localized_name": "image",
"type": "IMAGE",
"link": 75
}
......@@ -461,6 +487,7 @@
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"type": "IMAGE",
"links": [
76
......@@ -478,48 +505,6 @@
"center"
]
},
{
"id": 17,
"type": "LoadImage",
"pos": [
6.694743633270264,
562.3865966796875
],
"size": [
315,
314.0000305175781
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"type": "IMAGE",
"links": [
75
],
"slot_index": 0,
"shape": 3,
"label": "IMAGE"
},
{
"name": "MASK",
"type": "MASK",
"links": null,
"shape": 3,
"label": "MASK"
}
],
"properties": {
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"robot.png",
"image"
]
},
{
"id": 7,
"type": "CLIPTextEncode",
......@@ -539,20 +524,22 @@
"inputs": [
{
"name": "clip",
"localized_name": "clip",
"label": "clip",
"type": "CLIP",
"link": 63,
"label": "clip"
"link": 63
}
],
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"label": "CONDITIONING",
"type": "CONDITIONING",
"links": [
68
],
"slot_index": 0,
"label": "CONDITIONING"
"slot_index": 0
}
],
"title": "CLIP Text Encode (Negative Prompt)",
......@@ -565,6 +552,50 @@
"color": "#322",
"bgcolor": "#533"
},
{
"id": 17,
"type": "LoadImage",
"pos": [
6.694743633270264,
562.3865966796875
],
"size": [
315,
314.0000305175781
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"label": "IMAGE",
"type": "IMAGE",
"shape": 3,
"links": [
75
],
"slot_index": 0
},
{
"name": "MASK",
"localized_name": "MASK",
"label": "MASK",
"type": "MASK",
"shape": 3,
"links": null
}
],
"properties": {
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"robot.png",
"image"
]
},
{
"id": 36,
"type": "SVDQuantFluxDiTLoader",
......@@ -574,7 +605,7 @@
],
"size": [
395.6002197265625,
105.77959442138672
106
],
"flags": {},
"order": 3,
......@@ -583,6 +614,7 @@
"outputs": [
{
"name": "MODEL",
"localized_name": "MODEL",
"type": "MODEL",
"links": [
71
......@@ -595,6 +627,7 @@
},
"widgets_values": [
"mit-han-lab/svdq-int4-flux.1-canny-dev",
"disable",
0
]
}
......@@ -741,14 +774,14 @@
"config": {},
"extra": {
"ds": {
"scale": 0.895430243255241,
"scale": 1.5863092971714992,
"offset": [
838.4305404853558,
332.05158795287764
170.04223120944968,
209.5374167314878
]
},
"node_versions": {
"comfy-core": "0.3.14"
"comfy-core": "0.3.24"
}
},
"version": 0.4
......
......@@ -21,20 +21,22 @@
"inputs": [
{
"name": "clip",
"localized_name": "clip",
"label": "clip",
"type": "CLIP",
"link": 63,
"label": "clip"
"link": 63
}
],
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"label": "CONDITIONING",
"type": "CONDITIONING",
"links": [
68
],
"slot_index": 0,
"label": "CONDITIONING"
"slot_index": 0
}
],
"title": "CLIP Text Encode (Negative Prompt)",
......@@ -56,7 +58,7 @@
],
"size": [
315,
106
122
],
"flags": {},
"order": 0,
......@@ -65,12 +67,13 @@
"outputs": [
{
"name": "CLIP",
"localized_name": "CLIP",
"label": "CLIP",
"type": "CLIP",
"links": [
62,
63
],
"label": "CLIP"
]
}
],
"properties": {
......@@ -100,21 +103,23 @@
"inputs": [
{
"name": "conditioning",
"localized_name": "conditioning",
"label": "conditioning",
"type": "CONDITIONING",
"link": 41,
"label": "conditioning"
"link": 41
}
],
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"label": "CONDITIONING",
"type": "CONDITIONING",
"shape": 3,
"links": [
67
],
"slot_index": 0,
"shape": 3,
"label": "CONDITIONING"
"slot_index": 0
}
],
"properties": {
......@@ -141,45 +146,50 @@
"inputs": [
{
"name": "model",
"localized_name": "model",
"label": "model",
"type": "MODEL",
"link": 78,
"label": "model"
"link": 78
},
{
"name": "positive",
"localized_name": "positive",
"label": "positive",
"type": "CONDITIONING",
"link": 64,
"label": "positive"
"link": 64
},
{
"name": "negative",
"localized_name": "negative",
"label": "negative",
"type": "CONDITIONING",
"link": 65,
"label": "negative"
"link": 65
},
{
"name": "latent_image",
"localized_name": "latent_image",
"label": "latent_image",
"type": "LATENT",
"link": 73,
"label": "latent_image"
"link": 73
}
],
"outputs": [
{
"name": "LATENT",
"localized_name": "LATENT",
"label": "LATENT",
"type": "LATENT",
"links": [
7
],
"slot_index": 0,
"label": "LATENT"
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "KSampler"
},
"widgets_values": [
704308966490490,
69796511068157,
"randomize",
20,
1,
......@@ -188,39 +198,6 @@
1
]
},
{
"id": 39,
"type": "SVDQuantFluxDiTLoader",
"pos": [
707.80908203125,
-172.0343017578125
],
"size": [
315,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
78
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "SVDQuantFluxDiTLoader"
},
"widgets_values": [
"mit-han-lab/svdq-int4-flux.1-depth-dev",
0
]
},
{
"id": 43,
"type": "PreviewImage",
......@@ -238,9 +215,10 @@
"inputs": [
{
"name": "images",
"localized_name": "images",
"label": "images",
"type": "IMAGE",
"link": 87,
"label": "images"
"link": 87
}
],
"outputs": [],
......@@ -266,26 +244,29 @@
"inputs": [
{
"name": "samples",
"localized_name": "samples",
"label": "samples",
"type": "LATENT",
"link": 7,
"label": "samples"
"link": 7
},
{
"name": "vae",
"localized_name": "vae",
"label": "vae",
"type": "VAE",
"link": 60,
"label": "vae"
"link": 60
}
],
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"label": "IMAGE",
"type": "IMAGE",
"links": [
85
],
"slot_index": 0,
"label": "IMAGE"
"slot_index": 0
}
],
"properties": {
......@@ -310,9 +291,10 @@
"inputs": [
{
"name": "images",
"localized_name": "images",
"label": "images",
"type": "IMAGE",
"link": 85,
"label": "images"
"link": 85
}
],
"outputs": [],
......@@ -338,6 +320,7 @@
"inputs": [
{
"name": "image",
"localized_name": "image",
"type": "IMAGE",
"link": 82
}
......@@ -345,6 +328,7 @@
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"type": "IMAGE",
"links": [
86
......@@ -379,20 +363,22 @@
"inputs": [
{
"name": "clip",
"localized_name": "clip",
"label": "clip",
"type": "CLIP",
"link": 62,
"label": "clip"
"link": 62
}
],
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"label": "CONDITIONING",
"type": "CONDITIONING",
"links": [
41
],
"slot_index": 0,
"label": "CONDITIONING"
"slot_index": 0
}
],
"title": "CLIP Text Encode (Positive Prompt)",
......@@ -417,19 +403,20 @@
58
],
"flags": {},
"order": 2,
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"localized_name": "VAE",
"label": "VAE",
"type": "VAE",
"links": [
60,
69
],
"slot_index": 0,
"label": "VAE"
"slot_index": 0
}
],
"properties": {
......@@ -456,56 +443,63 @@
"inputs": [
{
"name": "positive",
"localized_name": "positive",
"label": "positive",
"type": "CONDITIONING",
"link": 67,
"label": "positive"
"link": 67
},
{
"name": "negative",
"localized_name": "negative",
"label": "negative",
"type": "CONDITIONING",
"link": 68,
"label": "negative"
"link": 68
},
{
"name": "vae",
"localized_name": "vae",
"label": "vae",
"type": "VAE",
"link": 69,
"label": "vae"
"link": 69
},
{
"name": "pixels",
"localized_name": "pixels",
"label": "pixels",
"type": "IMAGE",
"link": 88,
"label": "pixels"
"link": 88
}
],
"outputs": [
{
"name": "positive",
"localized_name": "positive",
"label": "positive",
"type": "CONDITIONING",
"links": [
64
],
"slot_index": 0,
"label": "positive"
"slot_index": 0
},
{
"name": "negative",
"localized_name": "negative",
"label": "negative",
"type": "CONDITIONING",
"links": [
65
],
"slot_index": 1,
"label": "negative"
"slot_index": 1
},
{
"name": "latent",
"localized_name": "latent",
"label": "latent",
"type": "LATENT",
"links": [
73
],
"slot_index": 2,
"label": "latent"
"slot_index": 2
}
],
"properties": {
......@@ -530,6 +524,7 @@
"inputs": [
{
"name": "image",
"localized_name": "image",
"type": "IMAGE",
"link": 86
}
......@@ -537,6 +532,7 @@
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"type": "IMAGE",
"links": [
87,
......@@ -564,26 +560,28 @@
314.0000305175781
],
"flags": {},
"order": 3,
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"label": "IMAGE",
"type": "IMAGE",
"shape": 3,
"links": [
82
],
"slot_index": 0,
"shape": 3,
"label": "IMAGE"
"slot_index": 0
},
{
"name": "MASK",
"localized_name": "MASK",
"label": "MASK",
"type": "MASK",
"links": null,
"shape": 3,
"label": "MASK"
"links": null
}
],
"properties": {
......@@ -593,6 +591,41 @@
"logo_example.png",
"image"
]
},
{
"id": 39,
"type": "SVDQuantFluxDiTLoader",
"pos": [
707.80908203125,
-172.0343017578125
],
"size": [
315,
106
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"localized_name": "MODEL",
"type": "MODEL",
"links": [
78
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "SVDQuantFluxDiTLoader"
},
"widgets_values": [
"mit-han-lab/svdq-int4-flux.1-depth-dev",
"disable",
0
]
}
],
"links": [
......@@ -739,12 +772,12 @@
"ds": {
"scale": 0.8140274938684042,
"offset": [
1060.3416359459316,
529.8567933439979
1795.999020278545,
750.1636967541119
]
},
"node_versions": {
"comfy-core": "0.3.14"
"comfy-core": "0.3.24"
}
},
"version": 0.4
......
......@@ -19,6 +19,7 @@
"inputs": [
{
"name": "conditioning",
"localized_name": "conditioning",
"type": "CONDITIONING",
"link": 41
}
......@@ -26,12 +27,13 @@
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"type": "CONDITIONING",
"shape": 3,
"links": [
42
],
"slot_index": 0,
"shape": 3
"slot_index": 0
}
],
"properties": {
......@@ -60,6 +62,7 @@
"inputs": [
{
"name": "images",
"localized_name": "images",
"type": "IMAGE",
"link": 9
}
......@@ -82,24 +85,25 @@
82
],
"flags": {},
"order": 6,
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "NOISE",
"localized_name": "NOISE",
"type": "NOISE",
"shape": 3,
"links": [
37
],
"shape": 3
]
}
],
"properties": {
"Node name for S&R": "RandomNoise"
},
"widgets_values": [
148576770035090,
385675283593224,
"randomize"
],
"color": "#2a363b",
......@@ -117,17 +121,18 @@
58
],
"flags": {},
"order": 0,
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "SAMPLER",
"localized_name": "SAMPLER",
"type": "SAMPLER",
"shape": 3,
"links": [
19
],
"shape": 3
]
}
],
"properties": {
......@@ -154,6 +159,7 @@
"inputs": [
{
"name": "model",
"localized_name": "model",
"type": "MODEL",
"link": 55,
"slot_index": 0
......@@ -162,11 +168,12 @@
"outputs": [
{
"name": "SIGMAS",
"localized_name": "SIGMAS",
"type": "SIGMAS",
"shape": 3,
"links": [
20
],
"shape": 3
]
}
],
"properties": {
......@@ -187,7 +194,7 @@
],
"size": [
315,
130
170
],
"flags": {},
"order": 11,
......@@ -195,6 +202,7 @@
"inputs": [
{
"name": "model",
"localized_name": "model",
"type": "MODEL",
"link": 123,
"slot_index": 0
......@@ -202,32 +210,41 @@
{
"name": "width",
"type": "INT",
"link": 115,
"slot_index": 1,
"pos": [
10,
84
],
"widget": {
"name": "width"
}
},
"link": 115,
"slot_index": 1
},
{
"name": "height",
"type": "INT",
"link": 114,
"slot_index": 2,
"pos": [
10,
108
],
"widget": {
"name": "height"
}
},
"link": 114,
"slot_index": 2
}
],
"outputs": [
{
"name": "MODEL",
"localized_name": "MODEL",
"type": "MODEL",
"shape": 3,
"links": [
54,
55
],
"slot_index": 0,
"shape": 3
"slot_index": 0
}
],
"properties": {
......@@ -257,11 +274,13 @@
"inputs": [
{
"name": "samples",
"localized_name": "samples",
"type": "LATENT",
"link": 24
},
{
"name": "vae",
"localized_name": "vae",
"type": "VAE",
"link": 12
}
......@@ -269,6 +288,7 @@
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"type": "IMAGE",
"links": [
9
......@@ -298,12 +318,14 @@
"inputs": [
{
"name": "model",
"localized_name": "model",
"type": "MODEL",
"link": 54,
"slot_index": 0
},
{
"name": "conditioning",
"localized_name": "conditioning",
"type": "CONDITIONING",
"link": 42,
"slot_index": 1
......@@ -312,12 +334,13 @@
"outputs": [
{
"name": "GUIDER",
"localized_name": "GUIDER",
"type": "GUIDER",
"shape": 3,
"links": [
30
],
"slot_index": 0,
"shape": 3
"slot_index": 0
}
],
"properties": {
......@@ -344,14 +367,14 @@
{
"name": "INT",
"type": "INT",
"widget": {
"name": "height"
},
"links": [
113,
114
],
"slot_index": 0,
"widget": {
"name": "height"
}
"slot_index": 0
}
],
"title": "height",
......@@ -377,21 +400,21 @@
82
],
"flags": {},
"order": 1,
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "INT",
"type": "INT",
"widget": {
"name": "width"
},
"links": [
112,
115
],
"slot_index": 0,
"widget": {
"name": "width"
}
"slot_index": 0
}
],
"title": "width",
......@@ -414,7 +437,7 @@
],
"size": [
315,
106
126
],
"flags": {},
"order": 7,
......@@ -423,29 +446,38 @@
{
"name": "width",
"type": "INT",
"link": 112,
"pos": [
10,
36
],
"widget": {
"name": "width"
}
},
"link": 112
},
{
"name": "height",
"type": "INT",
"link": 113,
"pos": [
10,
60
],
"widget": {
"name": "height"
}
},
"link": 113
}
],
"outputs": [
{
"name": "LATENT",
"localized_name": "LATENT",
"type": "LATENT",
"shape": 3,
"links": [
116
],
"slot_index": 0,
"shape": 3
"slot_index": 0
}
],
"properties": {
......@@ -474,6 +506,7 @@
"inputs": [
{
"name": "clip",
"localized_name": "clip",
"type": "CLIP",
"link": 118
}
......@@ -481,6 +514,7 @@
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
41
......@@ -515,6 +549,7 @@
"inputs": [
{
"name": "model",
"localized_name": "model",
"type": "MODEL",
"link": 122
}
......@@ -522,6 +557,7 @@
"outputs": [
{
"name": "MODEL",
"localized_name": "MODEL",
"type": "MODEL",
"links": [
123
......@@ -539,39 +575,6 @@
1
]
},
{
"id": 38,
"type": "SVDQuantFluxDiTLoader",
"pos": [
426.25274658203125,
905.1461181640625
],
"size": [
315,
82
],
"flags": {},
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
122
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "SVDQuantFluxDiTLoader"
},
"widgets_values": [
"mit-han-lab/svdq-int4-flux.1-dev",
0
]
},
{
"id": 10,
"type": "VAELoader",
......@@ -590,12 +593,13 @@
"outputs": [
{
"name": "VAE",
"localized_name": "VAE",
"type": "VAE",
"shape": 3,
"links": [
12
],
"slot_index": 0,
"shape": 3
"slot_index": 0
}
],
"properties": {
......@@ -617,12 +621,13 @@
178
],
"flags": {},
"order": 3,
"order": 5,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "CLIP",
"localized_name": "CLIP",
"type": "CLIP",
"links": [
118
......@@ -658,30 +663,35 @@
"inputs": [
{
"name": "noise",
"localized_name": "noise",
"type": "NOISE",
"link": 37,
"slot_index": 0
},
{
"name": "guider",
"localized_name": "guider",
"type": "GUIDER",
"link": 30,
"slot_index": 1
},
{
"name": "sampler",
"localized_name": "sampler",
"type": "SAMPLER",
"link": 19,
"slot_index": 2
},
{
"name": "sigmas",
"localized_name": "sigmas",
"type": "SIGMAS",
"link": 20,
"slot_index": 3
},
{
"name": "latent_image",
"localized_name": "latent_image",
"type": "LATENT",
"link": 116,
"slot_index": 4
......@@ -690,24 +700,61 @@
"outputs": [
{
"name": "output",
"localized_name": "output",
"type": "LATENT",
"shape": 3,
"links": [
24
],
"slot_index": 0,
"shape": 3
"slot_index": 0
},
{
"name": "denoised_output",
"localized_name": "denoised_output",
"type": "LATENT",
"links": null,
"shape": 3
"shape": 3,
"links": null
}
],
"properties": {
"Node name for S&R": "SamplerCustomAdvanced"
},
"widgets_values": []
},
{
"id": 38,
"type": "SVDQuantFluxDiTLoader",
"pos": [
425.7825012207031,
887.9263916015625
],
"size": [
315,
106
],
"flags": {},
"order": 6,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"localized_name": "MODEL",
"type": "MODEL",
"links": [
122
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "SVDQuantFluxDiTLoader"
},
"widgets_values": [
"mit-han-lab/svdq-int4-flux.1-dev",
"disable",
0
]
}
],
"links": [
......@@ -870,8 +917,8 @@
"ds": {
"scale": 1.0152559799477106,
"offset": [
521.7873982958799,
167.19904950835112
1093.678904911345,
404.94781362261836
]
},
"groupNodes": {
......@@ -1066,7 +1113,7 @@
}
},
"node_versions": {
"comfy-core": "0.3.14"
"comfy-core": "0.3.24"
}
},
"version": 0.4
......
......@@ -19,11 +19,13 @@
"inputs": [
{
"name": "samples",
"localized_name": "samples",
"type": "LATENT",
"link": 7
},
{
"name": "vae",
"localized_name": "vae",
"type": "VAE",
"link": 60
}
......@@ -31,6 +33,7 @@
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"type": "IMAGE",
"links": [
95
......@@ -60,26 +63,31 @@
"inputs": [
{
"name": "positive",
"localized_name": "positive",
"type": "CONDITIONING",
"link": 80
},
{
"name": "negative",
"localized_name": "negative",
"type": "CONDITIONING",
"link": 81
},
{
"name": "vae",
"localized_name": "vae",
"type": "VAE",
"link": 82
},
{
"name": "pixels",
"localized_name": "pixels",
"type": "IMAGE",
"link": 107
},
{
"name": "mask",
"localized_name": "mask",
"type": "MASK",
"link": 108
}
......@@ -87,6 +95,7 @@
"outputs": [
{
"name": "positive",
"localized_name": "positive",
"type": "CONDITIONING",
"links": [
77
......@@ -95,6 +104,7 @@
},
{
"name": "negative",
"localized_name": "negative",
"type": "CONDITIONING",
"links": [
78
......@@ -103,6 +113,7 @@
},
{
"name": "latent",
"localized_name": "latent",
"type": "LATENT",
"links": [
88
......@@ -134,21 +145,25 @@
"inputs": [
{
"name": "model",
"localized_name": "model",
"type": "MODEL",
"link": 102
},
{
"name": "positive",
"localized_name": "positive",
"type": "CONDITIONING",
"link": 77
},
{
"name": "negative",
"localized_name": "negative",
"type": "CONDITIONING",
"link": 78
},
{
"name": "latent_image",
"localized_name": "latent_image",
"type": "LATENT",
"link": 88
}
......@@ -156,6 +171,7 @@
"outputs": [
{
"name": "LATENT",
"localized_name": "LATENT",
"type": "LATENT",
"links": [
7
......@@ -167,7 +183,7 @@
"Node name for S&R": "KSampler"
},
"widgets_values": [
54184445162233,
482487939694684,
"randomize",
20,
1,
......@@ -176,35 +192,6 @@
1
]
},
{
"id": 9,
"type": "SaveImage",
"pos": [
1879,
90
],
"size": [
828.9535522460938,
893.8475341796875
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "images",
"type": "IMAGE",
"link": 95
}
],
"outputs": [],
"properties": {
"Node name for S&R": "SaveImage"
},
"widgets_values": [
"ComfyUI"
]
},
{
"id": 26,
"type": "FluxGuidance",
......@@ -222,6 +209,7 @@
"inputs": [
{
"name": "conditioning",
"localized_name": "conditioning",
"type": "CONDITIONING",
"link": 41
}
......@@ -229,12 +217,13 @@
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"type": "CONDITIONING",
"shape": 3,
"links": [
80
],
"slot_index": 0,
"shape": 3
"slot_index": 0
}
],
"properties": {
......@@ -263,6 +252,7 @@
"inputs": [
{
"name": "clip",
"localized_name": "clip",
"type": "CLIP",
"link": 63
}
......@@ -270,6 +260,7 @@
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
81
......@@ -296,7 +287,7 @@
],
"size": [
315,
106
122
],
"flags": {},
"order": 0,
......@@ -305,6 +296,7 @@
"outputs": [
{
"name": "CLIP",
"localized_name": "CLIP",
"type": "CLIP",
"links": [
62,
......@@ -322,72 +314,6 @@
"default"
]
},
{
"id": 32,
"type": "VAELoader",
"pos": [
1303,
424
],
"size": [
315,
58
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"links": [
60,
82
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"ae.safetensors"
]
},
{
"id": 45,
"type": "SVDQuantFluxDiTLoader",
"pos": [
936.3029174804688,
-113.06819915771484
],
"size": [
315,
82
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"links": [
102
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "SVDQuantFluxDiTLoader"
},
"widgets_values": [
"mit-han-lab/svdq-int4-flux.1-fill-dev",
0
]
},
{
"id": 58,
"type": "ImageAndMaskResizeNode",
......@@ -405,11 +331,13 @@
"inputs": [
{
"name": "image",
"localized_name": "image",
"type": "IMAGE",
"link": 105
},
{
"name": "mask",
"localized_name": "mask",
"type": "MASK",
"link": 106
}
......@@ -417,6 +345,7 @@
"outputs": [
{
"name": "image",
"localized_name": "image",
"type": "IMAGE",
"links": [
107
......@@ -425,6 +354,7 @@
},
{
"name": "mask",
"localized_name": "mask",
"type": "MASK",
"links": [
108
......@@ -460,6 +390,7 @@
"inputs": [
{
"name": "clip",
"localized_name": "clip",
"type": "CLIP",
"link": 62
}
......@@ -467,6 +398,7 @@
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
41
......@@ -496,7 +428,7 @@
132.3040771484375
],
"flags": {},
"order": 3,
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [],
......@@ -509,6 +441,41 @@
"color": "#432",
"bgcolor": "#653"
},
{
"id": 45,
"type": "SVDQuantFluxDiTLoader",
"pos": [
936.3029174804688,
-113.06819915771484
],
"size": [
315,
106
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "MODEL",
"localized_name": "MODEL",
"type": "MODEL",
"links": [
102
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "SVDQuantFluxDiTLoader"
},
"widgets_values": [
"mit-han-lab/svdq-int4-flux.1-fill-dev",
"disable",
0
]
},
{
"id": 17,
"type": "LoadImage",
......@@ -529,30 +496,96 @@
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"type": "IMAGE",
"shape": 3,
"links": [
105
],
"slot_index": 0,
"shape": 3
"slot_index": 0
},
{
"name": "MASK",
"localized_name": "MASK",
"type": "MASK",
"shape": 3,
"links": [
106
],
"slot_index": 1,
"shape": 3
"slot_index": 1
}
],
"properties": {
"Node name for S&R": "LoadImage"
},
"widgets_values": [
"clipspace/clipspace-mask-8389612.599999994.png [input]",
"clipspace/clipspace-mask-331829.799999997.png [input]",
"image"
]
},
{
"id": 32,
"type": "VAELoader",
"pos": [
953.8762817382812,
440.3467102050781
],
"size": [
315,
58
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"localized_name": "VAE",
"type": "VAE",
"links": [
60,
82
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"ae.safetensors"
]
},
{
"id": 9,
"type": "SaveImage",
"pos": [
1862.43359375,
96.36107635498047
],
"size": [
828.9535522460938,
893.8475341796875
],
"flags": {},
"order": 12,
"mode": 0,
"inputs": [
{
"name": "images",
"localized_name": "images",
"type": "IMAGE",
"link": 95
}
],
"outputs": [],
"properties": {
"Node name for S&R": "SaveImage"
},
"widgets_values": [
"ComfyUI"
]
}
],
"links": [
......@@ -697,14 +730,14 @@
"config": {},
"extra": {
"ds": {
"scale": 0.8390545288824038,
"scale": 1.7985878990921451,
"offset": [
815.2093059315082,
185.9955477896796
-287.8887097712823,
208.1745856210748
]
},
"node_versions": {
"comfy-core": "0.3.14",
"comfy-core": "0.3.24",
"comfyui-inpainteasy": "1.0.2"
}
},
......
......@@ -14,16 +14,18 @@
46
],
"flags": {},
"order": 12,
"order": 10,
"mode": 0,
"inputs": [
{
"name": "samples",
"localized_name": "samples",
"type": "LATENT",
"link": 24
},
{
"name": "vae",
"localized_name": "vae",
"type": "VAE",
"link": 12
}
......@@ -31,6 +33,7 @@
"outputs": [
{
"name": "IMAGE",
"localized_name": "IMAGE",
"type": "IMAGE",
"links": [
9
......@@ -55,12 +58,13 @@
106
],
"flags": {},
"order": 2,
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "LATENT",
"localized_name": "LATENT",
"type": "LATENT",
"links": [
23
......@@ -91,17 +95,18 @@
58
],
"flags": {},
"order": 3,
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "SAMPLER",
"localized_name": "SAMPLER",
"type": "SAMPLER",
"shape": 3,
"links": [
19
],
"shape": 3
]
}
],
"properties": {
......@@ -123,11 +128,12 @@
106
],
"flags": {},
"order": 8,
"order": 7,
"mode": 0,
"inputs": [
{
"name": "model",
"localized_name": "model",
"type": "MODEL",
"link": 41,
"slot_index": 0
......@@ -136,11 +142,12 @@
"outputs": [
{
"name": "SIGMAS",
"localized_name": "SIGMAS",
"type": "SIGMAS",
"shape": 3,
"links": [
20
],
"shape": 3
]
}
],
"properties": {
......@@ -152,31 +159,6 @@
1
]
},
{
"id": 27,
"type": "Note",
"pos": [
480,
960
],
"size": [
311.3529052734375,
131.16229248046875
],
"flags": {},
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {
"text": ""
},
"widgets_values": [
"The schnell model is a distilled model that can generate a good image with only 4 steps."
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 22,
"type": "BasicGuider",
......@@ -189,17 +171,19 @@
46
],
"flags": {},
"order": 10,
"order": 8,
"mode": 0,
"inputs": [
{
"name": "model",
"localized_name": "model",
"type": "MODEL",
"link": 42,
"slot_index": 0
},
{
"name": "conditioning",
"localized_name": "conditioning",
"type": "CONDITIONING",
"link": 40,
"slot_index": 1
......@@ -208,12 +192,13 @@
"outputs": [
{
"name": "GUIDER",
"localized_name": "GUIDER",
"type": "GUIDER",
"shape": 3,
"links": [
30
],
"slot_index": 0,
"shape": 3
"slot_index": 0
}
],
"properties": {
......@@ -233,11 +218,12 @@
164.31304931640625
],
"flags": {},
"order": 9,
"order": 6,
"mode": 0,
"inputs": [
{
"name": "clip",
"localized_name": "clip",
"type": "CLIP",
"link": 43
}
......@@ -245,6 +231,7 @@
"outputs": [
{
"name": "CONDITIONING",
"localized_name": "CONDITIONING",
"type": "CONDITIONING",
"links": [
40
......@@ -273,35 +260,40 @@
106
],
"flags": {},
"order": 11,
"order": 9,
"mode": 0,
"inputs": [
{
"name": "noise",
"localized_name": "noise",
"type": "NOISE",
"link": 37,
"slot_index": 0
},
{
"name": "guider",
"localized_name": "guider",
"type": "GUIDER",
"link": 30,
"slot_index": 1
},
{
"name": "sampler",
"localized_name": "sampler",
"type": "SAMPLER",
"link": 19,
"slot_index": 2
},
{
"name": "sigmas",
"localized_name": "sigmas",
"type": "SIGMAS",
"link": 20,
"slot_index": 3
},
{
"name": "latent_image",
"localized_name": "latent_image",
"type": "LATENT",
"link": 23,
"slot_index": 4
......@@ -310,18 +302,20 @@
"outputs": [
{
"name": "output",
"localized_name": "output",
"type": "LATENT",
"shape": 3,
"links": [
24
],
"slot_index": 0,
"shape": 3
"slot_index": 0
},
{
"name": "denoised_output",
"localized_name": "denoised_output",
"type": "LATENT",
"links": null,
"shape": 3
"shape": 3,
"links": null
}
],
"properties": {
......@@ -341,11 +335,12 @@
1060.3828125
],
"flags": {},
"order": 13,
"order": 11,
"mode": 0,
"inputs": [
{
"name": "images",
"localized_name": "images",
"type": "IMAGE",
"link": 9
}
......@@ -368,17 +363,18 @@
82
],
"flags": {},
"order": 6,
"order": 2,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "NOISE",
"localized_name": "NOISE",
"type": "NOISE",
"shape": 3,
"links": [
37
],
"shape": 3
]
}
],
"properties": {
......@@ -400,10 +396,10 @@
],
"size": [
352.79998779296875,
130
178
],
"flags": {},
"order": 7,
"order": 3,
"mode": 0,
"inputs": [
{
......@@ -415,6 +411,7 @@
"outputs": [
{
"name": "CLIP",
"localized_name": "CLIP",
"type": "CLIP",
"links": [
43
......@@ -429,77 +426,57 @@
"flux",
"t5xxl_fp16.safetensors",
"clip_l.safetensors",
512
512,
"BF16",
"mit-han-lab/svdq-flux.1-t5"
]
},
{
"id": 10,
"type": "VAELoader",
"id": 28,
"type": "SVDQuantFluxDiTLoader",
"pos": [
-31.617252349853516,
377.54791259765625
-10.846628189086914,
890.9998779296875
],
"size": [
315,
58
106
],
"flags": {},
"order": 0,
"order": 4,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VAE",
"type": "VAE",
"name": "MODEL",
"localized_name": "MODEL",
"type": "MODEL",
"links": [
12
41,
42
],
"slot_index": 0,
"shape": 3
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "VAELoader"
"Node name for S&R": "SVDQuantFluxDiTLoader"
},
"widgets_values": [
"ae.safetensors"
"mit-han-lab/svdq-int4-flux.1-schnell",
"disable",
0
]
},
{
"id": 26,
"type": "Note",
"pos": [
-28.286691665649414,
511.4660339355469
],
"size": [
336,
288
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [],
"outputs": [],
"properties": {
"text": ""
},
"widgets_values": [
"If you get an error in any of the nodes above make sure the files are in the correct directories.\n\nSee the top of the examples page for the links : https://comfyanonymous.github.io/ComfyUI_examples/flux/\n\nflux1-schnell.safetensors goes in: ComfyUI/models/unet/\n\nt5xxl_fp16.safetensors and clip_l.safetensors go in: ComfyUI/models/clip/\n\nae.safetensors goes in: ComfyUI/models/vae/\n\n\nTip: You can set the weight_dtype above to one of the fp8 types if you have memory issues."
],
"color": "#432",
"bgcolor": "#653"
},
{
"id": 28,
"type": "SVDQuantFluxDiTLoader",
"id": 10,
"type": "VAELoader",
"pos": [
-10.846628189086914,
890.9998779296875
874.65625,
480.88372802734375
],
"size": [
315,
82
58
],
"flags": {},
"order": 5,
......@@ -507,21 +484,21 @@
"inputs": [],
"outputs": [
{
"name": "MODEL",
"type": "MODEL",
"name": "VAE",
"localized_name": "VAE",
"type": "VAE",
"shape": 3,
"links": [
41,
42
12
],
"slot_index": 0
}
],
"properties": {
"Node name for S&R": "SVDQuantFluxDiTLoader"
"Node name for S&R": "VAELoader"
},
"widgets_values": [
"mit-han-lab/svdq-int4-flux.1-schnell",
0
"ae.safetensors"
]
}
],
......@@ -627,11 +604,14 @@
"config": {},
"extra": {
"ds": {
"scale": 0.6727499949325652,
"scale": 1.1167815779424761,
"offset": [
405.6825017392191,
29.738440474209906
874.5548427683093,
429.12540214017235
]
},
"node_versions": {
"comfy-core": "0.3.24"
}
},
"version": 0.4
......
import torch
from diffusers import FluxPipeline
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
from nunchaku import NunchakuFluxTransformer2dModel
transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-fp4-flux.1-dev", precision="fp4")
pipeline = FluxPipeline.from_pretrained(
......
import torch
from diffusers import FluxPipeline
from nunchaku.models.transformer_flux import NunchakuFluxTransformer2dModel
from nunchaku import NunchakuFluxTransformer2dModel
transformer = NunchakuFluxTransformer2dModel.from_pretrained("mit-han-lab/svdq-fp4-flux.1-schnell", precision="fp4")
pipeline = FluxPipeline.from_pretrained(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment