@@ -10,8 +10,8 @@ This interactive Gradio application can generate an image based on your provided
python run_gradio.py
```
* By default, the Gemma-2B model is loaded as a safety checker. To disable this feature and save GPU memory, use `--no-safety-checker`.
* By default, only the INT4 DiT is loaded. Use `-p int4 bf16` to add a BF16 DiT for side-by-side comparison, or `-p bf16` to load only the BF16 model.
- By default, the Gemma-2B model is loaded as a safety checker. To disable this feature and save GPU memory, use `--no-safety-checker`.
- By default, only the INT4 DiT is loaded. Use `-p int4 bf16` to add a BF16 DiT for side-by-side comparison, or `-p bf16` to load only the BF16 model.
## Command Line Inference
...
...
@@ -21,10 +21,10 @@ We provide a script, [generate.py](generate.py), that generates an image from a
python generate.py --prompt"You Text Prompt"
```
* The generated image will be saved as `output.png` by default. You can specify a different path using the `-o` or `--output-path` options.
* By default, the script uses our INT4 model. To use the BF16 model instead, specify `-p bf16`.
* You can adjust the number of inference steps and classifier-free guidance scale with `-t` and `-g`, respectively. The defaults are 20 steps and a guidance scale of 5.
* In addition to the classifier-free guidance, you can also adjust the [PAG guidance](https://arxiv.org/abs/2403.17377) scale with `--pag-scale`. The default is 2.
- The generated image will be saved as `output.png` by default. You can specify a different path using the `-o` or `--output-path` options.
- By default, the script uses our INT4 model. To use the BF16 model instead, specify `-p bf16`.
- You can adjust the number of inference steps and classifier-free guidance scale with `-t` and `-g`, respectively. The defaults are 20 steps and a guidance scale of 5.
- In addition to the classifier-free guidance, you can also adjust the [PAG guidance](https://arxiv.org/abs/2403.17377) scale with `--pag-scale`. The default is 2.
## Latency Benchmark
...
...
@@ -34,7 +34,7 @@ To measure the latency of our INT4 models, use the following command:
python latency.py
```
* Adjust the number of inference steps and the guidance scale using `-t` and `-g`, respectively. The defaults are 20 steps and a guidance scale of 5.
* You can also adjust the [PAG guidance](https://arxiv.org/abs/2403.17377) scale with `--pag-scale`. The default is 2.
* By default, the script measures the end-to-end latency for generating a single image. To measure the latency of a single DiT forward step instead, use the `--mode step` flag.
* Specify the number of warmup and test runs using `--warmup-times` and `--test-times`. The defaults are 2 warmup runs and 10 test runs.
- Adjust the number of inference steps and the guidance scale using `-t` and `-g`, respectively. The defaults are 20 steps and a guidance scale of 5.
- You can also adjust the [PAG guidance](https://arxiv.org/abs/2403.17377) scale with `--pag-scale`. The default is 2.
- By default, the script measures the end-to-end latency for generating a single image. To measure the latency of a single DiT forward step instead, use the `--mode step` flag.
- Specify the number of warmup and test runs using `--warmup-times` and `--test-times`. The defaults are 2 warmup runs and 10 test runs.
***You only installed the ComfyUI plugin (`ComfyUI-nunchaku`) but not the core `nunchaku` library.** Please follow the [installation instructions in our README](https://github.com/mit-han-lab/nunchaku?tab=readme-ov-file#installation) to install the correct version of the `nunchaku` library.
***You installed `nunchaku` using `pip install nunchaku`, but this is the wrong package.**
-**You only installed the ComfyUI plugin (`ComfyUI-nunchaku`) but not the core `nunchaku` library.** Please follow the [installation instructions in our README](https://github.com/mit-han-lab/nunchaku?tab=readme-ov-file#installation) to install the correct version of the `nunchaku` library.
-**You installed `nunchaku` using `pip install nunchaku`, but this is the wrong package.**
The `nunchaku` name on PyPI is already taken by an unrelated project. Please uninstall the incorrect package and follow our [installation guide](https://github.com/mit-han-lab/nunchaku?tab=readme-ov-file#installation) to install the correct version.
***(MOST LIKELY) You installed `nunchaku` correctly, but into the wrong Python environment.**
-**(MOST LIKELY) You installed `nunchaku` correctly, but into the wrong Python environment.**
If you're using the ComfyUI portable package, its Python interpreter is very likely not the system default. To identify the correct Python path, launch ComfyUI and check the several initial lines in the log. For example, you will find
```text
...
...
@@ -28,28 +30,33 @@ Please also check the following common causes:
***You have a folder named `nunchaku` in your working directory.**
-**You have a folder named `nunchaku` in your working directory.**
Python may mistakenly load from that local folder instead of the installed library. Also, make sure your plugin folder under `custom_nodes` is named `ComfyUI-nunchaku`, not `nunchaku`.
This error is typically due to using the wrong model for your GPU.
- If you're using a **Blackwell GPU (e.g., RTX 50-series)**, please use our **FP4** models.
- For all other GPUs, use our **INT4** models.
### ❗ System crash or blue screen (e.g., mit-han-lab/nunchaku#57)
We have observed some cases where memory is not properly released after image generation, especially when using ComfyUI. This may lead to system instability or crashes.
We’re actively investigating this issue. If you have experience or insights into memory management in ComfyUI, we would appreciate your help!
### ❗Out of Memory or Slow Model Loading (e.g.,mit-han-lab/nunchaku#249 mit-han-lab/nunchaku#311 mit-han-lab/nunchaku#276)
Try upgrading your CUDA driver and try setting the environment variable `NUNCHAKU_LOAD_METHOD` to either `READ` or `READNOPIN`.
### ❗Same Seeds Produce Slightly Different Images (e.g., mit-han-lab/nunchaku#229 mit-han-lab/nunchaku#294)
This behavior is due to minor precision noise introduced by the GPU’s accumulation order. Because modern GPUs execute operations out of order for better performance, small variations in output can occur, even with the same seed.
Enforcing strict accumulation order would reduce this variability but significantly hurt performance, so we do not plan to change this behavior.
### ❓ PuLID Support (e.g., mit-han-lab/nunchaku#258)
PuLID support is currently in development and will be included in the next major release.
- (Optional) Step 5: Building wheel for Portable Python
If building directly with portable Python fails, you can first build the wheel in a working Conda environment, then install the `.whl` file using your portable Python:
If building directly with portable Python fails, you can first build the wheel in a working Conda environment, then install the `.whl` file using your portable Python:
@@ -183,37 +182,36 @@ Alternatively, install using [ComfyUI-Manager](https://github.com/Comfy-Org/Comf
-**Standard FLUX.1-dev Models**
Start by downloading the standard [FLUX.1-dev text encoders](https://huggingface.co/comfyanonymous/flux_text_encoders/tree/main) and [VAE](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/ae.safetensors). You can also optionally download the original [BF16 FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/flux1-dev.safetensors) model. An example command:
Start by downloading the standard [FLUX.1-dev text encoders](https://huggingface.co/comfyanonymous/flux_text_encoders/tree/main) and [VAE](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/ae.safetensors). You can also optionally download the original [BF16 FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/flux1-dev.safetensors) model. An example command:
You can test with some sample LoRAs like [FLUX.1-Turbo](https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha/blob/main/diffusion_pytorch_model.safetensors) and [Ghibsky](https://huggingface.co/aleksa-codes/flux-ghibsky-illustration/blob/main/lora.safetensors). Place these files in the `models/loras` directory:
You can test with some sample LoRAs like [FLUX.1-Turbo](https://huggingface.co/alimama-creative/FLUX.1-Turbo-Alpha/blob/main/diffusion_pytorch_model.safetensors) and [Ghibsky](https://huggingface.co/aleksa-codes/flux-ghibsky-illustration/blob/main/lora.safetensors). Place these files in the `models/loras` directory: