Commit 998192ca authored by Muyang Li's avatar Muyang Li Committed by muyangli
Browse files

Merge pull request #70 from mit-han-lab/dev/muyang

Ready to release v0.2.0
parent 44ae975c
...@@ -4,13 +4,15 @@ ...@@ -4,13 +4,15 @@
<h3 align="center"> <h3 align="center">
<a href="http://arxiv.org/abs/2411.05007"><b>Paper</b></a> | <a href="https://hanlab.mit.edu/projects/svdquant"><b>Website</b></a> | <a href="https://hanlab.mit.edu/blog/svdquant"><b>Blog</b></a> | <a href="https://svdquant.mit.edu"><b>Demo</b></a> | <a href="https://huggingface.co/collections/mit-han-lab/svdquant-67493c2c2e62a1fc6e93f45c"><b>HuggingFace</b></a> | <a href="https://modelscope.cn/collections/svdquant-468e8f780c2641"><b>ModelScope</b></a> | <a href="https://github.com/mit-han-lab/ComfyUI-nunchaku"><b>ComfyUI</b></a> <a href="http://arxiv.org/abs/2411.05007"><b>Paper</b></a> | <a href="https://hanlab.mit.edu/projects/svdquant"><b>Website</b></a> | <a href="https://hanlab.mit.edu/blog/svdquant"><b>Blog</b></a> | <a href="https://svdquant.mit.edu"><b>Demo</b></a> | <a href="https://huggingface.co/collections/mit-han-lab/svdquant-67493c2c2e62a1fc6e93f45c"><b>HuggingFace</b></a> | <a href="https://modelscope.cn/collections/svdquant-468e8f780c2641"><b>ModelScope</b></a> | <a href="https://github.com/mit-han-lab/ComfyUI-nunchaku"><b>ComfyUI</b></a>
</h3> </h3>
**Nunchaku** is a high-performance inference engine optimized for 4-bit neural networks, as introduced in our paper [SVDQuant](http://arxiv.org/abs/2411.05007). For the underlying quantization library, check out [DeepCompressor](https://github.com/mit-han-lab/deepcompressor). **Nunchaku** is a high-performance inference engine optimized for 4-bit neural networks, as introduced in our paper [SVDQuant](http://arxiv.org/abs/2411.05007). For the underlying quantization library, check out [DeepCompressor](https://github.com/mit-han-lab/deepcompressor).
Join our user groups on [**Slack**](https://join.slack.com/t/nunchaku/shared_invite/zt-3170agzoz-NgZzWaTrEj~n2KEV3Hpl5Q) and [**WeChat**](./assets/wechat.jpg) to engage in discussions with the community! More details can be found [here](https://github.com/mit-han-lab/nunchaku/issues/149). If you have any questions, run into issues, or are interested in contributing, don’t hesitate to reach out! Join our user groups on [**Slack**](https://join.slack.com/t/nunchaku/shared_invite/zt-3170agzoz-NgZzWaTrEj~n2KEV3Hpl5Q) and [**WeChat**](./assets/wechat.jpg) to engage in discussions with the community! More details can be found [here](https://github.com/mit-han-lab/nunchaku/issues/149). If you have any questions, run into issues, or are interested in contributing, don’t hesitate to reach out!
## News ## News
- **[2025-04-05]** 🚀 **Nunchaku v0.2.0 released!** This release brings **multi-LoRA** and **ControlNet** support with even faster performance. We've also added compatibility for **20-series GPUs** — Nunchaku is now more accessible than ever! - **[2025-04-05]** 🚀 **Nunchaku v0.2.0 released!** This release brings [**multi-LoRA**](examples/flux.1-dev-multiple-lora.py) and [**ControlNet**](examples/flux.1-dev-controlnet-union-pro.py) support with even faster performance powered by [**FP16 attention**](#fp16-attention) and [**First-Block Cache**](#first-block-cache). We've also added compatibility for [**20-series GPUs**](examples/flux.1-dev-turing.py) — Nunchaku is now more accessible than ever!
- **[2025-03-17]** 🚀 Released NVFP4 4-bit [Shuttle-Jaguar](https://huggingface.co/mit-han-lab/svdq-int4-shuttle-jaguar) and FLUX.1-tools and also upgraded the INT4 FLUX.1-tool models. Download and update your models from our [HuggingFace](https://huggingface.co/collections/mit-han-lab/svdquant-67493c2c2e62a1fc6e93f45c) or [ModelScope](https://modelscope.cn/collections/svdquant-468e8f780c2641) collections! - **[2025-03-17]** 🚀 Released NVFP4 4-bit [Shuttle-Jaguar](https://huggingface.co/mit-han-lab/svdq-int4-shuttle-jaguar) and FLUX.1-tools and also upgraded the INT4 FLUX.1-tool models. Download and update your models from our [HuggingFace](https://huggingface.co/collections/mit-han-lab/svdquant-67493c2c2e62a1fc6e93f45c) or [ModelScope](https://modelscope.cn/collections/svdquant-468e8f780c2641) collections!
- **[2025-03-13]** 📦 Separate the ComfyUI node into a [standalone repository](https://github.com/mit-han-lab/ComfyUI-nunchaku) for easier installation and release node v0.1.6! Plus, [4-bit Shuttle-Jaguar](https://huggingface.co/mit-han-lab/svdq-int4-shuttle-jaguar) is now fully supported! - **[2025-03-13]** 📦 Separate the ComfyUI node into a [standalone repository](https://github.com/mit-han-lab/ComfyUI-nunchaku) for easier installation and release node v0.1.6! Plus, [4-bit Shuttle-Jaguar](https://huggingface.co/mit-han-lab/svdq-int4-shuttle-jaguar) is now fully supported!
- **[2025-03-07]** 🚀 **Nunchaku v0.1.4 Released!** We've supported [4-bit text encoder and per-layer CPU offloading](#Low-Memory-Inference), reducing FLUX's minimum memory requirement to just **4 GiB** while maintaining a **2–3× speedup**. This update also fixes various issues related to resolution, LoRA, pin memory, and runtime stability. Check out the release notes for full details! - **[2025-03-07]** 🚀 **Nunchaku v0.1.4 Released!** We've supported [4-bit text encoder and per-layer CPU offloading](#Low-Memory-Inference), reducing FLUX's minimum memory requirement to just **4 GiB** while maintaining a **2–3× speedup**. This update also fixes various issues related to resolution, LoRA, pin memory, and runtime stability. Check out the release notes for full details!
...@@ -71,7 +73,7 @@ pip install torch==2.6 torchvision==0.21 torchaudio==2.6 ...@@ -71,7 +73,7 @@ pip install torch==2.6 torchvision==0.21 torchaudio==2.6
``` ```
#### Install nunchaku #### Install nunchaku
Once PyTorch is installed, you can directly install `nunchaku` from our whell repositories [Hugging Face](https://huggingface.co/mit-han-lab/nunchaku/tree/main) or [ModelScope](https://modelscope.cn/models/Lmxyy1999/nunchaku). Be sure to select the appropriate wheel for your Python and PyTorch version. For example, for Python 3.11 and PyTorch 2.6: Once PyTorch is installed, you can directly install `nunchaku` from our wheel repositories on [Hugging Face](https://huggingface.co/mit-han-lab/nunchaku/tree/main) or [ModelScope](https://modelscope.cn/models/Lmxyy1999/nunchaku) or [GitHub release](https://github.com/mit-han-lab/nunchaku/releases). Be sure to select the appropriate wheel for your Python and PyTorch version. For example, for Python 3.11 and PyTorch 2.6:
```shell ```shell
pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.2.0+torch2.6-cp311-cp311-linux_x86_64.whl pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.2.0+torch2.6-cp311-cp311-linux_x86_64.whl
...@@ -83,7 +85,7 @@ pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0. ...@@ -83,7 +85,7 @@ pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.
**Note**: **Note**:
* Ensure your CUDA version is ** 12.2 on Linux** and ** 12.6 on Windows**. * Make sure your CUDA version is **at least 12.2 on Linux** and **at least 12.6 on Windows**. If you're using a Blackwell GPU (e.g., 50-series GPUs), CUDA **12.8 or higher is required**.
* For Windows users, please refer to [this issue](https://github.com/mit-han-lab/nunchaku/issues/6) for the instruction. Please upgrade your MSVC compiler to the latest version. * For Windows users, please refer to [this issue](https://github.com/mit-han-lab/nunchaku/issues/6) for the instruction. Please upgrade your MSVC compiler to the latest version.
...@@ -141,7 +143,7 @@ pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0. ...@@ -141,7 +143,7 @@ pip install https://huggingface.co/mit-han-lab/nunchaku/resolve/main/nunchaku-0.
## Usage Example ## Usage Example
In [examples](examples), we provide minimal scripts for running INT4 [FLUX.1](https://github.com/black-forest-labs/flux) and [SANA](https://github.com/NVlabs/Sana) models with Nunchaku. For example, the [script](examples/flux.1-dev.py) for [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) is as follows: In [examples](examples), we provide minimal scripts for running INT4 [FLUX.1](https://github.com/black-forest-labs/flux) and [SANA](https://github.com/NVlabs/Sana) models with Nunchaku. It shares the same APIs as [diffusers](https://github.com/huggingface/diffusers) and can be used in a similar way. For example, the [script](examples/flux.1-dev.py) for [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) is as follows:
```python ```python
import torch import torch
...@@ -159,35 +161,37 @@ image = pipeline("A cat holding a sign that says hello world", num_inference_ste ...@@ -159,35 +161,37 @@ image = pipeline("A cat holding a sign that says hello world", num_inference_ste
image.save(f"flux.1-dev-{precision}.png") image.save(f"flux.1-dev-{precision}.png")
``` ```
Specifically, `nunchaku` shares the same APIs as [diffusers](https://github.com/huggingface/diffusers) and can be used in a similar way. **Note**: If you're using a **Turing GPU (e.g., NVIDIA 20-series)**, make sure to set `torch_dtype=torch.float16` and use our `nunchaku-fp16` attention module as below. A complete example is available in [`examples/flux.1-dev-turing.py`](examples/flux.1-dev-turing.py).
### First-Block Cache and Low-Precision Attention ### FP16 Attention
In addition to FlashAttention-2, Nunchaku introduces a custom FP16 attention implementation that achieves up to **1.2× faster performance** on NVIDIA 30-, 40-, and even 50-series GPUs—without loss in precision. To enable it, simply use:
```python
transformer.set_attention_impl("nunchaku-fp16")
```
### CPU Offloading See [`examples/flux.1-dev-fp16attn.py`](examples/flux.1-dev-fp16attn.py) for a complete example.
### First-Block Cache
To further reduce GPU memory usage, you can use CPU offloading, requiring a minimum of just 4GiB of memory. The usage is also simple in the diffusers' way. For example, the [script](examples/flux.1-dev-offload.py) for FLUX.1-dev is as follows: Nunchaku supports [First-Block Cache](https://github.com/chengzeyi/ParaAttention?tab=readme-ov-file#first-block-cache-our-dynamic-caching) to accelerate long-step denoising. Enable it easily with:
```python ```python
import torch apply_cache_on_pipe(pipeline, residual_diff_threshold=0.12)
from diffusers import FluxPipeline ```
from nunchaku import NunchakuFluxTransformer2dModel You can tune the `residual_diff_threshold` to balance speed and quality: larger values yield faster inference at the cost of some quality. A recommended value is `0.12`, which provides up to **2× speedup** for 50-step denoising and **1.4× speedup** for 30-step denoising. See the full example in [`examples/flux.1-dev-cache.py`](examples/flux.1-dev-cache.py).
from nunchaku.utils import get_precision
precision = get_precision() # auto-detect your precision is 'int4' or 'fp4' based on your GPU ### CPU Offloading
transformer = NunchakuFluxTransformer2dModel.from_pretrained(
f"mit-han-lab/svdq-{precision}-flux.1-dev", offload=True To minimize GPU memory usage, Nunchaku supports CPU offloading—requiring as little as **4 GiB** of GPU memory. You can enable it by setting `offload=True` when initializing `NunchakuFluxTransformer2dModel`, and then calling:
) # set offload to False if you want to disable offloading
pipeline = FluxPipeline.from_pretrained( ```python
"black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16 pipeline.enable_sequential_cpu_offload()
) # no need to set the device here
pipeline.enable_sequential_cpu_offload() # diffusers' offloading
image = pipeline("A cat holding a sign that says hello world", num_inference_steps=50, guidance_scale=3.5).images[0]
image.save(f"flux.1-dev-{precision}.png")
``` ```
For a complete example, refer to [`examples/flux.1-dev-offload.py`](examples/flux.1-dev-offload.py).
## Customized LoRA ## Customized LoRA
...@@ -200,7 +204,7 @@ transformer.update_lora_params(path_to_your_lora) ...@@ -200,7 +204,7 @@ transformer.update_lora_params(path_to_your_lora)
transformer.set_lora_strength(lora_strength) transformer.set_lora_strength(lora_strength)
``` ```
`path_to_your_lora` can also be a remote HuggingFace path. In [examples/flux.1-dev-lora.py](examples/flux.1-dev-lora.py), we provide a minimal example script for running [Ghibsky](https://huggingface.co/aleksa-codes/flux-ghibsky-illustration) LoRA with SVDQuant's 4-bit FLUX.1-dev: `path_to_your_lora` can also be a remote HuggingFace path. In [`examples/flux.1-dev-lora.py`](examples/flux.1-dev-lora.py), we provide a minimal example script for running [Ghibsky](https://huggingface.co/aleksa-codes/flux-ghibsky-illustration) LoRA with SVDQuant's 4-bit FLUX.1-dev:
```python ```python
import torch import torch
...@@ -230,8 +234,29 @@ image = pipeline( ...@@ -230,8 +234,29 @@ image = pipeline(
image.save(f"flux.1-dev-ghibsky-{precision}.png") image.save(f"flux.1-dev-ghibsky-{precision}.png")
``` ```
To compose multiple LoRAs, you can use `nunchaku.lora.flux.compose.compose_lora` to compose them. The usage is
```python
composed_lora = compose_lora(
[
("PATH_OR_STATE_DICT_OF_LORA1", lora_strength1),
("PATH_OR_STATE_DICT_OF_LORA2", lora_strength2),
# Add more LoRAs as needed
]
) # set your lora strengths here when using composed lora
transformer.update_lora_params(composed_lora)
```
You can specify individual strengths for each LoRA in the list. For a complete example, refer to [`examples/flux.1-dev-multiple-lora.py`](examples/flux.1-dev-multiple-lora.py).
**For ComfyUI users, you can directly use our LoRA loader. The converted LoRA is deprecated. Please refer to [mit-han-lab/ComfyUI-nunchaku](https://github.com/mit-han-lab/ComfyUI-nunchaku) for more details.** **For ComfyUI users, you can directly use our LoRA loader. The converted LoRA is deprecated. Please refer to [mit-han-lab/ComfyUI-nunchaku](https://github.com/mit-han-lab/ComfyUI-nunchaku) for more details.**
## ControlNets
Nunchaku supports both the [FLUX.1-tools](https://blackforestlabs.ai/flux-1-tools/) and the [FLUX.1-dev-ControlNet-Union-Pro](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro) models. Example scripts can be found in the [`examples`](examples) directory.
![control](./assets/control.jpg)
## ComfyUI ## ComfyUI
Please refer to [mit-han-lab/ComfyUI-nunchaku](https://github.com/mit-han-lab/ComfyUI-nunchaku) for the usage in [ComfyUI](https://github.com/comfyanonymous/ComfyUI). Please refer to [mit-han-lab/ComfyUI-nunchaku](https://github.com/mit-han-lab/ComfyUI-nunchaku) for the usage in [ComfyUI](https://github.com/comfyanonymous/ComfyUI).
......
import torch
from diffusers import FluxPipeline
from nunchaku import NunchakuFluxTransformer2dModel
from nunchaku.utils import get_precision
precision = get_precision() # auto-detect your precision is 'int4' or 'fp4' based on your GPU
transformer = NunchakuFluxTransformer2dModel.from_pretrained(f"mit-han-lab/svdq-{precision}-flux.1-dev")
transformer.set_attention_impl("nunchaku-fp16") # set attention implementation to fp16
pipeline = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")
image = pipeline(["A cat holding a sign that says hello world"], num_inference_steps=50).images[0]
image.save(f"flux.1-dev-cache-{precision}.png")
import torch
from diffusers import FluxPipeline
from nunchaku import NunchakuFluxTransformer2dModel
from nunchaku.lora.flux.compose import compose_lora
from nunchaku.utils import get_precision
precision = get_precision() # auto-detect your precision is 'int4' or 'fp4' based on your GPU
transformer = NunchakuFluxTransformer2dModel.from_pretrained(f"mit-han-lab/svdq-{precision}-flux.1-dev")
pipeline = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")
### LoRA Related Code ###
composed_lora = compose_lora(
[
("aleksa-codes/flux-ghibsky-illustration/lora.safetensors", 1),
("alimama-creative/FLUX.1-Turbo-Alpha/diffusion_pytorch_model.safetensors", 1),
]
) # set your lora strengths here when using composed lora
transformer.update_lora_params(composed_lora)
### End of LoRA Related Code ###
image = pipeline(
"GHIBSKY style, cozy mountain cabin covered in snow, with smoke curling from the chimney and a warm, inviting light spilling through the windows", # noqa: E501
num_inference_steps=8,
guidance_scale=3.5,
).images[0]
image.save(f"flux.1-dev-turbo-ghibsky-{precision}.png")
import torch
from diffusers import FluxPipeline
from nunchaku import NunchakuFluxTransformer2dModel
from nunchaku.utils import get_precision
precision = get_precision() # auto-detect your precision is 'int4' or 'fp4' based on your GPU
transformer = NunchakuFluxTransformer2dModel.from_pretrained(
f"mit-han-lab/svdq-{precision}-flux.1-dev",
offload=True,
torch_dtype=torch.float16, # Turing GPUs only support fp16 precision
) # set offload to False if you want to disable offloading
transformer.set_attention_impl("nunchaku-fp16") # Turing GPUs only support fp16 attention
pipeline = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev", transformer=transformer, torch_dtype=torch.float16
) # no need to set the device here
pipeline.enable_sequential_cpu_offload() # diffusers' offloading
image = pipeline("A cat holding a sign that says hello world", num_inference_steps=50, guidance_scale=3.5).images[0]
image.save(f"flux.1-dev-{precision}.png")
...@@ -18,7 +18,12 @@ for python_version in "${python_versions[@]}"; do ...@@ -18,7 +18,12 @@ for python_version in "${python_versions[@]}"; do
done done
done done
bash scripts/build_linux_wheel_cu128.sh "3.10" "2.7" "12.8" bash scripts/build_linux_wheel_torch2.7_cu128.sh "3.10" "2.7" "12.8"
bash scripts/build_linux_wheel_cu128.sh "3.11" "2.7" "12.8" bash scripts/build_linux_wheel_torch2.7_cu128.sh "3.11" "2.7" "12.8"
bash scripts/build_linux_wheel_cu128.sh "3.12" "2.7" "12.8" bash scripts/build_linux_wheel_torch2.7_cu128.sh "3.12" "2.7" "12.8"
bash scripts/build_linux_wheel_cu128.sh "3.13" "2.7" "12.8" bash scripts/build_linux_wheel_torch2.7_cu128.sh "3.13" "2.7" "12.8"
\ No newline at end of file
bash scripts/build_linux_wheel_cu128.sh "3.10" "2.8" "12.8"
bash scripts/build_linux_wheel_cu128.sh "3.11" "2.8" "12.8"
bash scripts/build_linux_wheel_cu128.sh "3.12" "2.8" "12.8"
bash scripts/build_linux_wheel_cu128.sh "3.13" "2.8" "12.8"
\ No newline at end of file
...@@ -20,10 +20,15 @@ for %%P in (%python_versions%) do ( ...@@ -20,10 +20,15 @@ for %%P in (%python_versions%) do (
) )
) )
call scripts\build_windows_wheel.cmd 3.10 2.7 12.8 call scripts\build_windows_wheel_cu128.cmd 3.10 2.7 12.8
call scripts\build_windows_wheel.cmd 3.11 2.7 12.8 call scripts\build_windows_wheel_cu128.cmd 3.11 2.7 12.8
call scripts\build_windows_wheel.cmd 3.12 2.7 12.8 call scripts\build_windows_wheel_cu128.cmd 3.12 2.7 12.8
call scripts\build_windows_wheel.cmd 3.13 2.7 12.8 call scripts\build_windows_wheel_cu128.cmd 3.13 2.7 12.8
call scripts\build_windows_wheel_cu128.cmd 3.10 2.8 12.8
call scripts\build_windows_wheel_cu128.cmd 3.11 2.8 12.8
call scripts\build_windows_wheel_cu128.cmd 3.12 2.8 12.8
call scripts\build_windows_wheel_cu128.cmd 3.13 2.8 12.8
echo All builds completed successfully! echo All builds completed successfully!
exit /b 0 exit /b 0
...@@ -22,12 +22,10 @@ PYTHON_ROOT_PATH=/opt/python/cp${PYTHON_VERSION//.}-cp${PYTHON_VERSION//.} ...@@ -22,12 +22,10 @@ PYTHON_ROOT_PATH=/opt/python/cp${PYTHON_VERSION//.}-cp${PYTHON_VERSION//.}
docker run --rm \ docker run --rm \
-v "$(pwd)":/nunchaku \ -v "$(pwd)":/nunchaku \
pytorch/manylinux-builder:cuda${CUDA_VERSION} \ pytorch/manylinux2_28-builder:cuda${CUDA_VERSION} \
bash -c " bash -c "
cd /nunchaku && \ cd /nunchaku && \
rm -rf build && \ rm -rf build && \
yum install -y devtoolset-11 && \
source scl_source enable devtoolset-11 && \
gcc --version && g++ --version && \ gcc --version && g++ --version && \
${PYTHON_ROOT_PATH}/bin/pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128 && \ ${PYTHON_ROOT_PATH}/bin/pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128 && \
${PYTHON_ROOT_PATH}/bin/pip install build ninja wheel setuptools && \ ${PYTHON_ROOT_PATH}/bin/pip install build ninja wheel setuptools && \
......
#!/bin/bash
# Modified from https://github.com/sgl-project/sglang/blob/main/sgl-kernel/build.sh
set -ex
PYTHON_VERSION=$1
TORCH_VERSION=$2 # has no use for now
CUDA_VERSION=$3
MAX_JOBS=${4:-} # optional
PYTHON_ROOT_PATH=/opt/python/cp${PYTHON_VERSION//.}-cp${PYTHON_VERSION//.}
# Check if TORCH_VERSION is 2.5 or 2.6 and set the corresponding versions for TORCHVISION and TORCHAUDIO
#if [ "$TORCH_VERSION" == "2.5" ]; then
# TORCHVISION_VERSION="0.20"
# TORCHAUDIO_VERSION="2.5"
# echo "TORCH_VERSION is 2.5, setting TORCHVISION_VERSION to $TORCHVISION_VERSION and TORCHAUDIO_VERSION to $TORCHAUDIO_VERSION"
#elif [ "$TORCH_VERSION" == "2.6" ]; then
# TORCHVISION_VERSION="0.21"
# TORCHAUDIO_VERSION="2.6"
# echo "TORCH_VERSION is 2.6, setting TORCHVISION_VERSION to $TORCHVISION_VERSION and TORCHAUDIO_VERSION to $TORCHAUDIO_VERSION"
#else
# echo "TORCH_VERSION is not 2.5 or 2.6, no changes to versions."
#fi
docker run --rm \
-v "$(pwd)":/nunchaku \
pytorch/manylinux2_28-builder:cuda${CUDA_VERSION} \
bash -c "
cd /nunchaku && \
rm -rf build && \
gcc --version && g++ --version && \
${PYTHON_ROOT_PATH}/bin/pip install --pre torch==2.7.0.dev20250307+cu128 torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128 && \
${PYTHON_ROOT_PATH}/bin/pip install build ninja wheel setuptools && \
export NUNCHAKU_INSTALL_MODE=ALL && \
export NUNCHAKU_BUILD_WHEELS=1 && \
export MAX_JOBS=${MAX_JOBS} && \
${PYTHON_ROOT_PATH}/bin/python -m build --wheel --no-isolation
"
\ No newline at end of file
...@@ -19,7 +19,12 @@ call conda activate %ENV_NAME% ...@@ -19,7 +19,12 @@ call conda activate %ENV_NAME%
:: install dependencies :: install dependencies
call pip install ninja setuptools wheel build call pip install ninja setuptools wheel build
call pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
if "%TORCH_VERSION%"=="2.7" (
call pip install --pre torch==2.7.0.dev20250307+cu128 torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
) else (
call pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
)
:: set environment variables :: set environment variables
set NUNCHAKU_INSTALL_MODE=ALL set NUNCHAKU_INSTALL_MODE=ALL
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment