# Installation Guide
Welcome to the installation guide for the `bitsandbytes` library! This document provides step-by-step instructions to install `bitsandbytes` across various platforms and hardware configurations. The library primarily supports CUDA-based GPUs, but the team is actively working on enabling support for additional backends like CPU, AMD ROCm, Intel XPU, and Gaudi HPU.
## Table of Contents
- [CUDA](#cuda)
- [Installation via PyPI](#cuda-pip)
- [Compile from Source](#cuda-compile)
- [Preview Wheels from `main`](#cuda-preview)
- [Multi-Backend Preview](#multi-backend)
- [Supported Backends](#multi-backend-supported-backends)
- [Pre-requisites](#multi-backend-pre-requisites)
- [Installation](#multi-backend-pip)
- [Compile from Source](#multi-backend-compile)
## CUDA[[cuda]]
`bitsandbytes` is currently supported on NVIDIA GPUs with [Compute Capability](https://developer.nvidia.com/cuda-gpus) 5.0+.
The library can be built using CUDA Toolkit versions as old as **11.6** on Windows and **11.4** on Linux.
| **Feature** | **CC Required** | **Example Hardware Requirement** |
|---------------------------------|-----------------|---------------------------------------------|
| LLM.int8() | 7.5+ | Turing (RTX 20 series, T4) or newer GPUs |
| 8-bit optimizers/quantization | 5.0+ | Maxwell (GTX 900 series, TITAN X, M40) or newer GPUs |
| NF4/FP4 quantization | 5.0+ | Maxwell (GTX 900 series, TITAN X, M40) or newer GPUs |
> [!WARNING]
> Support for Maxwell GPUs is deprecated and will be removed in a future release. For the best results, a Turing generation device or newer is recommended.
### Installation via PyPI[[cuda-pip]]
This is the most straightforward and recommended installation option.
The currently distributed `bitsandbytes` packages are built with the following configurations:
| **OS** | **CUDA Toolkit** | **Host Compiler** | **Targets**
|--------------------|------------------|----------------------|--------------
| **Linux x86-64** | 11.8 - 12.6 | GCC 11.2 | sm50, sm60, sm75, sm80, sm86, sm89, sm90
| **Linux x86-64** | 12.8 | GCC 11.2 | sm75, sm80, sm86, sm89, sm90, sm100, sm120
| **Linux aarch64** | 11.8 - 12.6 | GCC 11.2 | sm75, sm80, sm90
| **Linux aarch64** | 12.8 | GCC 11.2 | sm75, sm80, sm90, sm100
| **Windows x86-64** | 11.8 - 12.6 | MSVC 19.43+ (VS2022) | sm50, sm60, sm75, sm80, sm86, sm89, sm90
| **Windows x86-64** | 12.8 | MSVC 19.43+ (VS2022) | sm75, sm80, sm86, sm89, sm90, sm100, sm120
Use `pip` or `uv` to install:
```bash
pip install bitsandbytes
```
### Compile from source[[cuda-compile]]
> [!TIP]
> Don't hesitate to compile from source! The process is pretty straight forward and resilient. This might be needed for older CUDA Toolkit versions or Linux distributions, or other less common configurations.
For Linux and Windows systems, compiling from source allows you to customize the build configurations. See below for detailed platform-specific instructions (see the `CMakeLists.txt` if you want to check the specifics and explore some additional options):
To compile from source, you need CMake >= **3.22.1** and Python >= **3.9** installed. Make sure you have a compiler installed to compile C++ (`gcc`, `make`, headers, etc.). It is recommended to use GCC 9 or newer.
For example, to install a compiler and CMake on Ubuntu:
```bash
apt-get install -y build-essential cmake
```
You should also install CUDA Toolkit by following the [NVIDIA CUDA Installation Guide for Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html) guide. The current minimum supported CUDA Toolkit version that we test with is **11.8**.
```bash
git clone https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
cmake -DCOMPUTE_BACKEND=cuda -S .
make
pip install -e . # `-e` for "editable" install, when developing BNB (otherwise leave that out)
```
> [!TIP]
> If you have multiple versions of the CUDA Toolkit installed or it is in a non-standard location, please refer to CMake CUDA documentation for how to configure the CUDA compiler.
Compilation from source on Windows systems require Visual Studio with C++ support as well as an installation of the CUDA Toolkit.
To compile from source, you need CMake >= **3.22.1** and Python >= **3.9** installed. You should also install CUDA Toolkit by following the [CUDA Installation Guide for Windows](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html) guide from NVIDIA. The current minimum supported CUDA Toolkit version that we test with is **11.8**.
```bash
git clone https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
cmake -DCOMPUTE_BACKEND=cuda -S .
cmake --build . --config Release
pip install -e . # `-e` for "editable" install, when developing BNB (otherwise leave that out)
```
Big thanks to [wkpark](https://github.com/wkpark), [Jamezo97](https://github.com/Jamezo97), [rickardp](https://github.com/rickardp), [akx](https://github.com/akx) for their amazing contributions to make bitsandbytes compatible with Windows.
### Preview Wheels from `main`[[cuda-preview]]
If you would like to use new features even before they are officially released and help us test them, feel free to install the wheel directly from our CI (*the wheel links will remain stable!*):
```bash
# Note: if you don't want to reinstall our dependencies, append the `--no-deps` flag!
# x86_64 (most users)
pip install --force-reinstall https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_main/bitsandbytes-1.33.7.preview-py3-none-manylinux_2_24_x86_64.whl
# ARM/aarch64
pip install --force-reinstall https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_main/bitsandbytes-1.33.7.preview-py3-none-manylinux_2_24_aarch64.whl
```
```bash
# Note: if you don't want to reinstall our dependencies, append the `--no-deps` flag!
pip install --force-reinstall https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_main/bitsandbytes-1.33.7.preview-py3-none-win_amd64.whl
```
## Multi-Backend Preview[[multi-backend]]
> [!WARNING]
> This functionality existed as an early technical preview and is not recommended for production use. We are in the process of upstreaming improved support for AMD and Intel hardware into the main project.
We provide an early preview of support for AMD and Intel hardware as part of a development branch.
### Supported Backends[[multi-backend-supported-backends]]
| **Backend** | **Supported Versions** | **Python versions** | **Architecture Support** | **Status** |
|-------------|------------------------|---------------------------|-------------------------|------------|
| **AMD ROCm** | 6.1+ | 3.10+ | minimum CDNA - `gfx90a`, RDNA - `gfx1100` | Alpha |
| **Intel CPU** | v2.4.0+ | 3.10+ | Intel CPU | Alpha |
| **Intel GPU** | v2.7.0+ | 3.10+ | Intel GPU | Experimental |
| **Ascend NPU** | 2.1.0+ (`torch_npu`) | 3.10+ | Ascend NPU | Experimental |
For each supported backend, follow the respective instructions below:
### Pre-requisites[[multi-backend-pre-requisites]]
To use this preview version of `bitsandbytes` with `transformers`, be sure to install:
```bash
pip install "transformers>=4.45.1"
```
> [!WARNING]
> Pre-compiled binaries are only built for ROCm versions `6.1.2`/`6.2.4`/`6.3.2` and `gfx90a`, `gfx942`, `gfx1100` GPU architectures. [Find the pip install instructions here](#multi-backend-pip).
>
> Other supported versions that don't come with pre-compiled binaries [can be compiled for with these instructions](#multi-backend-compile).
>
> **Windows is not supported for the ROCm backend**
> [!TIP]
> If you would like to install ROCm and PyTorch on bare metal, skip the Docker steps and refer to ROCm's official guides at [ROCm installation overview](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/tutorial/install-overview.html#rocm-install-overview) and [Installing PyTorch for ROCm](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/pytorch-install.html#using-wheels-package) (Step 3 of wheels build for quick installation). Special note: please make sure to get the respective ROCm-specific PyTorch wheel for the installed ROCm version, e.g. `https://download.pytorch.org/whl/nightly/rocm6.2/`!
```bash
# Create a docker container with the ROCm image, which includes ROCm libraries
docker pull rocm/dev-ubuntu-22.04:6.3.4-complete
docker run -it --device=/dev/kfd --device=/dev/dri --group-add video rocm/dev-ubuntu-22.04:6.3.4-complete
apt-get update && apt-get install -y git && cd home
# Install pytorch compatible with above ROCm version
pip install torch --index-url https://download.pytorch.org/whl/rocm6.3/
```
* A compatible PyTorch version with Intel XPU support is required. It is recommended to use the latest stable release. See [Getting Started on Intel GPU](https://docs.pytorch.org/docs/stable/notes/get_start_xpu.html) for guidance.
### Installation
You can install the pre-built wheels for each backend, or compile from source for custom configurations.
#### Pre-built Wheel Installation (recommended)[[multi-backend-pip]]
This wheel provides support for ROCm and Intel XPU platforms.
```
# Note, if you don't want to reinstall our dependencies, append the `--no-deps` flag!
pip install --force-reinstall 'https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_multi-backend-refactor/bitsandbytes-0.44.1.dev0-py3-none-manylinux_2_24_x86_64.whl'
```
This wheel provides support for the Intel XPU platform.
```bash
# Note, if you don't want to reinstall our dependencies, append the `--no-deps` flag!
pip install --force-reinstall 'https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_multi-backend-refactor/bitsandbytes-0.44.1.dev0-py3-none-win_amd64.whl'
```
#### Compile from Source[[multi-backend-compile]]
#### AMD GPU
bitsandbytes is supported from ROCm 6.1 - ROCm 6.4.
```bash
# Install bitsandbytes from source
# Clone bitsandbytes repo, ROCm backend is currently enabled on multi-backend-refactor branch
git clone -b multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
# Compile & install
apt-get install -y build-essential cmake # install build tools dependencies, unless present
cmake -DCOMPUTE_BACKEND=hip -S . # Use -DBNB_ROCM_ARCH="gfx90a;gfx942" to target specific gpu arch
make
pip install -e . # `-e` for "editable" install, when developing BNB (otherwise leave that out)
```
#### Intel CPU + GPU(XPU)
CPU needs to build CPU C++ codes, while XPU needs to build sycl codes.
Run `export bnb_device=xpu` if you are using xpu, run `export bnb_device=cpu` if you are using cpu.
```
git clone https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
cmake -DCOMPUTE_BACKEND=$bnb_device -S .
make
pip install -e .
```
#### Ascend NPU
Please refer to [the official Ascend installations instructions](https://www.hiascend.com/document/detail/zh/Pytorch/60RC3/configandinstg/instg/insg_0001.html) for guidance on how to install the necessary `torch_npu` dependency.
```bash
# Install bitsandbytes from source
# Clone bitsandbytes repo, Ascend NPU backend is currently enabled on multi-backend-refactor branch
git clone -b multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git && cd bitsandbytes/
# Compile & install
apt-get install -y build-essential cmake # install build tools dependencies, unless present
cmake -DCOMPUTE_BACKEND=npu -S .
make
pip install -e . # `-e` for "editable" install, when developing BNB (otherwise leave that out)
```