bitsandbytes is a quantization library that includes support for 4-bit and 8-bit quantization. Quantization reduces your model size compared to its native full precision version, making it easier to fit large models onto GPUs with limited memory.
Make sure you have bitsnbytes and 🤗 Accelerate installed:
Make sure you have bitsandbytes and 🤗 Accelerate installed: