- Doubled quantization routines for 4-bit quantization
- Doubled quantization routines for 4-bit quantization
- Paged optimizers for Adam and Lion.
- Paged optimizers for Adam and Lion.
- bfloat16 gradient / weight support for Adam and Lion with 8 or 32-bit states.
- bfloat16 gradient / weight support for Adam and Lion with 8 or 32-bit states.
Bug fixes:
- Fixed a bug where 8-bit models consumed twice the memory as expected after serialization
Deprecated:
- Kepler binaries (GTX 700s and Tesla K40/K80) are not longer provided via pip and need to be compiled from source. Kepler support might be fully removed in the future.
@@ -33,3 +33,8 @@ You can set `CUDA_HOME` to `/usr/local/cuda-11.7`. For example, you might be abl
...
@@ -33,3 +33,8 @@ You can set `CUDA_HOME` to `/usr/local/cuda-11.7`. For example, you might be abl
If you have problems compiling the library with these instructions from source, please open an issue.
If you have problems compiling the library with these instructions from source, please open an issue.
## Compilation with Kepler
Since 0.39.1 bitsandbytes installed via pip no longer provides Kepler binaries and these need to be compiled from source. Follow the steps above and instead of `cuda11x_nomatmul` etc use `cuda11x_nomatmul_kepler`