release 0.43.3

2e03d344 · Titus von Koeller · b64cbe32 · 2e03d344 · 2e03d344 · 2e03d344
Commit 2e03d344 authored Jul 30, 2024 by Titus von Koeller
Hide whitespace changes
Inline Side-by-side

Showing with 11 additions and 2 deletions

CHANGELOG.md CHANGELOG.md +9 -0

bitsandbytes/__init__.py bitsandbytes/__init__.py +1 -1

setup.py setup.py +1 -1

No files found.
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
+### 0.43.3
+
+#### Improvements:
+
+- FSDP: Enable loading prequantized weights with bf16/fp16/fp32 quant_storage
+    - Background: This update, linked to [Transformer PR #32276](https://github.com/huggingface/transformers/pull/32276), allows loading prequantized weights with alternative storage formats. Metadata is tracked similarly to `Params4bit.__new__` post PR #970. It supports models exported with non-default `quant_storage`, such as [this NF4 model with BF16 storage](https://huggingface.co/hugging-quants/Meta-Llama-3.1-405B-BNB-NF4-BF16).
+    - Special thanks to @winglian and @matthewdouglas for enabling FSDP+QLoRA finetuning of Llama 3.1 405B on a single 8xH100 or 8xA100 node with as little as 256GB system RAM.
+
+
 ### 0.43.2

 This release is quite significant as the QLoRA bug fix big implications for higher `seqlen` and batch sizes.

--- a/bitsandbytes/__init__.py
+++ b/bitsandbytes/__init__.py
@@ -21,4 +21,4 @@ __pdoc__ = {
    "optim.optimizer.MockArgs": False,
 }

-__version__ = "0.43.3.dev"
+__version__ = "0.43.3"
--- a/setup.py
+++ b/setup.py
@@ -25,7 +25,7 @@ class BinaryDistribution(Distribution):

 setup(
    name="bitsandbytes",
-    version="0.43.3.dev",
+    version="0.43.3",
    author="Tim Dettmers",
    author_email="dettmers@cs.washington.edu",
    description="k-bit optimizers and matrix multiplication routines.",