(docs) integrations: fix omission in bf16 related warning (#1183)

* (docs) integrations: fix omission in bf16 related warning * (docs) integrations: further clarifications to prior fix * (docs) integrations: fix punctuation Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * (docs) integrations: fix omitted code formatting --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

(docs) integrations: fix omission in bf16 related warning (#1183)
* (docs) integrations: fix omission in bf16 related warning * (docs) integrations: further clarifications to prior fix * (docs) integrations: fix punctuation Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * (docs) integrations: fix omitted code formatting --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
ffd7d0db · Titus · GitHub · 6cecb65a · ffd7d0db
Unverified Commit ffd7d0db authored Apr 17, 2024 by Titus Committed by GitHub Apr 17, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

docs/source/integrations.mdx docs/source/integrations.mdx +1 -1

No files found.
--- a/docs/source/integrations.mdx
+++ b/docs/source/integrations.mdx
@@ -12,7 +12,7 @@ With Transformers, it's very easy to load any model in 4 or 8-bit and quantize t
 For example, to load and quantize a model to 4-bits and use the bfloat16 data type for compute:

 > [!WARNING]
-> bfloat16 is the optimal compute data type if your hardware supports it. The default is float32 for backward compatibility and numerical stability, but it can often lead to numerical instabilities. bfloat16 provides the best of both worlds, numerical stability equivalent to float32, but combined with the memory footprint and significant computation speedup of a 16-bit data type. Make sure to check if your hardware supports bfloat16 and if it does, configure it using the `bnb_4bit_compute_dtype` parameter in [`~transformers.BitsAndBytesConfig`]!
+> bfloat16 is the ideal `compute_dtype` if your hardware supports it. While the default `compute_dtype`, float32, ensures backward compatibility (due to wide-ranging hardware support) and numerical stability, it is large and slows down computations. In contrast, float16 is smaller and faster but can lead to numerical instabilities. bfloat16 combines the best aspects of both; it offers the numerical stability of float32 and the reduced memory footprint and speed of a 16-bit data type. Check if your hardware supports bfloat16 and configure it using the `bnb_4bit_compute_dtype` parameter in [`~transformers.BitsAndBytesConfig`]!

 ```py
 from transformers import AutoModelForCausalLM, BitsAndBytesConfig