Unverified Commit fb29132b authored by Sayak Paul's avatar Sayak Paul Committed by GitHub
Browse files

[docs] minor updates to bitsandbytes docs. (#11509)

* minor updates to bitsandbytes docs.

* Apply suggestions from code review
parent 79371661
...@@ -48,7 +48,7 @@ For Ada and higher-series GPUs. we recommend changing `torch_dtype` to `torch.bf ...@@ -48,7 +48,7 @@ For Ada and higher-series GPUs. we recommend changing `torch_dtype` to `torch.bf
```py ```py
from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig
from transformers import BitsAndBytesConfig as TransformersBitsAndBytesConfig from transformers import BitsAndBytesConfig as TransformersBitsAndBytesConfig
import torch
from diffusers import AutoModel from diffusers import AutoModel
from transformers import T5EncoderModel from transformers import T5EncoderModel
...@@ -88,6 +88,8 @@ Setting `device_map="auto"` automatically fills all available space on the GPU(s ...@@ -88,6 +88,8 @@ Setting `device_map="auto"` automatically fills all available space on the GPU(s
CPU, and finally, the hard drive (the absolute slowest option) if there is still not enough memory. CPU, and finally, the hard drive (the absolute slowest option) if there is still not enough memory.
```py ```py
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained( pipe = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev", "black-forest-labs/FLUX.1-dev",
transformer=transformer_8bit, transformer=transformer_8bit,
...@@ -132,7 +134,7 @@ For Ada and higher-series GPUs. we recommend changing `torch_dtype` to `torch.bf ...@@ -132,7 +134,7 @@ For Ada and higher-series GPUs. we recommend changing `torch_dtype` to `torch.bf
```py ```py
from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig
from transformers import BitsAndBytesConfig as TransformersBitsAndBytesConfig from transformers import BitsAndBytesConfig as TransformersBitsAndBytesConfig
import torch
from diffusers import AutoModel from diffusers import AutoModel
from transformers import T5EncoderModel from transformers import T5EncoderModel
...@@ -171,6 +173,8 @@ Let's generate an image using our quantized models. ...@@ -171,6 +173,8 @@ Let's generate an image using our quantized models.
Setting `device_map="auto"` automatically fills all available space on the GPU(s) first, then the CPU, and finally, the hard drive (the absolute slowest option) if there is still not enough memory. Setting `device_map="auto"` automatically fills all available space on the GPU(s) first, then the CPU, and finally, the hard drive (the absolute slowest option) if there is still not enough memory.
```py ```py
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained( pipe = FluxPipeline.from_pretrained(
"black-forest-labs/FLUX.1-dev", "black-forest-labs/FLUX.1-dev",
transformer=transformer_4bit, transformer=transformer_4bit,
...@@ -214,6 +218,8 @@ Check your memory footprint with the `get_memory_footprint` method: ...@@ -214,6 +218,8 @@ Check your memory footprint with the `get_memory_footprint` method:
print(model.get_memory_footprint()) print(model.get_memory_footprint())
``` ```
Note that this only tells you the memory footprint of the model params and does _not_ estimate the inference memory requirements.
Quantized models can be loaded from the [`~ModelMixin.from_pretrained`] method without needing to specify the `quantization_config` parameters: Quantized models can be loaded from the [`~ModelMixin.from_pretrained`] method without needing to specify the `quantization_config` parameters:
```py ```py
...@@ -413,4 +419,4 @@ transformer_4bit.dequantize() ...@@ -413,4 +419,4 @@ transformer_4bit.dequantize()
## Resources ## Resources
* [End-to-end notebook showing Flux.1 Dev inference in a free-tier Colab](https://gist.github.com/sayakpaul/c76bd845b48759e11687ac550b99d8b4) * [End-to-end notebook showing Flux.1 Dev inference in a free-tier Colab](https://gist.github.com/sayakpaul/c76bd845b48759e11687ac550b99d8b4)
* [Training](https://gist.github.com/sayakpaul/05afd428bc089b47af7c016e42004527) * [Training](https://github.com/huggingface/diffusers/blob/8c661ea586bf11cb2440da740dd3c4cf84679b85/examples/dreambooth/README_hidream.md#using-quantization)
\ No newline at end of file \ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment