Unverified Commit de11d0bd authored by miRx923's avatar miRx923 Committed by GitHub
Browse files

Update quantizer_bnb_4bit.py: In the ValueError string there should be...

Update quantizer_bnb_4bit.py: In the ValueError string there should be "....you need to set `llm_int8_enable_fp32_cpu_offload=True`...." instead of "`load_in_8bit_fp32_cpu_offload=True`". (#30013)

* Update quantizer_bnb_4bit.py

There is an mistake in ValueError on line 86 of quantizer_bnb_4bit.py. In the error string there should be "....you need to set `llm_int8_enable_fp32_cpu_offload=True`...." instead of "load_in_8bit_fp32_cpu_offload=True". I think you updated the BitsAndBytesConfig() arguments, but forgot to change the ValueError in quantizer_bnb_4bit.py.

* Update quantizer_bnb_4bit.py

Changed ValueError string "...you need to set load_in_8bit_fp32_cpu_offload=True..." to "....you need to set llm_int8_enable_fp32_cpu_offload=True...."
parent 4207a407
...@@ -87,7 +87,7 @@ class Bnb4BitHfQuantizer(HfQuantizer): ...@@ -87,7 +87,7 @@ class Bnb4BitHfQuantizer(HfQuantizer):
""" """
Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the
quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules
in 32-bit, you need to set `load_in_8bit_fp32_cpu_offload=True` and pass a custom `device_map` to in 32-bit, you need to set `llm_int8_enable_fp32_cpu_offload=True` and pass a custom `device_map` to
`from_pretrained`. Check `from_pretrained`. Check
https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu
for more details. for more details.
......
...@@ -87,7 +87,7 @@ class Bnb8BitHfQuantizer(HfQuantizer): ...@@ -87,7 +87,7 @@ class Bnb8BitHfQuantizer(HfQuantizer):
""" """
Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the
quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules quantized model. If you want to dispatch the model on the CPU or the disk while keeping these modules
in 32-bit, you need to set `load_in_8bit_fp32_cpu_offload=True` and pass a custom `device_map` to in 32-bit, you need to set `llm_int8_enable_fp32_cpu_offload=True` and pass a custom `device_map` to
`from_pretrained`. Check `from_pretrained`. Check
https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu https://huggingface.co/docs/transformers/main/en/main_classes/quantization#offload-between-cpu-and-gpu
for more details. for more details.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment