Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
3892e58a
Unverified
Commit
3892e58a
authored
Mar 24, 2025
by
Jee Jee Li
Committed by
GitHub
Mar 24, 2025
Browse files
[Misc] Upgrade BNB version (#15183)
parent
d20e2611
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
10 additions
and
10 deletions
+10
-10
Dockerfile
Dockerfile
+1
-1
docs/source/features/quantization/bnb.md
docs/source/features/quantization/bnb.md
+1
-1
vllm/model_executor/layers/quantization/bitsandbytes.py
vllm/model_executor/layers/quantization/bitsandbytes.py
+4
-4
vllm/model_executor/model_loader/loader.py
vllm/model_executor/model_loader/loader.py
+4
-4
No files found.
Dockerfile
View file @
3892e58a
...
@@ -286,7 +286,7 @@ RUN --mount=type=cache,target=/root/.cache/uv \
...
@@ -286,7 +286,7 @@ RUN --mount=type=cache,target=/root/.cache/uv \
if
[
"
$TARGETPLATFORM
"
=
"linux/arm64"
]
;
then
\
if
[
"
$TARGETPLATFORM
"
=
"linux/arm64"
]
;
then
\
uv pip
install
accelerate hf_transfer
'modelscope!=1.15.0'
'bitsandbytes>=0.42.0'
'timm==0.9.10'
boto3 runai-model-streamer runai-model-streamer[s3]
;
\
uv pip
install
accelerate hf_transfer
'modelscope!=1.15.0'
'bitsandbytes>=0.42.0'
'timm==0.9.10'
boto3 runai-model-streamer runai-model-streamer[s3]
;
\
else
\
else
\
uv pip
install
accelerate hf_transfer
'modelscope!=1.15.0'
'bitsandbytes>=0.45.
0
'
'timm==0.9.10'
boto3 runai-model-streamer runai-model-streamer[s3]
;
\
uv pip
install
accelerate hf_transfer
'modelscope!=1.15.0'
'bitsandbytes>=0.45.
3
'
'timm==0.9.10'
boto3 runai-model-streamer runai-model-streamer[s3]
;
\
fi
fi
ENV
VLLM_USAGE_SOURCE production-docker-image
ENV
VLLM_USAGE_SOURCE production-docker-image
...
...
docs/source/features/quantization/bnb.md
View file @
3892e58a
...
@@ -9,7 +9,7 @@ Compared to other quantization methods, BitsAndBytes eliminates the need for cal
...
@@ -9,7 +9,7 @@ Compared to other quantization methods, BitsAndBytes eliminates the need for cal
Below are the steps to utilize BitsAndBytes with vLLM.
Below are the steps to utilize BitsAndBytes with vLLM.
```
console
```
console
pip install bitsandbytes>
=
0.45.
0
pip install bitsandbytes>
=
0.45.
3
```
```
vLLM reads the model's config file and supports both in-flight quantization and pre-quantized checkpoint.
vLLM reads the model's config file and supports both in-flight quantization and pre-quantized checkpoint.
...
...
vllm/model_executor/layers/quantization/bitsandbytes.py
View file @
3892e58a
...
@@ -155,12 +155,12 @@ class BitsAndBytesLinearMethod(LinearMethodBase):
...
@@ -155,12 +155,12 @@ class BitsAndBytesLinearMethod(LinearMethodBase):
def
__init__
(
self
,
quant_config
:
BitsAndBytesConfig
):
def
__init__
(
self
,
quant_config
:
BitsAndBytesConfig
):
try
:
try
:
import
bitsandbytes
import
bitsandbytes
if
bitsandbytes
.
__version__
<
"0.45.
0
"
:
if
bitsandbytes
.
__version__
<
"0.45.
3
"
:
raise
ImportError
(
"bitsandbytes version is wrong. Please "
raise
ImportError
(
"bitsandbytes version is wrong. Please "
"install bitsandbytes>=0.45.
0
."
)
"install bitsandbytes>=0.45.
3
."
)
except
ImportError
as
err
:
except
ImportError
as
err
:
raise
ImportError
(
"Please install bitsandbytes>=0.45.
0
via "
raise
ImportError
(
"Please install bitsandbytes>=0.45.
3
via "
"`pip install bitsandbytes>=0.45.
0
` to use "
"`pip install bitsandbytes>=0.45.
3
` to use "
"bitsandbytes quantizer."
)
from
err
"bitsandbytes quantizer."
)
from
err
self
.
quant_config
=
quant_config
self
.
quant_config
=
quant_config
...
...
vllm/model_executor/model_loader/loader.py
View file @
3892e58a
...
@@ -862,12 +862,12 @@ class BitsAndBytesModelLoader(BaseModelLoader):
...
@@ -862,12 +862,12 @@ class BitsAndBytesModelLoader(BaseModelLoader):
try
:
try
:
import
bitsandbytes
import
bitsandbytes
if
bitsandbytes
.
__version__
<
"0.45.
0
"
:
if
bitsandbytes
.
__version__
<
"0.45.
3
"
:
raise
ImportError
(
"bitsandbytes version is wrong. Please "
raise
ImportError
(
"bitsandbytes version is wrong. Please "
"install bitsandbytes>=0.45.
0
."
)
"install bitsandbytes>=0.45.
3
."
)
except
ImportError
as
err
:
except
ImportError
as
err
:
raise
ImportError
(
"Please install bitsandbytes>=0.45.
0
via "
raise
ImportError
(
"Please install bitsandbytes>=0.45.
3
via "
"`pip install bitsandbytes>=0.45.
0
` to use "
"`pip install bitsandbytes>=0.45.
3
` to use "
"bitsandbytes quantizer."
)
from
err
"bitsandbytes quantizer."
)
from
err
hf_weights_files
,
use_safetensors
=
self
.
_prepare_weights
(
hf_weights_files
,
use_safetensors
=
self
.
_prepare_weights
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment