Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
948c8595
Unverified
Commit
948c8595
authored
Nov 22, 2024
by
zixuanzhang226
Committed by
GitHub
Nov 22, 2024
Browse files
support bitsandbytes quantization with qwen model (#10549)
Signed-off-by:
Ubuntu
<
zixuanzhang@bytedance.com
>
parent
97814fbf
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
12 additions
and
0 deletions
+12
-0
vllm/model_executor/models/qwen.py
vllm/model_executor/models/qwen.py
+12
-0
No files found.
vllm/model_executor/models/qwen.py
View file @
948c8595
...
@@ -1028,6 +1028,18 @@ class QWenLLM(QWenBaseModel):
...
@@ -1028,6 +1028,18 @@ class QWenLLM(QWenBaseModel):
embedding_modules
=
{}
embedding_modules
=
{}
embedding_padding_modules
=
[]
embedding_padding_modules
=
[]
default_bitsandbytes_target_modules
=
[
".c_attn."
,
".c_proj."
,
".w1."
,
".w2."
,
]
bitsandbytes_stacked_params_mapping
=
{
# shard_name, weight_name, index
"w2"
:
(
"gate_up_proj"
,
0
),
"w1"
:
(
"gate_up_proj"
,
1
),
}
class
QWenVL
(
QWenBaseModel
,
SupportsMultiModal
):
class
QWenVL
(
QWenBaseModel
,
SupportsMultiModal
):
packed_modules_mapping
=
{
packed_modules_mapping
=
{
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment