Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
18e50593
Unverified
Commit
18e50593
authored
Feb 25, 2025
by
Michael Goin
Committed by
GitHub
Feb 25, 2025
Browse files
[Bugfix] Support MLA for CompressedTensorsWNA16 (#13725)
Signed-off-by:
mgoin
<
mgoin64@gmail.com
>
parent
4a8cfc75
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
7 additions
and
7 deletions
+7
-7
vllm/attention/backends/mla/common.py
vllm/attention/backends/mla/common.py
+7
-7
No files found.
vllm/attention/backends/mla/common.py
View file @
18e50593
...
...
@@ -1130,13 +1130,13 @@ class MLACommonImpl(MLAAttentionImpl[T], Generic[T]):
)
def
get_layer_weight
(
layer
):
if
hasattr
(
layer
,
"weight"
)
:
return
layer
.
weight
el
if
hasattr
(
layer
,
"qweight"
):
return
layer
.
qweight
else
:
raise
A
ttribute
Error
(
f
"Layer '
{
layer
}
' has neither weight nor qweight
"
)
WEIGHT_NAMES
=
(
"weight"
,
"qweight"
,
"weight
_packed
"
)
for
attr
in
WEIGHT_NAMES
:
if
hasattr
(
layer
,
attr
):
return
getattr
(
layer
,
attr
)
raise
AttributeError
(
f
"Layer '
{
layer
}
' has no recognized weight a
ttribute
:"
f
"
{
WEIGHT_NAMES
}
.
"
)
def
get_and_maybe_dequant_weights
(
layer
:
LinearBase
):
if
not
isinstance
(
layer
.
quant_method
,
UnquantizedLinearMethod
):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment