Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
ba09652d
Unverified
Commit
ba09652d
authored
Oct 21, 2025
by
JartX
Committed by
GitHub
Oct 21, 2025
Browse files
[ROCM] Enable CompressedTensorsWNA16 (#27187)
Signed-off-by:
JartX
<
sagformas@epdcenter.es
>
parent
bd66b852
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
1 deletion
+4
-1
vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors_moe.py
...quantization/compressed_tensors/compressed_tensors_moe.py
+4
-1
No files found.
vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors_moe.py
View file @
ba09652d
...
...
@@ -142,7 +142,10 @@ class CompressedTensorsMoEMethod(FusedMoEMethodBase):
# group_size=None means channelwise
group_size
=
weight_quant
.
group_size
or
-
1
# Prefer to use the MarlinMoE kernel when it is supported.
if
not
check_moe_marlin_supports_layer
(
layer
,
group_size
):
if
(
not
check_moe_marlin_supports_layer
(
layer
,
group_size
)
or
current_platform
.
is_rocm
()
):
if
(
weight_quant
.
strategy
==
QuantizationStrategy
.
GROUP
and
weight_quant
.
actorder
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment