Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
158efb14
"...git@developer.sourcefind.cn:OpenDAS/torch-harmonics.git" did not exist on "9c26a6d8163b2cda1a16e85267b30b6e82c9a41b"
Commit
158efb14
authored
Nov 19, 2025
by
maxiao1
Browse files
修复w8a8_marlin tp pp
parent
eed591c9
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
1 deletion
+2
-1
python/sglang/srt/layers/quantization/compressed_tensors/compressed_tensors_moe_marlin.py
...ation/compressed_tensors/compressed_tensors_moe_marlin.py
+2
-1
No files found.
python/sglang/srt/layers/quantization/compressed_tensors/compressed_tensors_moe_marlin.py
View file @
158efb14
...
...
@@ -15,6 +15,7 @@ from sglang.srt.layers.quantization.base_config import FusedMoEMethodBase
from
sglang.srt.utils
import
set_weight_attrs
from
sglang.srt.layers.moe
import
MoeRunner
,
MoeRunnerBackend
,
MoeRunnerConfig
from
sglang.srt.layers.moe.utils
import
get_moe_a2a_backend
try
:
from
lmslim.layers.fused_moe.fuse_moe_int8_marlin
import
fused_experts_impl_int8_marlin
except
Exception
:
...
...
@@ -77,7 +78,7 @@ class CompressedTensorsW8A8Int8MarlinMoEMethod(CompressedTensorsMarlinMoEMethod)
"weights"
)
self
.
input_quant
=
self
.
quant_config
.
target_scheme_map
[
"Linear"
].
get
(
"input_activations"
)
self
.
use_deepep
=
True
self
.
use_deepep
=
get_moe_a2a_backend
().
is_deepep
()
per_channel
=
(
self
.
weight_quant
.
strategy
==
QuantizationStrategy
.
CHANNEL
and
self
.
input_quant
.
strategy
==
QuantizationStrategy
.
TOKEN
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment