Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
59cce594
Unverified
Commit
59cce594
authored
Oct 31, 2025
by
Qiaolin Yu
Committed by
GitHub
Oct 31, 2025
Browse files
Use sgl fp4 quant kernel by default (#12482)
parent
795e98f8
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
1 deletion
+5
-1
python/sglang/srt/layers/quantization/modelopt_quant.py
python/sglang/srt/layers/quantization/modelopt_quant.py
+5
-1
No files found.
python/sglang/srt/layers/quantization/modelopt_quant.py
View file @
59cce594
...
...
@@ -7,6 +7,7 @@ from typing import TYPE_CHECKING, Any, Dict, List, Optional
import
torch
from
torch.nn.parameter
import
Parameter
from
python.sglang.srt.utils.common
import
is_sm120_supported
from
sglang.srt.distributed
import
get_tp_group
from
sglang.srt.layers.dp_attention
import
get_dp_global_num_tokens
,
get_local_dp_buffer
from
sglang.srt.layers.moe
import
(
...
...
@@ -51,7 +52,10 @@ if TYPE_CHECKING:
from
sglang.srt.single_batch_overlap
import
DownGemmOverlapArgs
try
:
from
flashinfer
import
fp4_quantize
if
is_sm120_supported
():
from
flashinfer
import
fp4_quantize
else
:
from
sgl_kernel
import
scaled_fp4_quant
as
fp4_quantize
except
ImportError
:
fp4_quantize
=
None
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment