Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
zhaoyu6
sglang
Commits
065ce815
"git@developer.sourcefind.cn:zhaoyu6/sglang.git" did not exist on "490a1f39dd54115b56e3c587b457cca49e0a9bfc"
Unverified
Commit
065ce815
authored
Oct 14, 2025
by
fzyzcjy
Committed by
GitHub
Oct 13, 2025
Browse files
Tiny cleanup fp4 gemm calls (#11537)
parent
8e51049f
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
9 additions
and
19 deletions
+9
-19
python/sglang/srt/layers/quantization/modelopt_quant.py
python/sglang/srt/layers/quantization/modelopt_quant.py
+9
-19
No files found.
python/sglang/srt/layers/quantization/modelopt_quant.py
View file @
065ce815
...
...
@@ -852,25 +852,15 @@ class ModelOptFp4LinearMethod(LinearMethodBase):
if
enable_flashinfer_fp4_gemm
:
w
=
layer
.
weight
.
T
w_scale_interleaved
=
layer
.
weight_scale_interleaved
.
T
if
USE_CUTLASS_BACKEND_FOR_FP4_GEMM
:
out
=
fp4_gemm
(
x_fp4
,
w
,
x_scale_interleaved
,
w_scale_interleaved
,
layer
.
alpha
,
output_dtype
,
backend
=
"cutlass"
,
)
else
:
out
=
fp4_gemm
(
x_fp4
,
w
,
x_scale_interleaved
,
w_scale_interleaved
,
layer
.
alpha
,
output_dtype
,
)
out
=
fp4_gemm
(
x_fp4
,
w
,
x_scale_interleaved
,
w_scale_interleaved
,
layer
.
alpha
,
output_dtype
,
**
(
dict
(
backend
=
"cutlass"
)
if
USE_CUTLASS_BACKEND_FOR_FP4_GEMM
else
dict
()),
)
if
bias
is
not
None
:
out
=
out
+
bias
return
out
.
view
(
*
output_shape
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment