Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
change
sglang
Commits
6a47b730
"examples/dnn_introduction2_ex.cpp" did not exist on "61591b13e24f1e6377ef37c7f777e499f0840a06"
Unverified
Commit
6a47b730
authored
Jun 01, 2025
by
Baizhou Zhang
Committed by
GitHub
Jun 01, 2025
Browse files
Remove contiguous before Flashinfer groupwise fp8 gemm (#6804)
parent
c429919d
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
4 deletions
+6
-4
python/sglang/srt/layers/quantization/fp8_utils.py
python/sglang/srt/layers/quantization/fp8_utils.py
+6
-4
No files found.
python/sglang/srt/layers/quantization/fp8_utils.py
View file @
6a47b730
...
...
@@ -166,11 +166,13 @@ def flashinfer_gemm_w8a8_block_fp8_linear(
input_2d
,
block_size
[
1
],
column_major_scales
=
False
)
x_scale_input
=
x_scale
.
T
.
contiguous
()
weight_scale_input
=
weight_scale
.
T
.
contiguous
()
output
=
gemm_fp8_nt_groupwise
(
q_input
,
weight
,
x_scale_input
,
weight_scale_input
,
out_dtype
=
input_2d
.
dtype
q_input
,
weight
,
x_scale
,
weight_scale
,
scale_major_mode
=
"K"
,
out_dtype
=
input_2d
.
dtype
,
)
if
bias
is
not
None
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment