Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
a3319f4f
Unverified
Commit
a3319f4f
authored
Jun 12, 2025
by
Michael Goin
Committed by
GitHub
Jun 12, 2025
Browse files
[Bugfix] Enforce contiguous input for dynamic_per_token FP8/INT8 quant (#19452)
Signed-off-by:
mgoin
<
mgoin64@gmail.com
>
parent
9d880f59
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
3 deletions
+3
-3
vllm/_custom_ops.py
vllm/_custom_ops.py
+3
-3
No files found.
vllm/_custom_ops.py
View file @
a3319f4f
...
@@ -1270,7 +1270,7 @@ def scaled_fp8_quant(
...
@@ -1270,7 +1270,7 @@ def scaled_fp8_quant(
device
=
input
.
device
,
device
=
input
.
device
,
dtype
=
torch
.
float32
)
dtype
=
torch
.
float32
)
torch
.
ops
.
_C
.
dynamic_per_token_scaled_fp8_quant
(
torch
.
ops
.
_C
.
dynamic_per_token_scaled_fp8_quant
(
output
,
input
,
scale
,
scale_ub
)
output
,
input
.
contiguous
()
,
scale
,
scale_ub
)
else
:
else
:
scale
=
torch
.
zeros
(
1
,
device
=
input
.
device
,
dtype
=
torch
.
float32
)
scale
=
torch
.
zeros
(
1
,
device
=
input
.
device
,
dtype
=
torch
.
float32
)
torch
.
ops
.
_C
.
dynamic_scaled_fp8_quant
(
output
,
input
,
scale
)
torch
.
ops
.
_C
.
dynamic_scaled_fp8_quant
(
output
,
input
,
scale
)
...
@@ -1379,8 +1379,8 @@ def scaled_int8_quant(
...
@@ -1379,8 +1379,8 @@ def scaled_int8_quant(
dtype
=
torch
.
float32
)
dtype
=
torch
.
float32
)
input_azp
=
None
if
symmetric
else
torch
.
empty_like
(
input_scales
,
input_azp
=
None
if
symmetric
else
torch
.
empty_like
(
input_scales
,
dtype
=
torch
.
int32
)
dtype
=
torch
.
int32
)
torch
.
ops
.
_C
.
dynamic_scaled_int8_quant
(
output
,
input
,
input_scales
,
torch
.
ops
.
_C
.
dynamic_scaled_int8_quant
(
output
,
input
.
contiguous
()
,
input_azp
)
input_scales
,
input_azp
)
return
output
,
input_scales
,
input_azp
return
output
,
input_scales
,
input_azp
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment