Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
4a734b9d
Commit
4a734b9d
authored
Feb 20, 2025
by
zhuwenwen
Browse files
skip fp8 fusion
parent
177520a9
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
13 additions
and
13 deletions
+13
-13
vllm/compilation/fusion.py
vllm/compilation/fusion.py
+13
-13
No files found.
vllm/compilation/fusion.py
View file @
4a734b9d
...
...
@@ -58,11 +58,11 @@ kFp8DynamicTensorSym = QuantKey(FP8_DTYPE, False, True, True)
kFp8DynamicTokenSym
=
QuantKey
(
FP8_DTYPE
,
False
,
False
,
True
)
QUANT_OPS
:
Dict
[
QuantKey
,
OpOverload
]
=
{
kFp8StaticTensorSym
:
torch
.
ops
.
_C
.
static_scaled_fp8_quant
.
default
,
# noqa
kFp8DynamicTensorSym
:
torch
.
ops
.
_C
.
dynamic_scaled_fp8_quant
.
default
,
# noqa
kFp8DynamicTokenSym
:
torch
.
ops
.
_C
.
dynamic_per_token_scaled_fp8_quant
.
default
,
# noqa
#
kFp8StaticTensorSym: torch.ops._C.static_scaled_fp8_quant.default, # noqa
#
kFp8DynamicTensorSym:
#
torch.ops._C.dynamic_scaled_fp8_quant.default, # noqa
#
kFp8DynamicTokenSym:
#
torch.ops._C.dynamic_per_token_scaled_fp8_quant.default, # noqa
}
...
...
@@ -81,14 +81,14 @@ class FusedRMSQuantKey(NamedTuple):
FUSED_OPS
:
Dict
[
FusedRMSQuantKey
,
OpOverload
]
=
{
FusedRMSQuantKey
(
kFp8StaticTensorSym
,
False
):
torch
.
ops
.
_C
.
rms_norm_static_fp8_quant
.
default
,
# noqa
FusedRMSQuantKey
(
kFp8StaticTensorSym
,
True
):
torch
.
ops
.
_C
.
fused_add_rms_norm_static_fp8_quant
.
default
,
# noqa
FusedRMSQuantKey
(
kFp8DynamicTokenSym
,
False
):
torch
.
ops
.
_C
.
rms_norm_dynamic_per_token_quant
.
default
,
# noqa
FusedRMSQuantKey
(
kFp8DynamicTokenSym
,
True
):
torch
.
ops
.
_C
.
rms_norm_dynamic_per_token_quant
.
default
,
# noqa
#
FusedRMSQuantKey(kFp8StaticTensorSym, False):
#
torch.ops._C.rms_norm_static_fp8_quant.default, # noqa
#
FusedRMSQuantKey(kFp8StaticTensorSym, True):
#
torch.ops._C.fused_add_rms_norm_static_fp8_quant.default, # noqa
#
FusedRMSQuantKey(kFp8DynamicTokenSym, False):
#
torch.ops._C.rms_norm_dynamic_per_token_quant.default, # noqa
#
FusedRMSQuantKey(kFp8DynamicTokenSym, True):
#
torch.ops._C.rms_norm_dynamic_per_token_quant.default, # noqa
}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment