Unverified Commit 4470ee2f authored by Alexander Matveev's avatar Alexander Matveev Committed by GitHub
Browse files

[Perf] Enable separate shared_experts stream only for CUDA (#30085)


Signed-off-by: default avatarAlexander Matveev <amatveev@redhat.com>
parent 690cc3ef
...@@ -863,7 +863,8 @@ class FusedMoE(CustomOp): ...@@ -863,7 +863,8 @@ class FusedMoE(CustomOp):
use_chunked_impl: bool, use_chunked_impl: bool,
) -> tuple[bool, torch.Tensor | None]: ) -> tuple[bool, torch.Tensor | None]:
use_shared_experts_stream = ( use_shared_experts_stream = (
has_separate_shared_experts current_platform.is_cuda()
and has_separate_shared_experts
and not use_chunked_impl and not use_chunked_impl
and self.shared_experts_stream is not None and self.shared_experts_stream is not None
and ( and (
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment