Unverified Commit 4470ee2f authored by Alexander Matveev's avatar Alexander Matveev Committed by GitHub
Browse files

[Perf] Enable separate shared_experts stream only for CUDA (#30085)


Signed-off-by: default avatarAlexander Matveev <amatveev@redhat.com>
parent 690cc3ef
......@@ -863,7 +863,8 @@ class FusedMoE(CustomOp):
use_chunked_impl: bool,
) -> tuple[bool, torch.Tensor | None]:
use_shared_experts_stream = (
has_separate_shared_experts
current_platform.is_cuda()
and has_separate_shared_experts
and not use_chunked_impl
and self.shared_experts_stream is not None
and (
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment