[Bugfix] fix device_name for routing replay (#34336)

Signed-off-by: liyongwen <1310439159@qq.com>

[Bugfix] fix device_name for routing replay (#34336)
Signed-off-by: liyongwen <1310439159@qq.com>
c6ca5159 · Li-Yongwen · GitHub · c0615a29 · c6ca5159
Unverified Commit c6ca5159 authored Feb 26, 2026 by Li-Yongwen Committed by GitHub Feb 26, 2026
Show whitespace changes
Inline Side-by-side

Showing with 2 additions and 1 deletion

vllm/model_executor/layers/fused_moe/routed_experts_capturer.py ...odel_executor/layers/fused_moe/routed_experts_capturer.py +2 -1

No files found.
--- a/vllm/model_executor/layers/fused_moe/routed_experts_capturer.py
+++ b/vllm/model_executor/layers/fused_moe/routed_experts_capturer.py
@@ -20,6 +20,7 @@ import torch
 from vllm.config import VllmConfig
 from vllm.distributed import get_tensor_model_parallel_rank
 from vllm.forward_context import get_forward_context
+from vllm.platforms import current_platform
 logger = logging.getLogger(__name__)
@@ -132,7 +133,7 @@ class RoutedExpertsCapturer:
        self._device_buffer = torch.zeros(
            (max_num_batched_tokens, num_layers, num_experts_per_tok),
            dtype=torch.int32,
-            device="cuda",
+            device=current_platform.device_type,
        )
        self.dp_rank = vllm_config.parallel_config.data_parallel_rank