Unverified Commit ed939f08 authored by Biswa Panda's avatar Biswa Panda Committed by GitHub
Browse files

docs(recipes): add experimental WIP note to Kimi-K2.5 recipe (#7381)

parent 4ef8b8e6
...@@ -42,7 +42,7 @@ These recipes demonstrate aggregated or disaggregated serving: ...@@ -42,7 +42,7 @@ These recipes demonstrate aggregated or disaggregated serving:
| **[DeepSeek-R1](deepseek-r1/sglang/disagg-16gpu/)** | SGLang | Disagg WideEP | 32x H200 | ✅ | ❌ | TP=16, multi-node. Use `model-download-sglang.yaml` | ❌ | | **[DeepSeek-R1](deepseek-r1/sglang/disagg-16gpu/)** | SGLang | Disagg WideEP | 32x H200 | ✅ | ❌ | TP=16, multi-node. Use `model-download-sglang.yaml` | ❌ |
| **[DeepSeek-R1](deepseek-r1/trtllm/disagg/wide_ep/gb200/)** | TensorRT-LLM | Disagg WideEP (GB200) | 36x GB200 | ✅ | ✅ | Multi-node: 8 decode + 1 prefill nodes | ❌ | | **[DeepSeek-R1](deepseek-r1/trtllm/disagg/wide_ep/gb200/)** | TensorRT-LLM | Disagg WideEP (GB200) | 36x GB200 | ✅ | ✅ | Multi-node: 8 decode + 1 prefill nodes | ❌ |
| **[DeepSeek-R1](deepseek-r1/)** | vLLM | Disagg DEP16 | 32x H200 | ✅ | ❌ | Multi-node, data-expert parallel | ❌ | | **[DeepSeek-R1](deepseek-r1/)** | vLLM | Disagg DEP16 | 32x H200 | ✅ | ❌ | Multi-node, data-expert parallel | ❌ |
| **[Kimi-K2.5](kimi-k2.5/)** | TensorRT-LLM | Aggregated | 8x B200 | ✅ | ❌ | MoE model, TP8×EP8, reasoning + tool calling | ❌ | | **[Kimi-K2.5](kimi-k2.5/)** 🚧 | TensorRT-LLM | Aggregated | 8x B200 | ✅ | ❌ | Experimental — MoE model, TP8×EP8, reasoning + tool calling | ❌ |
**Legend:** **Legend:**
- **Deployment**: ✅ = Complete `deploy.yaml` manifest available - **Deployment**: ✅ = Complete `deploy.yaml` manifest available
......
# Kimi-K2.5 Recipes # Kimi-K2.5 Recipes
> 🚧 **Work-in-Progress — Experimental Recipe**
>
> The TensorRT-LLM Python package used for Dynamo's TRT-LLM integration does not yet include
> native Kimi K2.5 support. This recipe is an **experimental** effort to bring
> Kimi K2.5 to Dynamo ahead of upstream availability. It needs to patch the container image on top of released dynamo image.
Deployment recipe for **Kimi-K2.5** using TensorRT-LLM with Dynamo's KV-aware routing. Deployment recipe for **Kimi-K2.5** using TensorRT-LLM with Dynamo's KV-aware routing.
## Available Configurations ## Available Configurations
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment