Unverified Commit a0c7025d authored by Alec's avatar Alec Committed by GitHub
Browse files

docs: add GLM-5 NVFP4 recipe to recipes README (#8098)


Co-authored-by: default avatarClaude Opus 4.6 (1M context) <noreply@anthropic.com>
parent 1284237e
...@@ -65,6 +65,7 @@ These recipes are under active development and may require additional setup step ...@@ -65,6 +65,7 @@ These recipes are under active development and may require additional setup step
| Model | Framework | Mode | GPUs | Deployment | Notes | | Model | Framework | Mode | GPUs | Deployment | Notes |
|-------|-----------|------|------|------------|-------| |-------|-----------|------|------|------------|-------|
| **[GLM-5-NVFP4](glm-5-nvfp4/sglang/disagg/)** | SGLang | Disagg Prefill/Decode | 20x GB200 | ✅ | NVFP4, EAGLE speculative decoding, TP16 decode + TP4 prefill. Requires [custom container build](glm-5-nvfp4/). |
| **[nvidia/Kimi-K2.5-NVFP4](kimi-k2.5/trtllm/agg/nvidia/)** | TensorRT-LLM | Aggregated | 8x B200 | ✅ | Text only — MoE model, TP8×EP8, reasoning + tool calling. Requires [container patch](kimi-k2.5/trtllm/agg/nvidia/patch/). Vision input not yet functional with the patch. | | **[nvidia/Kimi-K2.5-NVFP4](kimi-k2.5/trtllm/agg/nvidia/)** | TensorRT-LLM | Aggregated | 8x B200 | ✅ | Text only — MoE model, TP8×EP8, reasoning + tool calling. Requires [container patch](kimi-k2.5/trtllm/agg/nvidia/patch/). Vision input not yet functional with the patch. |
## Recipe Structure ## Recipe Structure
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment