Unverified Commit edda76b4 authored by Alec's avatar Alec Committed by GitHub
Browse files

docs: Add results video to agg round robin vs disagg kv router ReadMe.md (#5022)


Signed-off-by: default avataralec-flowers <aflowers@nvidia.com>
Signed-off-by: default avatarAlec <35311602+alec-flowers@users.noreply.github.com>
parent f30089aa
......@@ -2,6 +2,11 @@
This recipe demonstrates the performance difference between **aggregated (round-robin)** and **disaggregated (KV-aware)** routing using a real-world conversation trace dataset from the [Mooncake FAST25 paper](https://github.com/kvcache-ai/Mooncake).
## Results
https://github.com/user-attachments/assets/c425002b-4459-47c4-bfca-fd1e2620500c
## Experiment Overview
We compare two deployment modes on **16x H200 GPUs across 2 nodes**:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment