README.md 274 Bytes
Newer Older
1
2
3
4
5
6
7
8
# Disaggregated Serving

This example contains scripts that demonstrate the disaggregated serving features of vLLM.

## Files

- `disagg_proxy_demo.py` - Demonstrates XpYd (X prefill instances, Y decode instances).
- `kv_events.sh` - Demonstrates KV cache event publishing.