Unverified Commit b9cbee0b authored by ZichengMa's avatar ZichengMa Committed by GitHub
Browse files

docs: Add note for LMCache ARM support (#2535)


Co-authored-by: default avatarDmitry Tokarev <dtokarev@nvidia.com>
parent e5b6a054
...@@ -11,6 +11,10 @@ This document describes how LMCache is integrated into Dynamo's vLLM backend to ...@@ -11,6 +11,10 @@ This document describes how LMCache is integrated into Dynamo's vLLM backend to
- **Memory Offloading**: Intelligent KV cache placement across CPU/GPU/storage tiers - **Memory Offloading**: Intelligent KV cache placement across CPU/GPU/storage tiers
- **Improved Throughput**: Reduced GPU memory pressure enables higher batch sizes - **Improved Throughput**: Reduced GPU memory pressure enables higher batch sizes
## Platform Support
**Important Note**: LMCache integration currently only supports x86 architecture. ARM64 is not supported at this time.
## Aggregated Serving ## Aggregated Serving
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment