@@ -9,8 +9,6 @@ The Dynamo KV Block Manager (KVBM) is a scalable runtime component designed to h
KVBM is modular and can be used standalone via `pip install kvbm` or as the memory management component in the full Dynamo stack. This guide covers installation, configuration, and deployment of the Dynamo KV Block Manager (KVBM) and other KV cache management systems.
## Quick Start
## Run KVBM Standalone
KVBM can be used independently without using the rest of the Dynamo stack:
...
...
@@ -32,8 +30,23 @@ To build KVBM from source, see the detailed instructions in the [KVBM bindings R
```bash
# Start up etcd for KVBM leader/worker registration and discovery
docker compose -f deploy/docker-compose.yml up -d
```
Pick one of the following to get a Dynamo vLLM container with KVBM built in. The subsequent serving commands are the same either way.
**Option A: Pre-built NGC container (recommended for quick start)**
```bash
docker run --gpus all --network host --rm-it nvcr.io/nvidia/ai-dynamo/vllm-runtime:1.0.0
```
See the [Local Installation Guide](../../getting-started/local-installation.md) for full setup instructions and [Release Artifacts](../../reference/release-artifacts.md#container-images) for available versions.
**Option B: Build from source**
```bash
# Build a dynamo vLLM container (KVBM is built in by default)
# NOTE: render.py defaults to --platform linux/amd64. On ARM64 hosts, pass --platform linux/arm64.
# Start up etcd for KVBM leader/worker registration and discovery
docker compose -f deploy/docker-compose.yml up -d
```
Pick one of the following to get a Dynamo TensorRT-LLM container with KVBM built in. The subsequent serving commands are the same either way.
**Option A: Pre-built NGC container (recommended for quick start)**
```bash
docker run --gpus all --network host --rm-it nvcr.io/nvidia/ai-dynamo/tensorrtllm-runtime:1.0.0
```
See the [Local Installation Guide](../../getting-started/local-installation.md) for full setup instructions and [Release Artifacts](../../reference/release-artifacts.md#container-images) for available versions.
**Option B: Build from source**
```bash
# Build a dynamo TRTLLM container (KVBM is built in by default)
# NOTE: render.py defaults to --platform linux/amd64. On ARM64 hosts, pass --platform linux/arm64.