"docs/components/vscode:/vscode.git/clone" did not exist on "fce8bbc2508c1be0156841dce5c2e8bf25764c77"
Unverified Commit a289695c authored by mohammedabdulwahhab's avatar mohammedabdulwahhab Committed by GitHub
Browse files

fix: consolidate dyn_discovery_backend and dyn_kv_store (#6167)


Signed-off-by: default avatarmohammedabdulwahhab <furkhan324@berkeley.edu>
parent 0c83585a
......@@ -164,25 +164,25 @@ Dynamo provides a simple way to spin up a local set of inference components incl
Start the frontend:
> **Tip:** To run in a single terminal (useful in containers), append `> logfile.log 2>&1 &` to run processes in background. Example: `python3 -m dynamo.frontend --store-kv file > dynamo.frontend.log 2>&1 &`
> **Tip:** To run in a single terminal (useful in containers), append `> logfile.log 2>&1 &` to run processes in background. Example: `python3 -m dynamo.frontend --discovery-backend file > dynamo.frontend.log 2>&1 &`
```bash
# Start an OpenAI compatible HTTP server with prompt templating, tokenization, and routing.
# For local dev: --store-kv file avoids etcd (workers and frontend must share a disk)
python3 -m dynamo.frontend --http-port 8000 --store-kv file
# For local dev: --discovery-backend file avoids etcd (workers and frontend must share a disk)
python3 -m dynamo.frontend --http-port 8000 --discovery-backend file
```
In another terminal (or same terminal if using background mode), start a worker for your chosen backend:
```bash
# SGLang
python3 -m dynamo.sglang --model-path Qwen/Qwen3-0.6B --store-kv file
python3 -m dynamo.sglang --model-path Qwen/Qwen3-0.6B --discovery-backend file
# TensorRT-LLM
python3 -m dynamo.trtllm --model-path Qwen/Qwen3-0.6B --store-kv file
python3 -m dynamo.trtllm --model-path Qwen/Qwen3-0.6B --discovery-backend file
# vLLM (note: uses --model, not --model-path)
python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --store-kv file \
python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --discovery-backend file \
--kv-events-config '{"enable_kv_cache_events": false}'
```
......@@ -335,7 +335,7 @@ python3 -m dynamo.frontend
## 9. Configure for Local Development
- Pass `--store-kv file` to avoid external dependencies (see [Service Discovery and Messaging](#service-discovery-and-messaging))
- Pass `--discovery-backend file` to avoid external dependencies (see [Service Discovery and Messaging](#service-discovery-and-messaging))
- Set `DYN_LOG` to adjust the logging level (e.g., `export DYN_LOG=debug`). Uses the same syntax as `RUST_LOG`
> **Note:** VSCode and Cursor users can use the `.devcontainer` folder for a pre-configured dev environment. See the [devcontainer README](.devcontainer/README.md) for details.
......@@ -365,7 +365,7 @@ Dynamo uses TCP for inter-component communication. On Kubernetes, native resourc
| Deployment | etcd | NATS | Notes |
|------------|------|------|-------|
| **Local Development** | ❌ Not required | ❌ Not required | Pass `--store-kv file`; vLLM also needs `--kv-events-config '{"enable_kv_cache_events": false}'` |
| **Local Development** | ❌ Not required | ❌ Not required | Pass `--discovery-backend file`; vLLM also needs `--kv-events-config '{"enable_kv_cache_events": false}'` |
| **Kubernetes** | ❌ Not required | ❌ Not required | K8s-native discovery; TCP request plane |
> **Note:** KV-Aware Routing requires NATS for prefix caching coordination.
......
......@@ -15,7 +15,7 @@ class DynamoRuntimeConfig(ConfigBase):
"""Configuration for Dynamo runtime (common across all backends)."""
namespace: str
store_kv: str
discovery_backend: str
request_plane: str
event_plane: str
connector: list[str]
......@@ -54,11 +54,11 @@ class DynamoRuntimeArgGroup(ArgGroup):
)
add_argument(
g,
flag_name="--store-kv",
env_var="DYN_STORE_KV",
flag_name="--discovery-backend",
env_var="DYN_DISCOVERY_BACKEND",
default="etcd",
help="Which key-value backend to use: etcd, mem, file. Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
choices=["etcd", "file", "mem"],
help="Discovery backend: kubernetes (K8s API), etcd (distributed KV), file (local filesystem), mem (in-memory). Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
choices=["kubernetes", "etcd", "file", "mem"],
)
add_argument(
g,
......
......@@ -62,7 +62,7 @@ async def graceful_shutdown(
def create_runtime(
store_kv: str,
discovery_backend: str,
request_plane: str,
event_plane: str,
use_kv_events: bool,
......@@ -74,7 +74,7 @@ def create_runtime(
creates the runtime, and installs SIGTERM/SIGINT handlers.
Args:
store_kv: Key-value backend type (etcd, file, mem).
discovery_backend: Discovery backend type (kubernetes, etcd, file, mem).
request_plane: Request distribution method (nats, http, tcp).
event_plane: Event publishing method (nats, zmq).
use_kv_events: Whether KV events are enabled.
......@@ -89,7 +89,7 @@ def create_runtime(
enable_nats = request_plane == "nats" or (event_plane == "nats" and use_kv_events)
runtime = DistributedRuntime(loop, store_kv, request_plane, enable_nats)
runtime = DistributedRuntime(loop, discovery_backend, request_plane, enable_nats)
def signal_handler():
asyncio.create_task(graceful_shutdown(runtime, shutdown_event))
......
......@@ -77,7 +77,7 @@ class FrontendConfig(ConfigBase):
grpc_metrics_port: int
dump_config_to: Optional[str]
store_kv: str
discovery_backend: str
request_plane: str
event_plane: str
chat_processor: str
......@@ -463,15 +463,15 @@ class FrontendArgGroup(ArgGroup):
add_argument(
g,
flag_name="--store-kv",
env_var="DYN_STORE_KV",
flag_name="--discovery-backend",
env_var="DYN_DISCOVERY_BACKEND",
default="etcd",
help=(
"Which key-value backend to use: etcd, mem, file. Etcd uses the ETCD_* env vars "
"(e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var "
"DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv."
"Discovery backend: kubernetes (K8s API), etcd (distributed KV), file (local filesystem), "
"mem (in-memory). Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. "
"File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv."
),
choices=["etcd", "file", "mem"],
choices=["kubernetes", "etcd", "file", "mem"],
)
add_argument(
g,
......
......@@ -166,7 +166,7 @@ async def async_main():
loop = asyncio.get_running_loop()
runtime = DistributedRuntime(
loop, config.store_kv, config.request_plane, enable_nats
loop, config.discovery_backend, config.request_plane, enable_nats
)
def signal_handler():
......
......@@ -327,11 +327,11 @@ def parse_args():
),
)
parser.add_argument(
"--store-kv",
"--discovery-backend",
type=str,
choices=["etcd", "file", "mem"],
default=os.environ.get("DYN_STORE_KV", "etcd"),
help="Which key-value backend to use: etcd, mem, file. Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
choices=["kubernetes", "etcd", "file", "mem"],
default=os.environ.get("DYN_DISCOVERY_BACKEND", "etcd"),
help="Discovery backend: kubernetes (K8s API), etcd (distributed KV), file (local filesystem), mem (in-memory). Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
)
parser.add_argument(
"--request-plane",
......
......@@ -159,7 +159,7 @@ async def launch_workers(args, extra_engine_args_path):
logger.info(f"Creating mocker worker {worker_id + 1}/{args.num_workers}")
# Create a separate DistributedRuntime for this worker (on same event loop)
runtime = DistributedRuntime(loop, args.store_kv, args.request_plane)
runtime = DistributedRuntime(loop, args.discovery_backend, args.request_plane)
runtimes.append(runtime)
# Determine which engine args file to use
......
......@@ -98,12 +98,12 @@ DYNAMO_ARGS: Dict[str, Dict[str, Any]] = {
"default": None,
"help": "Dump debug config to the specified file path. If not specified, the config will be dumped to stdout at INFO level.",
},
"store-kv": {
"flags": ["--store-kv"],
"discovery-backend": {
"flags": ["--discovery-backend"],
"type": str,
"choices": ["etcd", "file", "mem"],
"default": os.environ.get("DYN_STORE_KV", "etcd"),
"help": "Which key-value backend to use: etcd, mem, file. Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
"choices": ["kubernetes", "etcd", "file", "mem"],
"default": os.environ.get("DYN_DISCOVERY_BACKEND", "etcd"),
"help": "Discovery backend: kubernetes (K8s API), etcd (distributed KV), file (local filesystem), mem (in-memory). Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
},
"request-plane": {
"flags": ["--request-plane"],
......@@ -157,7 +157,7 @@ class DynamoArgs:
namespace: str
component: str
endpoint: str
store_kv: str
discovery_backend: str
request_plane: str
event_plane: str
......@@ -595,7 +595,7 @@ async def parse_args(args: list[str]) -> Config:
namespace=parsed_namespace,
component=parsed_component_name,
endpoint=parsed_endpoint_name,
store_kv=parsed_args.store_kv,
discovery_backend=parsed_args.discovery_backend,
request_plane=parsed_args.request_plane,
event_plane=parsed_args.event_plane,
tool_call_parser=tool_call_parser,
......
......@@ -203,7 +203,7 @@ async def worker():
dynamo_args = config.dynamo_args
runtime, loop = create_runtime(
store_kv=dynamo_args.store_kv,
discovery_backend=dynamo_args.discovery_backend,
request_plane=dynamo_args.request_plane,
event_plane=dynamo_args.event_plane,
use_kv_events=dynamo_args.use_kv_events,
......
......@@ -29,7 +29,7 @@ class DiffusionConfig:
namespace: str = DYN_NAMESPACE
component: str = "diffusion"
endpoint: str = "generate"
store_kv: str = "etcd"
discovery_backend: str = "etcd"
request_plane: str = "tcp"
event_plane: str = "nats"
......
......@@ -31,7 +31,7 @@ async def worker():
shutdown_event = asyncio.Event()
runtime, _ = create_runtime(
store_kv=config.store_kv,
discovery_backend=config.discovery_backend,
request_plane=config.request_plane,
event_plane=config.event_plane,
use_kv_events=config.use_kv_events,
......
......@@ -65,7 +65,7 @@ class Config:
self.dump_config_to: Optional[str] = None
self.custom_jinja_template: Optional[str] = None
self.dyn_endpoint_types: str = "chat,completions"
self.store_kv: str = ""
self.discovery_backend: str = ""
self.request_plane: str = ""
self.event_plane: str = ""
self.enable_local_indexer: bool = True
......@@ -124,7 +124,7 @@ class Config:
f"tool_call_parser={self.tool_call_parser}, "
f"dump_config_to={self.dump_config_to}, "
f"custom_jinja_template={self.custom_jinja_template}, "
f"store_kv={self.store_kv}, "
f"discovery_backend={self.discovery_backend}, "
f"request_plane={self.request_plane}, "
f"event_plane={self.event_plane}, "
f"enable_local_indexer={self.enable_local_indexer}, "
......@@ -335,11 +335,11 @@ def cmd_line_args():
help="Comma-separated list of endpoint types to enable. Options: 'chat', 'completions'. Default: 'chat,completions'. Use 'completions' for models without chat templates.",
)
parser.add_argument(
"--store-kv",
"--discovery-backend",
type=str,
choices=["etcd", "file", "mem"],
default=os.environ.get("DYN_STORE_KV", "etcd"),
help="Which key-value backend to use: etcd, mem, file. Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
choices=["kubernetes", "etcd", "file", "mem"],
default=os.environ.get("DYN_DISCOVERY_BACKEND", "etcd"),
help="Discovery backend: kubernetes (K8s API), etcd (distributed KV), file (local filesystem), mem (in-memory). Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
)
parser.add_argument(
"--request-plane",
......@@ -551,7 +551,7 @@ def cmd_line_args():
config.tool_call_parser = args.dyn_tool_call_parser
config.dump_config_to = args.dump_config_to
config.dyn_endpoint_types = args.dyn_endpoint_types
config.store_kv = args.store_kv
config.discovery_backend = args.discovery_backend
config.request_plane = args.request_plane
config.event_plane = args.event_plane
config.enable_local_indexer = not args.durable_kv_events
......
......@@ -40,7 +40,7 @@ async def init_video_diffusion_worker(
namespace=config.namespace,
component=config.component,
endpoint=config.endpoint,
store_kv=config.store_kv,
discovery_backend=config.discovery_backend,
request_plane=config.request_plane,
event_plane=config.event_plane,
model_path=config.model_path,
......
......@@ -37,7 +37,7 @@ class Config(DynamoRuntimeConfig, DynamoVllmConfig):
is_prefill_worker: bool
is_decode_worker: bool
custom_jinja_template: Optional[str] = None
store_kv: str
discovery_backend: str
request_plane: str
event_plane: str
enable_local_indexer: bool = True
......
......@@ -143,7 +143,7 @@ async def worker():
shutdown_event = asyncio.Event()
runtime, _ = create_runtime(
store_kv=config.store_kv,
discovery_backend=config.discovery_backend,
request_plane=config.request_plane,
event_plane=config.event_plane,
use_kv_events=config.use_kv_events,
......
......@@ -162,7 +162,7 @@ docker compose -f deploy/docker-compose.yml up -d
```
> [!NOTE]
> - **etcd** is optional but is the default local discovery backend. You can also use `--kv_store file` to use file system based discovery.
> - **etcd** is optional but is the default local discovery backend. You can also use `--discovery-backend file` to use file system based discovery.
> - **NATS** is optional - only needed if using KV routing with events (default). You can disable it with `--no-kv-events` flag for prediction-based routing
> - **On Kubernetes**, neither is required when using the Dynamo operator, which explicitly sets `DYN_DISCOVERY_BACKEND=kubernetes` to enable native K8s service discovery (DynamoWorkerMetadata CRD)
......
......@@ -71,7 +71,7 @@ docker compose -f deploy/docker-compose.yml up -d
```
> [!NOTE]
> - **etcd** is optional but is the default local discovery backend. You can also use `--kv_store file` to use file system based discovery.
> - **etcd** is optional but is the default local discovery backend. You can also use `--discovery-backend file` to use file system based discovery.
> - **NATS** is optional - only needed if using KV routing with events (default). You can disable it with `--no-kv-events` flag for prediction-based routing
> - **On Kubernetes**, neither is required when using the Dynamo operator, which explicitly sets `DYN_DISCOVERY_BACKEND=kubernetes` to enable native K8s service discovery (DynamoWorkerMetadata CRD)
......
......@@ -66,7 +66,7 @@ docker compose -f deploy/docker-compose.yml up -d
```
> [!NOTE]
> - **etcd** is optional but is the default local discovery backend. You can also use `--kv_store file` to use file system based discovery.
> - **etcd** is optional but is the default local discovery backend. You can also use `--discovery-backend file` to use file system based discovery.
> - **NATS** is optional - only needed if using KV routing with events (default). You can disable it with `--no-kv-events` flag for prediction-based routing
> - **On Kubernetes**, neither is required when using the Dynamo operator, which explicitly sets `DYN_DISCOVERY_BACKEND=kubernetes` to enable native K8s service discovery (DynamoWorkerMetadata CRD)
......
......@@ -187,7 +187,7 @@ The main KV-aware routing arguments:
> - **No KV events** (`--no-kv-events`): State persistence is not supported.
>
> **Request plane is independent of KV event transport.**
> The request plane (`DYN_REQUEST_PLANE` / `--request-plane`) controls how requests reach workers (TCP/HTTP/NATS), while KV events travel over a separate path. KV events use NATS in JetStream or NATS Core modes, or ZMQ when `--event-plane zmq` is set. With `--event-plane zmq` and `--store-kv file` or `mem`, the router can run entirely without etcd or NATS. When using a NATS-based event plane (the default), NATS is initialized automatically; set `NATS_SERVER=nats://...` to override the default `localhost:4222`. Use `--no-kv-events` to disable KV event transport entirely.
> The request plane (`DYN_REQUEST_PLANE` / `--request-plane`) controls how requests reach workers (TCP/HTTP/NATS), while KV events travel over a separate path. KV events use NATS in JetStream or NATS Core modes, or ZMQ when `--event-plane zmq` is set. With `--event-plane zmq` and `--discovery-backend file` or `mem`, the router can run entirely without etcd or NATS. When using a NATS-based event plane (the default), NATS is initialized automatically; set `NATS_SERVER=nats://...` to override the default `localhost:4222`. Use `--no-kv-events` to disable KV event transport entirely.
>
> When `--kv-overlap-score-weight` is set to 0, no KVIndexer is created and prefix matching is disabled (pure load balancing). When `--no-kv-events` is set, a KVIndexer is still created but no event subscriber is launched to consume KV events from workers. Instead, the router predicts cache state based on its own routing decisions with TTL-based expiration and pruning.
>
......
......@@ -18,7 +18,7 @@ The discovery backend adapts to the deployment environment.
| **Kubernetes** (with Dynamo operator) | Native K8s (CRDs, EndpointSlices) | Operator sets `DYN_DISCOVERY_BACKEND=kubernetes` |
| **Bare metal / Local** (default) | etcd | `ETCD_ENDPOINTS` (defaults to `http://localhost:2379`) |
> **Note:** The runtime always defaults to etcd (`kv_store`). Kubernetes discovery must be explicitly enabled -- the Dynamo operator handles this automatically.
> **Note:** The runtime always defaults to etcd. Kubernetes discovery must be explicitly enabled -- the Dynamo operator handles this automatically.
## Kubernetes Discovery
......@@ -48,7 +48,7 @@ When running on Kubernetes with the Dynamo operator, service discovery uses nati
## etcd Discovery (Default)
When `DYN_DISCOVERY_BACKEND` is not set (or set to `kv_store`), etcd is used for service discovery.
When `DYN_DISCOVERY_BACKEND` is not set (or set to `etcd`), etcd is used for service discovery.
### Connection Configuration
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment