Unverified Commit a289695c authored by mohammedabdulwahhab's avatar mohammedabdulwahhab Committed by GitHub
Browse files

fix: consolidate dyn_discovery_backend and dyn_kv_store (#6167)


Signed-off-by: default avatarmohammedabdulwahhab <furkhan324@berkeley.edu>
parent 0c83585a
...@@ -164,25 +164,25 @@ Dynamo provides a simple way to spin up a local set of inference components incl ...@@ -164,25 +164,25 @@ Dynamo provides a simple way to spin up a local set of inference components incl
Start the frontend: Start the frontend:
> **Tip:** To run in a single terminal (useful in containers), append `> logfile.log 2>&1 &` to run processes in background. Example: `python3 -m dynamo.frontend --store-kv file > dynamo.frontend.log 2>&1 &` > **Tip:** To run in a single terminal (useful in containers), append `> logfile.log 2>&1 &` to run processes in background. Example: `python3 -m dynamo.frontend --discovery-backend file > dynamo.frontend.log 2>&1 &`
```bash ```bash
# Start an OpenAI compatible HTTP server with prompt templating, tokenization, and routing. # Start an OpenAI compatible HTTP server with prompt templating, tokenization, and routing.
# For local dev: --store-kv file avoids etcd (workers and frontend must share a disk) # For local dev: --discovery-backend file avoids etcd (workers and frontend must share a disk)
python3 -m dynamo.frontend --http-port 8000 --store-kv file python3 -m dynamo.frontend --http-port 8000 --discovery-backend file
``` ```
In another terminal (or same terminal if using background mode), start a worker for your chosen backend: In another terminal (or same terminal if using background mode), start a worker for your chosen backend:
```bash ```bash
# SGLang # SGLang
python3 -m dynamo.sglang --model-path Qwen/Qwen3-0.6B --store-kv file python3 -m dynamo.sglang --model-path Qwen/Qwen3-0.6B --discovery-backend file
# TensorRT-LLM # TensorRT-LLM
python3 -m dynamo.trtllm --model-path Qwen/Qwen3-0.6B --store-kv file python3 -m dynamo.trtllm --model-path Qwen/Qwen3-0.6B --discovery-backend file
# vLLM (note: uses --model, not --model-path) # vLLM (note: uses --model, not --model-path)
python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --store-kv file \ python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --discovery-backend file \
--kv-events-config '{"enable_kv_cache_events": false}' --kv-events-config '{"enable_kv_cache_events": false}'
``` ```
...@@ -335,7 +335,7 @@ python3 -m dynamo.frontend ...@@ -335,7 +335,7 @@ python3 -m dynamo.frontend
## 9. Configure for Local Development ## 9. Configure for Local Development
- Pass `--store-kv file` to avoid external dependencies (see [Service Discovery and Messaging](#service-discovery-and-messaging)) - Pass `--discovery-backend file` to avoid external dependencies (see [Service Discovery and Messaging](#service-discovery-and-messaging))
- Set `DYN_LOG` to adjust the logging level (e.g., `export DYN_LOG=debug`). Uses the same syntax as `RUST_LOG` - Set `DYN_LOG` to adjust the logging level (e.g., `export DYN_LOG=debug`). Uses the same syntax as `RUST_LOG`
> **Note:** VSCode and Cursor users can use the `.devcontainer` folder for a pre-configured dev environment. See the [devcontainer README](.devcontainer/README.md) for details. > **Note:** VSCode and Cursor users can use the `.devcontainer` folder for a pre-configured dev environment. See the [devcontainer README](.devcontainer/README.md) for details.
...@@ -365,7 +365,7 @@ Dynamo uses TCP for inter-component communication. On Kubernetes, native resourc ...@@ -365,7 +365,7 @@ Dynamo uses TCP for inter-component communication. On Kubernetes, native resourc
| Deployment | etcd | NATS | Notes | | Deployment | etcd | NATS | Notes |
|------------|------|------|-------| |------------|------|------|-------|
| **Local Development** | ❌ Not required | ❌ Not required | Pass `--store-kv file`; vLLM also needs `--kv-events-config '{"enable_kv_cache_events": false}'` | | **Local Development** | ❌ Not required | ❌ Not required | Pass `--discovery-backend file`; vLLM also needs `--kv-events-config '{"enable_kv_cache_events": false}'` |
| **Kubernetes** | ❌ Not required | ❌ Not required | K8s-native discovery; TCP request plane | | **Kubernetes** | ❌ Not required | ❌ Not required | K8s-native discovery; TCP request plane |
> **Note:** KV-Aware Routing requires NATS for prefix caching coordination. > **Note:** KV-Aware Routing requires NATS for prefix caching coordination.
......
...@@ -15,7 +15,7 @@ class DynamoRuntimeConfig(ConfigBase): ...@@ -15,7 +15,7 @@ class DynamoRuntimeConfig(ConfigBase):
"""Configuration for Dynamo runtime (common across all backends).""" """Configuration for Dynamo runtime (common across all backends)."""
namespace: str namespace: str
store_kv: str discovery_backend: str
request_plane: str request_plane: str
event_plane: str event_plane: str
connector: list[str] connector: list[str]
...@@ -54,11 +54,11 @@ class DynamoRuntimeArgGroup(ArgGroup): ...@@ -54,11 +54,11 @@ class DynamoRuntimeArgGroup(ArgGroup):
) )
add_argument( add_argument(
g, g,
flag_name="--store-kv", flag_name="--discovery-backend",
env_var="DYN_STORE_KV", env_var="DYN_DISCOVERY_BACKEND",
default="etcd", default="etcd",
help="Which key-value backend to use: etcd, mem, file. Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.", help="Discovery backend: kubernetes (K8s API), etcd (distributed KV), file (local filesystem), mem (in-memory). Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
choices=["etcd", "file", "mem"], choices=["kubernetes", "etcd", "file", "mem"],
) )
add_argument( add_argument(
g, g,
......
...@@ -62,7 +62,7 @@ async def graceful_shutdown( ...@@ -62,7 +62,7 @@ async def graceful_shutdown(
def create_runtime( def create_runtime(
store_kv: str, discovery_backend: str,
request_plane: str, request_plane: str,
event_plane: str, event_plane: str,
use_kv_events: bool, use_kv_events: bool,
...@@ -74,7 +74,7 @@ def create_runtime( ...@@ -74,7 +74,7 @@ def create_runtime(
creates the runtime, and installs SIGTERM/SIGINT handlers. creates the runtime, and installs SIGTERM/SIGINT handlers.
Args: Args:
store_kv: Key-value backend type (etcd, file, mem). discovery_backend: Discovery backend type (kubernetes, etcd, file, mem).
request_plane: Request distribution method (nats, http, tcp). request_plane: Request distribution method (nats, http, tcp).
event_plane: Event publishing method (nats, zmq). event_plane: Event publishing method (nats, zmq).
use_kv_events: Whether KV events are enabled. use_kv_events: Whether KV events are enabled.
...@@ -89,7 +89,7 @@ def create_runtime( ...@@ -89,7 +89,7 @@ def create_runtime(
enable_nats = request_plane == "nats" or (event_plane == "nats" and use_kv_events) enable_nats = request_plane == "nats" or (event_plane == "nats" and use_kv_events)
runtime = DistributedRuntime(loop, store_kv, request_plane, enable_nats) runtime = DistributedRuntime(loop, discovery_backend, request_plane, enable_nats)
def signal_handler(): def signal_handler():
asyncio.create_task(graceful_shutdown(runtime, shutdown_event)) asyncio.create_task(graceful_shutdown(runtime, shutdown_event))
......
...@@ -77,7 +77,7 @@ class FrontendConfig(ConfigBase): ...@@ -77,7 +77,7 @@ class FrontendConfig(ConfigBase):
grpc_metrics_port: int grpc_metrics_port: int
dump_config_to: Optional[str] dump_config_to: Optional[str]
store_kv: str discovery_backend: str
request_plane: str request_plane: str
event_plane: str event_plane: str
chat_processor: str chat_processor: str
...@@ -463,15 +463,15 @@ class FrontendArgGroup(ArgGroup): ...@@ -463,15 +463,15 @@ class FrontendArgGroup(ArgGroup):
add_argument( add_argument(
g, g,
flag_name="--store-kv", flag_name="--discovery-backend",
env_var="DYN_STORE_KV", env_var="DYN_DISCOVERY_BACKEND",
default="etcd", default="etcd",
help=( help=(
"Which key-value backend to use: etcd, mem, file. Etcd uses the ETCD_* env vars " "Discovery backend: kubernetes (K8s API), etcd (distributed KV), file (local filesystem), "
"(e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var " "mem (in-memory). Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. "
"DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv." "File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv."
), ),
choices=["etcd", "file", "mem"], choices=["kubernetes", "etcd", "file", "mem"],
) )
add_argument( add_argument(
g, g,
......
...@@ -166,7 +166,7 @@ async def async_main(): ...@@ -166,7 +166,7 @@ async def async_main():
loop = asyncio.get_running_loop() loop = asyncio.get_running_loop()
runtime = DistributedRuntime( runtime = DistributedRuntime(
loop, config.store_kv, config.request_plane, enable_nats loop, config.discovery_backend, config.request_plane, enable_nats
) )
def signal_handler(): def signal_handler():
......
...@@ -327,11 +327,11 @@ def parse_args(): ...@@ -327,11 +327,11 @@ def parse_args():
), ),
) )
parser.add_argument( parser.add_argument(
"--store-kv", "--discovery-backend",
type=str, type=str,
choices=["etcd", "file", "mem"], choices=["kubernetes", "etcd", "file", "mem"],
default=os.environ.get("DYN_STORE_KV", "etcd"), default=os.environ.get("DYN_DISCOVERY_BACKEND", "etcd"),
help="Which key-value backend to use: etcd, mem, file. Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.", help="Discovery backend: kubernetes (K8s API), etcd (distributed KV), file (local filesystem), mem (in-memory). Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
) )
parser.add_argument( parser.add_argument(
"--request-plane", "--request-plane",
......
...@@ -159,7 +159,7 @@ async def launch_workers(args, extra_engine_args_path): ...@@ -159,7 +159,7 @@ async def launch_workers(args, extra_engine_args_path):
logger.info(f"Creating mocker worker {worker_id + 1}/{args.num_workers}") logger.info(f"Creating mocker worker {worker_id + 1}/{args.num_workers}")
# Create a separate DistributedRuntime for this worker (on same event loop) # Create a separate DistributedRuntime for this worker (on same event loop)
runtime = DistributedRuntime(loop, args.store_kv, args.request_plane) runtime = DistributedRuntime(loop, args.discovery_backend, args.request_plane)
runtimes.append(runtime) runtimes.append(runtime)
# Determine which engine args file to use # Determine which engine args file to use
......
...@@ -98,12 +98,12 @@ DYNAMO_ARGS: Dict[str, Dict[str, Any]] = { ...@@ -98,12 +98,12 @@ DYNAMO_ARGS: Dict[str, Dict[str, Any]] = {
"default": None, "default": None,
"help": "Dump debug config to the specified file path. If not specified, the config will be dumped to stdout at INFO level.", "help": "Dump debug config to the specified file path. If not specified, the config will be dumped to stdout at INFO level.",
}, },
"store-kv": { "discovery-backend": {
"flags": ["--store-kv"], "flags": ["--discovery-backend"],
"type": str, "type": str,
"choices": ["etcd", "file", "mem"], "choices": ["kubernetes", "etcd", "file", "mem"],
"default": os.environ.get("DYN_STORE_KV", "etcd"), "default": os.environ.get("DYN_DISCOVERY_BACKEND", "etcd"),
"help": "Which key-value backend to use: etcd, mem, file. Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.", "help": "Discovery backend: kubernetes (K8s API), etcd (distributed KV), file (local filesystem), mem (in-memory). Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
}, },
"request-plane": { "request-plane": {
"flags": ["--request-plane"], "flags": ["--request-plane"],
...@@ -157,7 +157,7 @@ class DynamoArgs: ...@@ -157,7 +157,7 @@ class DynamoArgs:
namespace: str namespace: str
component: str component: str
endpoint: str endpoint: str
store_kv: str discovery_backend: str
request_plane: str request_plane: str
event_plane: str event_plane: str
...@@ -595,7 +595,7 @@ async def parse_args(args: list[str]) -> Config: ...@@ -595,7 +595,7 @@ async def parse_args(args: list[str]) -> Config:
namespace=parsed_namespace, namespace=parsed_namespace,
component=parsed_component_name, component=parsed_component_name,
endpoint=parsed_endpoint_name, endpoint=parsed_endpoint_name,
store_kv=parsed_args.store_kv, discovery_backend=parsed_args.discovery_backend,
request_plane=parsed_args.request_plane, request_plane=parsed_args.request_plane,
event_plane=parsed_args.event_plane, event_plane=parsed_args.event_plane,
tool_call_parser=tool_call_parser, tool_call_parser=tool_call_parser,
......
...@@ -203,7 +203,7 @@ async def worker(): ...@@ -203,7 +203,7 @@ async def worker():
dynamo_args = config.dynamo_args dynamo_args = config.dynamo_args
runtime, loop = create_runtime( runtime, loop = create_runtime(
store_kv=dynamo_args.store_kv, discovery_backend=dynamo_args.discovery_backend,
request_plane=dynamo_args.request_plane, request_plane=dynamo_args.request_plane,
event_plane=dynamo_args.event_plane, event_plane=dynamo_args.event_plane,
use_kv_events=dynamo_args.use_kv_events, use_kv_events=dynamo_args.use_kv_events,
......
...@@ -29,7 +29,7 @@ class DiffusionConfig: ...@@ -29,7 +29,7 @@ class DiffusionConfig:
namespace: str = DYN_NAMESPACE namespace: str = DYN_NAMESPACE
component: str = "diffusion" component: str = "diffusion"
endpoint: str = "generate" endpoint: str = "generate"
store_kv: str = "etcd" discovery_backend: str = "etcd"
request_plane: str = "tcp" request_plane: str = "tcp"
event_plane: str = "nats" event_plane: str = "nats"
......
...@@ -31,7 +31,7 @@ async def worker(): ...@@ -31,7 +31,7 @@ async def worker():
shutdown_event = asyncio.Event() shutdown_event = asyncio.Event()
runtime, _ = create_runtime( runtime, _ = create_runtime(
store_kv=config.store_kv, discovery_backend=config.discovery_backend,
request_plane=config.request_plane, request_plane=config.request_plane,
event_plane=config.event_plane, event_plane=config.event_plane,
use_kv_events=config.use_kv_events, use_kv_events=config.use_kv_events,
......
...@@ -65,7 +65,7 @@ class Config: ...@@ -65,7 +65,7 @@ class Config:
self.dump_config_to: Optional[str] = None self.dump_config_to: Optional[str] = None
self.custom_jinja_template: Optional[str] = None self.custom_jinja_template: Optional[str] = None
self.dyn_endpoint_types: str = "chat,completions" self.dyn_endpoint_types: str = "chat,completions"
self.store_kv: str = "" self.discovery_backend: str = ""
self.request_plane: str = "" self.request_plane: str = ""
self.event_plane: str = "" self.event_plane: str = ""
self.enable_local_indexer: bool = True self.enable_local_indexer: bool = True
...@@ -124,7 +124,7 @@ class Config: ...@@ -124,7 +124,7 @@ class Config:
f"tool_call_parser={self.tool_call_parser}, " f"tool_call_parser={self.tool_call_parser}, "
f"dump_config_to={self.dump_config_to}, " f"dump_config_to={self.dump_config_to}, "
f"custom_jinja_template={self.custom_jinja_template}, " f"custom_jinja_template={self.custom_jinja_template}, "
f"store_kv={self.store_kv}, " f"discovery_backend={self.discovery_backend}, "
f"request_plane={self.request_plane}, " f"request_plane={self.request_plane}, "
f"event_plane={self.event_plane}, " f"event_plane={self.event_plane}, "
f"enable_local_indexer={self.enable_local_indexer}, " f"enable_local_indexer={self.enable_local_indexer}, "
...@@ -335,11 +335,11 @@ def cmd_line_args(): ...@@ -335,11 +335,11 @@ def cmd_line_args():
help="Comma-separated list of endpoint types to enable. Options: 'chat', 'completions'. Default: 'chat,completions'. Use 'completions' for models without chat templates.", help="Comma-separated list of endpoint types to enable. Options: 'chat', 'completions'. Default: 'chat,completions'. Use 'completions' for models without chat templates.",
) )
parser.add_argument( parser.add_argument(
"--store-kv", "--discovery-backend",
type=str, type=str,
choices=["etcd", "file", "mem"], choices=["kubernetes", "etcd", "file", "mem"],
default=os.environ.get("DYN_STORE_KV", "etcd"), default=os.environ.get("DYN_DISCOVERY_BACKEND", "etcd"),
help="Which key-value backend to use: etcd, mem, file. Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.", help="Discovery backend: kubernetes (K8s API), etcd (distributed KV), file (local filesystem), mem (in-memory). Etcd uses the ETCD_* env vars (e.g. ETCD_ENDPOINTS) for connection details. File uses root dir from env var DYN_FILE_KV or defaults to $TMPDIR/dynamo_store_kv.",
) )
parser.add_argument( parser.add_argument(
"--request-plane", "--request-plane",
...@@ -551,7 +551,7 @@ def cmd_line_args(): ...@@ -551,7 +551,7 @@ def cmd_line_args():
config.tool_call_parser = args.dyn_tool_call_parser config.tool_call_parser = args.dyn_tool_call_parser
config.dump_config_to = args.dump_config_to config.dump_config_to = args.dump_config_to
config.dyn_endpoint_types = args.dyn_endpoint_types config.dyn_endpoint_types = args.dyn_endpoint_types
config.store_kv = args.store_kv config.discovery_backend = args.discovery_backend
config.request_plane = args.request_plane config.request_plane = args.request_plane
config.event_plane = args.event_plane config.event_plane = args.event_plane
config.enable_local_indexer = not args.durable_kv_events config.enable_local_indexer = not args.durable_kv_events
......
...@@ -40,7 +40,7 @@ async def init_video_diffusion_worker( ...@@ -40,7 +40,7 @@ async def init_video_diffusion_worker(
namespace=config.namespace, namespace=config.namespace,
component=config.component, component=config.component,
endpoint=config.endpoint, endpoint=config.endpoint,
store_kv=config.store_kv, discovery_backend=config.discovery_backend,
request_plane=config.request_plane, request_plane=config.request_plane,
event_plane=config.event_plane, event_plane=config.event_plane,
model_path=config.model_path, model_path=config.model_path,
......
...@@ -37,7 +37,7 @@ class Config(DynamoRuntimeConfig, DynamoVllmConfig): ...@@ -37,7 +37,7 @@ class Config(DynamoRuntimeConfig, DynamoVllmConfig):
is_prefill_worker: bool is_prefill_worker: bool
is_decode_worker: bool is_decode_worker: bool
custom_jinja_template: Optional[str] = None custom_jinja_template: Optional[str] = None
store_kv: str discovery_backend: str
request_plane: str request_plane: str
event_plane: str event_plane: str
enable_local_indexer: bool = True enable_local_indexer: bool = True
......
...@@ -143,7 +143,7 @@ async def worker(): ...@@ -143,7 +143,7 @@ async def worker():
shutdown_event = asyncio.Event() shutdown_event = asyncio.Event()
runtime, _ = create_runtime( runtime, _ = create_runtime(
store_kv=config.store_kv, discovery_backend=config.discovery_backend,
request_plane=config.request_plane, request_plane=config.request_plane,
event_plane=config.event_plane, event_plane=config.event_plane,
use_kv_events=config.use_kv_events, use_kv_events=config.use_kv_events,
......
...@@ -162,7 +162,7 @@ docker compose -f deploy/docker-compose.yml up -d ...@@ -162,7 +162,7 @@ docker compose -f deploy/docker-compose.yml up -d
``` ```
> [!NOTE] > [!NOTE]
> - **etcd** is optional but is the default local discovery backend. You can also use `--kv_store file` to use file system based discovery. > - **etcd** is optional but is the default local discovery backend. You can also use `--discovery-backend file` to use file system based discovery.
> - **NATS** is optional - only needed if using KV routing with events (default). You can disable it with `--no-kv-events` flag for prediction-based routing > - **NATS** is optional - only needed if using KV routing with events (default). You can disable it with `--no-kv-events` flag for prediction-based routing
> - **On Kubernetes**, neither is required when using the Dynamo operator, which explicitly sets `DYN_DISCOVERY_BACKEND=kubernetes` to enable native K8s service discovery (DynamoWorkerMetadata CRD) > - **On Kubernetes**, neither is required when using the Dynamo operator, which explicitly sets `DYN_DISCOVERY_BACKEND=kubernetes` to enable native K8s service discovery (DynamoWorkerMetadata CRD)
......
...@@ -71,7 +71,7 @@ docker compose -f deploy/docker-compose.yml up -d ...@@ -71,7 +71,7 @@ docker compose -f deploy/docker-compose.yml up -d
``` ```
> [!NOTE] > [!NOTE]
> - **etcd** is optional but is the default local discovery backend. You can also use `--kv_store file` to use file system based discovery. > - **etcd** is optional but is the default local discovery backend. You can also use `--discovery-backend file` to use file system based discovery.
> - **NATS** is optional - only needed if using KV routing with events (default). You can disable it with `--no-kv-events` flag for prediction-based routing > - **NATS** is optional - only needed if using KV routing with events (default). You can disable it with `--no-kv-events` flag for prediction-based routing
> - **On Kubernetes**, neither is required when using the Dynamo operator, which explicitly sets `DYN_DISCOVERY_BACKEND=kubernetes` to enable native K8s service discovery (DynamoWorkerMetadata CRD) > - **On Kubernetes**, neither is required when using the Dynamo operator, which explicitly sets `DYN_DISCOVERY_BACKEND=kubernetes` to enable native K8s service discovery (DynamoWorkerMetadata CRD)
......
...@@ -66,7 +66,7 @@ docker compose -f deploy/docker-compose.yml up -d ...@@ -66,7 +66,7 @@ docker compose -f deploy/docker-compose.yml up -d
``` ```
> [!NOTE] > [!NOTE]
> - **etcd** is optional but is the default local discovery backend. You can also use `--kv_store file` to use file system based discovery. > - **etcd** is optional but is the default local discovery backend. You can also use `--discovery-backend file` to use file system based discovery.
> - **NATS** is optional - only needed if using KV routing with events (default). You can disable it with `--no-kv-events` flag for prediction-based routing > - **NATS** is optional - only needed if using KV routing with events (default). You can disable it with `--no-kv-events` flag for prediction-based routing
> - **On Kubernetes**, neither is required when using the Dynamo operator, which explicitly sets `DYN_DISCOVERY_BACKEND=kubernetes` to enable native K8s service discovery (DynamoWorkerMetadata CRD) > - **On Kubernetes**, neither is required when using the Dynamo operator, which explicitly sets `DYN_DISCOVERY_BACKEND=kubernetes` to enable native K8s service discovery (DynamoWorkerMetadata CRD)
......
...@@ -187,7 +187,7 @@ The main KV-aware routing arguments: ...@@ -187,7 +187,7 @@ The main KV-aware routing arguments:
> - **No KV events** (`--no-kv-events`): State persistence is not supported. > - **No KV events** (`--no-kv-events`): State persistence is not supported.
> >
> **Request plane is independent of KV event transport.** > **Request plane is independent of KV event transport.**
> The request plane (`DYN_REQUEST_PLANE` / `--request-plane`) controls how requests reach workers (TCP/HTTP/NATS), while KV events travel over a separate path. KV events use NATS in JetStream or NATS Core modes, or ZMQ when `--event-plane zmq` is set. With `--event-plane zmq` and `--store-kv file` or `mem`, the router can run entirely without etcd or NATS. When using a NATS-based event plane (the default), NATS is initialized automatically; set `NATS_SERVER=nats://...` to override the default `localhost:4222`. Use `--no-kv-events` to disable KV event transport entirely. > The request plane (`DYN_REQUEST_PLANE` / `--request-plane`) controls how requests reach workers (TCP/HTTP/NATS), while KV events travel over a separate path. KV events use NATS in JetStream or NATS Core modes, or ZMQ when `--event-plane zmq` is set. With `--event-plane zmq` and `--discovery-backend file` or `mem`, the router can run entirely without etcd or NATS. When using a NATS-based event plane (the default), NATS is initialized automatically; set `NATS_SERVER=nats://...` to override the default `localhost:4222`. Use `--no-kv-events` to disable KV event transport entirely.
> >
> When `--kv-overlap-score-weight` is set to 0, no KVIndexer is created and prefix matching is disabled (pure load balancing). When `--no-kv-events` is set, a KVIndexer is still created but no event subscriber is launched to consume KV events from workers. Instead, the router predicts cache state based on its own routing decisions with TTL-based expiration and pruning. > When `--kv-overlap-score-weight` is set to 0, no KVIndexer is created and prefix matching is disabled (pure load balancing). When `--no-kv-events` is set, a KVIndexer is still created but no event subscriber is launched to consume KV events from workers. Instead, the router predicts cache state based on its own routing decisions with TTL-based expiration and pruning.
> >
......
...@@ -18,7 +18,7 @@ The discovery backend adapts to the deployment environment. ...@@ -18,7 +18,7 @@ The discovery backend adapts to the deployment environment.
| **Kubernetes** (with Dynamo operator) | Native K8s (CRDs, EndpointSlices) | Operator sets `DYN_DISCOVERY_BACKEND=kubernetes` | | **Kubernetes** (with Dynamo operator) | Native K8s (CRDs, EndpointSlices) | Operator sets `DYN_DISCOVERY_BACKEND=kubernetes` |
| **Bare metal / Local** (default) | etcd | `ETCD_ENDPOINTS` (defaults to `http://localhost:2379`) | | **Bare metal / Local** (default) | etcd | `ETCD_ENDPOINTS` (defaults to `http://localhost:2379`) |
> **Note:** The runtime always defaults to etcd (`kv_store`). Kubernetes discovery must be explicitly enabled -- the Dynamo operator handles this automatically. > **Note:** The runtime always defaults to etcd. Kubernetes discovery must be explicitly enabled -- the Dynamo operator handles this automatically.
## Kubernetes Discovery ## Kubernetes Discovery
...@@ -48,7 +48,7 @@ When running on Kubernetes with the Dynamo operator, service discovery uses nati ...@@ -48,7 +48,7 @@ When running on Kubernetes with the Dynamo operator, service discovery uses nati
## etcd Discovery (Default) ## etcd Discovery (Default)
When `DYN_DISCOVERY_BACKEND` is not set (or set to `kv_store`), etcd is used for service discovery. When `DYN_DISCOVERY_BACKEND` is not set (or set to `etcd`), etcd is used for service discovery.
### Connection Configuration ### Connection Configuration
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment