@@ -40,7 +40,7 @@ from dynamo.runtime import DistributedRuntime
...
@@ -40,7 +40,7 @@ from dynamo.runtime import DistributedRuntime
from.import__version__
from.import__version__
DYNAMO_NAMESPACE_ENV_VAR="DYN_NAMESPACE"
DYN_NAMESPACE_ENV_VAR="DYN_NAMESPACE"
logger=logging.getLogger(__name__)
logger=logging.getLogger(__name__)
...
@@ -142,7 +142,7 @@ def parse_args():
...
@@ -142,7 +142,7 @@ def parse_args():
parser.add_argument(
parser.add_argument(
"--namespace",
"--namespace",
type=str,
type=str,
default=os.environ.get(DYNAMO_NAMESPACE_ENV_VAR),
default=os.environ.get(DYN_NAMESPACE_ENV_VAR),
help="Dynamo namespace for model discovery scoping. If specified, models will only be discovered from this namespace. If not specified, discovers models from all namespaces (global discovery).",
help="Dynamo namespace for model discovery scoping. If specified, models will only be discovered from this namespace. If not specified, discovers models from all namespaces (global discovery).",
You can configure the plugin by setting environment vars in your [values-epp-aware.yaml].
You can configure the plugin by setting environment vars in your [values-epp-aware.yaml].
- Overwrite the `DYNAMO_NAMESPACE` env var if needed to match your model's dynamo namespace.
- Overwrite the `DYN_NAMESPACE` env var if needed to match your model's dynamo namespace.
- Set `DYNAMO_BUSY_THRESHOLD` to configure the upper bound on how “full” a worker can be (often derived from kv_active_blocks or other load metrics) before the router skips it. If the selected worker exceeds this value, routing falls back to the next best candidate. By default the value is negative meaning this is not enabled.
- Set `DYNAMO_BUSY_THRESHOLD` to configure the upper bound on how “full” a worker can be (often derived from kv_active_blocks or other load metrics) before the router skips it. If the selected worker exceeds this value, routing falls back to the next best candidate. By default the value is negative meaning this is not enabled.
- Set `DYNAMO_ROUTER_REPLICA_SYNC=true` to enable a background watcher to keep multiple router instances in sync (important if you run more than one KV router per component).
- Set `DYNAMO_ROUTER_REPLICA_SYNC=true` to enable a background watcher to keep multiple router instances in sync (important if you run more than one KV router per component).
- By default the Dynamo plugin uses KV routing. You can expose `DYNAMO_USE_KV_ROUTING=false` in your [values-epp-aware.yaml] if you prefer to route in the round-robin fashion.
- By default the Dynamo plugin uses KV routing. You can expose `DYNAMO_USE_KV_ROUTING=false` in your [values-epp-aware.yaml] if you prefer to route in the round-robin fashion.
@@ -27,11 +27,11 @@ While this guide does not use Prometheus, it assumes Grafana is pre-installed wi
...
@@ -27,11 +27,11 @@ While this guide does not use Prometheus, it assumes Grafana is pre-installed wi
The following env variables are set:
The following env variables are set:
-`MONITORING_NAMESPACE`: The namespace where Loki is installed
-`MONITORING_NAMESPACE`: The namespace where Loki is installed
-`DYNAMO_NAMESPACE`: The namespace where Dynamo Cloud Operator is installed
-`DYN_NAMESPACE`: The namespace where Dynamo Cloud Operator is installed
```bash
```bash
export MONITORING_NAMESPACE=monitoring
export MONITORING_NAMESPACE=monitoring
export DYNAMO_NAMESPACE=dynamo-system
export DYN_NAMESPACE=dynamo-system
```
```
## Installation Steps
## Installation Steps
...
@@ -99,7 +99,7 @@ podLogs:
...
@@ -99,7 +99,7 @@ podLogs:
-"nvidia_com_dynamo_component_type"# extract this label from the dynamo graph deployment
-"nvidia_com_dynamo_component_type"# extract this label from the dynamo graph deployment
-"nvidia_com_dynamo_graph_deployment_name"# extract this label from the dynamo graph deployment
-"nvidia_com_dynamo_graph_deployment_name"# extract this label from the dynamo graph deployment
namespaces:
namespaces:
-$DYNAMO_NAMESPACE
-$DYN_NAMESPACE
```
```
### 3. Configure Grafana with the Loki datasource and Dynamo Logs dashboard
### 3. Configure Grafana with the Loki datasource and Dynamo Logs dashboard
...
@@ -126,7 +126,7 @@ At this point, we should have everything in place to collect and view logs in ou
...
@@ -126,7 +126,7 @@ At this point, we should have everything in place to collect and view logs in ou
To enable structured logs in a DynamoGraphDeployment, we need to set the `DYN_LOGGING_JSONL` environment variable to `1`. This is done for us in the `agg_logging.yaml` setup for the Sglang backend. We can now deploy the DynamoGraphDeployment with:
To enable structured logs in a DynamoGraphDeployment, we need to set the `DYN_LOGGING_JSONL` environment variable to `1`. This is done for us in the `agg_logging.yaml` setup for the Sglang backend. We can now deploy the DynamoGraphDeployment with:
Send a few chat completions requests to generate structured logs across the frontend and worker pods across the DynamoGraphDeployment. We are now all set to view the logs in Grafana.
Send a few chat completions requests to generate structured logs across the frontend and worker pods across the DynamoGraphDeployment. We are now all set to view the logs in Grafana.