@@ -109,6 +109,28 @@ The migration system is designed with several important architectural considerat
...
@@ -109,6 +109,28 @@ The migration system is designed with several important architectural considerat
**Error Handling**: The migration system distinguishes between different types of failures and applies appropriate recovery strategies for each scenario.
**Error Handling**: The migration system distinguishes between different types of failures and applies appropriate recovery strategies for each scenario.
## Monitoring and Metrics
The migration system exposes Prometheus metrics to monitor migration activity. These metrics are available on the frontend's `/metrics` endpoint (default port 8000):
-`dynamo_frontend_model_migration_total`: Counter tracking the total number of request migrations
- Labels:
-`model`: The model name being served
-`migration_type`: Either `new_request` (initial connection failure) or `ongoing_request` (mid-stream disconnection)
- Alert on excessive migration rates indicating infrastructure issues
- Track the effectiveness of fault tolerance mechanisms
For more information on Dynamo metrics, see the [Metrics documentation](../observability/metrics.md).
## Operational Impact
## Operational Impact
Request migration fundamentally changes how the system handles failures, moving from a "fail-fast" approach to a "graceful degradation" model. This architectural shift enables higher availability and better resource utilization while maintaining the same external API contract for clients.
Request migration fundamentally changes how the system handles failures, moving from a "fail-fast" approach to a "graceful degradation" model. This architectural shift enables higher availability and better resource utilization while maintaining the same external API contract for clients.