Unverified Commit fa1ea1d5 authored by Jacky's avatar Jacky Committed by GitHub
Browse files

docs: Update Request Migration test instructions (#5754)


Signed-off-by: default avatarJacky <18255193+kthui@users.noreply.github.com>
parent 5a00a7d6
...@@ -2,38 +2,56 @@ ...@@ -2,38 +2,56 @@
## Migration Tests ## Migration Tests
The migration directory contains tests for worker fault tolerance with migration support. The migration directory contains tests for worker fault tolerance with migration support across multiple backends (vLLM, SGLang, TRT-LLM) in both aggregated and disaggregated modes.
### Test Parameterization
All migration tests are parameterized with the following dimensions:
| Parameter IDs | Description |
|---------------|-------------|
| `migration_enabled`, `migration_disabled` | Controls whether migration is allowed |
| `worker_failure` (SIGKILL), `graceful_shutdown` (SIGTERM) | Worker termination method |
| `chat`, `completion` (skipped) | API endpoint to test |
| `stream`, `unary` (skipped) | Streaming vs unary responses |
| `nats`, `tcp` | Request plane transport |
### Test Matrix ### Test Matrix
| Test | Shutdown Method | Migration Enabled | Expected Result | Verification | Each backend (vLLM, SGLang, TRT-LLM) has the following test types:
|------|----------------|-------------------|-----------------|--------------|
| `test_request_migration_vllm_worker_failure` | SIGKILL (immediate) | Yes (default) | Request succeeds | "Stream disconnected... recreating stream..." in logs |
| `test_request_migration_vllm_graceful_shutdown` | SIGTERM (10s timeout) | Yes (default) | Request succeeds | "Stream disconnected... recreating stream..." in logs |
| `test_no_request_migration_vllm_worker_failure` | SIGKILL (immediate) | No (migration_limit=0) | Request fails (500) | "Migration limit exhausted" in logs |
| `test_no_request_migration_vllm_graceful_shutdown` | SIGTERM (10s timeout) | No (migration_limit=0) | Request fails (500) | "Migration limit exhausted" in logs |
### Common Test Flow | Test | Mode | Setup |
|------|------|-------|
| `test_request_migration_{backend}_aggregated` | Aggregated | 2 workers |
| `test_request_migration_{backend}_prefill` | Disaggregated | 1 decode + 2 prefill |
| `test_request_migration_{backend}_kv_transfer` | Disaggregated | 1 prefill + 2 decode |
| `test_request_migration_{backend}_decode` | Disaggregated | 1 prefill + 2 decode |
All migration tests follow this pattern: Where `{backend}` is one of: `vllm`, `sglang`, `trtllm`
### Common Test Flow
1. Start a Dynamo frontend with round-robin routing 1. Start a Dynamo frontend with round-robin routing
2. Start 2 vLLM workers sequentially 2. Start workers (configuration varies by mode: aggregated or disaggregated)
3. Send a long completion request (max_tokens=8192) in a separate daemon thread 3. Send a request (chat/completion, streaming/unary) in a background thread
4. Use parallel polling to determine which worker received the request (checks for "New Request ID:" in logs) 4. Determine which worker received the request via log polling
5. Terminate the worker processing the request (method varies by test) 5. For decode tests: wait for initial responses before termination
6. Validate the request outcome (success or failure based on migration setting) 6. Terminate the worker processing the request (SIGKILL or SIGTERM)
7. Verify migration behavior in frontend logs 7. Validate the request outcome based on `migration_limit`:
- `migration_limit > 0`: Request succeeds, verify TTFT/TPOT if streaming and migration metrics
- `migration_limit = 0`: Request fails with expected error
8. Verify migration behavior in frontend logs
**Run examples:** **Run examples:**
```bash ```bash
# With migration enabled # Run all vLLM migration tests
pytest tests/fault_tolerance/migration/test_vllm.py::test_request_migration_vllm_worker_failure -v -s pytest tests/fault_tolerance/migration -m vllm -v -s
pytest tests/fault_tolerance/migration/test_vllm.py::test_request_migration_vllm_graceful_shutdown -v -s
# Run aggregated or decode tests for SGLang
pytest tests/fault_tolerance/migration -m sglang -k "aggregated or decode" -v -s
# With migration disabled # Run specific parameter combination
pytest tests/fault_tolerance/migration/test_vllm.py::test_no_request_migration_vllm_worker_failure -v -s pytest tests/fault_tolerance/migration -m trtllm -k "aggregated and nats and stream and chat and worker_failure and migration_enabled" -v -s
pytest tests/fault_tolerance/migration/test_vllm.py::test_no_request_migration_vllm_graceful_shutdown -v -s
``` ```
## Cancellation Tests ## Cancellation Tests
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment