This suite validates determinism properties of the API-backed LLM under fixed sampling parameters and optionally across prefix cache resets. The tests can automatically start a local vLLM server, warm it up, and compare responses for identical prompts over multiple iterations.
This suite validates the determinism properties of the API-backed LLM under fixed sampling parameters and, optionally, across prefix cache resets. The tests can automatically start a local LLM server—either a vLLM server or a TensorRT-LLM server—warm it up, and compare responses for identical prompts over multiple iterations. The suite also automatically detects whether the vLLM or TensorRT-LLM wheel is installed and starts the corresponding server.
## Files
-`test_determinism.py` — comprehensive determinism tests with automatic vLLM server lifecycle and warmup.
-`test_determinism.py` — comprehensive determinism tests with automatic LLM server lifecycle and warmup.
-`test_determinism_with_cache_reset` — run test with warmup, reset cache, then run again without warmup to test determinism across cache reset boundary
-`test_concurrent_determinism_with_ifeval` — send parametrized number of IFEval prompts (default: 120) with controlled concurrency, with warmup, then reset cache and test again without warmup to validate determinism across cache reset
...
...
@@ -19,7 +19,7 @@ This suite validates determinism properties of the API-backed LLM under fixed sa
## How It Works
- A `VLLMServerManager` fixture (`vllm_server`) launches `vllm serve` with the Dynamo connector and optional cache block overrides.
- A `LLMServerManager` fixture (`llm_server`) launches `vllm serve`or `trtllm-serve`with the Dynamo connector and optional cache block overrides.
- A `tester` fixture binds the test client to the running server's base URL.
- The test performs a comprehensive warmup across prompts, then executes repeated requests and checks that responses are identical (deterministic). An optional cache reset phase re-validates determinism across the reset boundary.
...
...
@@ -43,8 +43,8 @@ Environment variables control server settings and test load: