- 29 Oct, 2025 1 commit
-
-
Patrick Devine authored
-
- 16 Oct, 2025 1 commit
-
-
Daniel Hiltgen authored
-
- 15 Aug, 2025 1 commit
-
-
Daniel Hiltgen authored
* test: improve scheduler/concurrency stress tests The scheduler test used to use approximate memory figures and would often over or under shoot a systems capcity leading to flaky test results. This should improve the reliability of this scenario by leveraging ps output to determinie exactly how many models it takes to trigger thrashing. The concurrency test is also refined to target num_parallel + 1 and handle timeouts better. With these refinements, TestMultiModelConcurrency was redundant * test: add parallel generate with history TestGenerateWithHistory will help verify caching and context are properly handled while making requests * test: focus embed tests on embedding models remove non-embedding models from the embedding tests
-
- 05 Jul, 2025 1 commit
-
-
Daniel Hiltgen authored
usage example: go test --tags=integration,perf -count 1 ./integration -v -timeout 1h -run TestModelsPerf 2>&1 | tee int.log cat int.log | grep MODEL_PERF_HEADER | cut -f2- -d: > perf.csv cat int.log | grep MODEL_PERF_DATA | cut -f2- -d: >> perf.csv
-
- 24 May, 2025 1 commit
-
-
Daniel Hiltgen authored
-
- 16 Apr, 2025 1 commit
-
-
Daniel Hiltgen authored
Add some new test coverage for various model architectures, and switch from orca-mini to the small llama model.
-