"cacheflow/git@developer.sourcefind.cn:norm/vllm.git" did not exist on "04e5acc08ed5b878225491bf62540ea10274fb29"
tests: reduce stress on CPU to 2 models (#12161)
* tests: reduce stress on CPU to 2 models This should avoid flakes due to systems getting overloaded with 3 (or more) models running concurrently * tests: allow slow systems to pass on timeout If a slow system is still streaming a response, and the response will pass validation, don't fail just because the system is slow. * test: unload embedding models more quickly
Showing
Please register or sign in to comment