• Daniel Hiltgen's avatar
    tests: reduce stress on CPU to 2 models (#12161) · 67451828
    Daniel Hiltgen authored
    * tests: reduce stress on CPU to 2 models
    
    This should avoid flakes due to systems getting overloaded with 3 (or more) models running concurrently
    
    * tests: allow slow systems to pass on timeout
    
    If a slow system is still streaming a response, and the response
    will pass validation, don't fail just because the system is slow.
    
    * test: unload embedding models more quickly
    67451828
utils_test.go 18.5 KB