"tests/kernels/test_cache.py" did not exist on "c9d5b6d4a8b3f51ff6c9eee7eb52bb5149d89b6a"
-
yugong333 authored
Reduce the kernel overhead when num of active loras is smaller than max loras. Multiple cuda graphs are captured for each num of active-loras. (#32005) Signed-off-by:Yu Gong <yu3.gong@gmail.com>
ffe1fc7a