"examples/trials/mnist-nas/darts_mode/config_darts.yml" did not exist on "2a28a578fc373ecdbc08431b2d3ab4e1d6655146"
- 17 Dec, 2024 1 commit
-
-
Jesse Gross authored
Sometimes the KV cache requires defragmentation even without triggering the threshold heuristic. In this case, decoding will not being able to find a KV cache slot. This is particularly difficult for the caller to handle if it happens in between ubatches. To avoid this, we should immediately trigger a defrag. In addition, a heavily fragmented cache can require more than max_moves to defragment. Currently, we stop when we hit the limit but this can leave a cache that still does not have adequate space even after defragmentation is triggered. Instead, we should do multiple batches of processing until everything is complete. Fixes #7949
-