"vllm/model_executor/models/starcoder2.py" did not exist on "ba0bfd40e21cacfd5da6a1e43028a37258a29cb4"
[Bugfix] Fix use_cascade_attention handling for Alibi-based models on vllm/v1 (#15211)
Signed-off-by:h-sugi <h.sugi@ieee.org> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu>
Showing
Please register or sign in to comment