"vllm/model_executor/models/commandr.py" did not exist on "5c976a7e1a1bec875bf6474824b7dff39e38de18"
[Core] Allow full cudagraph with separate attention routines and orthogonal to...
[Core] Allow full cudagraph with separate attention routines and orthogonal to compilation, add support for FA2 and FlashInfer (#20059) Signed-off-by:fhl <2410591650@qq.com> Signed-off-by:
fhl2000 <63384265+fhl2000@users.noreply.github.com> Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by:
Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
Showing
This diff is collapsed.
Please register or sign in to comment