- 09 Dec, 2025 6 commits
-
-
Micah Williamson authored
Signed-off-by:Micah Williamson <micah.williamson@amd.com>
-
Lucas Wilkinson authored
Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
Benjamin Chislett <chislett.ben@gmail.com>
-
Or Ozeri authored
Signed-off-by:Or Ozeri <oro@il.ibm.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Victor Ziliang Peng authored
Signed-off-by:Ziliang Peng <ziliang@character.ai>
-
Ming Yang authored
Signed-off-by:Ming Yang <minos.future@gmail.com>
-
- 08 Dec, 2025 1 commit
-
-
Charlie Fu authored
Signed-off-by:
charlifu <charlifu@amd.com> Signed-off-by:
Charlie Fu <Charlie.Fu@amd.com>
-
- 07 Dec, 2025 2 commits
-
-
Cyrus Leung authored
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 06 Dec, 2025 3 commits
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
rasmith authored
[CI/Build][AMD] Use ROCM_ATTN instead of FLASH_ATTN test for test_register_kv_caches for ROCm and update test for TRITON_ATTN (#29985) Signed-off-by:
Randall Smith <ransmith@amd.com> Co-authored-by:
Randall Smith <ransmith@amd.com> Co-authored-by:
TJian <tunjian.tan@embeddedllm.com>
-
Samuel Shen authored
Signed-off-by:
Samuel Shen <slshen@uchicago.edu> Co-authored-by:
Samuel Shen <slshen@uchicago.edu>
-
- 05 Dec, 2025 8 commits
-
-
Divakar Verma authored
Signed-off-by:Divakar Verma <divakar.verma@amd.com>
-
Matthew Bonanni authored
Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Signed-off-by:
Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
Nick Hill authored
Currently, when requests are cancelled while executing their final step, "completion" is handled based on normal stop processing (e.g. length or stop token), so the abort has no effect. This is typically not a problem, but when a kv connector is involved it thinks the request completed successfully rather than being aborted. This is problematic for disaggregated prefill which will free kv cache blocks if the request was aborted but not if it completed successfully—since the cancelled request will never be sent to the decode side, kv cache blocks remain pinned until the fall-back timeout expires. The problem is exacerbated when many requests are cancelled and/or there are large prefills whose forward pass takes a long time (since the window is bigger). This PR fixes the problem by processing pending aborts immediately prior to processing model output each step; we process only aborts, not new requests, since it's preferable for latency to process model outputs before new incoming requests. Fixes #26400. Signed-off-by:Nick Hill <nhill@redhat.com>
-
Nicolò Lucchesi authored
Signed-off-by:
NickLucche <nlucches@redhat.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
Alec S authored
Signed-off-by:
Alec Solder <alecs@fb.com> Signed-off-by:
Alec S <10566873+alecsolder@users.noreply.github.com> Co-authored-by:
Alec Solder <alecs@fb.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
22quinn <33176974+22quinn@users.noreply.github.com>
-
rasmith authored
Signed-off-by:
Randall Smith <ransmith@amd.com> Co-authored-by:
Randall Smith <ransmith@amd.com>
-
- 04 Dec, 2025 6 commits
-
-
Lucas Wilkinson authored
Signed-off-by:Lucas Wilkinson <lwilkins@redhat.com>
-
Doug Smith authored
Signed-off-by:dougbtv <dosmith@redhat.com>
-
rasmith authored
[CI/Build][AMD] Skip test on test_hybrid_attention_mamba_tensor_shapes on ROCm, requires FLASHINFER (#29995) Signed-off-by:
Randall Smith <ransmith@amd.com> Co-authored-by:
Randall Smith <ransmith@amd.com>
-
Mark McLoughlin authored
Signed-off-by:Mark McLoughlin <markmc@redhat.com>
-
Charlie Fu authored
Signed-off-by:
charlifu <charlifu@amd.com> Co-authored-by:
Alexei-V-Ivanov-AMD <156011006+Alexei-V-Ivanov-AMD@users.noreply.github.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 03 Dec, 2025 3 commits
-
-
Micah Williamson authored
Signed-off-by:Micah Williamson <micah.williamson@amd.com>
-
Lumis Chen authored
Signed-off-by:
LuminolT <lumischen01@gmail.com> Signed-off-by:
Lumis Chen <lumischen01@gmail.com> Co-authored-by:
Russell Bryant <rbryant@redhat.com>
-
rasmith authored
[CI/Build][AMD] Skip test_shared_storage_connector_hashes in test_shared_storage_connector.py due to hipErrorLaunchFailure when calling .cpu() (#29839) Signed-off-by:
Randall Smith <ransmith@amd.com> Co-authored-by:
Randall Smith <ransmith@amd.com>
-
- 02 Dec, 2025 11 commits
-
-
Micah Williamson authored
Signed-off-by:Micah Williamson <micah.williamson@amd.com>
-
Chauncey authored
Signed-off-by:
chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Divakar Verma authored
Signed-off-by:Divakar Verma <divakar.verma@amd.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
杰兮 authored
Signed-off-by:
zhyajie <yajizhan@amd.com> Co-authored-by:
zhyajie <yajizhan@amd.com>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Divakar Verma authored
Signed-off-by:Divakar Verma <divakar.verma@amd.com>
-
Divakar Verma authored
Signed-off-by:
Divakar Verma <divakar.verma@amd.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
usberkeley authored
Signed-off-by:Bradley <bradley.b.pitt@gmail.com>
-
Zhuohan Li authored
Signed-off-by:
Zhuohan Li <zhuohan123@gmail.com> Signed-off-by:
Nick Hill <nhill@redhat.com> Co-authored-by:
Nick Hill <nhill@redhat.com>
-
Nick Hill authored
Signed-off-by:Nick Hill <nhill@redhat.com>
-