- 22 Feb, 2025 25 commits
-
-
Yan Ma authored
-
Helena Kloosterman authored
-
Cyrus Leung authored
-
Gregory Shtrasberg authored
-
Sage Moore authored
[V1][Kernel] Refactor the prefix_prefill kernel so that the caller no longer has to pass in the context lengths (#13095)
-
Kaixi Hou authored
-
Keyun Tong authored
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
youkaichao authored
Signed-off-by:youkaichao <youkaichao@gmail.com>
-
Cyrus Leung authored
-
Jee Jee Li authored
-
Mark McLoughlin authored
-
Mark McLoughlin authored
-
Yu Chin Fabian Lim authored
-
Jennifer Zhao authored
Signed-off-by:
Jennifer Zhao <7443418+JenZhao@users.noreply.github.com> Co-authored-by:
Roger Wang <ywang@roblox.com>
-
Lu Fang authored
Signed-off-by:Lu Fang <lufang@fb.com>
-
Robin authored
[Bugfix] Fix benchmark script bug: inaccurate stats for vllm backend when max_model_len < input_len + output_len (#13691) Signed-off-by:WangErXiao <863579016@qq.com>
-
Dipika Sikka authored
-
Shane A authored
-
Gordon Wong authored
-
Jun Duan authored
-
Robin authored
-
Keyun Tong authored
-
Yuan Tang authored
-
Isotr0py authored
Signed-off-by:Isotr0py <2037008807@qq.com>
-
- 21 Feb, 2025 15 commits
-
-
Lucas Wilkinson authored
Signed-off-by:
Lucas Wilkinson <lwilkinson@neuralmagic.com> Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by:
Patrick Horn <patrick.horn@gmail.com> Co-authored-by:
simon-mo <xmo@berkeley.edu> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
-
John Zheng authored
Signed-off-by:John Zheng <john.zheng@hp.com>
-
Isotr0py authored
-
leoneo authored
-
Kevin H. Luu authored
-
Gabriel Marinho authored
-
Szymon Ożóg authored
-
Nick Hill authored
-
Roger Wang authored
-
Harry Mellor authored
-
Kaixi Hou authored
-
Edwin Hernandez authored
-
Kante Yin authored
Signed-off-by:kerthcet <kerthcet@gmail.com>
-
Lingfan Yu authored
Signed-off-by:Lingfan Yu <lingfany@amazon.com>
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-