- 10 Jul, 2024 1 commit
-
-
Abhinav Goyal authored
-
- 09 Jul, 2024 1 commit
-
-
Swapnil Parekh authored
Co-authored-by:
Swapnil Parekh <swapnilp@ibm.com> Co-authored-by:
Joe G <joseph.granados@h2o.ai> Co-authored-by:
Antoni Baum <antoni.baum@protonmail.com>
-
- 02 Jul, 2024 5 commits
-
-
Qubitium-ModelCloud authored
Co-authored-by:
Robert Shaw <rshaw@neuralmagic.com> Co-authored-by:
ZX <zx@lbx.dev>
-
Murali Andoorveedu authored
Signed-off-by:Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
-
Sirej Dua authored
Co-authored-by:Sirej Dua <sirej.dua@databricks.com> Co-authored-by: Sirej Dua <Sirej Dua>
-
xwjiang2010 authored
Signed-off-by:
Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by:
Roger Wang <ywang@roblox.com>
-
Alexander Matveev authored
-
- 01 Jul, 2024 1 commit
-
-
sroy745 authored
-
- 28 Jun, 2024 1 commit
-
-
Cody Yu authored
-
- 26 Jun, 2024 1 commit
-
-
Thomas Parnell authored
Signed-off-by:Thomas Parnell <tpa@zurich.ibm.com>
-
- 25 Jun, 2024 1 commit
-
-
Woo-Yeon Lee authored
[Speculative Decoding] Support draft model on different tensor-parallel size than target model (#5414)
-
- 21 Jun, 2024 1 commit
-
-
Joshua Rosenkranz authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com> Co-authored-by:
Davis Wertheimer <Davis.Wertheimer@ibm.com>
-
- 19 Jun, 2024 1 commit
-
-
zifeitong authored
-
- 15 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 11 Jun, 2024 1 commit
-
-
Nick Hill authored
Co-authored-by:Antoni Baum <antoni.baum@protonmail.com>
-
- 05 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 03 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 25 May, 2024 1 commit
-
-
Lily Liu authored
-
- 20 May, 2024 1 commit
-
-
Alexei-V-Ivanov-AMD authored
Co-authored-by:Alexey Kondratiev <alexey.kondratiev@amd.com>
-
- 16 May, 2024 1 commit
-
-
Cody Yu authored
Co-authored-by:
Cade Daniel <edacih@gmail.com> Co-authored-by:
Cade Daniel <cade@anyscale.com>
-
- 13 May, 2024 2 commits
-
-
Cody Yu authored
-
Cyrus Leung authored
Since #4335 was merged, I've noticed that the definition of ServerRunner in the tests is the same as in the test for OpenAI API. I have moved the class to the test utilities to avoid code duplication. (Although it only has been repeated twice so far, I will add another similar test suite in #4200 which would duplicate the code a third time) Also, I have moved the test utilities file (test_utils.py) to under the test directory (tests/utils.py), since none of its code is actually used in the main package. Note that I have added __init__.py to each test subpackage and updated the ray.init() call in the test utilities file in order to relative import tests/utils.py.
-
- 11 May, 2024 1 commit
-
-
Chang Su authored
-
- 10 May, 2024 1 commit
-
-
heeju-kim2 authored
Co-authored-by:Cade Daniel <edacih@gmail.com>
-
- 08 May, 2024 1 commit
-
-
Cody Yu authored
Co-authored-by:Cade Daniel <edacih@gmail.com>
-
- 07 May, 2024 1 commit
-
-
leiwen83 authored
Co-authored-by:
Lei Wen <wenlei03@qiyi.com> Co-authored-by:
Cade Daniel <edacih@gmail.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-
- 04 May, 2024 1 commit
-
-
Cody Yu authored
-
- 03 May, 2024 2 commits
-
-
Cade Daniel authored
-
SangBin Cho authored
-
- 02 May, 2024 1 commit
-
-
SangBin Cho authored
[Bug fix][Core] assert num_new_tokens == 1 fails when SamplingParams.n is not 1 and max_tokens is large & Add tests for preemption (#4451)
-
- 01 May, 2024 1 commit
-
-
leiwen83 authored
Co-authored-by:Lei Wen <wenlei03@qiyi.com>
-
- 30 Apr, 2024 1 commit
-
-
leiwen83 authored
Co-authored-by:Lei Wen <wenlei03@qiyi.com>
-
- 23 Apr, 2024 1 commit
-
-
Cade Daniel authored
-
- 16 Apr, 2024 2 commits
-
-
Cade Daniel authored
-
Antoni Baum authored
-
- 09 Apr, 2024 1 commit
-
-
Cade Daniel authored
[Misc] [Core] Implement RFC "Augment BaseExecutor interfaces to enable hardware-agnostic speculative decoding" (#3837)
-
- 05 Apr, 2024 1 commit
-
-
Cade Daniel authored
-
- 03 Apr, 2024 1 commit
-
-
Cade Daniel authored
Co-authored-by:Lily Liu <lilyliupku@gmail.com>
-
- 25 Mar, 2024 2 commits
-
-
xwjiang2010 authored
-
SangBin Cho authored
-