- 27 Nov, 2024 1 commit
-
-
王敏 authored
2.更新medusa readme 3.解决benchmark_moe报错问题
-
- 18 Nov, 2024 1 commit
-
-
王敏 authored
-
- 06 Nov, 2024 1 commit
-
-
王敏 authored
2.examples中添加medusa readme 3.修复model_runner中input_positions配置错误的笔误,解决多个模型运行失败问题
-
- 24 Oct, 2024 1 commit
-
-
王敏 authored
-
- 25 Sep, 2024 1 commit
-
-
Travis Johnson authored
Signed-off-by:Travis Johnson <tsjohnso@us.ibm.com>
-
- 22 Sep, 2024 1 commit
-
-
Lily Liu authored
-
- 02 Sep, 2024 1 commit
-
-
Lily Liu authored
-
- 30 Aug, 2024 1 commit
-
-
afeldman-nm authored
-
- 25 Aug, 2024 1 commit
-
-
Nick Hill authored
-
- 22 Aug, 2024 1 commit
-
-
Abhinav Goyal authored
-
- 20 Aug, 2024 1 commit
-
-
Abhinav Goyal authored
-
- 09 Aug, 2024 1 commit
-
-
William Lin authored
-
- 05 Aug, 2024 1 commit
-
-
Cade Daniel authored
-
- 30 Jul, 2024 1 commit
-
-
Nick Hill authored
-
- 24 Jul, 2024 1 commit
-
-
Allen.Dou authored
-
- 21 Jul, 2024 1 commit
-
-
sroy745 authored
[Spec Decode] Disable Log Prob serialization to CPU for spec decoding for both draft and target models. (#6485)
-
- 19 Jul, 2024 2 commits
-
-
Woo-Yeon Lee authored
-
Thomas Parnell authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com>
-
- 17 Jul, 2024 1 commit
-
-
shangmingc authored
Co-authored-by:caishangming.csm <caishangming.csm@alibaba-inc.com>
-
- 10 Jul, 2024 2 commits
-
-
sroy745 authored
[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765)
-
Abhinav Goyal authored
-
- 02 Jul, 2024 1 commit
-
-
Sirej Dua authored
Co-authored-by:Sirej Dua <sirej.dua@databricks.com> Co-authored-by: Sirej Dua <Sirej Dua>
-
- 01 Jul, 2024 1 commit
-
-
sroy745 authored
-
- 28 Jun, 2024 1 commit
-
-
Cody Yu authored
-
- 25 Jun, 2024 1 commit
-
-
Woo-Yeon Lee authored
[Speculative Decoding] Support draft model on different tensor-parallel size than target model (#5414)
-
- 21 Jun, 2024 1 commit
-
-
Joshua Rosenkranz authored
Signed-off-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by:
Nick Hill <nickhill@us.ibm.com> Co-authored-by:
Davis Wertheimer <Davis.Wertheimer@ibm.com>
-
- 15 Jun, 2024 1 commit
-
-
Cyrus Leung authored
-
- 11 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 05 Jun, 2024 1 commit
-
-
Nick Hill authored
-
- 25 May, 2024 1 commit
-
-
Lily Liu authored
-
- 22 May, 2024 1 commit
-
-
Nick Hill authored
-
- 16 May, 2024 1 commit
-
-
Cody Yu authored
Co-authored-by:
Cade Daniel <edacih@gmail.com> Co-authored-by:
Cade Daniel <cade@anyscale.com>
-
- 08 May, 2024 1 commit
-
-
Cody Yu authored
Co-authored-by:Cade Daniel <edacih@gmail.com>
-
- 07 May, 2024 1 commit
-
-
leiwen83 authored
Co-authored-by:
Lei Wen <wenlei03@qiyi.com> Co-authored-by:
Cade Daniel <edacih@gmail.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com>
-
- 04 May, 2024 1 commit
-
-
Cody Yu authored
-
- 03 May, 2024 1 commit
-
-
Cade Daniel authored
-
- 01 May, 2024 1 commit
-
-
leiwen83 authored
Co-authored-by:Lei Wen <wenlei03@qiyi.com>
-
- 26 Apr, 2024 1 commit
-
-
SangBin Cho authored
Co-authored-by:Danny Guinther <dguinther@neuralmagic.com>
-
- 23 Apr, 2024 1 commit
-
-
Cade Daniel authored
-
- 18 Apr, 2024 1 commit
-
-
SangBin Cho authored
Co-authored-by:SangBin Cho <sangcho@sangcho-LT93GQWG9C.local>
-