- 27 Nov, 2024 1 commit
-
-
王敏 authored
2.更新medusa readme 3.解决benchmark_moe报错问题
-
- 19 Nov, 2024 1 commit
-
-
王敏 authored
-
- 06 Nov, 2024 1 commit
-
-
王敏 authored
2.examples中添加medusa readme 3.修复model_runner中input_positions配置错误的笔误,解决多个模型运行失败问题
-
- 24 Oct, 2024 1 commit
-
-
王敏 authored
-
- 25 Sep, 2024 7 commits
-
-
Chen Zhang authored
Co-authored-by:
simon-mo <xmo@berkeley.edu> Co-authored-by:
Chang Su <chang.s.su@oracle.com> Co-authored-by:
Simon Mo <simon.mo@hey.com> Co-authored-by:
Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by:
Roger Wang <ywang@roblox.com>
-
Simon Mo authored
-
科英 authored
-
Cyrus Leung authored
-
Joe Runde authored
-
Archit Patke authored
-
Travis Johnson authored
Signed-off-by:Travis Johnson <tsjohnso@us.ibm.com>
-
- 24 Sep, 2024 1 commit
-
-
Simon Mo authored
-
- 23 Sep, 2024 2 commits
-
-
Alexander Matveev authored
-
Alex Brooks authored
Signed-off-by:Alex-Brooks <Alex.Brooks@ibm.com>
-
- 21 Sep, 2024 1 commit
-
-
Cyrus Leung authored
-
- 19 Sep, 2024 1 commit
-
-
Nick Hill authored
-
- 18 Sep, 2024 3 commits
-
-
Joe Runde authored
Signed-off-by:Joe Runde <Joseph.Runde@ibm.com>
-
Alexander Matveev authored
Co-authored-by:
Nick Hill <nickhill@us.ibm.com> Co-authored-by:
rshaw@neuralmagic.com <rshaw@neuralmagic.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by:
Simon Mo <simon.mo@hey.com>
-
Aaron Pham authored
Signed-off-by:
Aaron Pham <contact@aarnphm.xyz> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-
- 17 Sep, 2024 2 commits
-
-
sroy745 authored
-
Alex Brooks authored
Signed-off-by:Alex-Brooks <Alex.Brooks@ibm.com>
-
- 16 Sep, 2024 1 commit
-
-
Nick Hill authored
-
- 13 Sep, 2024 3 commits
-
-
William Lin authored
-
Alexander Matveev authored
-
Cyrus Leung authored
-
- 12 Sep, 2024 3 commits
-
-
Roger Wang authored
[Hotfix][Core][VLM] Disable chunked prefill by default and prefix caching for multimodal models (#8425)
-
Nick Hill authored
-
youkaichao authored
-
- 11 Sep, 2024 2 commits
-
-
Aarni Koskela authored
-
zhuwenwen authored
-
- 10 Sep, 2024 1 commit
-
-
Cody Yu authored
[MISC] Keep chunked prefill enabled by default with long context when prefix caching is enabled (#8342)
-
- 09 Sep, 2024 1 commit
-
-
zhuwenwen authored
-
- 08 Sep, 2024 1 commit
-
-
Alexander Matveev authored
-
- 07 Sep, 2024 2 commits
-
-
Cyrus Leung authored
-
William Lin authored
-
- 06 Sep, 2024 1 commit
-
-
Patrick von Platen authored
Co-authored-by:Michael Goin <michael@neuralmagic.com>
-
- 04 Sep, 2024 1 commit
-
-
Harsha vardhan manoj Bikki authored
Co-authored-by:Harsha Bikki <harbikh@amazon.com>
-
- 03 Sep, 2024 3 commits
-
-
Antoni Baum authored
-
Alexander Matveev authored
-
Woosuk Kwon authored
-