- 09 Dec, 2025 12 commits
-
-
Tsukasa OI authored
Signed-off-by:Tsukasa OI <floss_llm@irq.a4lg.com>
-
liuquan authored
Signed-off-by:
quanliu <18646313696@163.com> Co-authored-by:
Wentao Ye <44945378+yewentao256@users.noreply.github.com>
-
vllmellm authored
Signed-off-by:vllmellm <vllm.ellm@embeddedllm.com>
-
Dongjie Zou authored
Signed-off-by:baonudesifeizhai <baonudesifeizhai@gmail.com>
-
wang.yuqi authored
Signed-off-by:
wang.yuqi <yuqi.wang@daocloud.io> Signed-off-by:
wang.yuqi <noooop@126.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
Tsukasa OI authored
[Model][Quantization] Restore MoE + GGUF models support (incl. Qwen3 MoE) by allowing Sideload Parameters (#30116) Signed-off-by:
Tsukasa OI <floss_llm@irq.a4lg.com> Co-authored-by:
Isotr0py <mozf@mail2.sysu.edu.cn>
-
liangel-02 authored
Signed-off-by:Angel Li <liangel@meta.com>
-
Michael Goin authored
Signed-off-by:
mgoin <mgoin64@gmail.com> Signed-off-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
-
czhu-cohere authored
Signed-off-by:czhu-cohere <conway.zhu@cohere.com>
-
Zhewen Li authored
Signed-off-by:
zhewenli <zhewenli@meta.com> Signed-off-by:
Zhewen Li <zhewenli@meta.com>
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Ming Yang authored
Signed-off-by:Ming Yang <minos.future@gmail.com>
-
- 08 Dec, 2025 7 commits
-
-
roikoren755 authored
Signed-off-by:Roi Koren <roik@nvidia.com>
-
Vasiliy Kuznetsov authored
Signed-off-by:vasiliy <vasiliy@fb.com>
-
shaharmor98 authored
Signed-off-by:
Shahar Mor <smor@nvidia.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
-
Daniel Cámpora authored
Signed-off-by:
Daniel Campora <961215+dcampora@users.noreply.github.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Cyrus Leung <tlleungac@connect.ust.hk>
-
wang.yuqi authored
[Model][7/N] Improve all pooling task | Deprecation as_reward_model. Extract hidden states prefer using new multi-vector retrieval API (#26686) Signed-off-by:wang.yuqi <yuqi.wang@daocloud.io>
-
Dazhi Jiang authored
Signed-off-by:Dazhi Jiang <dazhi_jiang@163.com>
-
Zhiwei authored
Signed-off-by:ZhiweiYan-96 <zhiwei.yan@amd.com>
-
- 07 Dec, 2025 5 commits
-
-
ElizaWszola authored
Signed-off-by:
ElizaWszola <ewszola@redhat.com> Signed-off-by:
yewentao256 <zhyanwentao@126.com> Co-authored-by:
yewentao256 <zhyanwentao@126.com>
-
Wentao Ye authored
[Perf] Deepgemm fused layout kernel for activations, 4.3% throughput improvement, 10.7% TTFT improvement. (#29546) Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Jinzhen Lin authored
Signed-off-by:
Jinzhen Lin <jinzhen.ljz@antgroup.com> Co-authored-by:
Jee Jee Li <pandaleefree@gmail.com>
-
Cyrus Leung authored
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
- 06 Dec, 2025 6 commits
-
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Cyrus Leung authored
Signed-off-by:DarkLight1337 <tlleungac@connect.ust.hk>
-
Peter Salas authored
Signed-off-by:Peter Salas <peter@fixie.ai>
-
Dongjie Zou authored
Signed-off-by:baonudesifeizhai <baonudesifeizhai@gmail.com>
-
yuttian1 authored
Signed-off-by:yuttian1 <yuttian@amd.com>
-
rasmith authored
[CI/Build][AMD][Quantization] Fix test_int8_kernel.py by updating int8_utils to use hip.libdevice.round (#30151) Signed-off-by:
Randall Smith <ransmith@amd.com> Co-authored-by:
Randall Smith <ransmith@amd.com>
-
- 05 Dec, 2025 7 commits
-
-
Wentao Ye authored
Signed-off-by:yewentao256 <zhyanwentao@126.com>
-
Divakar Verma authored
Signed-off-by:Divakar Verma <divakar.verma@amd.com>
-
Matthew Bonanni authored
Signed-off-by:
Matthew Bonanni <mbonanni@redhat.com> Signed-off-by:
Matthew Bonanni <mbonanni001@gmail.com> Co-authored-by:
gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com>
-
Yi Liu authored
Signed-off-by:yiliu30 <yi4.liu@intel.com>
-
Zhiwei authored
Signed-off-by:ZhiweiYan-96 <zhiwei.yan@amd.com>
-
amitz-nv authored
[Frontend][Model] Add 'float16' to possible mamba cache dtype values, override mamba SSM cache dtype value for NemotronH (#29978) Signed-off-by:amitz-nv <203509407+amitz-nv@users.noreply.github.com>
-
Alexander Matveev authored
Signed-off-by:Alexander Matveev <amatveev@redhat.com>
-
- 04 Dec, 2025 3 commits
-
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Jee Jee Li authored
Signed-off-by:Jee Jee Li <pandaleefree@gmail.com>
-
Tao Yun authored
Signed-off-by:
taoyun <1069423820@qq.com> Signed-off-by:
Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by:
Cyrus Leung <cyrus.tl.leung@gmail.com>
-