- 26 Mar, 2025 2 commits
- 24 Mar, 2025 2 commits
- 19 Mar, 2025 1 commit
-
-
lizhigong authored
-
- 18 Mar, 2025 1 commit
-
-
lizhigong authored
-
- 17 Mar, 2025 2 commits
- 15 Mar, 2025 2 commits
- 14 Mar, 2025 2 commits
- 12 Mar, 2025 1 commit
-
-
王敏 authored
-
- 11 Mar, 2025 1 commit
-
-
王敏 authored
-
- 28 Feb, 2025 1 commit
-
-
gaoqiong authored
-
- 20 Feb, 2025 1 commit
-
-
yangql authored
-
- 19 Feb, 2025 1 commit
-
-
zhuwenwen authored
-
- 18 Feb, 2025 2 commits
- 17 Feb, 2025 1 commit
-
-
王敏 authored
-
- 13 Feb, 2025 1 commit
-
-
王敏 authored
-
- 12 Feb, 2025 1 commit
-
-
王敏 authored
-
- 11 Feb, 2025 2 commits
- 08 Feb, 2025 3 commits
- 06 Feb, 2025 2 commits
-
-
Lu Fang authored
-
Lucas Wilkinson authored
-
- 05 Feb, 2025 5 commits
-
-
Roger Wang authored
-
Rahul Tuli authored
-
Kyle Sayers authored
Signed-off-by:
mgoin <michael@neuralmagic.com> Signed-off-by:
Kyle Sayers <kylesayrs@gmail.com> Co-authored-by:
mgoin <michael@neuralmagic.com>
-
Harry Mellor authored
Signed-off-by:Harry Mellor <19981378+hmellor@users.noreply.github.com>
-
Aviv Keshet authored
Signed-off-by:Aviv Keshet <akeshet@scaledcognition.com>
-
- 04 Feb, 2025 2 commits
-
-
Hongxia Yang authored
Signed-off-by:
Hongxia Yang <hongxia.yang@amd.com> Co-authored-by:
Matthew Wong <Matthew.Wong2@amd.com>
-
Kyle Sayers authored
-
- 03 Feb, 2025 4 commits
-
-
kushanam authored
-
Srikanth Srinivas authored
Fix to AWQ quant loading of the new R1 model The new optimized MoE kernels for a large number of experts `moe_wn16` uses AWQ quant which requires the attention layers to be in 16bit The current merge has broken this, and the `get_quant_method` must return None for it to work correctly again --------- Signed-off-by:
Srikanth Srinivas <srikanth@astrum.ai> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by:
Beim <beim2015@outlook.com> Signed-off-by:
rshaw@neuralmagic.com <rshaw@neuralmagic.com> Signed-off-by:
mgoin <michael@neuralmagic.com> Signed-off-by:
npanpaliya <nishidha.panpaliya@partner.ibm.com> Signed-off-by:
Aleksandr Malyshev <maleksan@amd.com> Signed-off-by:
Lucas Wilkinson <lwilkinson@neuralmagic.com> Signed-off-by:
simon-mo <xmo@berkeley.edu> Signed-off-by:
Cody Yu <hao.yu.cody@gmail.com> Signed-off-by:
Chen Zhang <zhangch99@outlook.com> Signed-off-by:
Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by:
Ryan N <ryan.nguyen@centml.ai> Signed-off-by:
Brian Dellabetta <bdellabe@redhat.com> Signed-off-by:
Jee Jee Li <pandaleefree@gmail.com> Signed-off-by:
Rahul Tuli <rahul@neuralmagic.com> Signed-off-by:
Russell Bryant <rbryant@redhat.com> Signed-off-by:
simon-mo <simon.mo@hey.com> Signed-off-by:
Vicente Herrera <vicenteherrera@vicenteherrera.com> Signed-off-by:
Jinzhen Lin <linjinzhen@hotmail.com> Signed-off-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by:
Shawn Du <shawnd200@outlook.com> Signed-off-by:
Kunshang Ji <kunshang.ji@intel.com> Signed-off-by:
youkaichao <youkaichao@gmail.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Beim <805908499@qq.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by:
mgoin <michael@neuralmagic.com> Co-authored-by:
simon-mo <xmo@berkeley.edu> Co-authored-by:
Nishidha <nishidha.panpaliya@partner.ibm.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
Aleksandr Malyshev <164964928+maleksan85@users.noreply.github.com> Co-authored-by:
Aleksandr Malyshev <maleksan@amd.com> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
simon-mo <simon.mo@hey.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Zhuohan Li <zhuohan123@gmail.com> Co-authored-by:
Tyler Michael Smith <tysmith@redhat.com> Co-authored-by:
Alexander Matveev <59768536+alexm-neuralmagic@users.noreply.github.com> Co-authored-by:
Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com> Co-authored-by:
Chen Zhang <zhangch99@outlook.com> Co-authored-by:
Kevin H. Luu <kevin@anyscale.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by:
Ryan Nguyen <96593302+xpbowler@users.noreply.github.com> Co-authored-by:
Brian Dellabetta <brian-dellabetta@users.noreply.github.com> Co-authored-by:
fade_away <1028552010@qq.com> Co-authored-by:
weilong.yu <weilong.yu@shopee.com> Co-authored-by:
Jee Jee Li <pandaleefree@gmail.com> Co-authored-by:
Eldar Kurtic <eldarkurtic314@gmail.com> Co-authored-by:
Rahul Tuli <rahul@neuralmagic.com> Co-authored-by:
Russell Bryant <rbryant@redhat.com> Co-authored-by:
Vicente Herrera <vicenteherrera@vicenteherrera.com> Co-authored-by:
Jinzhen Lin <linjinzhen@hotmail.com> Co-authored-by:
Shawn Du <shawnd200@outlook.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com> Co-authored-by:
youkaichao <youkaichao@gmail.com>
-
Eldar Kurtic authored
Thanks @kylesayrs for catching this!
-
Yang Chen authored
sgl_moe_align_block_size is based on: https://github.com/sgl-project/sglang/commit/ded9fcd09a43d5e7d5bb31a2bc3e9fc21bf65d2a moe_align_block_size is based on: https://github.com/sgl-project/sglang/commit/ba5112ff691d791a9e38c6c71f59324a5fcb49d0 Signed-off-by:
Yang Chen <yangche@fb.com>
-