- 13 Mar, 2025 2 commits
-
-
Szymon Ożóg authored
Signed-off-by:SzymonOzog <szymon.ozog@aleph-alpha.com>
-
Gregory Shtrasberg authored
Signed-off-by:Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
-
- 12 Mar, 2025 2 commits
-
-
Pavani Majety authored
Signed-off-by:Pavani Majety <pmajety@nvidia.com>
-
Szymon Ożóg authored
Signed-off-by:SzymonOzog <szymon.ozog@aleph-alpha.com>
-
- 11 Mar, 2025 1 commit
-
-
Jeff Daily authored
Signed-off-by:Jeff Daily <jeff.daily@amd.com>
-
- 10 Mar, 2025 1 commit
-
-
Szymon Ożóg authored
Signed-off-by:
SzymonOzog <szymon.ozog@aleph-alpha.com> Signed-off-by:
SzymonOzog <szymon.ozog@gmail.com>
-
- 07 Mar, 2025 2 commits
-
-
Jinzhen Lin authored
Signed-off-by:Jinzhen Lin <linjinzhen@hotmail.com>
-
Luka Govedič authored
Signed-off-by:luka <luka@neuralmagic.com>
-
- 05 Mar, 2025 1 commit
-
-
rainkert authored
Signed-off-by:
dangshunya <dangshunya@baichuan-inc.com> Co-authored-by:
dangshunya <dangshunya@baichuan-inc.com>
-
- 01 Mar, 2025 1 commit
-
-
YajieWang authored
-
- 28 Feb, 2025 2 commits
-
-
Luka Govedič authored
[torch.compile] Fix RMSNorm + quant fusion in the non-cutlass-fp8 case, rename RedundantReshapesPass to NoopEliminationPass (#10902) Signed-off-by:luka <luka@neuralmagic.com>
-
Jee Jee Li authored
-
- 27 Feb, 2025 1 commit
-
-
Szymon Ożóg authored
-
- 26 Feb, 2025 2 commits
-
-
Woosuk Kwon authored
Signed-off-by:Woosuk Kwon <woosuk.kwon@berkeley.edu>
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
- 25 Feb, 2025 3 commits
-
-
Michael Goin authored
-
Tyler Michael Smith authored
Signed-off-by:Tyler Michael Smith <tyler@neuralmagic.com>
-
cjackal authored
Signed-off-by:cjackal <44624812+cjackal@users.noreply.github.com>
-
- 24 Feb, 2025 1 commit
-
-
Jongseok Park authored
-
- 22 Feb, 2025 1 commit
-
-
Gregory Shtrasberg authored
-
- 21 Feb, 2025 1 commit
-
-
Lucas Wilkinson authored
Signed-off-by:
Lucas Wilkinson <lwilkinson@neuralmagic.com> Signed-off-by:
Lucas Wilkinson <lwilkins@redhat.com> Co-authored-by:
Patrick Horn <patrick.horn@gmail.com> Co-authored-by:
simon-mo <xmo@berkeley.edu> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
-
- 18 Feb, 2025 1 commit
-
-
Tyler Michael Smith authored
Signed-off-by:Tyler Michael Smith <tyler@neuralmagic.com>
-
- 16 Feb, 2025 1 commit
-
-
Kyle Sayers authored
-
- 15 Feb, 2025 1 commit
-
-
rasmith authored
-
- 14 Feb, 2025 2 commits
-
-
Michael Goin authored
Signed-off-by:mgoin <mgoin64@gmail.com>
-
Tyler Michael Smith authored
Signed-off-by:Tyler Michael Smith <tyler@neuralmagic.com>
-
- 13 Feb, 2025 1 commit
-
-
Daniel Han authored
-
- 12 Feb, 2025 2 commits
-
-
Michael Goin authored
-
Qubitium-ModelCloud authored
-
- 07 Feb, 2025 1 commit
-
-
TJian authored
[ROCm] [Feature] [Doc] [Dockerfile] [BugFix] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing (#12501)
-
- 06 Feb, 2025 4 commits
-
-
Lu Fang authored
Signed-off-by:Lu Fang <lufang@fb.com>
-
Dipika Sikka authored
-
Lu Fang authored
-
Lucas Wilkinson authored
-
- 05 Feb, 2025 2 commits
-
-
Rahul Tuli authored
-
Kyle Sayers authored
Signed-off-by:
mgoin <michael@neuralmagic.com> Signed-off-by:
Kyle Sayers <kylesayrs@gmail.com> Co-authored-by:
mgoin <michael@neuralmagic.com>
-
- 04 Feb, 2025 2 commits
-
-
Hongxia Yang authored
Signed-off-by:
Hongxia Yang <hongxia.yang@amd.com> Co-authored-by:
Matthew Wong <Matthew.Wong2@amd.com>
-
Kyle Sayers authored
-
- 03 Feb, 2025 2 commits
-
-
kushanam authored
-
Srikanth Srinivas authored
Fix to AWQ quant loading of the new R1 model The new optimized MoE kernels for a large number of experts `moe_wn16` uses AWQ quant which requires the attention layers to be in 16bit The current merge has broken this, and the `get_quant_method` must return None for it to work correctly again --------- Signed-off-by:
Srikanth Srinivas <srikanth@astrum.ai> Signed-off-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by:
Beim <beim2015@outlook.com> Signed-off-by:
rshaw@neuralmagic.com <rshaw@neuralmagic.com> Signed-off-by:
mgoin <michael@neuralmagic.com> Signed-off-by:
npanpaliya <nishidha.panpaliya@partner.ibm.com> Signed-off-by:
Aleksandr Malyshev <maleksan@amd.com> Signed-off-by:
Lucas Wilkinson <lwilkinson@neuralmagic.com> Signed-off-by:
simon-mo <xmo@berkeley.edu> Signed-off-by:
Cody Yu <hao.yu.cody@gmail.com> Signed-off-by:
Chen Zhang <zhangch99@outlook.com> Signed-off-by:
Tyler Michael Smith <tyler@neuralmagic.com> Signed-off-by:
Ryan N <ryan.nguyen@centml.ai> Signed-off-by:
Brian Dellabetta <bdellabe@redhat.com> Signed-off-by:
Jee Jee Li <pandaleefree@gmail.com> Signed-off-by:
Rahul Tuli <rahul@neuralmagic.com> Signed-off-by:
Russell Bryant <rbryant@redhat.com> Signed-off-by:
simon-mo <simon.mo@hey.com> Signed-off-by:
Vicente Herrera <vicenteherrera@vicenteherrera.com> Signed-off-by:
Jinzhen Lin <linjinzhen@hotmail.com> Signed-off-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Signed-off-by:
Shawn Du <shawnd200@outlook.com> Signed-off-by:
Kunshang Ji <kunshang.ji@intel.com> Signed-off-by:
youkaichao <youkaichao@gmail.com> Co-authored-by:
Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by:
Beim <805908499@qq.com> Co-authored-by:
Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> Co-authored-by:
mgoin <michael@neuralmagic.com> Co-authored-by:
simon-mo <xmo@berkeley.edu> Co-authored-by:
Nishidha <nishidha.panpaliya@partner.ibm.com> Co-authored-by:
Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by:
Aleksandr Malyshev <164964928+maleksan85@users.noreply.github.com> Co-authored-by:
Aleksandr Malyshev <maleksan@amd.com> Co-authored-by:
Woosuk Kwon <woosuk.kwon@berkeley.edu> Co-authored-by:
simon-mo <simon.mo@hey.com> Co-authored-by:
Michael Goin <mgoin64@gmail.com> Co-authored-by:
Zhuohan Li <zhuohan123@gmail.com> Co-authored-by:
Tyler Michael Smith <tysmith@redhat.com> Co-authored-by:
Alexander Matveev <59768536+alexm-neuralmagic@users.noreply.github.com> Co-authored-by:
Roger Wang <136131678+ywang96@users.noreply.github.com> Co-authored-by:
Cody Yu <hao.yu.cody@gmail.com> Co-authored-by:
Chen Zhang <zhangch99@outlook.com> Co-authored-by:
Kevin H. Luu <kevin@anyscale.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com> Co-authored-by:
Ryan Nguyen <96593302+xpbowler@users.noreply.github.com> Co-authored-by:
Brian Dellabetta <brian-dellabetta@users.noreply.github.com> Co-authored-by:
fade_away <1028552010@qq.com> Co-authored-by:
weilong.yu <weilong.yu@shopee.com> Co-authored-by:
Jee Jee Li <pandaleefree@gmail.com> Co-authored-by:
Eldar Kurtic <eldarkurtic314@gmail.com> Co-authored-by:
Rahul Tuli <rahul@neuralmagic.com> Co-authored-by:
Russell Bryant <rbryant@redhat.com> Co-authored-by:
Vicente Herrera <vicenteherrera@vicenteherrera.com> Co-authored-by:
Jinzhen Lin <linjinzhen@hotmail.com> Co-authored-by:
Shawn Du <shawnd200@outlook.com> Co-authored-by:
Kunshang Ji <kunshang.ji@intel.com> Co-authored-by:
youkaichao <youkaichao@gmail.com>
-