- 21 Jan, 2025 1 commit
-
-
Jinzhen Lin authored
[Kernel] optimize moe_align_block_size for cuda graph and large num_experts (e.g. DeepSeek-V3) (#12222) Signed-off-by:
Jinzhen Lin <linjinzhen@hotmail.com> Co-authored-by:
Michael Goin <mgoin@redhat.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
-
- 27 Dec, 2024 1 commit
-
-
Simon Mo authored
Signed-off-by:
mgoin <michael@neuralmagic.com> Co-authored-by:
mgoin <michael@neuralmagic.com> Co-authored-by:
robertgshaw2-neuralmagic <rshaw@neuralmagic.com>
-
- 24 Oct, 2024 1 commit
-
-
Charlie Fu authored
Signed-off-by:charlifu <charlifu@amd.com>
-
- 09 Jun, 2024 1 commit
-
-
bnellnm authored
-
- 22 May, 2024 1 commit
-
-
Michael Goin authored
-
- 18 Mar, 2024 1 commit
-
-
Woosuk Kwon authored
-
- 15 Mar, 2024 1 commit
-
-
akhoroshev authored
-
- 30 Jan, 2024 2 commits
-
-
Philipp Moritz authored
Co-authored-by:chen shen <scv119@gmail.com>
-
wangding zeng authored
Co-authored-by:roy <jasonailu87@gmail.com>
-