[Feat] Support non-gated MoE with Marlin, NVFP4 CUTLASS, FP8, INT8, compressed-tensors (#32257)
Signed-off-by:Tomer Natan <tbarnatan@computelab-frontend-8.nvidia.com> Signed-off-by:
mgoin <mgoin64@gmail.com> Co-authored-by:
Tomer Natan <tbarnatan@computelab-frontend-8.nvidia.com> Co-authored-by:
mgoin <mgoin64@gmail.com> Co-authored-by:
Tomer Natan <tbarnatan@ipp1-1429.ipp1a1.colossus.nvidia.com>
Showing
Please register or sign in to comment