-
Pavani Majety authored
[Quantization/NVFP4] Speed up TRTLLM NVFP4 MOE weight loading and fix K/V scale loading for MLA Attn (#25968) Signed-off-by:Pavani Majety <pmajety@nvidia.com>
a2691733
[Quantization/NVFP4] Speed up TRTLLM NVFP4 MOE weight loading and fix K/V scale loading for MLA Attn (#25968)
Signed-off-by:
Pavani Majety <pmajety@nvidia.com>