python/sglang/srt/layers/quantization/modelopt_quant.py · 915140fd18c9ff4193e994e6d756ea762a52240a · change / sglang · GitLab

Find file Blame History Permalink

[NVIDIA] Add Low Latency NVFP4 decode kernels from Flashinfer (#8552) · 915140fd
azhurkevich authored Aug 04, 2025
```
Co-authored-by: Cheng Wan <cwan@x.ai>
```
915140fd

modelopt_quant.py 45.5 KB