[Bugfix] Revert "Zero-init MLA attention output buffers to prevent NaN from...
[Bugfix] Revert "Zero-init MLA attention output buffers to prevent NaN from CUDA graph padding" (#38359) Signed-off-by:Elvir Crncevic <elvircrn@gmail.com> Co-authored-by:
Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by:
Tyler Michael Smith <tyler@neuralmagic.com>
Showing
Please register or sign in to comment