1. 15 Oct, 2025 1 commit
    • Tong WU's avatar
      [BugFix] Phaseout dependency of Triton in sink examples to make CI happy (#1045) · 8f001e02
      Tong WU authored
      
      
      * [BugFix] Phaseout dependency of Triton in sink examples to make CI happy
      
      - Added `benchmark_gqa_sink_fwd.py` and `benchmark_mha_sink_fwd.py` to evaluate performance of GQA and MHA attention mechanisms using Triton.
      - Refactored existing attention sink implementations to remove Triton kernel definitions from the reference programs, streamlining the code.
      - Updated input generation and benchmarking logic to enhance configurability and performance measurement.
      - Improved overall structure and organization of the examples for better clarity and usability.
      
      * [Lint]: [pre-commit.ci] auto fixes [...]
      
      ---------
      Co-authored-by: default avatarpre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
      8f001e02