Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • V vllm_cscc
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • CI/CD
    • Repository
    • Value stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • OpenDAS
  • vllm_cscc
  • Commits

Switch branch/tag
  • vllm_cscc
  • benchmarks
  • kernels
  • benchmark_trtllm_decode_attention.py
  1. 21 Aug, 2025 1 commit
    • Pavani Majety's avatar
      [Core] Always use tensor cores for Flashinfer Decode Wrapper (#23214) · 1d353b63
      Pavani Majety authored Aug 21, 2025
      
      Signed-off-by: default avatarPavani Majety <pmajety@nvidia.com>
      1d353b63
  2. 19 Aug, 2025 1 commit
    • elvischenv's avatar
      [NVIDIA] Support Flashinfer TRTLLM FP8-q/kv/out Attention Kernel (#21716) · 03752dba
      elvischenv authored Aug 19, 2025
      
      Signed-off-by: default avatarelvischenv <219235043+elvischenv@users.noreply.github.com>
      Co-authored-by: default avatarMichael Goin <mgoin64@gmail.com>
      Co-authored-by: default avatarLuka Govedič <ProExpertProg@users.noreply.github.com>
      03752dba
  3. 05 Aug, 2025 1 commit
    • elvischenv's avatar
      [NVIDIA] Support Flashinfer TRT-LLM Prefill Attention Kernel (#22095) · 83156c7b
      elvischenv authored Aug 05, 2025
      
      Signed-off-by: default avatarelvischenv <219235043+elvischenv@users.noreply.github.com>
      83156c7b
  4. 29 Jul, 2025 1 commit
    • elvischenv's avatar
      [Bugfix] Fix workspace buffer None issue for Flashinfer TRTLLM Backend (#21525) · 58b11b24
      elvischenv authored Jul 29, 2025
      
      Signed-off-by: default avatarelvischenv <219235043+elvischenv@users.noreply.github.com>
      58b11b24
  5. 11 Jul, 2025 1 commit
    • Pavani Majety's avatar
      [Core] Add Flashinfer TRTLLM Backend for Flashinfer decode path (SM100). (#19825) · 7bd4c37a
      Pavani Majety authored Jul 11, 2025
      
      Signed-off-by: default avatarPavani Majety <pmajety@nvidia.com>
      Signed-off-by: default avatarmgoin <mgoin64@gmail.com>
      Co-authored-by: default avatarshuw <shuw@nvidia.com>
      Co-authored-by: default avatarmgoin <mgoin64@gmail.com>
      7bd4c37a