"platforms/hip/src/HipKernelFactory.cpp" did not exist on "925b00ec7ee5ba38e3a72f1da8a88df5a3d51bd5"
  1. 13 May, 2026 1 commit
  2. 23 Mar, 2026 1 commit
  3. 23 Jan, 2026 1 commit
  4. 15 Jan, 2026 1 commit
  5. 11 Dec, 2025 1 commit
  6. 24 Oct, 2025 1 commit
  7. 17 Oct, 2025 1 commit
  8. 24 Sep, 2025 1 commit
  9. 10 Sep, 2025 2 commits
  10. 05 Aug, 2025 1 commit
  11. 31 Jul, 2025 1 commit
  12. 15 Jul, 2025 3 commits
  13. 12 Jul, 2025 1 commit
  14. 04 Jul, 2025 1 commit
    • Shangyan Zhou's avatar
      Use TMA to optimize internode dispatch. (#276) · a2fa3b73
      Shangyan Zhou authored
      
      
      * Add TMA buffer allocation
      
      * Use TMA for forwarders and NVL receivers
      
      * Use lane 31 to operate TMA.
      
      * Change rdma buffer layout.
      
      * Use TMA to transfer scales also.
      
      * Increase the NVL recv buffer size.
      
      * Disable early stopping.
      
      * Apply similar optimizations on receiver warps.
      
      * Prevent warp divergence.
      
      * Disable aggressive ptx by default.
      
      * Revert using TMA to transfer scales.
      
      * Format.
      
      * Change the layout of dispatch NVL buffer.
      
      * Move topk transformation to recv warps.
      
      * Use TMA to transfer all data in foward warps
      
      * Use TMA to store scales.
      
      * Code lint
      
      ---------
      Co-authored-by: default avatarChenggang Zhao <chenggangz@deepseek.com>
      a2fa3b73
  15. 27 Jun, 2025 1 commit
  16. 11 Jun, 2025 1 commit
    • Chenggang Zhao's avatar
      Support Ampere architecture (#204) · b8d90fb7
      Chenggang Zhao authored
      * Update README
      
      * Update `setup.py`
      
      * Fix headers
      
      * Add `DISABLE_NVSHMEM` for APIs
      
      * Fix launch
      
      * Fix TMA settings
      
      * Fix TMA usages
      
      * Fix dlink
      
      * Separate layout kernels
      
      * Update version
      
      * Add `is_sm90_compiled`
      
      * Fix tests
      
      * Add NVLink connection checks
      
      * Update README
      
      * Fix tests
      
      * Add some comments
      
      * Minor fix
      
      * Minor fix
      
      * Fix bugs
      b8d90fb7
  17. 19 May, 2025 1 commit
  18. 25 Feb, 2025 1 commit