1. 23 May, 2025 1 commit
  2. 22 May, 2025 3 commits
  3. 21 May, 2025 2 commits
  4. 20 May, 2025 2 commits
  5. 15 May, 2025 1 commit
  6. 14 May, 2025 1 commit
  7. 13 May, 2025 1 commit
  8. 11 May, 2025 1 commit
  9. 08 May, 2025 2 commits
  10. 07 May, 2025 3 commits
  11. 06 May, 2025 2 commits
  12. 05 May, 2025 1 commit
  13. 01 May, 2025 1 commit
  14. 30 Apr, 2025 2 commits
  15. 29 Apr, 2025 2 commits
  16. 28 Apr, 2025 1 commit
  17. 27 Apr, 2025 1 commit
  18. 25 Apr, 2025 3 commits
  19. 24 Apr, 2025 3 commits
    • jberchtold-nvidia's avatar
      Introduce nvte_memset to provide a fill kernel that is faster than... · 62d1b2bd
      jberchtold-nvidia authored
      
      Introduce nvte_memset to provide a fill kernel that is faster than cudaMemsetAsync for small sizes (#1716)
      
      * nvte_memset fills single float value
      Signed-off-by: default avatarJeremy Berchtold <jberchtold@nvidia.com>
      
      * Support larger sizes than a single value and add tests
      Signed-off-by: default avatarJeremy Berchtold <jberchtold@nvidia.com>
      
      ---------
      Signed-off-by: default avatarJeremy Berchtold <jberchtold@nvidia.com>
      62d1b2bd
    • wenjh's avatar
      [DCU] Fix failed test cases · 3ce226ae
      wenjh authored
      
      
      Due to the difference of warp size between nvidia(32) and dtk(64), the
      OperatorTest/CTDBiasTestSuite.TestCTDBias/* are all failed except:
      
      * OperatorTest/CTDBiasTestSuite.TestCTDBias/bfloat16Xfloat32X65536X128
      * OperatorTest/CTDBiasTestSuite.TestCTDBias/bfloat16Xfloat16X65536X128
      * OperatorTest/CTDBiasTestSuite.TestCTDBias/bfloat16Xbfloat16X65536X128
      * OperatorTest/CTDBiasTestSuite.TestCTDBias/bfloat16Xfloat8e5m2X65536X128
      * OperatorTest/CTDBiasTestSuite.TestCTDBias/bfloat16Xfloat8e4m3X65536X128
      
      This commit is intended to fix this.
      Signed-off-by: wenjh's avatarwenjh <wenjh@sugon.com>
      3ce226ae
    • wenjh's avatar
      [DCU] Fix crash test cases · 46c81675
      wenjh authored
      
      
      Due to the compiler compiling incorrect code. The following test case crashed:
      
      * OperatorTest/CTTestSuite.TestCastTranspose/bfloat16Xbfloat16X2048X12288
      * OperatorTest/CTTestSuite.TestCastTranspose/bfloat16Xbfloat16X65536X128
      * OperatorTest/CTTestSuite.TestCastTranspose/bfloat16Xbfloat16X256X65536
      
      This commit is intended to fix these test cases.
      Signed-off-by: wenjh's avatarwenjh <wenjh@sugon.com>
      46c81675
  20. 23 Apr, 2025 1 commit
  21. 22 Apr, 2025 3 commits
  22. 21 Apr, 2025 1 commit
  23. 19 Apr, 2025 1 commit
  24. 18 Apr, 2025 1 commit