- 08 May, 2025 1 commit
-
-
wenjh authored
Default use of hipMallocAsync rather than hipMalloc in rocblas_gemm and add support of fp16_fp16_fp32 in rocblas_gemm. Signed-off-by:wenjh <wenjh@sugon.com>
-
- 07 May, 2025 2 commits
- 06 May, 2025 5 commits
-
-
yuguo authored
-
wenjh authored
Fix launch bounds of multi_tensor_apply_kernel and thd_out_correction_kernel. Signed-off-by:wenjh <wenjh@sugon.com>
-
-
yuguo authored
-
wenjh authored
Fix launch params are larger than launch bounds(256) for kernels in rocm_gemm.cu Signed-off-by:wenjh <wenjh@sugon.com>
-
- 30 Apr, 2025 1 commit
-
-
wenjh authored
Signed-off-by:
wenjh <wenjh@sugon.com> [RocblasGemm] Provide support of AB(bf16)D(fp32) Signed-off-by:
wenjh <wenjh@sugon.com>
-
- 29 Apr, 2025 5 commits
- 28 Apr, 2025 1 commit
-
-
yuguo authored
-
- 27 Apr, 2025 2 commits
-
-
wenjh authored
Signed-off-by:wenjh <wenjh@sugon.com>
-
wenjh authored
Ref params of rmsnorm will make program corruption with 'nil' error. Signed-off-by:wenjh <wenjh@sugon.com>
-
- 25 Apr, 2025 5 commits
-
-
-
yuguo authored
-
panning authored
API `rmsnorm_forward` of python returns 3 values rather than 2 from V2.3 Signed-off-by:wenjh <wenjh@sugon.com>
-
-
yuguo authored
-
- 24 Apr, 2025 2 commits
-
-
wenjh authored
Due to the difference of warp size between nvidia(32) and dtk(64), the OperatorTest/CTDBiasTestSuite.TestCTDBias/* are all failed except: * OperatorTest/CTDBiasTestSuite.TestCTDBias/bfloat16Xfloat32X65536X128 * OperatorTest/CTDBiasTestSuite.TestCTDBias/bfloat16Xfloat16X65536X128 * OperatorTest/CTDBiasTestSuite.TestCTDBias/bfloat16Xbfloat16X65536X128 * OperatorTest/CTDBiasTestSuite.TestCTDBias/bfloat16Xfloat8e5m2X65536X128 * OperatorTest/CTDBiasTestSuite.TestCTDBias/bfloat16Xfloat8e4m3X65536X128 This commit is intended to fix this. Signed-off-by:wenjh <wenjh@sugon.com>
-
wenjh authored
Due to the compiler compiling incorrect code. The following test case crashed: * OperatorTest/CTTestSuite.TestCastTranspose/bfloat16Xbfloat16X2048X12288 * OperatorTest/CTTestSuite.TestCastTranspose/bfloat16Xbfloat16X65536X128 * OperatorTest/CTTestSuite.TestCastTranspose/bfloat16Xbfloat16X256X65536 This commit is intended to fix these test cases. Signed-off-by:wenjh <wenjh@sugon.com>
-
- 23 Apr, 2025 2 commits
- 22 Apr, 2025 1 commit
-
-
yuguo authored
-
- 18 Apr, 2025 1 commit
-
-
yuguo authored
-
- 17 Apr, 2025 2 commits
- 16 Apr, 2025 1 commit
-
-
yuguo authored
-
- 14 Apr, 2025 1 commit
-
-
yuguo authored
-
- 11 Apr, 2025 2 commits
-
-
-
yuguo authored
-
- 10 Apr, 2025 2 commits
- 09 Apr, 2025 2 commits
-
-
-
yuguo authored
-
- 08 Apr, 2025 2 commits