- 06 Aug, 2025 1 commit
-
-
yuguo authored
-
- 18 Jul, 2025 1 commit
-
-
yuguo authored
-
- 16 Jul, 2025 1 commit
-
-
yuguo authored
-
- 18 Jun, 2025 1 commit
-
-
yuguo authored
-
- 26 May, 2025 1 commit
-
-
wenjh authored
Use ocp fp8. Workaround: test_cast_float8blockwise.cu link wrong std::max Signed-off-by:wenjh <wenjh@sugon.com>
-
- 08 May, 2025 1 commit
-
-
wenjh authored
Default use of hipMallocAsync rather than hipMalloc in rocblas_gemm and add support of fp16_fp16_fp32 in rocblas_gemm. Signed-off-by:wenjh <wenjh@sugon.com>
-
- 06 May, 2025 2 commits
-
-
yuguo authored
-
wenjh authored
Fix launch params are larger than launch bounds(256) for kernels in rocm_gemm.cu Signed-off-by:wenjh <wenjh@sugon.com>
-
- 30 Apr, 2025 1 commit
-
-
wenjh authored
Signed-off-by:
wenjh <wenjh@sugon.com> [RocblasGemm] Provide support of AB(bf16)D(fp32) Signed-off-by:
wenjh <wenjh@sugon.com>
-
- 25 Apr, 2025 1 commit
-
-
yuguo authored
-
- 23 Apr, 2025 1 commit
-
-
yuguo authored
-
- 20 Mar, 2025 2 commits