- 23 Apr, 2025 3 commits
-
-
Zimin Li authored
issue/194: fix meta mem size allocation and change cpu DeviceImpl::create to return INFINI_STATUS_NOT_IMPLEMENTED
-
PanZezhong1725 authored
issue/189/docs: update README.md
-
YdrMaster authored
Signed-off-by:YdrMaster <ydrml@hotmail.com>
-
- 22 Apr, 2025 3 commits
-
-
PanZezhong1725 authored
Revert "添加swiglu算子测例"
-
PanZezhong1725 authored
-
PanZezhong1725 authored
添加swiglu算子测例
-
- 21 Apr, 2025 2 commits
-
-
PanZezhong1725 authored
issue/8: causalsoftmax算子-昇腾
-
PanZezhong1725 authored
Issue/48 Rope CPU & CUDA
-
- 18 Apr, 2025 2 commits
-
-
PanZezhong authored
-
PanZezhong authored
-
- 17 Apr, 2025 2 commits
-
-
PanZezhong authored
-
PanZezhong authored
-
- 16 Apr, 2025 2 commits
-
-
PanZezhong authored
-
zhangyunze authored
使用maskedInplaceOp和SoftmaxOp实现CausalSoftmaxOp
-
- 15 Apr, 2025 4 commits
-
-
PanZezhong1725 authored
Issue/127/feat. General Elementwise Framework with Refactored SwiGLU (CPU & CUDA)
-
Zimin Li authored
issue/127: change meta within ElementwiseInfo to std::vector<size_t> for correct alignment and change the reference name of the Opaque struct to Opaque instead of struct Opaque
-
Zimin Li authored
-
Zimin Li authored
issue/127: Add arguments to CREATE_ELEMENTWISE_PLATFORM_DESCRIPTOR macros for indirecting variable names, change DeviceImpl to use Result for the return type of the create function, change CEIL_DIV
-
- 14 Apr, 2025 13 commits
-
-
Zimin Li authored
-
Zimin Li authored
-
Zimin Li authored
issue/127: Optimize elementwise CUDA code by removing redundancy, change/correct kernel logic when all inputs have the same dtype
-
Zimin Li authored
issue/127: Refactor ElementwiseInfo, refactor elementwise to use workspace for storing meta, fix misc. issues
-
Zimin Li authored
issue/127: fix CUDA mix-precision broadcasting input mismatch issue, adjust comment structure and template variable order
-
Zimin Li authored
enable_if, remove std::move() in elementwise_cpu.h, add <array> inclusion
-
Zimin Li authored
-
Zimin Li authored
issue/127: refactor ElementwiseInfo to use utils::Result, change elementwise calcualte and calculateImpl to return infiniStatus_t, add CHECK_CUDA to cuda function calls
-
Zimin Li authored
issue/127: modify swiglu test to correctly handle broadcast scenarios, add two broadcast testcases, correct elementwise cpu mix-precision implementation
-
Zimin Li authored
issue/127: refactor elementwise framework, complete CUDA implementation, refactor swiglu using the generic elementwise framework
-
Zimin Li authored
-
PanZezhong1725 authored
issue/40: 实现沐曦rms_norm算子
-
qinyiqun authored
-
- 11 Apr, 2025 5 commits
-
-
PanZezhong1725 authored
issue/130: GEMM算子CPU平台的omp重构
-
xgqdut2016 authored
-
PanZezhong1725 authored
issue/32: 实现摩尔线程rms_norm算子
-
qinyiqun authored
-
qinyiqun authored
-
- 10 Apr, 2025 4 commits
-
-
PanZezhong1725 authored
issue/111: 添加rmsnorm以及算子编译流程
-
zhangyue authored
-
zhangyue authored
-
zhangyue authored
-