- 04 Jul, 2025 1 commit
-
-
pengcheng888 authored
-
- 02 Jul, 2025 5 commits
-
-
PanZezhong1725 authored
issue/158/feat: 支持天数
-
YdrMaster authored
Signed-off-by:YdrMaster <ydrml@hotmail.com>
-
PanZezhong1725 authored
-
YdrMaster authored
Signed-off-by:YdrMaster <ydrml@hotmail.com>
-
YdrMaster authored
Signed-off-by:YdrMaster <ydrml@hotmail.com>
-
- 01 Jul, 2025 1 commit
-
-
蒋帅宏(Shuaihong_Jiang) authored
* issue/254: 添加算子在CPU和CUDA上对BF16的支持,并增加相应的测试代码 * issue/254: 将修改后的算子格式化后重新提交 * 修改与最新main的冲突 * 解决冲突后rms_norm原本的精度过不了了,现在由 {"atol": 5e-3, "rtol": 5e-3}更改为 {"atol": 8e-3, "rtol": 8e-3} * rms_norm在debug模式下FP16的测试用例失败了(本地测试能通过,github上过不了), 所以将容差增大了两倍进行测试 * 将rms_normd的测试输入缩放0.5,将容差改回原始值来进行ci测试 * issue/254: 1.使用CHECK_DTYPE宏来进行数据类型检验 2.在test的utils.py中添加了设备对BF16支持的检验 * issue/254: rms_norm测试fp16容差由 torch.float16: {"atol": 1e-3, "rtol": 1e-3}, 改为torch.float16: {"atol": 2e-3, "rtol": 2e-3}, 并删除对输入0.5的放缩 * issue/254: 在utils.py中debug方法和debug_all方法中 添加了对BF16的特判 * 修改支持BF16测试的设备类型检查方法 * 修改支持BF16测试的设备检查 * issue/254: reduce redundancy in rms_norm.py * issue/254: add back the missing comment in rms_norm.py * issue/254: add fp32 tolerance condition in causal_softmax.py --------- Co-authored-by:Zimin Li <coollizimin@gmail.com>
-
- 30 Jun, 2025 3 commits
-
-
PanZezhong1725 authored
issue/288: Improve the Compatibility of the Torch Implementations
-
Zimin Li authored
-
Zimin Li authored
-
- 27 Jun, 2025 7 commits
-
-
PanZezhong1725 authored
issue/137: 添加causal_softmax测例,更新readme(合并)
-
Catheriany authored
-
PanZezhong1725 authored
issue/205 - 添加Sub算子 resolves #205
-
Pepe authored
-
Pepe authored
issue/205 - 添加Sub算子的头文件、CPU实现、cuda实现、及Python测试
-
PanZezhong1725 authored
issue/11: add random sample ascend
-
PanZezhong1725 authored
issue/282: Maca CausalSoftamx精度bug
-
- 26 Jun, 2025 2 commits
-
-
Catheriany authored
-
Catheriany authored
-
- 25 Jun, 2025 2 commits
-
-
zhangyunze authored
-
PanZezhong1725 authored
issue/271: xmake modify in moore gpu
-
- 23 Jun, 2025 1 commit
-
-
PanZezhong1725 authored
issue/273: Fully Support `equal_nan` Option for `debug()` and `debug_all()`
-
- 20 Jun, 2025 1 commit
-
-
Zimin Li authored
-
- 19 Jun, 2025 1 commit
-
-
zhushuang authored
-
- 17 Jun, 2025 5 commits
-
-
PanZezhong1725 authored
issue/152/feat: 添加 rearrange 算子测例
-
pwhMass authored
-
PanZezhong1725 authored
issue/152/feat: 添加 rearrange 算子测例
-
YdrMaster authored
Signed-off-by:YdrMaster <ydrml@hotmail.com>
-
zhangyue authored
-
- 13 Jun, 2025 2 commits
-
-
PanZezhong1725 authored
Issue/261: Optimize Torch Implementation of Several Operators
-
Zimin Li authored
issue/261: optimize the torch implementation of add, causal softmax, gemm, random sample, rearrange, rms norm, rope
-
- 12 Jun, 2025 3 commits
-
-
PanZezhong1725 authored
issue/36 - Migrate cuda ramdom sample to metax
-
crapromer authored
-
PanZezhong1725 authored
issue/256 沐曦通信库
-
- 11 Jun, 2025 4 commits
-
-
PanZezhong authored
-
PanZezhong1725 authored
issue/238 - Migrate cuda rearrange to metax
-
PanZezhong1725 authored
issue/39 Migrate cuda causal softmax to metax
-
PanZezhong1725 authored
issue/37 - Migrate cuda rope to metax
-
- 10 Jun, 2025 2 commits
-
-
YdrMaster authored
Signed-off-by:YdrMaster <ydrml@hotmail.com>
-
PanZezhong1725 authored
issue/228: swiglu测例0步长添加
-