- 29 Sep, 2025 5 commits
-
-
spike-zhu authored
-
PanZezhong1725 authored
-
zhushuang authored
-
spike-zhu authored
-
gongchensu authored
Co-authored-by:zhuyue <zhuyue@qiyuanlab.com>
-
- 25 Sep, 2025 3 commits
-
-
zhangyue authored
* issue/472: p800 ccl * issue/472: 删掉无用操作 * issue/472: fix format * issue/472: memcpy h2h case
-
PanZezhong1725 authored
issue/477 - Cambricon MLU NeoX
-
wooway777 authored
Added NeoX support to Cambricon RoPE; Added a missing argument in the profiling script;
-
- 24 Sep, 2025 1 commit
-
-
PanZezhong1725 authored
issue/474: rename Dequantize to DequantizeAWQ in nvidia gpu
-
- 23 Sep, 2025 2 commits
-
-
zhushuang authored
-
PanZezhong1725 authored
issue/469: disable NVIDIA-dequantize on Iluvatar GPU via ENABLE_NVIDIA_API marco
-
- 19 Sep, 2025 1 commit
-
-
zhushuang authored
-
- 18 Sep, 2025 7 commits
-
-
spike-zhu authored
-
thatPepe authored
* issue/459 - Support more data type combinations * issue/459 - added test cases for 9G7B and 9G70B * issue/459 - modified rms kernel to support larger tensors
-
zhangyue authored
issue/466: 昆仑平台rope关于NEOX算法的实现
-
zhangyunze authored
-
PanZezhong1725 authored
issue/434 - added bf16 support for Cambricon MLU
-
xgqdut2016 authored
-
PanZezhong1725 authored
* issue/436: support kunlun rope U32 * issue/436: 支持9g7b 4b模型 --------- Co-authored-by:zhangyue <zhangyue@qiyuanlab.com>
-
- 17 Sep, 2025 2 commits
-
-
zhangyue authored
-
xgqdut2016 authored
-
- 16 Sep, 2025 12 commits
-
-
wooway777 authored
-
Jiacheng Huang authored
-
PanZezhong1725 authored
-
zhushuang authored
-
PanZezhong1725 authored
issue/434 hccl support bf16
-
Ceng2333 authored
Signed-off-by:Ceng <441651826@qq.com>
-
Ceng authored
Signed-off-by:Ceng <441651826@qq.com>
-
PanZezhong1725 authored
Issue/428: Merge `rope_v2` into `rope`
-
Ziminli authored
issue/428: update the rope implementation on Ascend, Cambricon, and Kunlun to use the refactored interface and return unimplemented error for NEOX-style algorithm
-
Ziminli authored
-
Ziminli authored
-
PanZezhong1725 authored
* issue/450: change indexToReducedOffset() to indexToOffset in elementwise framework on CPU, NVIDIA, Cambricon, Metax, Moore, and Kunlun * issue/450: remove indexToReducedOffset() in all platforms * issue/450: add the testcases that pinpoint the issue in infiniop-test
-
- 15 Sep, 2025 3 commits
- 10 Sep, 2025 1 commit
-
-
PanZezhong1725 authored
-
- 09 Sep, 2025 2 commits
-
-
PanZezhong1725 authored
issue/434 nccl support bf16
-
PanZezhong1725 authored
-
- 04 Sep, 2025 1 commit
-
-
PanZezhong1725 authored
issue/425: implement GEMM with MUBLAS and MUDNN backends in moore gpu
-