Commits · 18773b69ae7bd79b4e9cf9ac0a4e4c6ed1bf9bf8 · jerrrrry / infinicore

13 Mar, 2026 2 commits
- Revert "Merge pull request #1069 from InfiniTensor/issue/1031_T1_1_15" · 18773b69
  wooway777 authored Mar 13, 2026
```
This reverts commit 21c6af2d, reversing
changes made to 99a802dd.
```
  18773b69
- Revert "【算子比赛2025秋】T1-1-9" · 908c3cc5
  thatPepe authored Mar 13, 2026
  
  908c3cc5
12 Mar, 2026 1 commit
- issue/1031 T1-1-9 · 85f8987c
  PanZezhong authored Mar 12, 2026
  
  85f8987c
11 Mar, 2026 3 commits
- issue/1031 T1-1-15 · 5f329d7a
  PanZezhong authored Mar 11, 2026
  
  5f329d7a
- issue/1031 T1-1-17 · 45a3794b
  wooway777 authored Mar 11, 2026
  
  45a3794b
- Revert "Merge pull request #1056 from InfiniTensor/issue/1031" · cb7f0b7d
  wooway777 authored Mar 11, 2026
```
This reverts commit 7f295448, reversing
changes made to e60985dc.
```
  cb7f0b7d
09 Mar, 2026 2 commits
- issue/1031 T1-1-4 · 210e31d3
  PanZezhong authored Mar 09, 2026
  
  210e31d3
- issue/1031 T1-1-17 · d6af9c90
  PanZezhong1725 authored Mar 09, 2026
  
  d6af9c90
06 Mar, 2026 1 commit
- issue/1031 T1-1-9 · f7c59399
  PanZezhong authored Mar 06, 2026
  
  f7c59399
11 Feb, 2026 2 commits

issue/949 - feat: add silu_and_mul for moore gpu with test pass · 54635d9f
zhushuang authored Jan 22, 2026

54635d9f

qinyiqun authored Feb 11, 2026



demo131 - multiple issues regarding quantization, qy, and so forth

* issue/843: success per_channel_quant_int8

* issue/843: success qy quant

* issue/843: modified quant

* Add w8a8int8 performance tests

* add infinicore op linear_w8a8i8

* w8a8 linear module functional nn

* issue/843: QY-GPU Support Int8 scale_mm (#68)

* issue/843: success qy scaled_mm

* issue/843: modified kernel.cuh as per_channel_dequant_int8.cuh

* fix parallel slic in w8

* w8: support multiple batch size

* temp: 修改quantconfig处理

* fix format and delete redundancy code

* fix format

* fix format

* fix format

* Refactor: add new API alongside legacy interfaces with deprecation warnings

* 添加w4 inifnicore相关内容，以及将Quantization config划入InfiniCore

* 量化算子支持图

* solve cub version problem and fix code structure

* fix format

* demo131 - remove commented lines

---------
Co-authored-by: xgqdut2016 <kenan_gewei@163.com>
Co-authored-by: xgqdut2016 <140036308+xgqdut2016@users.noreply.github.com>
Co-authored-by: wooway777 <wooway777@gmail.com>

eb89439d

27 Jan, 2026 3 commits

issue/923 - ninetoothed kv caching for nv, il, mtx · 97eced0e
wooway777 authored Jan 26, 2026

97eced0e
issue/919 - ninetoothed flash attention · 6ac8f906
wooway777 authored Jan 26, 2026

6ac8f906

issue/846 - Refactor embedding to support device-side input and CUDA graph recording · cc2cc3a1

gongchensu authored Dec 26, 2025

- Ensure embedding tensors are on the same device. Change format.
- Optimize embedding kernel with vectorized memory access and __ldg
- Add vectorized memory access using float4/float2, half2, and bfloat162
- Use __ldg instruction for read-only weight and indices access
- Add memory alignment checks to enable vectorized paths
- Add __restrict__ keywords for better compiler optimization
- Implement dynamic block size selection based on embedding_dim

cc2cc3a1

30 Dec, 2025 1 commit
- issue/848 - feat: add paged attention prefill for nvidia gpu with test pass · 1ba0bcfa
  zhushuang authored Dec 30, 2025
  
  1ba0bcfa
29 Dec, 2025 1 commit
- issue/834 - feat: add paged attention for nvidia gpu with test pass · 17299923
  zhushuang authored Dec 29, 2025
  
  17299923
24 Dec, 2025 1 commit
- 增加cpu的add rms_norm算子,c++和python接口 · 7d60e5b8
  zhuyue authored Dec 23, 2025
  
  7d60e5b8
21 Nov, 2025 1 commit

ISSUE/628 适配QY C610 GPU，增加编译选项，适配已有算子。添加bge类模型所需的算子， (#629) · 85bc98ac

qinyiqun authored Nov 21, 2025



* ISSUE/628 适配QY C610 GPU，增加编译选项，适配已有算子。添加bge类模型所需的算子，包括gelu,layer_norm，lp_norm(支持l1，l2 norm)，relu，softmax，tanh。

---------
Co-authored-by: xgqdut2016 <kenan_gewei@163.com>
Co-authored-by: xgqdut2016 <140036308+xgqdut2016@users.noreply.github.com>

85bc98ac

28 Oct, 2025 1 commit
- issue/456/feat: add silu operator · e184c7e4
  tianyuxbear authored Jul 25, 2025
  
  e184c7e4
23 Oct, 2025 1 commit
- issue/473 - the ones and zeros operators · 9b8de584
  pengcheng888 authored Oct 23, 2025
```
Co-authored-by: pengcheng888 <pengcheng@example.com>
```
  9b8de584
16 Oct, 2025 1 commit

issue/383: Add logsoftmax ops · 05a2e149

gongchensu authored Oct 16, 2025


Co-authored-by: wawahejun <hejunlbbc@gmail.com>
Co-authored-by: zhuyue <zhuyue@qiyuanlab.com>

05a2e149

29 Sep, 2025 1 commit
- issue/427 - the sigmoid, topksoftmax, and topkrouter ops · ed530e11
  pengcheng888 authored Sep 29, 2025
  
  ed530e11
23 Sep, 2025 1 commit
- feat: rename Dequantize to DequantizeAWQ in nvidia gpu · 4217976d
  zhushuang authored Sep 23, 2025
  
  4217976d
16 Sep, 2025 1 commit
- issue/428: merge rope_v2 into rope with algorithm selection · 86515765
  Ziminli authored Sep 07, 2025
  
  86515765
10 Sep, 2025 1 commit
- issue/440 feat: add softplus operator · 1635fd92
  PanZezhong1725 authored Sep 10, 2025
  
  1635fd92
02 Sep, 2025 1 commit
- [T2-2-3] blkmjsian · 9ad23fad
  blkmjsian authored Sep 02, 2025
```
- dequantize awq
- rope v2
```
  9ad23fad
07 Jul, 2025 1 commit
- issue/307 unify test tensor creation in pytorch tests · f62e952e
  PanZezhong authored Jul 07, 2025
  
  f62e952e
27 Jun, 2025 1 commit

issue/205 - 添加Sub算子 · 2ccf1d9d

Pepe authored Apr 27, 2025

issue/205 - 添加Sub算子的头文件、CPU实现、cuda实现、及Python测试

2ccf1d9d

06 May, 2025 1 commit
- issue/204: add算子测例 · 16506fc0
  Catheriany authored May 06, 2025
  
  16506fc0
28 Apr, 2025 1 commit
- issue/180：添加clip算子 · 8a49900f
  goldenfox2025 authored Apr 28, 2025
  
  8a49900f
25 Apr, 2025 2 commits
- issue/183 Mul算子CPU&CUDA · 03edef48
  Graylatzhou authored Apr 24, 2025
  
  03edef48
- Issue/183 Mul算子CPU&CUDA · 975559ee
  Graylatzhou authored Apr 22, 2025
  
  975559ee
08 Apr, 2025 1 commit
- issue/161 rope和causal softmax支持非原地 · 0450fb1e
  PanZezhong authored Apr 08, 2025
  
  0450fb1e
21 Mar, 2025 1 commit
- issue/115 将matmul改名为gemm · 054763bc
  PanZezhong authored Mar 21, 2025
  
  054763bc
05 Mar, 2025 2 commits
- issue/87/refactor: 修改 infiniop/handle.h，现在 infiniop 依赖 infinirt 创建指定硬件的 handle · 601defcb
  YdrMaster authored Mar 05, 2025
```
Signed-off-by: YdrMaster <ydrml@hotmail.com>
```
  601defcb
- issue/85 重命名头文件 · 9d611676
  PanZezhong authored Mar 05, 2025
  
  9d611676
21 Feb, 2025 1 commit
- issue/69 fix: 补上算子库头文件tensor和handle的接口 · 484554c7
  PanZezhong authored Feb 21, 2025
  
  484554c7
11 Feb, 2025 1 commit
- feat: cpu and cuda matmul · 46da1a27
  PanZezhongQY authored Feb 11, 2025
  
  46da1a27