Commits · c3946bf0efe017cfe1b32ff816bd63dfd96d9131 · jerrrrry / infinicore

03 Mar, 2026 2 commits
- issue/1032 - fix block size on iluvatar · c3946bf0
  wooway777 authored Mar 03, 2026
  
  c3946bf0
- Merge pull request #1038 from InfiniTensor/issue/1032n · abd45713
  thatPepe authored Mar 03, 2026
```
Issue/1032n - cuda swiglu
```
  abd45713
02 Mar, 2026 2 commits
- issue/1032 - adjust swiglu operator.cc · 1443aa67
  wooway777 authored Mar 02, 2026
  
  1443aa67
- Merge pull request #1037 from InfiniTensor/issue/1036 · 90cb1b54
  thatPepe authored Mar 02, 2026
```
Issue/1036 paged caching support strides
```
  90cb1b54
28 Feb, 2026 7 commits
- issue/1036 fix some warnings · 3e1ef507
  PanZezhong authored Feb 28, 2026
  
  3e1ef507
- issue/1036 paged caching support strides · 67425576
  PanZezhong authored Feb 28, 2026
  
  67425576
- Merge pull request #1024 from InfiniTensor/issue/1021 · d8176086
  spike-zhu authored Feb 28, 2026
```
Issue/1021： issue/1021 - feat: support bf16 in infiniccl with mccl
```
  d8176086
- issue/1021 - feat: support bf16 in infiniccl on moore_gpu_arch mp_31 with mccl · 93d261ed
  zhushuang authored Feb 12, 2026
  
  93d261ed
- issue/1032 - follow nv changes on metax (swiglu) · ae49716d
  wooway777 authored Feb 28, 2026
  
  ae49716d
- issue/1032n: support strided last dim in cuda swiglu · db7e4076
  xgqdut2016 authored Feb 28, 2026
  
  db7e4076
- issue/1032 - provide an alternate cuda swiglu · 362f0187
  wooway777 authored Feb 28, 2026
  
  362f0187
24 Feb, 2026 2 commits
- Merge pull request #1026 from InfiniTensor/issue/1025_t · 718b18cf
  thatPepe authored Feb 24, 2026
```
issue/1025 - temporarily disable paged op compilation on hygon
```
  718b18cf
- issue/1025 - temporarily disable paged op compilation on hygon · 930af1b9
  wooway777 authored Feb 24, 2026
  
  930af1b9
13 Feb, 2026 1 commit
- Merge pull request #990 from InfiniTensor/demo131 · 784139b9
  thatPepe authored Feb 13, 2026
```
Demo-131 Cuda graph with optimized paged attention
```
  784139b9
12 Feb, 2026 12 commits
- Merge pull request #962 from InfiniTensor/issue/961 · 1d6527cb
  thatPepe authored Feb 12, 2026
```
issue/961: fix metax init with preload
```
  1d6527cb
- Merge pull request #1019 from InfiniTensor/issue/1008 · 52f0dcf0
  thatPepe authored Feb 12, 2026
```
Issue/1008
```
  52f0dcf0
- issue/1008: use warpBroadcast api · 68026bd1
  zhangyue authored Feb 12, 2026
  
  68026bd1
- issue/1022 - patch metax hpcc hrc include · d0f405ce
  wooway777 authored Feb 12, 2026
  
  d0f405ce
- issue/1008: revert python_test.py · 3d54ce8c
  zhangyue authored Feb 12, 2026
  
  3d54ce8c
- issue/1008: wrap iluvatar change in #ifdef ENABLE_ILUVATAR_API · 1c32d14d
  zhangyue authored Feb 12, 2026
  
  1c32d14d
- issue/1008 skip scale_mm compile in iluvatar · 034b1895
  zhangyue authored Feb 12, 2026
  
  034b1895
- issue/1008: adapt paged_attention_prefill · 7377e711
  zhangyue authored Feb 12, 2026
  
  7377e711
- issue/1008: adapt lpnorm layernorm softmax rearrange paged_attention for iluvatar · f46e9f65
  zhangyue authored Feb 12, 2026
  
  f46e9f65
- issue/1008: mv "import infinicore" ahead of "import" torch · bd0c922a
  zhangyue authored Feb 10, 2026
  
  bd0c922a
- Merge pull request #1018 from InfiniTensor/issue/972 · 5675a4af
  thatPepe authored Feb 12, 2026
```
Issue/972：摩尔平台基于 muDNN 的 w8a8 量化实现，并完善 scaled_mm_int8 python 测试脚本
```
  5675a4af
- issue/972 - feat: adjust scaled_mm_int8 python test · 6841663b
  zhushuang authored Feb 11, 2026
  
  6841663b
11 Feb, 2026 14 commits
- issue/972 - feat: add per_channel_quant_int8 for moore gpu referencing nvidia · e1974c6b
  zhushuang authored Feb 11, 2026
  
  e1974c6b
- issue/972 - feat: add scaled_mm with muDNN BatchMatMul for moore gpu · d4f726de
  zhushuang authored Feb 11, 2026
  
  d4f726de
- Merge pull request #865 from gongchensu/Issue/862 · 6ec2ea40
  thatPepe authored Feb 11, 2026
```
Issue/862 - Fix compilation errors (missing headers, cub namespace) t…
```
  6ec2ea40
- Merge branch 'demo131' into Issue/862 · 8d09630a
  gongchensu authored Feb 11, 2026
  
  8d09630a
- Merge pull request #963 from InfiniTensor/issue/523-020 · 012df56c
  thatPepe authored Feb 11, 2026
```
issue/523 - switched to cambricon mlu 1.22 interface
```
  012df56c
- Merge pull request #879 from InfiniTensor/issue/837 · f1b8ab64
  thatPepe authored Feb 11, 2026
```
issue/837 - support int32 and int64 in cambricon add
```
  f1b8ab64
- Merge pull request #1011 from InfiniTensor/issue/1001 · 84201ad0
  thatPepe authored Feb 11, 2026
```
issue/1001 - feat: add paged attention prefill  and decode for moore gpu referencing nvidia
```
  84201ad0
- Merge pull request #1013 from InfiniTensor/issue/1012 · 718eaf42
  thatPepe authored Feb 11, 2026
```
issue/1012 - feat: add paged caching for moore gpu referencing nvidia
```
  718eaf42
- Merge pull request #839 from InfiniTensor/issue/838 · c112132e
  thatPepe authored Feb 11, 2026
```
issue/838 - Cambricon Batched RoPE
```
  c112132e
- demo131 - remove fp32 from paged tests · d3e27d8c
  wooway777 authored Feb 10, 2026
  
  d3e27d8c
- Merge pull request #1010 from InfiniTensor/issue/899 · 513a8502
  thatPepe authored Feb 11, 2026
```
issue/899 - fix: fix causal_softmax and rearrange bug 
```
  513a8502
- issue/1012 - feat: add paged caching for moore gpu referencing nvidia · 8f710be1
  zhushuang authored Feb 10, 2026
  
  8f710be1
- issue/1001 - feat: add paged attention prefill for moore gpu referencing nvidia · 6074f7b8
  zhushuang authored Feb 10, 2026
  
  6074f7b8
- issue/1001 - feat: add paged attention decode for moore gpu referencing nvidia · 3d3a277f
  zhushuang authored Feb 04, 2026
  
  3d3a277f