- 11 Feb, 2026 16 commits
-
-
thatPepe authored
Issue/862 - Fix compilation errors (missing headers, cub namespace) t…
-
gongchensu authored
-
thatPepe authored
issue/523 - switched to cambricon mlu 1.22 interface
-
thatPepe authored
issue/837 - support int32 and int64 in cambricon add
-
thatPepe authored
issue/1001 - feat: add paged attention prefill and decode for moore gpu referencing nvidia
-
thatPepe authored
issue/1012 - feat: add paged caching for moore gpu referencing nvidia
-
thatPepe authored
issue/838 - Cambricon Batched RoPE
-
wooway777 authored
-
thatPepe authored
issue/899 - fix: fix causal_softmax and rearrange bug
-
zhushuang authored
-
zhushuang authored
-
zhushuang authored
-
thatPepe authored
issue/949 - feat: add silu_and_mul for moore gpu with test pass
-
zhushuang authored
-
zhushuang authored
-
qinyiqun authored
demo131 - multiple issues regarding quantization, qy, and so forth * issue/843: success per_channel_quant_int8 * issue/843: success qy quant * issue/843: modified quant * Add w8a8int8 performance tests * add infinicore op linear_w8a8i8 * w8a8 linear module functional nn * issue/843: QY-GPU Support Int8 scale_mm (#68) * issue/843: success qy scaled_mm * issue/843: modified kernel.cuh as per_channel_dequant_int8.cuh * fix parallel slic in w8 * w8: support multiple batch size * temp: 修改quantconfig处理 * fix format and delete redundancy code * fix format * fix format * fix format * Refactor: add new API alongside legacy interfaces with deprecation warnings * 添加w4 inifnicore相关内容,以及将Quantization config划入InfiniCore * 量化算子支持图 * solve cub version problem and fix code structure * fix format * demo131 - remove commented lines --------- Co-authored-by:
xgqdut2016 <kenan_gewei@163.com> Co-authored-by:
xgqdut2016 <140036308+xgqdut2016@users.noreply.github.com> Co-authored-by:
wooway777 <wooway777@gmail.com>
-
- 04 Feb, 2026 4 commits
- 29 Jan, 2026 1 commit
-
-
zhangyue authored
-
- 27 Jan, 2026 19 commits
-
-
PanZezhong1725 authored
issue/811 use relax graph capture mode
-
PanZezhong authored
-
wooway777 authored
-
wooway777 authored
-
wooway777 authored
-
PanZezhong authored
-
wooway777 authored
-
wooway777 authored
-
wooway777 authored
-
wooway777 authored
-
wooway777 authored
-
Jiacheng Huang authored
issue/925 - Speed up `scripts/build_ntops.py` and `src/infiniop/ninetoothed/build.py` with `concurrent.futures`
-
Jiacheng Huang authored
对 `NineToothedTensor` 进行 C++ 层封装 加入使用数组作为 `shape` 和 `strides` 创建 `ninetoothed::Tensor` 的方式 使用 `ninetoothed::Tensor` 接入九齿的 ReLU 算子 Add an include guard to `ninetoothed/utils.h`
-
wooway777 authored
-
PanZezhong authored
-
wooway777 authored
-
wooway777 authored
-
wooway777 authored
-
PanZezhong authored
-