- 27 Jan, 2026 7 commits
-
-
PanZezhong authored
-
wooway777 authored
-
wooway777 authored
-
wooway777 authored
-
gongchensu authored
- Ensure embedding tensors are on the same device. Change format. - Optimize embedding kernel with vectorized memory access and __ldg - Add vectorized memory access using float4/float2, half2, and bfloat162 - Use __ldg instruction for read-only weight and indices access - Add memory alignment checks to enable vectorized paths - Add __restrict__ keywords for better compiler optimization - Implement dynamic block size selection based on embedding_dim
-
wooway777 authored
-
wooway777 authored
-
- 22 Jan, 2026 3 commits
-
-
PanZezhong1725 authored
issue/811 fix tensor to blob and resume
-
PanZezhong authored
-
PanZezhong1725 authored
issue/811 support cuda graph capture
-
- 21 Jan, 2026 3 commits
-
-
PanZezhong authored
-
PanZezhong authored
-
PanZezhong1725 authored
issue/810 feat: allow graph tensor to resume to allocator's tracking
-
- 19 Jan, 2026 1 commit
-
-
PanZezhong authored
-
- 16 Jan, 2026 1 commit
-
-
Haojie Wang authored
issue/920 RoPE supports longrope
-
- 15 Jan, 2026 2 commits
-
-
PanZezhong1725 authored
issue/811 remove shortcut for cpu runtime
-
PanZezhong authored
-
- 14 Jan, 2026 1 commit
-
-
PanZezhong authored
-
- 12 Jan, 2026 4 commits
-
-
PanZezhong1725 authored
issue/867 fix cpu malloc
-
PanZezhong authored
-
PanZezhong1725 authored
issue/867 fix page caching api, paged attn support more head dims
-
PanZezhong authored
-
- 10 Jan, 2026 1 commit
-
-
Haojie Wang authored
issue/810 static compute graph infra
-
- 09 Jan, 2026 4 commits
-
-
Haojie Wang authored
Issue/867: adjust paged_attention_prefill interface naming
-
PanZezhong authored
-
PanZezhong authored
-
PanZezhong authored
-
- 08 Jan, 2026 1 commit
-
-
zhushuang authored
-
- 07 Jan, 2026 1 commit
-
-
Haojie Wang authored
Issue/791 增加add_rms_norm融合算子
-
- 06 Jan, 2026 1 commit
-
-
PanZezhong authored
-
- 05 Jan, 2026 1 commit
-
-
thatPepe authored
issue/877 - return saved_file in TestManager.test method
-
- 04 Jan, 2026 1 commit
-
-
baominghelly authored
-
- 30 Dec, 2025 5 commits
-
-
PanZezhong1725 authored
Issue/847 paged attention prefill一段式接口
-
PanZezhong authored
-
PanZezhong authored
-
PanZezhong1725 authored
issue/848: add paged attention prefill for nvidia gpu with test pass
-
zhushuang authored
-
- 29 Dec, 2025 3 commits
-
-
PanZezhong1725 authored
issue/847-paged caching和atention添加infinicore的接口和测试
-
PanZezhong authored
-
pengcheng888 authored
-