Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
FlashMLA
Commits
60dfab334dab244c5c703140a913c7aafe50dd79
Switch branch/tag
flashmla
21 Feb, 2026
1 commit
float传bf16使用round_half_ulp_truncate
· 60dfab33
zhanghj2
authored
Feb 21, 2026
60dfab33
11 Feb, 2026
4 commits
对接口进行架构检查
· 68971b5c
zhanghj2
authored
Feb 11, 2026
68971b5c
删除mha测试用例
· 68055db7
zhanghj2
authored
Feb 11, 2026
68055db7
添加测试用例
· 611e6922
zhanghj2
authored
Feb 11, 2026
611e6922
支持kv 软fp8 e5m2
· 892f7274
zhanghj2
authored
Feb 11, 2026
892f7274
06 Feb, 2026
5 commits
加入版本信息
· 11e445c3
zhanghj2
authored
Feb 06, 2026
11e445c3
优化combine
· b1ba831f
zhanghj2
authored
Feb 06, 2026
b1ba831f
优化combine
· 91691124
zhanghj2
authored
Feb 06, 2026
91691124
支持nmz fp8
· c4412432
zhanghj2
authored
Feb 06, 2026
c4412432
支持nmz qkvfp8
· 26d2ab19
zhanghj2
authored
Feb 06, 2026
26d2ab19
04 Feb, 2026
2 commits
软fp8 e5m2搭框架
· 3eb7071c
zhanghj2
authored
Feb 04, 2026
3eb7071c
搭建支持qkvfp8的框架
· 4976cbaa
zhanghj2
authored
Feb 04, 2026
4976cbaa
03 Feb, 2026
2 commits
便于测试和出包
· 06612d65
zhanghj2
authored
Feb 03, 2026
06612d65
支持纯bf16
· 2033d805
zhanghj2
authored
Feb 03, 2026
2033d805
30 Jan, 2026
4 commits
修改写出
· 58b43d4a
zhanghj2
authored
Jan 30, 2026
58b43d4a
实现了scale使用buffer load读取
· d6379e50
zhanghj2
authored
Jan 30, 2026
d6379e50
使用buffer load lds读取q, 优化了vgpr溢出
· bdf0140b
zhanghj2
authored
Jan 30, 2026
bdf0140b
保存编译生成的汇编
· 515dbd44
zhanghj2
authored
Jan 30, 2026
515dbd44
29 Jan, 2026
7 commits
sparse decode支持head16
· 0651671f
zhanghj2
authored
Jan 29, 2026
0651671f
prefill支持head 16
· b94fdd0f
zhanghj2
authored
Jan 29, 2026
b94fdd0f
减少lds使用, 提高并行度
· 38421051
zhanghj2
authored
Jan 29, 2026
38421051
减少lds用量
· 6d68e3d1
zhanghj2
authored
Jan 29, 2026
6d68e3d1
938架构
· 5d62c0d7
zhanghj2
authored
Jan 29, 2026
5d62c0d7
使用64位计算地址,避免大size类型溢出
· 7a8722d7
zhanghj2
authored
Jan 29, 2026
7a8722d7
topk_length=0的时候,gMax_logits=-inf
· d1c9d3fa
zhanghj2
authored
Jan 29, 2026
d1c9d3fa
28 Jan, 2026
5 commits
区分dim576和512
· dd5d4bb3
zhanghj2
authored
Jan 28, 2026
dd5d4bb3
实现sparse prefill, 还有bug
· c3cf875a
zhanghj2
authored
Jan 28, 2026
c3cf875a
处理lse为正无穷的情况
· 50e2de8d
zhanghj2
authored
Jan 28, 2026
50e2de8d
判断是否为正无穷
· 5d8e93f6
zhanghj2
authored
Jan 28, 2026
5d8e93f6
处理attn_sink中inf的情况
· 50f07abd
zhanghj2
authored
Jan 28, 2026
50f07abd
27 Jan, 2026
4 commits
支持movel1 decode
· 75890221
zhanghj2
authored
Jan 27, 2026
75890221
修改支持modelv1,v32部分通过,model1未修改完
· 620f8769
zhanghj2
authored
Jan 27, 2026
620f8769
lambda函数优化代码结构
· 6fb681fc
zhanghj2
authored
Jan 27, 2026
6fb681fc
fix total_num_blocks计算
· 75f8262c
zhanghj2
authored
Jan 27, 2026
75f8262c
26 Jan, 2026
5 commits
fix 关闭attn sink情况下的错误
· 0ce8ee82
zhanghj2
authored
Jan 26, 2026
0ce8ee82
支持attn_sink
· 200f01d5
zhanghj2
authored
Jan 26, 2026
200f01d5
支持attn_sink
· 9b54b03c
zhanghj2
authored
Jan 26, 2026
9b54b03c
添加softmax
· 5813dcc1
zhanghj2
authored
Jan 26, 2026
5813dcc1
适配v32的decode kernel
· 0e1300f7
zhanghj2
authored
Jan 26, 2026
0e1300f7
25 Jan, 2026
1 commit
open check_if_all_features_are_supported_and_abort
· 7abe5160
zhanghj2
authored
Jan 25, 2026
7abe5160