Replace head_mapping params with num_kv_heads to attention kernel. (#1997)
Co-authored-by:wangguoya <wangguoya@baidu.com> Co-authored-by:
Yang Zhao <zhaoyangstar@foxmail.com>
Showing
Please register or sign in to comment
Co-authored-by:wangguoya <wangguoya@baidu.com> Co-authored-by:
Yang Zhao <zhaoyangstar@foxmail.com>