[Feature] Support Llama-2 with GQA (#147)
* add GQA for llama2 * fix model conversion * fix lint & remove dev log * update news * minor * fix allocation size * fix split_dim for w_qkv.bias
Showing
Please register or sign in to comment
* add GQA for llama2 * fix model conversion * fix lint & remove dev log * update news * minor * fix allocation size * fix split_dim for w_qkv.bias