- 21 Jul, 2023 1 commit
-
-
Li Zhang authored
* add GQA for llama2 * fix model conversion * fix lint & remove dev log * update news * minor * fix allocation size * fix split_dim for w_qkv.bias
-
- 01 Jul, 2023 2 commits
* add GQA for llama2 * fix model conversion * fix lint & remove dev log * update news * minor * fix allocation size * fix split_dim for w_qkv.bias
* build turbomind * change namespace fastertransformer to turbomind * change logger name