- 27 Mar, 2024 1 commit
-
-
Hongxin Liu authored
* [feature] refactor colo attention (#5462) * [extension] update api * [feature] add colo attention * [feature] update sdpa * [feature] update npu attention * [feature] update flash-attn * [test] add flash attn test * [test] update flash attn test * [shardformer] update modeling to fit colo attention (#5465) * [misc] refactor folder structure * [shardformer] update llama flash-attn * [shardformer] fix llama policy * [devops] update tensornvme install * [test] update llama test * [shardformer] update colo attn kernel dispatch * [shardformer] update blip2 * [shardformer] update chatglm * [shardformer] update gpt2 * [shardformer] update gptj * [shardformer] update opt * [shardformer] update vit * [shardformer] update colo attention mask prep * [shardformer] update whisper * [test] fix shardformer tests (#5514) * [test] fix shardformer tests * [test] fix shardformer tests
-
- 25 Jan, 2024 1 commit
-
-
Frank Lee authored
* [feat] refactored extension module * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish
-
- 27 Oct, 2023 1 commit
-
-
アマデウス authored
-
- 19 Sep, 2023 1 commit
-
-
Hongxin Liu authored
* [misc] update pre-commit * [misc] run pre-commit * [misc] remove useless configuration files * [misc] ignore cuda for clang-format
-
- 06 Jan, 2023 1 commit
-
-
Frank Lee authored
* [setup] support pre-build and jit-build of cuda kernels * polish code * polish code * polish code * polish code * polish code * polish code
-
- 04 Jan, 2023 2 commits
-
-
Jiarui Fang authored
-
Frank Lee authored
-
- 28 Dec, 2022 2 commits
-
-
Jiarui Fang authored
* [builder] polish builder * remove print
-
Jiarui Fang authored
-
- 26 Dec, 2022 1 commit
-
-
Jiarui Fang authored
-
- 23 Dec, 2022 2 commits
-
-
Jiarui Fang authored
-
Jiarui Fang authored
-