- 21 Jul, 2023 2 commits
-
-
Hongxin Liu authored
-
Baizhou Zhang authored
* sharded optimizer checkpoint for gemini plugin * modify test to reduce testing time * update doc * fix bug when keep_gatherd is true under GeminiPlugin
-
- 19 Jul, 2023 1 commit
-
-
Hongxin Liu authored
* [lazy] support init on cuda * [test] update lazy init test * [test] fix transformer version
-
- 18 Jul, 2023 1 commit
-
-
Cuiqing Li authored
* added softmax kernel * added qkv_kernel * added ops * adding tests * upload tets * fix tests * debugging * debugging tests * debugging * added * fixed errors * added softmax kernel * clean codes * added tests * update tests * update tests * added attention * add * fixed pytest checking * add cuda check * fix cuda version * fix typo
-
- 17 Jul, 2023 2 commits
-
-
binmakeswell authored
-
Jianghai authored
-
- 12 Jul, 2023 1 commit
-
-
github-actions[bot] authored
Co-authored-by:github-actions <github-actions@github.com>
-
- 10 Jul, 2023 1 commit
-
-
Frank Lee authored
* [docker] fixed ninja build command * polish code
-
- 07 Jul, 2023 2 commits
-
-
Baizhou Zhang authored
* [checkpointio] unsharded optimizer checkpoint for Gemini plugin * [checkpointio] unsharded optimizer checkpoint for Gemini using all_gather
-
Frank Lee authored
-
- 04 Jul, 2023 30 commits
-
-
Frank Lee authored
-
Frank Lee authored
-
Hongxin Liu authored
-
digger yu authored
-
github-actions[bot] authored
Co-authored-by:github-actions <github-actions@github.com>
-
Frank Lee authored
-
Frank Lee authored
-
Frank Lee authored
* [shardformer] made tensor parallelism configurable * polish code
-
Frank Lee authored
* [shardformer] refactored some doc and api * polish code
-
jiangmingyan authored
* [shardformer] add benchmark of shardformer * [shardformer] add benchmark of shardformer
-
Frank Lee authored
-
Frank Lee authored
-
Frank Lee authored
-
Frank Lee authored
-
Frank Lee authored
-
Kun Lin authored
* first v of vit shardformer * keep vit * update * vit shard add vitattention vitlayer * update num head shard para * finish test for vit * add new_model_class & postprocess * add vit readme * delete old files & fix the conflict * fix sth
-
jiangmingyan authored
* [shardformer] shardformer support opt models * [shardformer] shardformer support opt models, fix * [shardformer] shardformer support opt models, fix * [shardformer] shardformer support opt models, fix
-
Frank Lee authored
-
Frank Lee authored
* [test] fixed tests failed due to dtensor change * polish code
-
FoolPlayer authored
* add layernorm to bert * add layernorm test * add layernorm test with load state dict * add use_mixedfusedLN in shard config * refactor policy to support fused_layernorm
-
Frank Lee authored
-
FoolPlayer authored
* add linearconv1d test * add linearconv1d test
-
Frank Lee authored
* [shardformer] support module saving and loading * polish code
-
FoolPlayer authored
* support kit use for bert test * support kit test for gpt2
-
Frank Lee authored
-
Frank Lee authored
* [shardformer] adapted T5 and LLaMa test to use kit * polish code
-
FoolPlayer authored
* add gpt2 test and layer class refactor * add dropout in gpt2 policy
-
Frank Lee authored
-
Frank Lee authored
-
FoolPlayer authored
* fix bert downstream with new api * remove comment line
-