- 04 Jul, 2023 40 commits
-
-
Frank Lee authored
-
Frank Lee authored
-
Frank Lee authored
-
Frank Lee authored
-
Frank Lee authored
-
Kun Lin authored
* first v of vit shardformer * keep vit * update * vit shard add vitattention vitlayer * update num head shard para * finish test for vit * add new_model_class & postprocess * add vit readme * delete old files & fix the conflict * fix sth
-
jiangmingyan authored
* [shardformer] shardformer support opt models * [shardformer] shardformer support opt models, fix * [shardformer] shardformer support opt models, fix * [shardformer] shardformer support opt models, fix
-
Frank Lee authored
-
Frank Lee authored
* [test] fixed tests failed due to dtensor change * polish code
-
FoolPlayer authored
* add layernorm to bert * add layernorm test * add layernorm test with load state dict * add use_mixedfusedLN in shard config * refactor policy to support fused_layernorm
-
Frank Lee authored
-
FoolPlayer authored
* add linearconv1d test * add linearconv1d test
-
Frank Lee authored
* [shardformer] support module saving and loading * polish code
-
FoolPlayer authored
* support kit use for bert test * support kit test for gpt2
-
Frank Lee authored
-
Frank Lee authored
* [shardformer] adapted T5 and LLaMa test to use kit * polish code
-
FoolPlayer authored
* add gpt2 test and layer class refactor * add dropout in gpt2 policy
-
Frank Lee authored
-
Frank Lee authored
-
FoolPlayer authored
* fix bert downstream with new api * remove comment line
-
Frank Lee authored
-
FoolPlayer authored
-
FoolPlayer authored
-
Frank Lee authored
-
Frank Lee authored
* [shardformer] refactored embedding and dropout to parallel module * polish code
-
FoolPlayer authored
-
Frank Lee authored
* [shardformer] integrated linear 1D with dtensor * polish code
-
FoolPlayer authored
* fix an error in readme * simplify code * refactor shardformer * add todo * remove slicer * resolve code review
-
Frank Lee authored
-
FoolPlayer authored
* fix an error in readme * simplify code
-
FoolPlayer authored
* add dist dropout in model * update docstring and bert policy with dropout * refactor basepolicy and sharded, update bert * update format * update gpt2 policy * update bert policy * remove unused code * update readme for new policy usage * add downstream model of bert * remove unused code
-
wukong1992 authored
test t5
-
wukong1992 authored
adjust layer attr
-
FoolPlayer authored
* add dist dropout in model * update docstring and bert policy with dropout * refactor basepolicy and sharded, update bert * update format * update gpt2 policy * update bert policy * remove unused code * update readme for new policy usage
-
FoolPlayer authored
* fix bug in slicer, add slicer unit test * add dropout test * use pid as dropout seed * updata dropout test with local pattern * ad todo
-
FoolPlayer authored
* add bert align test, fix dist loss bug * forward and backward align * add ignore index * add shardformer CI * add gather_output optional for user in shardconfig * update readme with optional gather_ouput * add dist crossentropy loss test, remove unused files * remove unused file * remove unused file * rename the file * polish code
-
FoolPlayer authored
* add gpt2 policy and modify shard and slicer to support * remove unused code * polish code
-
FoolPlayer authored
-
FoolPlayer authored
* add dropout layer, add dropout test * modify seed manager as context manager * add a copy of col_nn.layer * add dist_crossentropy loss; separate module test * polish the code * fix dist crossentropy loss
-
FoolPlayer authored
* update readme with modules content * remove img
-