- 04 Jul, 2023 19 commits
-
-
Frank Lee authored
-
FoolPlayer authored
* add linearconv1d test * add linearconv1d test
-
Frank Lee authored
* [shardformer] support module saving and loading * polish code
-
FoolPlayer authored
* support kit use for bert test * support kit test for gpt2
-
Frank Lee authored
-
Frank Lee authored
* [shardformer] adapted T5 and LLaMa test to use kit * polish code
-
FoolPlayer authored
* add gpt2 test and layer class refactor * add dropout in gpt2 policy
-
Frank Lee authored
-
Frank Lee authored
-
FoolPlayer authored
* fix bert downstream with new api * remove comment line
-
FoolPlayer authored
-
Frank Lee authored
* [shardformer] refactored embedding and dropout to parallel module * polish code
-
FoolPlayer authored
-
Frank Lee authored
* [shardformer] integrated linear 1D with dtensor * polish code
-
FoolPlayer authored
* add dist dropout in model * update docstring and bert policy with dropout * refactor basepolicy and sharded, update bert * update format * update gpt2 policy * update bert policy * remove unused code * update readme for new policy usage * add downstream model of bert * remove unused code
-
wukong1992 authored
test t5
-
wukong1992 authored
adjust layer attr
-
FoolPlayer authored
* fix bug in slicer, add slicer unit test * add dropout test * use pid as dropout seed * updata dropout test with local pattern * ad todo
-
FoolPlayer authored
* add bert align test, fix dist loss bug * forward and backward align * add ignore index * add shardformer CI * add gather_output optional for user in shardconfig * update readme with optional gather_ouput * add dist crossentropy loss test, remove unused files * remove unused file * remove unused file * rename the file * polish code
-