- 25 Jun, 2023 3 commits
-
-
Baizhou Zhang authored
[gemini] Rename arguments in chunk configuration searching
-
Wenhao Chen authored
* refactor: adapt boost API in base and naive strategies * fix: initialize plugin after setup_distributed * fix: fix save_pretrained fn * refactor: adapt boost API in DDPStrategy * to: add _post_init check * to: fix ddp backward, modify ddp dataloader and unwrap * feat: adapt boost API in ColossalAIStrategy * fix: call setup_distributed before use get_current_device * fix: fix save_model and save_optimizer * test: remove save_sharded_optimizer test * style: apply formatter * fix: fix stage check and add comments * feat: allow dict type arg in strategy.prepare * to: temporarily remove lr_scheduler for testing * style: simplify init of ColossalAIStrategy * fix: fix lr_scheduler in sft and rm * style: modify comments * test: add train_prompts tests * fix: fix inference only case and use in train_prompts * test: skip failed tests in ci * style: fix CodeFactor check * fix: do not use model.to('cpu') with GeminiPlugin * test: enable colossalai_gemini tests * test: set CUDA_VISIBLE_DEVICES in ci * docs: add note -
Baizhou Zhang authored
-
- 22 Jun, 2023 1 commit
-
-
Frank Lee authored
-
- 19 Jun, 2023 5 commits
-
-
Hongxin Liu authored
* [devops] fix build on pr ci * [devops] fix build on pr ci
-
github-actions[bot] authored
Co-authored-by:github-actions <github-actions@github.com>
-
LuGY authored
-
digger yu authored
-
Frank Lee authored
[sync] sync develop to main
-
- 16 Jun, 2023 2 commits
-
-
Frank Lee authored
-
Baizhou Zhang authored
-
- 15 Jun, 2023 3 commits
-
-
Wenhao Chen authored
* feat: make optimizer optional in Booster.boost * test: skip unet test if diffusers version > 0.10.2
-
Baizhou Zhang authored
-
digger yu authored
-
- 14 Jun, 2023 1 commit
-
-
Baizhou Zhang authored
-
- 13 Jun, 2023 3 commits
-
-
Yuanchen authored
Co-authored-by:Yuanchen Xu <yuanchen.xu00@gmail.com>
-
Frank Lee authored
-
Wenhao Chen authored
* refactor: separate log_probs fn from Actor forward fn * refactor: separate generate fn from Actor class * feat: update unwrap_model and get_base_model * unwrap_model returns model not wrapped by Strategy * get_base_model returns HF model for Actor, Critic and RewardModel * feat: simplify Strategy.prepare * style: remove get_base_model method of Actor * perf: tokenize text in batches * refactor: move calc_action_log_probs to utils of model * test: update test with new forward fn * style: rename forward fn args * fix: do not unwrap model in save_model fn of naive strategy * test: add gemini test for train_prompts * fix: fix _set_default_generate_kwargs
-
- 12 Jun, 2023 5 commits
-
-
Frank Lee authored
[sync] update develop branch with main
-
Frank Lee authored
-
Frank Lee authored
-
Baizhou Zhang authored
-
Frank Lee authored
-
- 09 Jun, 2023 7 commits
-
-
digger yu authored
-
digger yu authored
-
Frank Lee authored
-
FoolPlayer authored
Revert "[sync] sync feature/shardformer with develop"
-
Frank Lee authored
-
FoolPlayer authored
[sync] sync feature/shardformer with develop
-
Liu Ziming authored
[example] Adding an example of training dreambooth with the new booster API
-
- 08 Jun, 2023 10 commits
-
-
Yuanchen authored
Co-authored-by:Yuanchen Xu <yuanchen.xu00@gmail.com>
-
digger yu authored
-
FoolPlayer authored
* add gpt2 policy and modify shard and slicer to support * remove unused code * polish code
-
FoolPlayer authored
-
FoolPlayer authored
* add dropout layer, add dropout test * modify seed manager as context manager * add a copy of col_nn.layer * add dist_crossentropy loss; separate module test * polish the code * fix dist crossentropy loss
-
FoolPlayer authored
* update readme with modules content * remove img
-
Frank Lee authored
* [shardformer] refactored the user api * polish code
-
Frank Lee authored
-
FoolPlayer authored
* init shardformer code structure * add implement of sharder (inject and replace) * add implement of replace layer to colossal layer * separate different layer policy, add some notion * implement 1d and 2d slicer, can tell col or row * fix bug when slicing and inject model * fix some bug; add inference test example * add share weight and train example * add train * add docstring and readme * add docstring for other files * pre-commit
-
FoolPlayer authored
* init shardformer code structure * add implement of sharder (inject and replace) * add implement of replace layer to colossal layer * separate different layer policy, add some notion * implement 1d and 2d slicer, can tell col or row * fix bug when slicing and inject model * fix some bug; add inference test example
-