Commits · 2c8ae37f61f123a305f7fe66af29140fe0f68a34 · OpenDAS / ColossalAI

25 Jun, 2023 3 commits

Merge pull request #4056 from Fridge003/hotfix/fix_gemini_chunk_config_searching · 2c8ae37f
Baizhou Zhang authored Jun 25, 2023
```
[gemini] Rename arguments in chunk configuration searching
```
2c8ae37f

[chat] refactor strategy class with booster api (#3987) · 153b957a

Wenhao Chen authored Jun 25, 2023

* refactor: adapt boost API in base and naive strategies

* fix: initialize plugin after setup_distributed

* fix: fix save_pretrained fn

* refactor: adapt boost API in DDPStrategy

* to: add _post_init check

* to: fix ddp backward, modify ddp dataloader and unwrap

* feat: adapt boost API in ColossalAIStrategy

* fix: call setup_distributed before use get_current_device

* fix: fix save_model and save_optimizer

* test: remove save_sharded_optimizer test

* style: apply formatter

* fix: fix stage check and add comments

* feat: allow dict type arg in strategy.prepare

* to: temporarily remove lr_scheduler for testing

* style: simplify init of ColossalAIStrategy

* fix: fix lr_scheduler in sft and rm

* style: modify comments

* test: add train_prompts tests

* fix: fix inference only case and use in train_prompts

* test: skip failed tests in ci

* style: fix CodeFactor check

* fix: do not use model.to('cpu') with GeminiPlugin

* test: enable colossalai_gemini tests

* test: set CUDA_VISIBLE_DEVICES in ci

* docs: add note

153b957a

[gemini] fix argument naming during chunk configuration searching · 0bb0b481
Baizhou Zhang authored Jun 25, 2023

0bb0b481

22 Jun, 2023 1 commit
- [workflow] cover all public repositories in weekly report (#4069) · b463651f
  Frank Lee authored Jun 22, 2023
  
  b463651f
19 Jun, 2023 5 commits
- [devops] fix build on pr ci (#4043) · 4a81faa5
  Hongxin Liu authored Jun 19, 2023
```
* [devops] fix build on pr ci

* [devops] fix build on pr ci
```
  4a81faa5
- [format] applied code formatting on changed files in pull request 4021 (#4022) · a52f6208
  github-actions[bot] authored Jun 19, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
  a52f6208
- [example] fix bucket size in example of gpt gemini (#4028) · 160c64c6
  LuGY authored Jun 19, 2023
  
  160c64c6
- [nfc] fix dim not defined and fix typo (#3991) · 727c4598
  digger yu authored Jun 19, 2023
  
  727c4598
- Merge pull request #4025 from hpcaitech/develop · ca768eb6
  Frank Lee authored Jun 19, 2023
```
[sync] sync develop to main
```
  ca768eb6
16 Jun, 2023 2 commits
- [test] fixed codefactor format report (#4026) · a5883aa7
  Frank Lee authored Jun 16, 2023
  
  a5883aa7
- [checkpointio] sharded optimizer checkpoint for DDP plugin (#4002) · 822c3d4d
  Baizhou Zhang authored Jun 16, 2023
  
  822c3d4d
15 Jun, 2023 3 commits
- [booster] make optimizer argument optional for boost (#3993) · 725af3ee
  Wenhao Chen authored Jun 15, 2023
```
* feat: make optimizer optional in Booster.boost

* test: skip unet test if diffusers version > 0.10.2
```
  725af3ee
- [checkpointio] General Checkpointing of Sharded Optimizers (#3984) · c9cff7e7
  Baizhou Zhang authored Jun 15, 2023
  
  c9cff7e7
- fix typo applications/Chat/coati/ (#3947) · d4fb7bfd
  digger yu authored Jun 15, 2023
  
  d4fb7bfd
14 Jun, 2023 1 commit
- [doc] add a note about unit-testing to CONTRIBUTING.md (#3970) · e8ad3c88
  Baizhou Zhang authored Jun 14, 2023
  
  e8ad3c88
13 Jun, 2023 3 commits

[evaluate] support gpt evaluation with reference (#3972) · 2925f473
Yuanchen authored Jun 13, 2023
```
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
```
2925f473
[workflow] fixed the directory check in build (#3980) · 8bcad736
Frank Lee authored Jun 13, 2023

8bcad736

[chat] refactor actor class (#3968) · 9d02590c

Wenhao Chen authored Jun 13, 2023

* refactor: separate log_probs fn from Actor forward fn

* refactor: separate generate fn from Actor class

* feat: update unwrap_model and get_base_model
* unwrap_model returns model not wrapped by Strategy
* get_base_model returns HF model for Actor, Critic and RewardModel

* feat: simplify Strategy.prepare

* style: remove get_base_model method of Actor

* perf: tokenize text in batches

* refactor: move calc_action_log_probs to utils of model

* test: update test with new forward fn

* style: rename forward fn args

* fix: do not unwrap model in save_model fn of naive strategy

* test: add gemini test for train_prompts

* fix: fix _set_default_generate_kwargs

9d02590c

12 Jun, 2023 5 commits
- Merge pull request #3967 from ver217/update-develop · 2bf6547a
  Frank Lee authored Jun 12, 2023
```
[sync] update develop branch with main
```
  2bf6547a
- [workflow] cancel duplicated workflow jobs (#3960) · 6718a2f2
  Frank Lee authored Jun 12, 2023
  
  6718a2f2
- [gemini] fixed the gemini checkpoint io (#3934) · 71fe5276
  Frank Lee authored Jun 09, 2023
  
  71fe5276
- [example] update ViT example using booster api (#3940) · b3ab7fba
  Baizhou Zhang authored Jun 12, 2023
  
  b3ab7fba
- [workflow] cancel duplicated workflow jobs (#3960) · 4110d1f0
  Frank Lee authored Jun 12, 2023
  
  4110d1f0
09 Jun, 2023 7 commits
- fix typo .github/workflows/scripts/ (#3946) · 1aadeede
  digger yu authored Jun 09, 2023
  
  1aadeede
- fix typo tests/ (#3936) · e61ffc77
  digger yu authored Jun 09, 2023
  
  e61ffc77
- [gemini] fixed the gemini checkpoint io (#3934) · bd1ab981
  Frank Lee authored Jun 09, 2023
  
  bd1ab981
- Merge pull request #3942 from hpcaitech/revert-3931-sync/develop-to-shardformer · bd2c7c32
  FoolPlayer authored Jun 09, 2023
```
Revert "[sync] sync feature/shardformer with develop"
```
  bd2c7c32
- Revert "[sync] sync feature/shardformer with develop" · ddcf58ca
  Frank Lee authored Jun 09, 2023
  
  ddcf58ca
- Merge pull request #3931 from FrankLeeeee/sync/develop-to-shardformer · 24651fdd
  FoolPlayer authored Jun 09, 2023
```
[sync] sync feature/shardformer with develop
```
  24651fdd
- Merge pull request #3905 from MaruyamaAya/dreambooth · e277534a
  Liu Ziming authored Jun 09, 2023
```
[example] Adding an example of training dreambooth with the new booster API
```
  e277534a
08 Jun, 2023 10 commits

support UniEval and add CHRF metric (#3924) · 21c4c0b1
Yuanchen authored Jun 08, 2023
```
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
```
21c4c0b1
fix typo examples and docs (#3932) · 33eef714
digger yu authored Jun 08, 2023

33eef714
[shardformer] add gpt2 policy and modify shard and slicer to support (#3883) · ef153775
FoolPlayer authored Jun 07, 2023
```
* add gpt2 policy and modify shard and slicer to support

* remove unused code

* polish code
```
ef153775
update README (#3909) · 6370a935
FoolPlayer authored Jun 06, 2023

6370a935

[shardformer] add Dropout layer support different dropout pattern (#3856) · 21a3915c

FoolPlayer authored Jun 01, 2023

* add dropout layer, add dropout test

* modify seed manager as context manager

* add a copy of col_nn.layer

* add dist_crossentropy loss; separate module test

* polish the code

* fix dist crossentropy loss

21a3915c

[shardformer] update readme with modules implement doc (#3834) · 997544c1
FoolPlayer authored May 24, 2023
```
* update readme with modules content

* remove img
```
997544c1
[shardformer] refactored the user api (#3828) · 537a52b7
Frank Lee authored May 24, 2023
```
* [shardformer] refactored the user api

* polish code
```
537a52b7
[shardformer] updated readme (#3827) · bc19024b
Frank Lee authored May 24, 2023

bc19024b

[shardformer]: Feature/shardformer, add some docstring and readme (#3816) · 58f64324

FoolPlayer authored May 24, 2023

* init shardformer code structure

* add implement of sharder (inject and replace)

* add implement of replace layer to colossal layer

* separate different layer policy, add some notion

* implement 1d and 2d slicer, can tell col or row

* fix bug when slicing and inject model

* fix some bug; add inference test example

* add share weight and train example

* add train

* add docstring and readme

* add docstring for other files

* pre-commit

58f64324

[shardformer] init shardformer code structure (#3731) · 6a69b44d

FoolPlayer authored May 22, 2023

* init shardformer code structure

* add implement of sharder (inject and replace)

* add implement of replace layer to colossal layer

* separate different layer policy, add some notion

* implement 1d and 2d slicer, can tell col or row

* fix bug when slicing and inject model

* fix some bug; add inference test example

6a69b44d