Commits · e3551443751e5ff1ced4215afe76fb8e22ded06b · OpenDAS / ColossalAI

17 Apr, 2023 1 commit

[chatgpt] Detached PPO Training (#3195) · e3551443

csric authored Apr 17, 2023



* run the base

* working on dist ppo

* sync

* detached trainer

* update detached trainer. no maker update function

* facing init problem

* 1 maker 1 trainer detached run. but no model update

* facing cuda problem

* fix save functions

* verified maker update

* nothing

* add ignore

* analyize loss issue

* remove some debug codes

* facing 2m1t stuck issue

* 2m1t verified

* do not use torchrun

* working on 2m2t

* working on 2m2t

* initialize strategy in ray actor env

* facing actor's init order issue

* facing ddp model update issue (need unwarp ddp)

* unwrap ddp actor

* checking 1m2t stuck problem

* nothing

* set timeout for trainer choosing. It solves the stuck problem!

* delete some debug output

* rename to sync with upstream

* rename to sync with upstream

* coati rename

* nothing

* I am going to detach the replaybuffer from trainer and make it a Ray Actor. Two benefits: 1. support TP trainer. 2. asynchronized buffer operations

* experience_maker_holder performs target-revolving _send_experience() instead of length comparison.

* move code to ray subfolder

* working on pipeline inference

* apply comments

---------
Co-authored-by: csric <richcsr256@gmail.com>

e3551443

13 Apr, 2023 1 commit

[chat] ChatGPT train prompts on ray example (#3309) · 1a809edd

MisterLin1995 authored Apr 13, 2023



* [feat][chatgpt]train prompts on ray example

* [fix]simplify code

* [fix]remove depreciated parameter

* [fix]add dependencies

* [fix]method calling

* [fix]experience maker

* [fix]missing loss function

* [fix]init optimizer

* [feat]add usage comment

* [fix]rename files

* [fix]add readme

* [fix]file path

* [fix]move directory

---------
Co-authored-by: jiangwen <zxl265370@antgroup.com>

1a809edd

10 Apr, 2023 2 commits
- [chat] add zero2 cpu strategy for sft training (#3520) · 89fd10a1
  ver217 authored Apr 10, 2023
  
  89fd10a1
- [Chat Community] Update README.md (fixed#3487) (#3506) · 635d0a1b
  NatalieC323 authored Apr 10, 2023
```
* Update README.md

* Update README.md

* Update README.md

* Update README.md

---------
Co-authored-by: Fazzie-Maqianli <55798671+Fazziekey@users.noreply.github.com>
```
  635d0a1b
06 Apr, 2023 5 commits

[chat] fix stage3 PPO sample sh command (#3477) · 891b8e7f
binmakeswell authored Apr 06, 2023

891b8e7f
add community example dictionary (#3465) · 6afeb120
Fazzie-Maqianli authored Apr 06, 2023

6afeb120

[Chat]Add Peft support & fix the ptx bug (#3433) · 62f4e2eb

YY Lin authored Apr 06, 2023

* Update ppo.py

Fix the bug of fetching wrong batch data

* Add peft model support in SFT and Prompts training

In stage-1 and stage-3, the peft model supports are added. So the trained artifacts will be only a small lora additions instead of the whole bunch of files.

* Delete test_prompts.txt

* Delete test_pretrained.txt

* Move the peft stuffs to a community folder.

* Move the demo sft to community

* delete dirty files

* Add instructions to install peft using source

* Remove Chinese comments

* remove the Chinese comments

62f4e2eb

[chat]fix readme (#3429) · 57a3c4db
kingkingofall authored Apr 06, 2023
```
* fix stage 2

fix stage 2

* add torch
```
57a3c4db

[Chat] fix the tokenizer "int too big to convert" error in SFT training (#3453) · 72cb4dd4

Camille Zhong authored Apr 06, 2023

* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* update roberta with coati

* chat ci update

* Revert "chat ci update"

This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.

* [Chat] fix the tokenizer "int too big to convert" error in SFT training

fix the tokenizer error during SFT training using Bloom and OPT

72cb4dd4

03 Apr, 2023 1 commit

[chatgpt] add pre-trained model RoBERTa for RLHF stage 2 & 3 (#3223) · 30412866

Camille Zhong authored Apr 03, 2023

* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* add test for reward model training

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* Add RoBERTa for RLHF Stage 2 & 3 (test)

RoBERTa for RLHF Stage 2 & 3 (still in testing)

* Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"

This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.

* Add RoBERTa for RLHF stage 2 & 3

1. add roberta folder under model folder
2. add  roberta option in train_reward_model.py
3. add some test in testci

* Update test_ci.sh

* Revert "Update test_ci.sh"

This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.

* update roberta with coati

30412866

29 Mar, 2023 2 commits
- [format] applied code formatting on changed files in pull request 3300 (#3302) · cb413ccf
  github-actions[bot] authored Mar 29, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
  cb413ccf
- [chat]polish prompts training (#3300) · 8257e105
  BlueRum authored Mar 29, 2023
```
* polish train_prompts

* polish readme
```
  8257e105
28 Mar, 2023 3 commits
- [format] applied code formatting on changed files in pull request 3296 (#3298) · 5134ad5d
  github-actions[bot] authored Mar 29, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
  5134ad5d
- [chat]Update Readme (#3296) · c8b723d6
  BlueRum authored Mar 29, 2023
```
* Update README.md

* Update README.md

* Update README.md

* update example readme
```
  c8b723d6
- [Coati] first commit (#3283) · b0ce5a10
  Fazzie-Maqianli authored Mar 28, 2023
  
  b0ce5a10