Commits · cd142fbefa964d62048a9bafb180322369ab89f8 · OpenDAS / ColossalAI

22 Mar, 2023 3 commits
- [chatgpt]add reward model code for deberta (#3199) · 9998d5ef
  Yuanchen authored Mar 22, 2023
```
Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>
```
  9998d5ef
- [chatgpt]support llama (#3070) · 1e1b9d2f
  Fazzie-Maqianli authored Mar 22, 2023
  
  1e1b9d2f
- [chatgpt] add supervised learning fine-tune code (#3183) · b4295293
  pgzhang authored Mar 22, 2023
```
* [chatgpt] add supervised fine-tune code

* [chatgpt] delete unused code and modified comment code

* [chatgpt] use pytorch distributed sampler instead

---------
Co-authored-by: zhangpengpeng <zhangpengpeng@joyy.com>
```
  b4295293
20 Mar, 2023 1 commit

[chatgpt]Reward Model Training Process update (#3133) · 7548ca5a

BlueRum authored Mar 20, 2023

* add normalize function to value_head in bloom rm

* add normalization to value_function in gpt_rm

* add normalization to value_head of opt_rm

* add Anthropic/hh-rlhf dataset

* Update __init__.py

* Add LogExpLoss in RM training

* Update __init__.py

* update rm trainer to use acc as target

* update example/train_rm

* Update train_rm.sh

* code style

* Update README.md

* Update README.md

* add rm test to ci

* fix tokenier

* fix typo

* change batchsize to avoid oom in ci

* Update test_ci.sh

7548ca5a

17 Mar, 2023 3 commits
- [chatgpt] fix trainer generate kwargs (#3166) · 1e58d31b
  ver217 authored Mar 17, 2023
  
  1e58d31b
- [chatgpt] fix ppo training hanging problem with gemini (#3162) · c474fda2
  ver217 authored Mar 17, 2023
```
* [chatgpt] fix generation early stopping

* [chatgpt] fix train prompts example
```
  c474fda2
- [doc] add community contribution guide (#3153) · 3c01280a
  binmakeswell authored Mar 17, 2023
```
* [doc] update contribution guide

* [doc] update contribution guide

* [doc] add community contribution guide
```
  3c01280a
14 Mar, 2023 1 commit

[chatgpt]update ci (#3087) · 23cd5e2c

BlueRum authored Mar 14, 2023

* [chatgpt]update ci

* Update test_ci.sh

* Update test_ci.sh

* Update test_ci.sh

* test

* Update train_prompts.py

* Update train_dummy.py

* add save_path

* polish

* add save path

* polish

* add save path

* polish

* delete bloom-560m test

delete bloom-560m test because of oom

* add ddp test

23cd5e2c

13 Mar, 2023 2 commits
- [chatgpt]Fix examples (#3116) · 68577fbc
  BlueRum authored Mar 13, 2023
```
* fix train_dummy

* fix train-prompts
```
  68577fbc
- [chatgpt] fix lora support for gpt (#3113) · 0672b5af
  BlueRum authored Mar 13, 2023
```
* fix gpt-actor

* fix gpt-critic

* fix opt-critic
```
  0672b5af
12 Mar, 2023 1 commit
- [chatgpt] type miss of kwargs (#3107) · 191daf74
  hiko2MSP authored Mar 13, 2023
  
  191daf74
10 Mar, 2023 2 commits
- [chatgpt] fix lora save bug (#3099) · c9dd0365
  BlueRum authored Mar 10, 2023
```
* fix colo-stratergy

* polish

* fix lora

* fix ddp

* polish

* polish
```
  c9dd0365
- [chatgpt]add flag of action mask in critic(#3086) · 02ae80bf
  Fazzie-Maqianli authored Mar 10, 2023
  
  02ae80bf
08 Mar, 2023 1 commit

[chatgpt] change critic input as state (#3042) · b51bfec3

wenjunyang authored Mar 08, 2023



* fix Critic

* fix Critic

* fix Critic

* fix neglect of attention mask

* fix neglect of attention mask

* fix neglect of attention mask

* add return

---------
Co-authored-by: yangwenjun <yangwenjun@soyoung.com>
Co-authored-by: yangwjd <yangwjd@chanjet.com>

b51bfec3

07 Mar, 2023 5 commits
- change nn to models (#3032) · c21b11ed
  Fazzie-Maqianli authored Mar 07, 2023
  
  c21b11ed
- [format] applied code formatting on changed files in pull request 3025 (#3026) · e86d9bb2
  github-actions[bot] authored Mar 07, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
  e86d9bb2
- [chatgpt] fix readme (#3025) · 55dcd305
  BlueRum authored Mar 07, 2023
  
  55dcd305
- [chatgpt] Add saving ckpt callback for PPO (#2880) · 287d6049
  LuGY authored Mar 07, 2023
```
* add checkpoint callback for chatgpt

* add save ckpt callbacks for ppo

---------
Co-authored-by: Fazzie-Maqianli <55798671+Fazziekey@users.noreply.github.com>
```
  287d6049
- [chatgpt]fix inference model load (#2988) · e5887034
  BlueRum authored Mar 07, 2023
```
* fix lora bug

* polish

* fix lora gemini

* fix inference laod model bug
```
  e5887034
03 Mar, 2023 3 commits

[chatgpt] allow shard init and display warning (#2986) · 0ff8406b
ver217 authored Mar 03, 2023

0ff8406b
[chatgpt] fix lora gemini conflict in RM training (#2984) · f5ca0397
BlueRum authored Mar 03, 2023
```
* fix lora bug

* polish

* fix lora gemini
```
f5ca0397

[chatgpt] making experience support dp (#2971) · 19ad49fb

ver217 authored Mar 03, 2023

* [chatgpt] making experience support dp

* [chatgpt] update example test ci

* [chatgpt] update example test ci

* [chatgpt] update example test ci

* [chatgpt] update example test ci

* [chatgpt] update sampler

* [chatgpt] update example test ci

* [chatgpt] refactor sampler

* [chatgpt] update example test ci

19ad49fb

02 Mar, 2023 4 commits
- [chatgpt]fix lora bug (#2974) · c9e27f0d
  BlueRum authored Mar 02, 2023
```
* fix lora bug

* polish
```
  c9e27f0d
- [chatgpt] fix inference demo loading bug (#2969) · 82149e9d
  BlueRum authored Mar 02, 2023
```
* [chatgpt] fix inference demo loading bug

* polish
```
  82149e9d
- [ChatGPT] fix README (#2966) · bbf9c827
  Fazzie-Maqianli authored Mar 02, 2023
```
* Update README.md

* fix README

* Update README.md

* Update README.md

---------
Co-authored-by: fastalgo <youyang@cs.berkeley.edu>
Co-authored-by: BlueRum <70618399+ht-zhou@users.noreply.github.com>
```
  bbf9c827
- [doc] fix chatgpt inference typo (#2964) · b0a87663
  binmakeswell authored Mar 02, 2023
  
  b0a87663
01 Mar, 2023 1 commit

[chatgpt]add inference example (#2944) · 489a9566

BlueRum authored Mar 01, 2023

* [chatgpt] support inference example

* Create inference.sh

* Update README.md

* Delete inference.sh

* Update inference.py

489a9566

28 Feb, 2023 1 commit
- [doc] add env scope (#2933) · 8264cd7e
  binmakeswell authored Feb 28, 2023
  
  8264cd7e
22 Feb, 2023 2 commits

[chatgpt]support opt & gpt for rm training (#2876) · 2e16f842
BlueRum authored Feb 22, 2023

2e16f842

[chatgpt] Support saving ckpt in examples (#2846) · 34ca324b

BlueRum authored Feb 22, 2023

* [chatgpt]fix train_rm bug with lora

* [chatgpt]support colossalai strategy to train rm

* fix pre-commit

* fix pre-commit 2

* [chatgpt]fix rm eval typo

* fix rm eval

* fix pre commit

* add support of saving ckpt in examples

* fix single-gpu save

34ca324b

21 Feb, 2023 1 commit

[chatgpt] fix rm eval (#2829) · 3eebc4df

BlueRum authored Feb 21, 2023

* [chatgpt]fix train_rm bug with lora

* [chatgpt]support colossalai strategy to train rm

* fix pre-commit

* fix pre-commit 2

* [chatgpt]fix rm eval typo

* fix rm eval

* fix pre commit

3eebc4df

20 Feb, 2023 1 commit
- [chatgpt] add test checkpoint (#2797) · b6a108cb
  ver217 authored Feb 20, 2023
```
* [chatgpt] add test checkpoint

* [chatgpt] test checkpoint use smaller model
```
  b6a108cb
17 Feb, 2023 2 commits

[chatgpt] update readme about checkpoint (#2792) · a619a190

ver217 authored Feb 17, 2023

* [chatgpt] add save/load checkpoint sample code

* [chatgpt] add save/load checkpoint readme

* [chatgpt] refactor save/load checkpoint readme

a619a190

[chatgpt] startegy add prepare method (#2766) · 4ee311c0

ver217 authored Feb 17, 2023

* [chatgpt] startegy add prepare method

* [chatgpt] refactor examples

* [chatgpt] refactor strategy.prepare

* [chatgpt] support save/load checkpoint

* [chatgpt] fix unwrap actor

* [chatgpt] fix unwrap actor

4ee311c0

16 Feb, 2023 3 commits
- [chatgpt] disable shard init for colossalai (#2767) · a88bc828
  ver217 authored Feb 16, 2023
  
  a88bc828
- [chatgpt] support colossalai strategy to train rm (#2742) · 613efebc
  BlueRum authored Feb 16, 2023
```
* [chatgpt]fix train_rm bug with lora

* [chatgpt]support colossalai strategy to train rm

* fix pre-commit

* fix pre-commit 2
```
  613efebc
- [chatgpt]fix train_rm bug with lora (#2741) · 648183a9
  BlueRum authored Feb 16, 2023
  
  648183a9
15 Feb, 2023 3 commits
- fix typo (#2721) · 7aacfad8
  CH.Li authored Feb 15, 2023
  
  7aacfad8
- [chatgpt] optimize generation kwargs (#2717) · 9c0943ec
  ver217 authored Feb 15, 2023
```
* [chatgpt] ppo trainer use default generate args

* [chatgpt] example remove generation preparing fn

* [chatgpt] benchmark remove generation preparing fn

* [chatgpt] fix ci
```
  9c0943ec
- [doc] add open-source contribution invitation (#2714) · d4d3387f
  binmakeswell authored Feb 15, 2023
```
* [doc] fix typo

* [doc] add invitation
```
  d4d3387f