Commits · 7548ca5a54ed117f03247dcb43ec1dd962ae04e0 · OpenDAS / ColossalAI

20 Mar, 2023 1 commit

[chatgpt]Reward Model Training Process update (#3133) · 7548ca5a

BlueRum authored Mar 20, 2023

* add normalize function to value_head in bloom rm

* add normalization to value_function in gpt_rm

* add normalization to value_head of opt_rm

* add Anthropic/hh-rlhf dataset

* Update __init__.py

* Add LogExpLoss in RM training

* Update __init__.py

* update rm trainer to use acc as target

* update example/train_rm

* Update train_rm.sh

* code style

* Update README.md

* Update README.md

* add rm test to ci

* fix tokenier

* fix typo

* change batchsize to avoid oom in ci

* Update test_ci.sh

7548ca5a

17 Mar, 2023 1 commit
- [chatgpt] fix ppo training hanging problem with gemini (#3162) · c474fda2
  ver217 authored Mar 17, 2023
```
* [chatgpt] fix generation early stopping

* [chatgpt] fix train prompts example
```
  c474fda2
14 Mar, 2023 1 commit

[chatgpt]update ci (#3087) · 23cd5e2c

BlueRum authored Mar 14, 2023

* [chatgpt]update ci

* Update test_ci.sh

* Update test_ci.sh

* Update test_ci.sh

* test

* Update train_prompts.py

* Update train_dummy.py

* add save_path

* polish

* add save path

* polish

* add save path

* polish

* delete bloom-560m test

delete bloom-560m test because of oom

* add ddp test

23cd5e2c

13 Mar, 2023 1 commit
- [chatgpt]Fix examples (#3116) · 68577fbc
  BlueRum authored Mar 13, 2023
```
* fix train_dummy

* fix train-prompts
```
  68577fbc
07 Mar, 2023 5 commits
- change nn to models (#3032) · c21b11ed
  Fazzie-Maqianli authored Mar 07, 2023
  
  c21b11ed
- [format] applied code formatting on changed files in pull request 3025 (#3026) · e86d9bb2
  github-actions[bot] authored Mar 07, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
  e86d9bb2
- [chatgpt] fix readme (#3025) · 55dcd305
  BlueRum authored Mar 07, 2023
  
  55dcd305
- [chatgpt] Add saving ckpt callback for PPO (#2880) · 287d6049
  LuGY authored Mar 07, 2023
```
* add checkpoint callback for chatgpt

* add save ckpt callbacks for ppo

---------
Co-authored-by: Fazzie-Maqianli <55798671+Fazziekey@users.noreply.github.com>
```
  287d6049
- [chatgpt]fix inference model load (#2988) · e5887034
  BlueRum authored Mar 07, 2023
```
* fix lora bug

* polish

* fix lora gemini

* fix inference laod model bug
```
  e5887034
03 Mar, 2023 2 commits

[chatgpt] fix lora gemini conflict in RM training (#2984) · f5ca0397
BlueRum authored Mar 03, 2023
```
* fix lora bug

* polish

* fix lora gemini
```
f5ca0397

[chatgpt] making experience support dp (#2971) · 19ad49fb

ver217 authored Mar 03, 2023

* [chatgpt] making experience support dp

* [chatgpt] update example test ci

* [chatgpt] update example test ci

* [chatgpt] update example test ci

* [chatgpt] update example test ci

* [chatgpt] update sampler

* [chatgpt] update example test ci

* [chatgpt] refactor sampler

* [chatgpt] update example test ci

19ad49fb

02 Mar, 2023 4 commits
- [chatgpt]fix lora bug (#2974) · c9e27f0d
  BlueRum authored Mar 02, 2023
```
* fix lora bug

* polish
```
  c9e27f0d
- [chatgpt] fix inference demo loading bug (#2969) · 82149e9d
  BlueRum authored Mar 02, 2023
```
* [chatgpt] fix inference demo loading bug

* polish
```
  82149e9d
- [ChatGPT] fix README (#2966) · bbf9c827
  Fazzie-Maqianli authored Mar 02, 2023
```
* Update README.md

* fix README

* Update README.md

* Update README.md

---------
Co-authored-by: fastalgo <youyang@cs.berkeley.edu>
Co-authored-by: BlueRum <70618399+ht-zhou@users.noreply.github.com>
```
  bbf9c827
- [doc] fix chatgpt inference typo (#2964) · b0a87663
  binmakeswell authored Mar 02, 2023
  
  b0a87663
01 Mar, 2023 1 commit

[chatgpt]add inference example (#2944) · 489a9566

BlueRum authored Mar 01, 2023

* [chatgpt] support inference example

* Create inference.sh

* Update README.md

* Delete inference.sh

* Update inference.py

489a9566

22 Feb, 2023 2 commits

[chatgpt]support opt & gpt for rm training (#2876) · 2e16f842
BlueRum authored Feb 22, 2023

2e16f842

[chatgpt] Support saving ckpt in examples (#2846) · 34ca324b

BlueRum authored Feb 22, 2023

* [chatgpt]fix train_rm bug with lora

* [chatgpt]support colossalai strategy to train rm

* fix pre-commit

* fix pre-commit 2

* [chatgpt]fix rm eval typo

* fix rm eval

* fix pre commit

* add support of saving ckpt in examples

* fix single-gpu save

34ca324b

21 Feb, 2023 1 commit

[chatgpt] fix rm eval (#2829) · 3eebc4df

BlueRum authored Feb 21, 2023

* [chatgpt]fix train_rm bug with lora

* [chatgpt]support colossalai strategy to train rm

* fix pre-commit

* fix pre-commit 2

* [chatgpt]fix rm eval typo

* fix rm eval

* fix pre commit

3eebc4df

17 Feb, 2023 1 commit

[chatgpt] startegy add prepare method (#2766) · 4ee311c0

ver217 authored Feb 17, 2023

* [chatgpt] startegy add prepare method

* [chatgpt] refactor examples

* [chatgpt] refactor strategy.prepare

* [chatgpt] support save/load checkpoint

* [chatgpt] fix unwrap actor

* [chatgpt] fix unwrap actor

4ee311c0

16 Feb, 2023 1 commit

[chatgpt] support colossalai strategy to train rm (#2742) · 613efebc

BlueRum authored Feb 16, 2023

* [chatgpt]fix train_rm bug with lora

* [chatgpt]support colossalai strategy to train rm

* fix pre-commit

* fix pre-commit 2

613efebc

15 Feb, 2023 1 commit

[chatgpt] optimize generation kwargs (#2717) · 9c0943ec

ver217 authored Feb 15, 2023

* [chatgpt] ppo trainer use default generate args

* [chatgpt] example remove generation preparing fn

* [chatgpt] benchmark remove generation preparing fn

* [chatgpt] fix ci

9c0943ec

14 Feb, 2023 1 commit
- [app] add chatgpt application (#2698) · 1b347010
  ver217 authored Feb 14, 2023
  
  1b347010