Commits · 1e1b9d2feabc6252818352fdd71772dd46fbe41d · OpenDAS / ColossalAI

22 Mar, 2023 2 commits

[chatgpt]support llama (#3070) · 1e1b9d2f
Fazzie-Maqianli authored Mar 22, 2023

1e1b9d2f

[chatgpt] add supervised learning fine-tune code (#3183) · b4295293

pgzhang authored Mar 22, 2023



* [chatgpt] add supervised fine-tune code

* [chatgpt] delete unused code and modified comment code

* [chatgpt] use pytorch distributed sampler instead

---------
Co-authored-by: zhangpengpeng <zhangpengpeng@joyy.com>

b4295293

20 Mar, 2023 1 commit

[chatgpt]Reward Model Training Process update (#3133) · 7548ca5a

BlueRum authored Mar 20, 2023

* add normalize function to value_head in bloom rm

* add normalization to value_function in gpt_rm

* add normalization to value_head of opt_rm

* add Anthropic/hh-rlhf dataset

* Update __init__.py

* Add LogExpLoss in RM training

* Update __init__.py

* update rm trainer to use acc as target

* update example/train_rm

* Update train_rm.sh

* code style

* Update README.md

* Update README.md

* add rm test to ci

* fix tokenier

* fix typo

* change batchsize to avoid oom in ci

* Update test_ci.sh

7548ca5a

17 Mar, 2023 1 commit
- [chatgpt] fix ppo training hanging problem with gemini (#3162) · c474fda2
  ver217 authored Mar 17, 2023
```
* [chatgpt] fix generation early stopping

* [chatgpt] fix train prompts example
```
  c474fda2
13 Mar, 2023 1 commit
- [chatgpt] fix lora support for gpt (#3113) · 0672b5af
  BlueRum authored Mar 13, 2023
```
* fix gpt-actor

* fix gpt-critic

* fix opt-critic
```
  0672b5af
12 Mar, 2023 1 commit
- [chatgpt] type miss of kwargs (#3107) · 191daf74
  hiko2MSP authored Mar 13, 2023
  
  191daf74
10 Mar, 2023 2 commits
- [chatgpt] fix lora save bug (#3099) · c9dd0365
  BlueRum authored Mar 10, 2023
```
* fix colo-stratergy

* polish

* fix lora

* fix ddp

* polish

* polish
```
  c9dd0365
- [chatgpt]add flag of action mask in critic(#3086) · 02ae80bf
  Fazzie-Maqianli authored Mar 10, 2023
  
  02ae80bf
08 Mar, 2023 1 commit

[chatgpt] change critic input as state (#3042) · b51bfec3

wenjunyang authored Mar 08, 2023



* fix Critic

* fix Critic

* fix Critic

* fix neglect of attention mask

* fix neglect of attention mask

* fix neglect of attention mask

* add return

---------
Co-authored-by: yangwenjun <yangwenjun@soyoung.com>
Co-authored-by: yangwjd <yangwjd@chanjet.com>

b51bfec3

07 Mar, 2023 1 commit
- change nn to models (#3032) · c21b11ed
  Fazzie-Maqianli authored Mar 07, 2023
  
  c21b11ed