- 26 Jul, 2023 1 commit
-
-
shenggan authored
-
- 29 Jun, 2023 1 commit
-
-
Wenhao Chen authored
* to: add SLTrainer * refactor: refactor RMTrainer and SFTTrainer * fix: fix init file * feat: remove on_learn_epoch fn as not used * fix: align with modified gemini arguments * to: add OnPolicyTrainer * revert: add _on_learn_epoch fn * refactor: refactor PPOTrainer * style: rename PPOTrainer argument * fix: align with modified PPO arguments * test: align with modified train_prompts arguments * chore: modify train_prompts * docs: align with modified arguments * fix: remove unnecessary output * fix: move dataloader to fit fn of SLTrainer * fix: move dataloader to fit fn of OnPolicyTrainer * fix: modify usage of prompt and pretrain dataloader
-
- 26 Apr, 2023 1 commit
-
-
Hongxin Liu authored
* [chat] ppo trainer remove useless args * [chat] update examples * [chat] update benchmark * [chat] update examples * [chat] fix sft training with wandb * [chat] polish docstr
-
- 18 Apr, 2023 1 commit
-
-
Yuanchen authored
Co-authored-by:Yuanchen Xu <yuanchen.xu00@gmail.com>
-
- 28 Mar, 2023 1 commit
-
-
Fazzie-Maqianli authored
-
- 03 Mar, 2023 1 commit
-
-
ver217 authored
* [chatgpt] making experience support dp * [chatgpt] update example test ci * [chatgpt] update example test ci * [chatgpt] update example test ci * [chatgpt] update example test ci * [chatgpt] update sampler * [chatgpt] update example test ci * [chatgpt] refactor sampler * [chatgpt] update example test ci
-
- 14 Feb, 2023 1 commit
-
-
ver217 authored
-