1. 14 Aug, 2023 1 commit
    • Wenhao Chen's avatar
      [doc] update Coati README (#4405) · 6d41c3f2
      Wenhao Chen authored
      * style: apply formatter
      
      * fix: add outdated warnings
      
      * docs: add dataset format and polish
      
      * docs: polish README
      
      * fix: fix json format
      
      * fix: fix typos
      
      * revert: revert 7b example
      6d41c3f2
  2. 28 Jul, 2023 1 commit
  3. 29 Jun, 2023 2 commits
    • Wenhao Chen's avatar
      [chat] remove naive strategy and split colossalai strategy (#4094) · edd75a59
      Wenhao Chen authored
      * feat: remove on_learn_epoch fn as not used
      
      * revert: add _on_learn_epoch fn
      
      * to: remove the use of NaiveStrategy
      
      * test: remove NaiveStrategy tests
      
      * feat: remove NaiveStrategy
      
      * style: modify comments and params
      
      * feat: split ColossalAIStrategy into LowLevelZeroStrategy and GeminiStrategy
      
      * fix: remove naive
      
      * fix: align with modified colossal strategy
      
      * fix: fix ddp _try_init_dist arg
      edd75a59
    • Wenhao Chen's avatar
      [chat] refactor trainer class (#4080) · b03d64d0
      Wenhao Chen authored
      * to: add SLTrainer
      
      * refactor: refactor RMTrainer and SFTTrainer
      
      * fix: fix init file
      
      * feat: remove on_learn_epoch fn as not used
      
      * fix: align with modified gemini arguments
      
      * to: add OnPolicyTrainer
      
      * revert: add _on_learn_epoch fn
      
      * refactor: refactor PPOTrainer
      
      * style: rename PPOTrainer argument
      
      * fix: align with modified PPO arguments
      
      * test: align with modified train_prompts arguments
      
      * chore: modify train_prompts
      
      * docs: align with modified arguments
      
      * fix: remove unnecessary output
      
      * fix: move dataloader to fit fn of SLTrainer
      
      * fix: move dataloader to fit fn of OnPolicyTrainer
      
      * fix: modify usage of prompt and pretrain dataloader
      b03d64d0
  4. 19 May, 2023 1 commit
  5. 17 May, 2023 1 commit
  6. 06 May, 2023 1 commit
  7. 05 May, 2023 2 commits
    • Camille Zhong's avatar
      [chat] PPO stage3 doc enhancement (#3679) · 0f785cb1
      Camille Zhong authored
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      update roberta with coati
      
      chat ci update
      
      Revert "chat ci update"
      
      This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
      
      * Update README.md
      
      Update README.md
      
      * update readme
      
      * Update test_ci.sh
      
      * update readme and add a script
      
      update readme and add a script
      
      modify readme
      
      Update README.md
      0f785cb1
    • digger-yu's avatar
      [doc] fix chat spelling error (#3671) · 6650daeb
      digger-yu authored
      * Update README.md
      
      change "huggingaface" to "huggingface"
      
      * Update README.md
      
      change "Colossa-AI" to "Colossal-AI"
      6650daeb
  8. 28 Apr, 2023 2 commits
  9. 27 Apr, 2023 2 commits
    • Hongxin Liu's avatar
      [chat] refactor model save/load logic (#3654) · 842768a1
      Hongxin Liu authored
      * [chat] strategy refactor unwrap model
      
      * [chat] strategy refactor save model
      
      * [chat] add docstr
      
      * [chat] refactor trainer save model
      
      * [chat] fix strategy typing
      
      * [chat] refactor trainer save model
      
      * [chat] update readme
      
      * [chat] fix unit test
      842768a1
    • Camille Zhong's avatar
      [Doc] enhancement on README.md for chat examples (#3646) · 8bccb72c
      Camille Zhong authored
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      update roberta with coati
      
      chat ci update
      
      Revert "chat ci update"
      
      This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
      
      * Update README.md
      
      Update README.md
      
      * update readme
      
      * Update test_ci.sh
      8bccb72c
  10. 20 Apr, 2023 1 commit
  11. 17 Apr, 2023 1 commit
  12. 06 Apr, 2023 1 commit
  13. 29 Mar, 2023 1 commit
  14. 28 Mar, 2023 3 commits
  15. 24 Mar, 2023 2 commits
  16. 20 Mar, 2023 1 commit
    • BlueRum's avatar
      [chatgpt]Reward Model Training Process update (#3133) · 7548ca5a
      BlueRum authored
      * add normalize function to value_head in bloom rm
      
      * add normalization to value_function in gpt_rm
      
      * add normalization to value_head of opt_rm
      
      * add Anthropic/hh-rlhf dataset
      
      * Update __init__.py
      
      * Add LogExpLoss in RM training
      
      * Update __init__.py
      
      * update rm trainer to use acc as target
      
      * update example/train_rm
      
      * Update train_rm.sh
      
      * code style
      
      * Update README.md
      
      * Update README.md
      
      * add rm test to ci
      
      * fix tokenier
      
      * fix typo
      
      * change batchsize to avoid oom in ci
      
      * Update test_ci.sh
      7548ca5a
  17. 07 Mar, 2023 3 commits
  18. 02 Mar, 2023 2 commits
  19. 01 Mar, 2023 1 commit
  20. 14 Feb, 2023 1 commit