1. 06 May, 2023 1 commit
  2. 05 May, 2023 2 commits
    • Camille Zhong's avatar
      [chat] PPO stage3 doc enhancement (#3679) · 0f785cb1
      Camille Zhong authored
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      update roberta with coati
      
      chat ci update
      
      Revert "chat ci update"
      
      This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
      
      * Update README.md
      
      Update README.md
      
      * update readme
      
      * Update test_ci.sh
      
      * update readme and add a script
      
      update readme and add a script
      
      modify readme
      
      Update README.md
      0f785cb1
    • digger-yu's avatar
      [doc] fix chat spelling error (#3671) · 6650daeb
      digger-yu authored
      * Update README.md
      
      change "huggingaface" to "huggingface"
      
      * Update README.md
      
      change "Colossa-AI" to "Colossal-AI"
      6650daeb
  3. 28 Apr, 2023 2 commits
  4. 27 Apr, 2023 2 commits
    • Hongxin Liu's avatar
      [chat] refactor model save/load logic (#3654) · 842768a1
      Hongxin Liu authored
      * [chat] strategy refactor unwrap model
      
      * [chat] strategy refactor save model
      
      * [chat] add docstr
      
      * [chat] refactor trainer save model
      
      * [chat] fix strategy typing
      
      * [chat] refactor trainer save model
      
      * [chat] update readme
      
      * [chat] fix unit test
      842768a1
    • Camille Zhong's avatar
      [Doc] enhancement on README.md for chat examples (#3646) · 8bccb72c
      Camille Zhong authored
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      update roberta with coati
      
      chat ci update
      
      Revert "chat ci update"
      
      This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
      
      * Update README.md
      
      Update README.md
      
      * update readme
      
      * Update test_ci.sh
      8bccb72c
  5. 20 Apr, 2023 1 commit
  6. 17 Apr, 2023 1 commit
  7. 06 Apr, 2023 1 commit
  8. 29 Mar, 2023 1 commit
  9. 28 Mar, 2023 3 commits
  10. 24 Mar, 2023 2 commits
  11. 20 Mar, 2023 1 commit
    • BlueRum's avatar
      [chatgpt]Reward Model Training Process update (#3133) · 7548ca5a
      BlueRum authored
      * add normalize function to value_head in bloom rm
      
      * add normalization to value_function in gpt_rm
      
      * add normalization to value_head of opt_rm
      
      * add Anthropic/hh-rlhf dataset
      
      * Update __init__.py
      
      * Add LogExpLoss in RM training
      
      * Update __init__.py
      
      * update rm trainer to use acc as target
      
      * update example/train_rm
      
      * Update train_rm.sh
      
      * code style
      
      * Update README.md
      
      * Update README.md
      
      * add rm test to ci
      
      * fix tokenier
      
      * fix typo
      
      * change batchsize to avoid oom in ci
      
      * Update test_ci.sh
      7548ca5a
  12. 07 Mar, 2023 3 commits
  13. 02 Mar, 2023 2 commits
  14. 01 Mar, 2023 1 commit
  15. 14 Feb, 2023 1 commit