1. 27 Apr, 2023 1 commit
    • Camille Zhong's avatar
      [Doc] enhancement on README.md for chat examples (#3646) · 8bccb72c
      Camille Zhong authored
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      update roberta with coati
      
      chat ci update
      
      Revert "chat ci update"
      
      This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
      
      * Update README.md
      
      Update README.md
      
      * update readme
      
      * Update test_ci.sh
      8bccb72c
  2. 18 Apr, 2023 1 commit
    • Camille Zhong's avatar
      Update test_ci.sh · 36a519b4
      Camille Zhong authored
      update
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      update
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      update ci
      
      Update test_ci.sh
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update run_chatgpt_examples.yml
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      update test ci
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      update roberta with coati
      
      chat ci update
      
      Revert "chat ci update"
      
      This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
      
      [test]chat_update_ci
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      test
      
      Update gpt_critic.py
      
      Update gpt_critic.py
      
      Update run_chatgpt_unit_tests.yml
      
      update test ci
      
      update
      
      update
      
      update
      
      update
      
      Update test_ci.sh
      
      update
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      36a519b4
  3. 03 Apr, 2023 1 commit
    • Camille Zhong's avatar
      [chatgpt] add pre-trained model RoBERTa for RLHF stage 2 & 3 (#3223) · 30412866
      Camille Zhong authored
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      * Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      * add test for reward model training
      
      * Update test_ci.sh
      
      * Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      * Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      * Update test_ci.sh
      
      * Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      * update roberta with coati
      30412866
  4. 28 Mar, 2023 1 commit
  5. 22 Mar, 2023 1 commit
  6. 20 Mar, 2023 1 commit
    • BlueRum's avatar
      [chatgpt]Reward Model Training Process update (#3133) · 7548ca5a
      BlueRum authored
      * add normalize function to value_head in bloom rm
      
      * add normalization to value_function in gpt_rm
      
      * add normalization to value_head of opt_rm
      
      * add Anthropic/hh-rlhf dataset
      
      * Update __init__.py
      
      * Add LogExpLoss in RM training
      
      * Update __init__.py
      
      * update rm trainer to use acc as target
      
      * update example/train_rm
      
      * Update train_rm.sh
      
      * code style
      
      * Update README.md
      
      * Update README.md
      
      * add rm test to ci
      
      * fix tokenier
      
      * fix typo
      
      * change batchsize to avoid oom in ci
      
      * Update test_ci.sh
      7548ca5a
  7. 14 Mar, 2023 1 commit
    • BlueRum's avatar
      [chatgpt]update ci (#3087) · 23cd5e2c
      BlueRum authored
      * [chatgpt]update ci
      
      * Update test_ci.sh
      
      * Update test_ci.sh
      
      * Update test_ci.sh
      
      * test
      
      * Update train_prompts.py
      
      * Update train_dummy.py
      
      * add save_path
      
      * polish
      
      * add save path
      
      * polish
      
      * add save path
      
      * polish
      
      * delete bloom-560m test
      
      delete bloom-560m test because of oom
      
      * add ddp test
      23cd5e2c
  8. 03 Mar, 2023 1 commit
    • ver217's avatar
      [chatgpt] making experience support dp (#2971) · 19ad49fb
      ver217 authored
      * [chatgpt] making experience support dp
      
      * [chatgpt] update example test ci
      
      * [chatgpt] update example test ci
      
      * [chatgpt] update example test ci
      
      * [chatgpt] update example test ci
      
      * [chatgpt] update sampler
      
      * [chatgpt] update example test ci
      
      * [chatgpt] refactor sampler
      
      * [chatgpt] update example test ci
      19ad49fb
  9. 14 Feb, 2023 1 commit