1. 27 Apr, 2023 1 commit
    • Hongxin Liu's avatar
      [chat] refactor model save/load logic (#3654) · 842768a1
      Hongxin Liu authored
      * [chat] strategy refactor unwrap model
      
      * [chat] strategy refactor save model
      
      * [chat] add docstr
      
      * [chat] refactor trainer save model
      
      * [chat] fix strategy typing
      
      * [chat] refactor trainer save model
      
      * [chat] update readme
      
      * [chat] fix unit test
      842768a1
  2. 18 Apr, 2023 1 commit
  3. 28 Mar, 2023 1 commit
  4. 20 Mar, 2023 1 commit
    • BlueRum's avatar
      [chatgpt]Reward Model Training Process update (#3133) · 7548ca5a
      BlueRum authored
      * add normalize function to value_head in bloom rm
      
      * add normalization to value_function in gpt_rm
      
      * add normalization to value_head of opt_rm
      
      * add Anthropic/hh-rlhf dataset
      
      * Update __init__.py
      
      * Add LogExpLoss in RM training
      
      * Update __init__.py
      
      * update rm trainer to use acc as target
      
      * update example/train_rm
      
      * Update train_rm.sh
      
      * code style
      
      * Update README.md
      
      * Update README.md
      
      * add rm test to ci
      
      * fix tokenier
      
      * fix typo
      
      * change batchsize to avoid oom in ci
      
      * Update test_ci.sh
      7548ca5a
  5. 07 Mar, 2023 1 commit
  6. 03 Mar, 2023 1 commit
  7. 02 Mar, 2023 1 commit
  8. 21 Feb, 2023 1 commit
    • BlueRum's avatar
      [chatgpt] fix rm eval (#2829) · 3eebc4df
      BlueRum authored
      * [chatgpt]fix train_rm bug with lora
      
      * [chatgpt]support colossalai strategy to train rm
      
      * fix pre-commit
      
      * fix pre-commit 2
      
      * [chatgpt]fix rm eval typo
      
      * fix rm eval
      
      * fix pre commit
      3eebc4df
  9. 16 Feb, 2023 2 commits
  10. 14 Feb, 2023 1 commit