1. 27 Apr, 2023 1 commit
    • Camille Zhong's avatar
      [Doc] enhancement on README.md for chat examples (#3646) · 8bccb72c
      Camille Zhong authored
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      update roberta with coati
      
      chat ci update
      
      Revert "chat ci update"
      
      This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
      
      * Update README.md
      
      Update README.md
      
      * update readme
      
      * Update test_ci.sh
      8bccb72c
  2. 26 Apr, 2023 3 commits
    • Hongxin Liu's avatar
      [chat] refactor trainer (#3648) · 2a951955
      Hongxin Liu authored
      * [chat] ppo trainer remove useless args
      
      * [chat] update examples
      
      * [chat] update benchmark
      
      * [chat] update examples
      
      * [chat] fix sft training with wandb
      
      * [chat] polish docstr
      2a951955
    • Hongxin Liu's avatar
      [chat] polish performance evaluator (#3647) · f8288315
      Hongxin Liu authored
      f8288315
    • Hongxin Liu's avatar
      [gemini] accelerate inference (#3641) · 50793b35
      Hongxin Liu authored
      * [gemini] support don't scatter after inference
      
      * [chat] update colossalai strategy
      
      * [chat] fix opt benchmark
      
      * [chat] update opt benchmark
      
      * [gemini] optimize inference
      
      * [test] add gemini inference test
      
      * [chat] fix unit test ci
      
      * [chat] fix ci
      
      * [chat] fix ci
      
      * [chat] skip checkpoint test
      50793b35
  3. 24 Apr, 2023 1 commit
  4. 22 Apr, 2023 1 commit
  5. 20 Apr, 2023 2 commits
  6. 18 Apr, 2023 3 commits
    • binmakeswell's avatar
      [coati] fix install cmd (#3592) · 5a79cffd
      binmakeswell authored
      5a79cffd
    • Yuanchen's avatar
      1ec0d386
    • Camille Zhong's avatar
      Update test_ci.sh · 36a519b4
      Camille Zhong authored
      update
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      update
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      update ci
      
      Update test_ci.sh
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update run_chatgpt_examples.yml
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      update test ci
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      Update test_ci.sh
      
      Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      update roberta with coati
      
      chat ci update
      
      Revert "chat ci update"
      
      This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
      
      [test]chat_update_ci
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      test
      
      Update gpt_critic.py
      
      Update gpt_critic.py
      
      Update run_chatgpt_unit_tests.yml
      
      update test ci
      
      update
      
      update
      
      update
      
      update
      
      Update test_ci.sh
      
      update
      
      Update test_ci.sh
      
      Update test_ci.sh
      
      Update run_chatgpt_examples.yml
      
      Update run_chatgpt_examples.yml
      36a519b4
  7. 17 Apr, 2023 4 commits
    • tingfeng cao's avatar
      fix: fix sft (#3568) · 7788e0b0
      tingfeng cao authored
      7788e0b0
    • Fazzie-Maqianli's avatar
      6b1a39b1
    • binmakeswell's avatar
      [chat] update reward model sh (#3578) · cc1eec2f
      binmakeswell authored
      cc1eec2f
    • csric's avatar
      [chatgpt] Detached PPO Training (#3195) · e3551443
      csric authored
      
      
      * run the base
      
      * working on dist ppo
      
      * sync
      
      * detached trainer
      
      * update detached trainer. no maker update function
      
      * facing init problem
      
      * 1 maker 1 trainer detached run. but no model update
      
      * facing cuda problem
      
      * fix save functions
      
      * verified maker update
      
      * nothing
      
      * add ignore
      
      * analyize loss issue
      
      * remove some debug codes
      
      * facing 2m1t stuck issue
      
      * 2m1t verified
      
      * do not use torchrun
      
      * working on 2m2t
      
      * working on 2m2t
      
      * initialize strategy in ray actor env
      
      * facing actor's init order issue
      
      * facing ddp model update issue (need unwarp ddp)
      
      * unwrap ddp actor
      
      * checking 1m2t stuck problem
      
      * nothing
      
      * set timeout for trainer choosing. It solves the stuck problem!
      
      * delete some debug output
      
      * rename to sync with upstream
      
      * rename to sync with upstream
      
      * coati rename
      
      * nothing
      
      * I am going to detach the replaybuffer from trainer and make it a Ray Actor. Two benefits: 1. support TP trainer. 2. asynchronized buffer operations
      
      * experience_maker_holder performs target-revolving _send_experience() instead of length comparison.
      
      * move code to ray subfolder
      
      * working on pipeline inference
      
      * apply comments
      
      ---------
      Co-authored-by: default avatarcsric <richcsr256@gmail.com>
      e3551443
  8. 13 Apr, 2023 2 commits
    • MisterLin1995's avatar
      [chat] ChatGPT train prompts on ray example (#3309) · 1a809edd
      MisterLin1995 authored
      
      
      * [feat][chatgpt]train prompts on ray example
      
      * [fix]simplify code
      
      * [fix]remove depreciated parameter
      
      * [fix]add dependencies
      
      * [fix]method calling
      
      * [fix]experience maker
      
      * [fix]missing loss function
      
      * [fix]init optimizer
      
      * [feat]add usage comment
      
      * [fix]rename files
      
      * [fix]add readme
      
      * [fix]file path
      
      * [fix]move directory
      
      ---------
      Co-authored-by: default avatarjiangwen <zxl265370@antgroup.com>
      1a809edd
    • binmakeswell's avatar
      [chat] polish tutorial doc (#3551) · 535b8964
      binmakeswell authored
      * [chat] clean up duplicate tutorial
      
      * [chat] clean up duplicate tutorial
      
      * [chat] clean up duplicate tutorial
      
      * [chat] clean up duplicate tutorial
      535b8964
  9. 12 Apr, 2023 1 commit
  10. 11 Apr, 2023 1 commit
  11. 10 Apr, 2023 2 commits
  12. 07 Apr, 2023 1 commit
  13. 06 Apr, 2023 7 commits
    • binmakeswell's avatar
      891b8e7f
    • Fazzie-Maqianli's avatar
      add community example dictionary (#3465) · 6afeb120
      Fazzie-Maqianli authored
      6afeb120
    • Frank Lee's avatar
      [test] refactor tests with spawn (#3452) · 80eba05b
      Frank Lee authored
      * [test] added spawn decorator
      
      * polish code
      
      * polish code
      
      * polish code
      
      * polish code
      
      * polish code
      
      * polish code
      80eba05b
    • YY Lin's avatar
      [Chat]Add Peft support & fix the ptx bug (#3433) · 62f4e2eb
      YY Lin authored
      * Update ppo.py
      
      Fix the bug of fetching wrong batch data
      
      * Add peft model support in SFT and Prompts training
      
      In stage-1 and stage-3, the peft model supports are added. So the trained artifacts will be only a small lora additions instead of the whole bunch of files.
      
      * Delete test_prompts.txt
      
      * Delete test_pretrained.txt
      
      * Move the peft stuffs to a community folder.
      
      * Move the demo sft to community
      
      * delete dirty files
      
      * Add instructions to install peft using source
      
      * Remove Chinese comments
      
      * remove the Chinese comments
      62f4e2eb
    • Dr-Corgi's avatar
      [chat]fix save_model(#3377) · 73afb635
      Dr-Corgi authored
      The function save_model should be a part of PPOTrainer.
      73afb635
    • kingkingofall's avatar
      [chat]fix readme (#3429) · 57a3c4db
      kingkingofall authored
      * fix stage 2
      
      fix stage 2
      
      * add torch
      57a3c4db
    • Camille Zhong's avatar
      [Chat] fix the tokenizer "int too big to convert" error in SFT training (#3453) · 72cb4dd4
      Camille Zhong authored
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      * Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      * Update test_ci.sh
      
      * Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      * Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      * Update test_ci.sh
      
      * Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      * update roberta with coati
      
      * chat ci update
      
      * Revert "chat ci update"
      
      This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846.
      
      * [Chat] fix the tokenizer "int too big to convert" error in SFT training
      
      fix the tokenizer error during SFT training using Bloom and OPT
      72cb4dd4
  14. 05 Apr, 2023 1 commit
  15. 04 Apr, 2023 3 commits
  16. 03 Apr, 2023 1 commit
    • Camille Zhong's avatar
      [chatgpt] add pre-trained model RoBERTa for RLHF stage 2 & 3 (#3223) · 30412866
      Camille Zhong authored
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      * Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      * add test for reward model training
      
      * Update test_ci.sh
      
      * Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      * Add RoBERTa for RLHF Stage 2 & 3 (test)
      
      RoBERTa for RLHF Stage 2 & 3 (still in testing)
      
      * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)"
      
      This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368.
      
      * Add RoBERTa for RLHF stage 2 & 3
      
      1. add roberta folder under model folder
      2. add  roberta option in train_reward_model.py
      3. add some test in testci
      
      * Update test_ci.sh
      
      * Revert "Update test_ci.sh"
      
      This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a.
      
      * update roberta with coati
      30412866
  17. 30 Mar, 2023 1 commit
  18. 29 Mar, 2023 5 commits