1. 25 May, 2023 1 commit
    • digger yu's avatar
      [nfc] fix typo colossalai/ applications/ (#3831) · e2d81eba
      digger yu authored
      * fix typo colossalai/autochunk auto_parallel amp
      
      * fix typo colossalai/auto_parallel nn utils etc.
      
      * fix typo colossalai/auto_parallel autochunk fx/passes  etc.
      
      * fix typo docs/
      
      * change placememt_policy to placement_policy in docs/ and examples/
      
      * fix typo colossalai/ applications/
      e2d81eba
  2. 17 Apr, 2023 1 commit
    • csric's avatar
      [chatgpt] Detached PPO Training (#3195) · e3551443
      csric authored
      
      
      * run the base
      
      * working on dist ppo
      
      * sync
      
      * detached trainer
      
      * update detached trainer. no maker update function
      
      * facing init problem
      
      * 1 maker 1 trainer detached run. but no model update
      
      * facing cuda problem
      
      * fix save functions
      
      * verified maker update
      
      * nothing
      
      * add ignore
      
      * analyize loss issue
      
      * remove some debug codes
      
      * facing 2m1t stuck issue
      
      * 2m1t verified
      
      * do not use torchrun
      
      * working on 2m2t
      
      * working on 2m2t
      
      * initialize strategy in ray actor env
      
      * facing actor's init order issue
      
      * facing ddp model update issue (need unwarp ddp)
      
      * unwrap ddp actor
      
      * checking 1m2t stuck problem
      
      * nothing
      
      * set timeout for trainer choosing. It solves the stuck problem!
      
      * delete some debug output
      
      * rename to sync with upstream
      
      * rename to sync with upstream
      
      * coati rename
      
      * nothing
      
      * I am going to detach the replaybuffer from trainer and make it a Ray Actor. Two benefits: 1. support TP trainer. 2. asynchronized buffer operations
      
      * experience_maker_holder performs target-revolving _send_experience() instead of length comparison.
      
      * move code to ray subfolder
      
      * working on pipeline inference
      
      * apply comments
      
      ---------
      Co-authored-by: default avatarcsric <richcsr256@gmail.com>
      e3551443