• Wenhao Chen's avatar
    [chat] refactor trainer class (#4080) · b03d64d0
    Wenhao Chen authored
    * to: add SLTrainer
    
    * refactor: refactor RMTrainer and SFTTrainer
    
    * fix: fix init file
    
    * feat: remove on_learn_epoch fn as not used
    
    * fix: align with modified gemini arguments
    
    * to: add OnPolicyTrainer
    
    * revert: add _on_learn_epoch fn
    
    * refactor: refactor PPOTrainer
    
    * style: rename PPOTrainer argument
    
    * fix: align with modified PPO arguments
    
    * test: align with modified train_prompts arguments
    
    * chore: modify train_prompts
    
    * docs: align with modified arguments
    
    * fix: remove unnecessary output
    
    * fix: move dataloader to fit fn of SLTrainer
    
    * fix: move dataloader to fit fn of OnPolicyTrainer
    
    * fix: modify usage of prompt and pretrain dataloader
    b03d64d0
colossalai.py 8.79 KB