• Wenhao Chen's avatar
    [shardformer, pipeline] add `gradient_checkpointing_ratio` and heterogenous... · e614aa34
    Wenhao Chen authored
    [shardformer, pipeline] add `gradient_checkpointing_ratio` and heterogenous shard policy for llama (#5508)
    
    * feat: add `GradientCheckpointConfig` and `PipelineGradientCheckpointConfig`
    
    * feat: apply `GradientCheckpointConfig` to policy and llama_forward
    
    * feat: move `distribute_layer` and `get_stage_index` to PipelineStageManager
    
    * fix: add optional args for `distribute_layer` and `get_stage_index`
    
    * fix: fix changed API calls
    
    * test: update llama tests
    
    * style: polish `GradientCheckpointConfig`
    
    * fix: fix pipeline utils tests
    e614aa34
mixtral_policy.py 22.4 KB