• tmarkstrum's avatar
    [Fix][FSDP]fixed padding size of input tensor for reduce scatter (#907) · fb4eca19
    tmarkstrum authored
    
    
    * fixed padding size of input tensor for reduce scatter, and fixed an error that assigned wrong group
    
    * Update fairscale/nn/data_parallel/fully_sharded_data_parallel.py
    Co-authored-by: default avatarMin Xu <24926999+min-xu-ai@users.noreply.github.com>
    
    * added changelog
    
    * fixed some commit.
    
    * added unit test to ensure the reduce_scatter process group size is correct in default cases. And fall back to default process grouop when the reduce_scatter process group has the wrong size.
    
    * throw an error instead of rolling back to use default process group for reduce_scatter_process_group
    
    * Revert "throw an error instead of rolling back to use default process group for reduce_scatter_process_group"
    
    This reverts commit eab5620da3b726ea55d3088ae4ca10d94dcdf4d9.
    
    * added check for None to avoid unit test failure
    
    * fixed an error to avoid the unit tests failure
    Co-authored-by: default avatarMin Xu <24926999+min-xu-ai@users.noreply.github.com>
    fb4eca19
test_fsdp.py 36 KB