1. 11 Mar, 2022 4 commits
    • ver217's avatar
      13886716
    • ver217's avatar
      fix sharded param hook and unit test · 36f9a74a
      ver217 authored
      36f9a74a
    • ver217's avatar
    • Jiarui Fang's avatar
      Feature/zero (#279) · 5a560a06
      Jiarui Fang authored
      
      
      * add zero1 (#209)
      
      * add zero1
      
      * add test zero1
      
      * update zero stage 1 develop (#212)
      
      * Implement naive zero3 (#240)
      
      * naive zero3 works well
      
      * add zero3 param manager
      
      * add TODOs in comments
      
      * add gather full param ctx
      
      * fix sub module streams
      
      * add offload
      
      * fix bugs of hook and add unit tests
      
      * fix bugs of hook and add unit tests (#252)
      
      * add gather full param ctx
      
      * fix sub module streams
      
      * add offload
      
      * fix bugs of hook and add unit tests
      
      * polish code and add state dict hook
      
      * fix bug
      
      * update unit test
      
      * refactor reconstructed zero code
      
      * clip_grad support zero3 and add unit test
      
      * add unit test for Zero3ParameterManager
      
      * [WIP] initialize the shard param class
      
      * [WIP] Yet another sharded model implementation (#274)
      
      * [WIP] initialize the shard param class
      
      * [WIP] Yes another implementation of shardModel. Using a better hook method.
      
      * torch.concat -> torch.cat
      
      * fix test_zero_level_1.py::test_zero_level_1 unitest
      
      * remove deepspeed implementation and refactor for the reconstructed zero module
      
      * polish zero dp unittests
      Co-authored-by: default avatarver217 <lhx0217@gmail.com>
      Co-authored-by: default avatarFrank Lee <somerlee.9@gmail.com>
      5a560a06