"applications/vscode:/vscode.git/clone" did not exist on "ce2cafae7613be2b805ebec18c2db5e398c50ca9"
  1. 11 Mar, 2022 4 commits
    • ver217's avatar
    • Frank Lee's avatar
      added unit test for sharded optimizer (#293) · 27155b85
      Frank Lee authored
      * added unit test for sharded optimizer
      
      * refactor for elegance
      27155b85
    • Frank Lee's avatar
      e17e54e3
    • Jiarui Fang's avatar
      Feature/zero (#279) · 5a560a06
      Jiarui Fang authored
      
      
      * add zero1 (#209)
      
      * add zero1
      
      * add test zero1
      
      * update zero stage 1 develop (#212)
      
      * Implement naive zero3 (#240)
      
      * naive zero3 works well
      
      * add zero3 param manager
      
      * add TODOs in comments
      
      * add gather full param ctx
      
      * fix sub module streams
      
      * add offload
      
      * fix bugs of hook and add unit tests
      
      * fix bugs of hook and add unit tests (#252)
      
      * add gather full param ctx
      
      * fix sub module streams
      
      * add offload
      
      * fix bugs of hook and add unit tests
      
      * polish code and add state dict hook
      
      * fix bug
      
      * update unit test
      
      * refactor reconstructed zero code
      
      * clip_grad support zero3 and add unit test
      
      * add unit test for Zero3ParameterManager
      
      * [WIP] initialize the shard param class
      
      * [WIP] Yet another sharded model implementation (#274)
      
      * [WIP] initialize the shard param class
      
      * [WIP] Yes another implementation of shardModel. Using a better hook method.
      
      * torch.concat -> torch.cat
      
      * fix test_zero_level_1.py::test_zero_level_1 unitest
      
      * remove deepspeed implementation and refactor for the reconstructed zero module
      
      * polish zero dp unittests
      Co-authored-by: default avatarver217 <lhx0217@gmail.com>
      Co-authored-by: default avatarFrank Lee <somerlee.9@gmail.com>
      5a560a06
  2. 29 Dec, 2021 1 commit
  3. 20 Dec, 2021 1 commit
  4. 16 Dec, 2021 1 commit
  5. 09 Dec, 2021 1 commit
    • Frank Lee's avatar
      Develop/experiments (#59) · da01c234
      Frank Lee authored
      
      
      * Add gradient accumulation, fix lr scheduler
      
      * fix FP16 optimizer and adapted torch amp with tensor parallel (#18)
      
      * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes
      
      * fixed trainer
      
      * Revert "fixed trainer"
      
      This reverts commit 2e0b0b76990e8d4e337add483d878c0f61cf5097.
      
      * improved consistency between trainer, engine and schedule (#23)
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      
      * Split conv2d, class token, positional embedding in 2d, Fix random number in ddp
      Fix convergence in cifar10, Imagenet1000
      
      * Integrate 1d tensor parallel in Colossal-AI (#39)
      
      * fixed 1D and 2D convergence (#38)
      
      * optimized 2D operations
      
      * fixed 1D ViT convergence problem
      
      * Feature/ddp (#49)
      
      * remove redundancy func in setup (#19) (#20)
      
      * use env to control the language of doc (#24) (#25)
      
      * Support TP-compatible Torch AMP and Update trainer API (#27)
      
      * Add gradient accumulation, fix lr scheduler
      
      * fix FP16 optimizer and adapted torch amp with tensor parallel (#18)
      
      * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes
      
      * fixed trainer
      
      * Revert "fixed trainer"
      
      This reverts commit 2e0b0b76990e8d4e337add483d878c0f61cf5097.
      
      * improved consistency between trainer, engine and schedule (#23)
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatarver217 <lhx0217@gmail.com>
      
      * add an example of ViT-B/16 and remove w_norm clipping in LAMB (#29)
      
      * add explanation for ViT example (#35) (#36)
      
      * support torch ddp
      
      * fix loss accumulation
      
      * add log for ddp
      
      * change seed
      
      * modify timing hook
      Co-authored-by: default avatarFrank Lee <somerlee.9@gmail.com>
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatarbinmakeswell <binmakeswell@gmail.com>
      
      * Feature/pipeline (#40)
      
      * remove redundancy func in setup (#19) (#20)
      
      * use env to control the language of doc (#24) (#25)
      
      * Support TP-compatible Torch AMP and Update trainer API (#27)
      
      * Add gradient accumulation, fix lr scheduler
      
      * fix FP16 optimizer and adapted torch amp with tensor parallel (#18)
      
      * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes
      
      * fixed trainer
      
      * Revert "fixed trainer"
      
      This reverts commit 2e0b0b76990e8d4e337add483d878c0f61cf5097.
      
      * improved consistency between trainer, engine and schedule (#23)
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatarver217 <lhx0217@gmail.com>
      
      * add an example of ViT-B/16 and remove w_norm clipping in LAMB (#29)
      
      * add explanation for ViT example (#35) (#36)
      
      * optimize communication of pipeline parallel
      
      * fix grad clip for pipeline
      Co-authored-by: default avatarFrank Lee <somerlee.9@gmail.com>
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatarbinmakeswell <binmakeswell@gmail.com>
      
      * optimized 3d layer to fix slow computation ; tested imagenet performance with 3d; reworked lr_scheduler config definition; fixed launch args; fixed some printing issues; simplified apis of 3d layers (#51)
      
      * Update 2.5d layer code to get a similar accuracy on imagenet-1k dataset
      
      * update api for better usability (#58)
      
      update api for better usability
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatarver217 <lhx0217@gmail.com>
      Co-authored-by: default avatarpuck_WCR <46049915+WANG-CR@users.noreply.github.com>
      Co-authored-by: default avatarbinmakeswell <binmakeswell@gmail.com>
      Co-authored-by: default avatarアマデウス <kurisusnowdeng@users.noreply.github.com>
      Co-authored-by: default avatarBoxiangW <45734921+BoxiangW@users.noreply.github.com>
      da01c234
  6. 28 Oct, 2021 1 commit