1. 15 Feb, 2022 1 commit
  2. 13 Jan, 2022 1 commit
  3. 21 Dec, 2021 1 commit
  4. 09 Dec, 2021 1 commit
    • Frank Lee's avatar
      Develop/experiments (#59) · da01c234
      Frank Lee authored
      
      
      * Add gradient accumulation, fix lr scheduler
      
      * fix FP16 optimizer and adapted torch amp with tensor parallel (#18)
      
      * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes
      
      * fixed trainer
      
      * Revert "fixed trainer"
      
      This reverts commit 2e0b0b76990e8d4e337add483d878c0f61cf5097.
      
      * improved consistency between trainer, engine and schedule (#23)
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      
      * Split conv2d, class token, positional embedding in 2d, Fix random number in ddp
      Fix convergence in cifar10, Imagenet1000
      
      * Integrate 1d tensor parallel in Colossal-AI (#39)
      
      * fixed 1D and 2D convergence (#38)
      
      * optimized 2D operations
      
      * fixed 1D ViT convergence problem
      
      * Feature/ddp (#49)
      
      * remove redundancy func in setup (#19) (#20)
      
      * use env to control the language of doc (#24) (#25)
      
      * Support TP-compatible Torch AMP and Update trainer API (#27)
      
      * Add gradient accumulation, fix lr scheduler
      
      * fix FP16 optimizer and adapted torch amp with tensor parallel (#18)
      
      * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes
      
      * fixed trainer
      
      * Revert "fixed trainer"
      
      This reverts commit 2e0b0b76990e8d4e337add483d878c0f61cf5097.
      
      * improved consistency between trainer, engine and schedule (#23)
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatarver217 <lhx0217@gmail.com>
      
      * add an example of ViT-B/16 and remove w_norm clipping in LAMB (#29)
      
      * add explanation for ViT example (#35) (#36)
      
      * support torch ddp
      
      * fix loss accumulation
      
      * add log for ddp
      
      * change seed
      
      * modify timing hook
      Co-authored-by: default avatarFrank Lee <somerlee.9@gmail.com>
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatarbinmakeswell <binmakeswell@gmail.com>
      
      * Feature/pipeline (#40)
      
      * remove redundancy func in setup (#19) (#20)
      
      * use env to control the language of doc (#24) (#25)
      
      * Support TP-compatible Torch AMP and Update trainer API (#27)
      
      * Add gradient accumulation, fix lr scheduler
      
      * fix FP16 optimizer and adapted torch amp with tensor parallel (#18)
      
      * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes
      
      * fixed trainer
      
      * Revert "fixed trainer"
      
      This reverts commit 2e0b0b76990e8d4e337add483d878c0f61cf5097.
      
      * improved consistency between trainer, engine and schedule (#23)
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatarver217 <lhx0217@gmail.com>
      
      * add an example of ViT-B/16 and remove w_norm clipping in LAMB (#29)
      
      * add explanation for ViT example (#35) (#36)
      
      * optimize communication of pipeline parallel
      
      * fix grad clip for pipeline
      Co-authored-by: default avatarFrank Lee <somerlee.9@gmail.com>
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatarbinmakeswell <binmakeswell@gmail.com>
      
      * optimized 3d layer to fix slow computation ; tested imagenet performance with 3d; reworked lr_scheduler config definition; fixed launch args; fixed some printing issues; simplified apis of 3d layers (#51)
      
      * Update 2.5d layer code to get a similar accuracy on imagenet-1k dataset
      
      * update api for better usability (#58)
      
      update api for better usability
      Co-authored-by: default avatar1SAA <c2h214748@gmail.com>
      Co-authored-by: default avatarver217 <lhx0217@gmail.com>
      Co-authored-by: default avatarpuck_WCR <46049915+WANG-CR@users.noreply.github.com>
      Co-authored-by: default avatarbinmakeswell <binmakeswell@gmail.com>
      Co-authored-by: default avatarアマデウス <kurisusnowdeng@users.noreply.github.com>
      Co-authored-by: default avatarBoxiangW <45734921+BoxiangW@users.noreply.github.com>
      da01c234
  5. 18 Nov, 2021 1 commit
  6. 15 Nov, 2021 1 commit
  7. 03 Nov, 2021 1 commit
  8. 28 Oct, 2021 1 commit