1. 01 Sep, 2023 1 commit
  2. 31 Aug, 2023 1 commit
    • Baizhou Zhang's avatar
      [shardformer] support sharded optimizer checkpointIO of HybridParallelPlugin (#4540) · c9625dbb
      Baizhou Zhang authored
      * implement sharded optimizer saving
      
      * add more param info
      
      * finish implementation of sharded optimizer saving
      
      * fix bugs in optimizer sharded saving
      
      * add pp+zero test
      
      * param group loading
      
      * greedy loading of optimizer
      
      * fix bug when loading
      
      * implement optimizer sharded saving
      
      * add optimizer test & arrange checkpointIO utils
      
      * fix gemini sharding state_dict
      
      * add verbose option
      
      * add loading of master params
      
      * fix typehint
      
      * fix master/working mapping in fp16 amp
      c9625dbb
  3. 25 Aug, 2023 1 commit
  4. 21 Jul, 2023 1 commit
  5. 07 Jul, 2023 1 commit
  6. 04 Jul, 2023 2 commits
  7. 03 Jul, 2023 2 commits
  8. 16 Jun, 2023 1 commit
  9. 15 Jun, 2023 1 commit
  10. 18 May, 2023 1 commit
    • Hongxin Liu's avatar
      [plugin] torch ddp plugin supports sharded model checkpoint (#3775) · 5452df63
      Hongxin Liu authored
      * [plugin] torch ddp plugin add save sharded model
      
      * [test] fix torch ddp ckpt io test
      
      * [test] fix torch ddp ckpt io test
      
      * [test] fix low level zero plugin test
      
      * [test] fix low level zero plugin test
      
      * [test] add debug info
      
      * [test] add debug info
      
      * [test] add debug info
      
      * [test] add debug info
      
      * [test] add debug info
      
      * [test] fix low level zero plugin test
      
      * [test] fix low level zero plugin test
      
      * [test] remove debug info
      5452df63
  11. 15 May, 2023 1 commit
  12. 05 May, 2023 1 commit
    • jiangmingyan's avatar
      [booster] gemini plugin support shard checkpoint (#3610) · 307894f7
      jiangmingyan authored
      
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin add shard checkpoint save/load
      
      * gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      * [API Refactoring]gemini plugin support shard checkpoint
      
      ---------
      Co-authored-by: default avatarluchen <luchen@luchendeMBP.lan>
      Co-authored-by: default avatarluchen <luchen@luchendeMacBook-Pro.local>
      307894f7
  13. 12 Apr, 2023 1 commit
    • jiangmingyan's avatar
      [checkpoint] Shard saved checkpoint need to be compatible with the naming... · 366a0355
      jiangmingyan authored
      
      [checkpoint]  Shard saved checkpoint need to be compatible with the naming format of hf checkpoint files  (#3479)
      
      * [checkpoint] support huggingface style sharded checkpoint, to be compatible with hf file naming format
      
      * [checkpoint] support huggingface style sharded checkpoint, to be compatible with hf file naming format
      
      * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename
      
      * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename
      
      * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename
      
      * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename
      
      ---------
      Co-authored-by: default avatarluchen <luchen@luchendeMacBook-Pro.local>
      Co-authored-by: default avatarluchen <luchen@luchendeMBP.lan>
      366a0355
  14. 06 Apr, 2023 1 commit
  15. 04 Apr, 2023 1 commit