1. 05 Sep, 2023 5 commits
  2. 04 Sep, 2023 2 commits
  3. 01 Sep, 2023 3 commits
  4. 31 Aug, 2023 2 commits
  5. 30 Aug, 2023 2 commits
    • flybird11111's avatar
      [shardformer] support pp+tp+zero1 tests (#4531) · ec18fc73
      flybird11111 authored
      * [shardformer] fix opt test hanging
      
      * fix
      
      * test
      
      * test
      
      * test
      
      * fix test
      
      * fix test
      
      * remove print
      
      * add fix
      
      * [shardformer] pp+tp+zero1
      
      [shardformer] pp+tp+zero1
      
      [shardformer] pp+tp+zero1
      
      [shardformer] pp+tp+zero1
      
      [shardformer] pp+tp+zero1
      
      [shardformer] pp+tp+zero1
      
      * [shardformer] pp+tp+zero1
      
      * [shardformer] pp+tp+zero1
      
      * [shardformer] pp+tp+zero1
      
      * [shardformer] pp+tp+zero1
      ec18fc73
    • flybird11111's avatar
      [shardformer] fix opt test hanging (#4521) · d367b887
      flybird11111 authored
      * [shardformer] fix opt test hanging
      
      * fix
      
      * test
      
      * test
      
      * test
      
      * fix test
      
      * fix test
      
      * remove print
      
      * add fix
      d367b887
  6. 29 Aug, 2023 2 commits
  7. 28 Aug, 2023 2 commits
  8. 25 Aug, 2023 3 commits
    • Baizhou Zhang's avatar
      [shardformer] support sharded checkpoint IO for models of HybridParallelPlugin (#4506) · 44eab2b2
      Baizhou Zhang authored
      * add APIs
      
      * implement save_sharded_model
      
      * add test for hybrid checkpointio
      
      * implement naive loading for sharded model
      
      * implement efficient sharded model loading
      
      * open a new file for hybrid checkpoint_io
      
      * small fix
      
      * fix circular importing
      
      * fix docstring
      
      * arrange arguments and apis
      
      * small fix
      44eab2b2
    • flybird11111's avatar
      [shardformer] opt fix. (#4514) · de8a65ba
      flybird11111 authored
      * [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      * fix
      
      fix
      
      fix
      
      fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * activate checks
      
      * [Test] test ci
      
      * test ci
      
      * test ci
      
      * test ci
      
      * test ci
      
      * test ci
      
      * test ci
      
      * fix
      de8a65ba
    • LuGY's avatar
      [zero]support zero2 with gradient accumulation (#4511) · 839847b7
      LuGY authored
      * support gradient accumulation with zero2
      
      * fix type
      839847b7
  9. 24 Aug, 2023 2 commits
    • flybird11111's avatar
      [shardformer] vit/llama/t5 ignore the sequence parallelism flag and some fix. (#4498) · 3353e55c
      flybird11111 authored
      * [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      * fix
      
      fix
      
      fix
      
      fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * [shardformer] jit fused fix
      
      * activate checks
      3353e55c
    • Hongxin Liu's avatar
      [gemini] improve compatibility and add static placement policy (#4479) · 27061426
      Hongxin Liu authored
      * [gemini] remove distributed-related part from colotensor (#4379)
      
      * [gemini] remove process group dependency
      
      * [gemini] remove tp part from colo tensor
      
      * [gemini] patch inplace op
      
      * [gemini] fix param op hook and update tests
      
      * [test] remove useless tests
      
      * [test] remove useless tests
      
      * [misc] fix requirements
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [misc] update requirements
      
      * [gemini] refactor gemini optimizer and gemini ddp (#4398)
      
      * [gemini] update optimizer interface
      
      * [gemini] renaming gemini optimizer
      
      * [gemini] refactor gemini ddp class
      
      * [example] update gemini related example
      
      * [example] update gemini related example
      
      * [plugin] fix gemini plugin args
      
      * [test] update gemini ckpt tests
      
      * [gemini] fix checkpoint io
      
      * [example] fix opt example requirements
      
      * [example] fix opt example
      
      * [example] fix opt example
      
      * [example] fix opt example
      
      * [gemini] add static placement policy (#4443)
      
      * [gemini] add static placement policy
      
      * [gemini] fix param offload
      
      * [test] update gemini tests
      
      * [plugin] update gemini plugin
      
      * [plugin] update gemini plugin docstr
      
      * [misc] fix flash attn requirement
      
      * [test] fix gemini checkpoint io test
      
      * [example] update resnet example result (#4457)
      
      * [example] update bert example result (#4458)
      
      * [doc] update gemini doc (#4468)
      
      * [example] update gemini related examples (#4473)
      
      * [example] update gpt example
      
      * [example] update dreambooth example
      
      * [example] update vit
      
      * [example] update opt
      
      * [example] update palm
      
      * [example] update vit and opt benchmark
      
      * [hotfix] fix bert in model zoo (#4480)
      
      * [hotfix] fix bert in model zoo
      
      * [test] remove chatglm gemini test
      
      * [test] remove sam gemini test
      
      * [test] remove vit gemini test
      
      * [hotfix] fix opt tutorial example (#4497)
      
      * [hotfix] fix opt tutorial example
      
      * [hotfix] fix opt tutorial example
      27061426
  10. 23 Aug, 2023 1 commit
  11. 22 Aug, 2023 2 commits
  12. 21 Aug, 2023 1 commit
  13. 18 Aug, 2023 2 commits
  14. 16 Aug, 2023 5 commits
  15. 15 Aug, 2023 6 commits