1. 29 Dec, 2023 1 commit
  2. 07 Nov, 2023 1 commit
  3. 19 Sep, 2023 1 commit
  4. 28 Aug, 2023 1 commit
  5. 22 Aug, 2023 1 commit
    • flybird11111's avatar
      [shardformer] chatglm support sequence parallel (#4482) · 59e252ec
      flybird11111 authored
      * [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      [shardformer] chatglm support sequence parallel
      
      * fix
      
      fix
      
      fix
      
      fix
      59e252ec
  6. 16 Aug, 2023 1 commit
    • Bin Jia's avatar
      [shardformer/sequence parallel] Cherry pick commit to new branch (#4450) · 424629fe
      Bin Jia authored
      * [shardformer/sequence parallel] Support sequence parallel for gpt2 (#4384)
      
      * [sequence parallel] add sequence parallel linear col/row support (#4336)
      
      * add sequence parallel linear col/row support
      
      * add annotation
      
      * add annotation
      
      * add support for gpt2 fused qkv linear layer
      
      * support sequence parallel in GPT2
      
      * add docstring and note
      
      * add requirments
      
      * remove unused flash-attb
      
      * modify flash attn test
      
      * modify flash attn setting
      
      * modify flash attn code
      
      * add assert before divide, rename forward function
      
      * [shardformer/test] fix gpt2 test with seq-parallel
      
      * [shardformer/sequence parallel] Overlap input gather and grad computation during col backward (#4401)
      
      * overlap gather input / grad computing during col backward
      
      * modify test for overlap
      
      * simplify code
      
      * fix code and modify cuda stream synchronize
      
      * [shardformer/sequence parallel] polish code
      424629fe
  7. 15 Aug, 2023 4 commits
    • Baizhou Zhang's avatar
      [shardformer] rewrite tests for opt/bloom/llama/vit/chatglm (#4395) · 7711bd52
      Baizhou Zhang authored
      * rewrite opt tests
      
      * rewrite llama tests
      
      * rewrite bloom & vit tests
      
      * rewrite chatglm tests
      
      * fix LinearCol for classfiers
      
      * add judge for other tp layers, fix lazy init in util
      7711bd52
    • Hongxin Liu's avatar
      [shardformer] support inplace sharding (#4251) · d921ce83
      Hongxin Liu authored
      * [shardformer] embedding support inplace sharding
      
      * [shardformer] linear support inplace sharding
      
      * [shardformer] layernorm support inplace sharding
      
      * [shardformer] qkv support inplace sharding
      
      * [test] update shardformer layer test
      
      * [shardformer] fix shared param sharding
      
      * [shardformer] fix bert policy
      
      * [shardformer] fix bloom policy
      
      * [shardformer] fix llama policy
      
      * [shardformer] fix opt policy
      
      * [shardformer] fix t5 policy
      
      * [shardformer] fix fused qkv linear
      
      * [shardformer] fix bugs
      
      * force sync
      
      * [test] fix bugs
      
      * [test] fix transformer version
      d921ce83
    • Baizhou Zhang's avatar
      [pipeline] Add Pipeline Forward for GPT2Model Shardformer (#4224) · 208ac8f2
      Baizhou Zhang authored
      * * fix typehint & docstring in sharder.py
      
      * * update pipeline forward for GPT2Model
      
      * * add test for pipeline forward of GPT2Model
      
      * * add cache cleaning in gpt2 test
      
      * * change assert to raise command
      208ac8f2
    • Hongxin Liu's avatar
      [shardformer] support lazy init (#4202) · 890774b2
      Hongxin Liu authored
      * [shardformer] support lazy init
      
      * [shardformer] linear support lazy init
      
      * [shardformer] embedding support lazy init
      
      * [shardformer] norm support lazy init
      
      * [shardformer] fused linear support lazy init
      
      * [test] update shardformer test layer
      
      * [test] shardformer with lazy init fit ddp
      
      * [lazy] hotfix deepcopy of param
      
      * [shardformer] fix bert policy and update test
      
      * [shardformer] fix bloom policy and update test
      
      * [shardformer] fix opt policy and update test
      
      * [shardformer] fix t5 policy and update test
      
      * [shardformer] fix gpt2 policy and update test
      
      * [shardformer] fix llama policy and update test
      890774b2
  8. 04 Jul, 2023 5 commits