1. 28 Nov, 2023 1 commit
  2. 20 Oct, 2023 1 commit
  3. 19 Sep, 2023 1 commit
  4. 11 Sep, 2023 1 commit
  5. 09 Sep, 2023 1 commit
    • flybird11111's avatar
      [shardformer] update llama2/opt finetune example and fix llama2 policy (#4645) · 7486ed7d
      flybird11111 authored
      * [shardformer] update shardformer readme
      
      [shardformer] update shardformer readme
      
      [shardformer] update shardformer readme
      
      * [shardformer] update llama2/opt finetune example and shardformer update to llama2
      
      * [shardformer] update llama2/opt finetune example and shardformer update to llama2
      
      * [shardformer] update llama2/opt finetune example and shardformer update to llama2
      
      * [shardformer] change dataset
      
      * [shardformer] change dataset
      
      * [shardformer] fix CI
      
      * [shardformer] fix
      
      * [shardformer] fix
      
      * [shardformer] fix
      
      * [shardformer] fix
      
      * [shardformer] fix
      
      [example] update opt example
      
      [example] resolve comments
      
      fix
      
      fix
      7486ed7d
  6. 24 Aug, 2023 1 commit
    • Hongxin Liu's avatar
      [gemini] improve compatibility and add static placement policy (#4479) · 27061426
      Hongxin Liu authored
      * [gemini] remove distributed-related part from colotensor (#4379)
      
      * [gemini] remove process group dependency
      
      * [gemini] remove tp part from colo tensor
      
      * [gemini] patch inplace op
      
      * [gemini] fix param op hook and update tests
      
      * [test] remove useless tests
      
      * [test] remove useless tests
      
      * [misc] fix requirements
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [misc] update requirements
      
      * [gemini] refactor gemini optimizer and gemini ddp (#4398)
      
      * [gemini] update optimizer interface
      
      * [gemini] renaming gemini optimizer
      
      * [gemini] refactor gemini ddp class
      
      * [example] update gemini related example
      
      * [example] update gemini related example
      
      * [plugin] fix gemini plugin args
      
      * [test] update gemini ckpt tests
      
      * [gemini] fix checkpoint io
      
      * [example] fix opt example requirements
      
      * [example] fix opt example
      
      * [example] fix opt example
      
      * [example] fix opt example
      
      * [gemini] add static placement policy (#4443)
      
      * [gemini] add static placement policy
      
      * [gemini] fix param offload
      
      * [test] update gemini tests
      
      * [plugin] update gemini plugin
      
      * [plugin] update gemini plugin docstr
      
      * [misc] fix flash attn requirement
      
      * [test] fix gemini checkpoint io test
      
      * [example] update resnet example result (#4457)
      
      * [example] update bert example result (#4458)
      
      * [doc] update gemini doc (#4468)
      
      * [example] update gemini related examples (#4473)
      
      * [example] update gpt example
      
      * [example] update dreambooth example
      
      * [example] update vit
      
      * [example] update opt
      
      * [example] update palm
      
      * [example] update vit and opt benchmark
      
      * [hotfix] fix bert in model zoo (#4480)
      
      * [hotfix] fix bert in model zoo
      
      * [test] remove chatglm gemini test
      
      * [test] remove sam gemini test
      
      * [test] remove vit gemini test
      
      * [hotfix] fix opt tutorial example (#4497)
      
      * [hotfix] fix opt tutorial example
      
      * [hotfix] fix opt tutorial example
      27061426
  7. 15 Aug, 2023 9 commits
    • flybird1111's avatar
      [Shardformer] Merge flash attention branch to pipeline branch (#4362) · 906426cb
      flybird1111 authored
      
      
      * [shardformer] supported flash attention test dependency (#4158)
      
      * [shardformer] fix flash attention utils test (#4180)
      
      * [shardformer] opt support flash attention (#4163)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] add performance benchmark of shardformer (#4175)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] benchmark fix
      
      * [shardformer] benchmark fix
      
      * [shardformer] llama support flash attention (#4185)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] llama support flash attention
      
      * [shardformer] llama support flash attention
      
      * [shardformer] Move the import statement for xformer outside the forward function.
      
      * [shardformer] gpt2 support flash attention. (#4191)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] gpt2 support flash attention
      
      * [shardformer] gpt2 support flash attention
      
      * [shardformer] gpt2 support flash attention
      
      * [shardformer] bloom support flash attention (#4188)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] bloom suport flash attention
      
      * [shardformer] add assert to sequence length
      
      * [shardformer] fix
      
      * [shardformer] fix
      
      * [shardformer] fix
      
      * [shardformer] bert support flash attention. (#4206)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] bert support flash attention
      
      * [shardformer] t5 support flash attention. (#4216)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] t5 support flash attention
      
      * [shardformer] t5 support flash attention
      
      * fix typo
      
      * fix typo
      
      * fix typo
      
      * fix typo
      
      * fix typo
      
      * fix typo
      
      * [shardformer] support 'paddedcausal'  type of attention mask in Coloattention. (#4215)
      
      * added padded causal attn mask type for ColoAttention
      
      * [shardformer]t5 flash attention fix (#4239)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] t5 flash attention fix
      
      * [shardformer] update gpt2 to use coloattention. (#4234)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] update gpt2 to use coloattention
      
      * [shardformer] update gpt2 to use coloattention
      
      * [shardformer] update gpt2 to use coloattention
      
      * [shardformer] update gpt2 to use coloattention
      
      * [shardformer] update gpt2
      
      * [shardformer] update opt and llama to use coloattention. (#4226)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt
      
      * [shardformer] shardformer support jit fused operator. (#4236)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] bloom support jit fused operator
      
      * [shardformer] bloom support jit fused operator
      
      * [shardformer] bloom support jit fused operator
      
      * [shardformer] t5 support jit fused operator
      
      * [shardformer] t5 support jit fused operator
      
      * [shardformer] t5 support jit fused operator
      
      * [shardformer] add roadmap of flash attention
      
      * [shardformer] add roadmap of flash attention
      
      * [shardformer] add roadmap of flash attention
      
      * [shardformer] add type hint to 'self' param of forward
      
      * [shardformer] merge feature/shardformer-models branch to feature/flash-attention-shardformer branch. (#4290)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      
      * [shardformer] whisper support flash attention (#4301)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] whisper support flash attention
      
      * [shardformer] whisper support flash attention
      
      * [shardformer]whisper support jit operator
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      
      * [shardformer] sam support flash attention (#4316)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] sam support flash attention
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      
      * [shardformer] merge blip2/chatglm  (#4321)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit
      
      * [shardformer] support Blip2 (#4243)
      
      * support base blip2
      
      * add support for downstream blip2 model
      
      * update readme
      
      * add forward injection
      
      * skip not compatible models test
      
      * fix test for gemini and low_level_zero_pugin
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      Co-authored-by: default avatarklhhhhh <1412841649@qq.com>
      
      * [shardformer] blip2 support flash attention and jit operator (#4325)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit
      
      * [shardformer] support Blip2 (#4243)
      
      * support base blip2
      
      * add support for downstream blip2 model
      
      * update readme
      
      * add forward injection
      
      * skip not compatible models test
      
      * fix test for gemini and low_level_zero_pugin
      
      * [shardformer] blip2 support flash attention and jit operator
      
      * [shardformer] blip2 support flash attention and jit operator
      
      * [shardformer] blip2 support flash attention and jit operator
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      Co-authored-by: default avatarklhhhhh <1412841649@qq.com>
      
      * [shardformer] chatglm support flash attention and jit operator (#4330)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit
      
      * [shardformer] support Blip2 (#4243)
      
      * support base blip2
      
      * add support for downstream blip2 model
      
      * update readme
      
      * add forward injection
      
      * skip not compatible models test
      
      * fix test for gemini and low_level_zero_pugin
      
      * [shardformer] chatglm support flash attention and jit operator
      
      * [shardformer] chatglm support flash attention and jit operator
      
      * [shardformer] chatglm support flash attention and jit operator
      
      * [shardformer] chatglm support flash attention and jit operator
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      Co-authored-by: default avatarklhhhhh <1412841649@qq.com>
      
      * [shardformer] vit support flash attention and jit operator (#4334)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit
      
      * [shardformer] support Blip2 (#4243)
      
      * support base blip2
      
      * add support for downstream blip2 model
      
      * update readme
      
      * add forward injection
      
      * skip not compatible models test
      
      * fix test for gemini and low_level_zero_pugin
      
      * [shardformer] vit support flash attention and jit operator
      
      * [shardformer] vit support flash attention and jit operator
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      Co-authored-by: default avatarklhhhhh <1412841649@qq.com>
      
      * [pipeline] merge flash attention branch
      
      * [pipeline] merge flash attention branch
      
      * [pipeline] merge flash attention branch
      
      * [pipeline] fix conflict
      
      * [pipeline] fix conflict
      
      * Merge branch 'feature/pipeline' into feature/pipeline
      
      * Merge branch 'feature/pipeline' into feature/pipeline
      
      * Merge branch 'feature/pipeline' into feature/pipeline
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * fix flash attention tests
      
      * gemini ignore whisper
      
      * fix vit
      
      * fix xformers import handle
      
      ---------
      Co-authored-by: default avatarFrank Lee <somerlee.9@gmail.com>
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      Co-authored-by: default avatarklhhhhh <1412841649@qq.com>
      906426cb
    • Baizhou Zhang's avatar
      [shardformer] add util functions for shardformer tests/fix sync_shared_param (#4366) · b1feeced
      Baizhou Zhang authored
      * add util functions for shardformer tests & rewrite gpt2 test
      
      * fix shared_params & embedding/merging
      
      * fix precision
      b1feeced
    • Bin Jia's avatar
      [test] Hotfix/fix some model test and refactor check util api (#4369) · 5c6f1831
      Bin Jia authored
      * fix llama test
      
      * fix test bug of bert, blip2, bloom, gpt2
      
      * fix llama test
      
      * fix opt test
      
      * fix sam test
      
      * fix sam test
      
      * fix t5 test
      
      * fix vit test
      
      * fix whisper test
      
      * fix whisper test
      
      * polish code
      
      * adjust allclose parameter
      
      * Add mistakenly deleted code
      
      * addjust allclose
      
      * change loss function for some base model
      5c6f1831
    • Baizhou Zhang's avatar
      [pipeline] support fp32 for HybridPlugin/merge shardformer test and pipeline... · 0ceec8f9
      Baizhou Zhang authored
      [pipeline] support fp32 for HybridPlugin/merge shardformer test and pipeline test into one file (#4354)
      
      * add naive optimizer for 3DPlugin/refactor gpt2 shardformer test
      
      * merge tests of PP/DP/TP combinations into one test file
      
      * fix bug when sync grad for dp in HybridPlugin
      
      * update supported precisions for 3DPlugin/fix bug when shifting tp_degree
      
      * improve the passing of lazy_init
      
      * modify lazy_init/use sync_shared_params
      0ceec8f9
    • Baizhou Zhang's avatar
      [pipeline] add pipeline support for T5Stack/T5EncoderModel (#4300) · 36e546b2
      Baizhou Zhang authored
      * modify t5 policy & add test
      
      * pipeline stage distribution for t5
      
      * complete t5 base policy
      
      * t5 stack: halfway
      
      * modify gpt2 pipeline test
      
      * complete pipeline forward for T5Stack/T5EncoderModel
      
      * fix docstring
      
      * move t5 util tests to test_pipeline
      36e546b2
    • Baizhou Zhang's avatar
      [pipeline] support shardformer for GPT2ForQuestionAnswering & complete... · 2a2eacfa
      Baizhou Zhang authored
      [pipeline] support shardformer for GPT2ForQuestionAnswering & complete pipeline support for GPT2 (#4245)
      
      * change for transformers loggers
      
      * add forward for GPT2ForQuestionAnswering
      
      * fix assert
      
      * fix torchrec test
      2a2eacfa
    • Jianghai's avatar
      [pipeline] add bloom model pipeline (#4210) · 37d22f68
      Jianghai authored
      * bloom policy
      
      * llama pipeline forward and tests
      
      * fix the output and attention_mask
      
      * fix name
      
      * bind argument to policy
      
      * finish bloom model
      
      * test shard gpt2
      
      * clear cache
      37d22f68
    • Jianghai's avatar
      [pipeline] Llama causal lm and llama for sequence classification pipeline (#4208) · 31bcf867
      Jianghai authored
      * bloom policy
      
      * llama pipeline forward and tests
      
      * fix the output and attention_mask
      
      * fix name
      
      * bind argument to policy
      
      * Revert "bloom policy"
      
      This reverts commit 8dee68a0a22568dbeed6d4563372b25e1e825fb0.
      
      This policy should be revert and copied to feature/bloom
      
      * revert the bloom changes
      
      * cancel unneeded inputs
      
      * gpt
      
      * finish llama
      
      * causal lm and sequence classification
      
      * revision
      31bcf867
    • Jianghai's avatar
      [pipeline] Llama pipeline (#4205) · 16220310
      Jianghai authored
      * bloom policy
      
      * llama pipeline forward and tests
      
      * fix the output and attention_mask
      
      * fix name
      
      * bind argument to policy
      
      * Revert "bloom policy"
      
      This reverts commit 8dee68a0a22568dbeed6d4563372b25e1e825fb0.
      
      This policy should be revert and copied to feature/bloom
      
      * revert the bloom changes
      
      * cancel unneeded inputs
      
      * gpt
      16220310
  8. 04 Jul, 2023 2 commits
  9. 22 Mar, 2023 1 commit
    • YuliangLiu0306's avatar
      [FX] refactor experimental tracer and adapt it with hf models (#3157) · f57d3495
      YuliangLiu0306 authored
      * pass gpt trace and meta_prop
      
      * pass t5 trace and meta_prop
      
      * [FX] refactor experimental tracer and adapt it with hf models
      
      * pass all mainstream model zoo
      
      * fix CI
      
      * fix CI
      
      * fix CI
      
      * fix CI
      
      * fix CI
      
      * fix CI
      
      * fix CI
      
      * fix CI
      
      * skip tests
      
      * fix CI
      
      * using packaging version
      
      * polish
      f57d3495
  10. 17 Mar, 2023 1 commit
  11. 15 Mar, 2023 1 commit