1. 24 Aug, 2023 1 commit
    • Hongxin Liu's avatar
      [gemini] improve compatibility and add static placement policy (#4479) · 27061426
      Hongxin Liu authored
      * [gemini] remove distributed-related part from colotensor (#4379)
      
      * [gemini] remove process group dependency
      
      * [gemini] remove tp part from colo tensor
      
      * [gemini] patch inplace op
      
      * [gemini] fix param op hook and update tests
      
      * [test] remove useless tests
      
      * [test] remove useless tests
      
      * [misc] fix requirements
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [test] fix model zoo
      
      * [misc] update requirements
      
      * [gemini] refactor gemini optimizer and gemini ddp (#4398)
      
      * [gemini] update optimizer interface
      
      * [gemini] renaming gemini optimizer
      
      * [gemini] refactor gemini ddp class
      
      * [example] update gemini related example
      
      * [example] update gemini related example
      
      * [plugin] fix gemini plugin args
      
      * [test] update gemini ckpt tests
      
      * [gemini] fix checkpoint io
      
      * [example] fix opt example requirements
      
      * [example] fix opt example
      
      * [example] fix opt example
      
      * [example] fix opt example
      
      * [gemini] add static placement policy (#4443)
      
      * [gemini] add static placement policy
      
      * [gemini] fix param offload
      
      * [test] update gemini tests
      
      * [plugin] update gemini plugin
      
      * [plugin] update gemini plugin docstr
      
      * [misc] fix flash attn requirement
      
      * [test] fix gemini checkpoint io test
      
      * [example] update resnet example result (#4457)
      
      * [example] update bert example result (#4458)
      
      * [doc] update gemini doc (#4468)
      
      * [example] update gemini related examples (#4473)
      
      * [example] update gpt example
      
      * [example] update dreambooth example
      
      * [example] update vit
      
      * [example] update opt
      
      * [example] update palm
      
      * [example] update vit and opt benchmark
      
      * [hotfix] fix bert in model zoo (#4480)
      
      * [hotfix] fix bert in model zoo
      
      * [test] remove chatglm gemini test
      
      * [test] remove sam gemini test
      
      * [test] remove vit gemini test
      
      * [hotfix] fix opt tutorial example (#4497)
      
      * [hotfix] fix opt tutorial example
      
      * [hotfix] fix opt tutorial example
      27061426
  2. 15 Aug, 2023 4 commits
    • flybird1111's avatar
      [Shardformer] Merge flash attention branch to pipeline branch (#4362) · 906426cb
      flybird1111 authored
      
      
      * [shardformer] supported flash attention test dependency (#4158)
      
      * [shardformer] fix flash attention utils test (#4180)
      
      * [shardformer] opt support flash attention (#4163)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] add performance benchmark of shardformer (#4175)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] benchmark fix
      
      * [shardformer] benchmark fix
      
      * [shardformer] llama support flash attention (#4185)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] llama support flash attention
      
      * [shardformer] llama support flash attention
      
      * [shardformer] Move the import statement for xformer outside the forward function.
      
      * [shardformer] gpt2 support flash attention. (#4191)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] gpt2 support flash attention
      
      * [shardformer] gpt2 support flash attention
      
      * [shardformer] gpt2 support flash attention
      
      * [shardformer] bloom support flash attention (#4188)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] bloom suport flash attention
      
      * [shardformer] add assert to sequence length
      
      * [shardformer] fix
      
      * [shardformer] fix
      
      * [shardformer] fix
      
      * [shardformer] bert support flash attention. (#4206)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] bert support flash attention
      
      * [shardformer] t5 support flash attention. (#4216)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] t5 support flash attention
      
      * [shardformer] t5 support flash attention
      
      * fix typo
      
      * fix typo
      
      * fix typo
      
      * fix typo
      
      * fix typo
      
      * fix typo
      
      * [shardformer] support 'paddedcausal'  type of attention mask in Coloattention. (#4215)
      
      * added padded causal attn mask type for ColoAttention
      
      * [shardformer]t5 flash attention fix (#4239)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] t5 flash attention fix
      
      * [shardformer] update gpt2 to use coloattention. (#4234)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] update gpt2 to use coloattention
      
      * [shardformer] update gpt2 to use coloattention
      
      * [shardformer] update gpt2 to use coloattention
      
      * [shardformer] update gpt2 to use coloattention
      
      * [shardformer] update gpt2
      
      * [shardformer] update opt and llama to use coloattention. (#4226)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt to use coloattention
      
      * [shardformer]update opt
      
      * [shardformer] shardformer support jit fused operator. (#4236)
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] opt support flash attention
      
      * [shardformer] move to modeling
      
      * [shardformer] move to modeling
      
      * [shardformer] bloom support jit fused operator
      
      * [shardformer] bloom support jit fused operator
      
      * [shardformer] bloom support jit fused operator
      
      * [shardformer] t5 support jit fused operator
      
      * [shardformer] t5 support jit fused operator
      
      * [shardformer] t5 support jit fused operator
      
      * [shardformer] add roadmap of flash attention
      
      * [shardformer] add roadmap of flash attention
      
      * [shardformer] add roadmap of flash attention
      
      * [shardformer] add type hint to 'self' param of forward
      
      * [shardformer] merge feature/shardformer-models branch to feature/flash-attention-shardformer branch. (#4290)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      
      * [shardformer] whisper support flash attention (#4301)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] whisper support flash attention
      
      * [shardformer] whisper support flash attention
      
      * [shardformer]whisper support jit operator
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      
      * [shardformer] sam support flash attention (#4316)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] sam support flash attention
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      
      * [shardformer] merge blip2/chatglm  (#4321)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit
      
      * [shardformer] support Blip2 (#4243)
      
      * support base blip2
      
      * add support for downstream blip2 model
      
      * update readme
      
      * add forward injection
      
      * skip not compatible models test
      
      * fix test for gemini and low_level_zero_pugin
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      Co-authored-by: default avatarklhhhhh <1412841649@qq.com>
      
      * [shardformer] blip2 support flash attention and jit operator (#4325)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit
      
      * [shardformer] support Blip2 (#4243)
      
      * support base blip2
      
      * add support for downstream blip2 model
      
      * update readme
      
      * add forward injection
      
      * skip not compatible models test
      
      * fix test for gemini and low_level_zero_pugin
      
      * [shardformer] blip2 support flash attention and jit operator
      
      * [shardformer] blip2 support flash attention and jit operator
      
      * [shardformer] blip2 support flash attention and jit operator
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      Co-authored-by: default avatarklhhhhh <1412841649@qq.com>
      
      * [shardformer] chatglm support flash attention and jit operator (#4330)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit
      
      * [shardformer] support Blip2 (#4243)
      
      * support base blip2
      
      * add support for downstream blip2 model
      
      * update readme
      
      * add forward injection
      
      * skip not compatible models test
      
      * fix test for gemini and low_level_zero_pugin
      
      * [shardformer] chatglm support flash attention and jit operator
      
      * [shardformer] chatglm support flash attention and jit operator
      
      * [shardformer] chatglm support flash attention and jit operator
      
      * [shardformer] chatglm support flash attention and jit operator
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      Co-authored-by: default avatarklhhhhh <1412841649@qq.com>
      
      * [shardformer] vit support flash attention and jit operator (#4334)
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * [shardformer] support SAM (#4231)
      
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      
      * [shardformer] support whisper (#4212)
      
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      
      * Feature/chatglm (#4240)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] support ChatGLMForConditionalGeneration & add fusedlayernorm for vit
      
      * [shardformer] support Blip2 (#4243)
      
      * support base blip2
      
      * add support for downstream blip2 model
      
      * update readme
      
      * add forward injection
      
      * skip not compatible models test
      
      * fix test for gemini and low_level_zero_pugin
      
      * [shardformer] vit support flash attention and jit operator
      
      * [shardformer] vit support flash attention and jit operator
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      Co-authored-by: default avatarklhhhhh <1412841649@qq.com>
      
      * [pipeline] merge flash attention branch
      
      * [pipeline] merge flash attention branch
      
      * [pipeline] merge flash attention branch
      
      * [pipeline] fix conflict
      
      * [pipeline] fix conflict
      
      * Merge branch 'feature/pipeline' into feature/pipeline
      
      * Merge branch 'feature/pipeline' into feature/pipeline
      
      * Merge branch 'feature/pipeline' into feature/pipeline
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * activate checks
      
      * fix flash attention tests
      
      * gemini ignore whisper
      
      * fix vit
      
      * fix xformers import handle
      
      ---------
      Co-authored-by: default avatarFrank Lee <somerlee.9@gmail.com>
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      Co-authored-by: default avatarFoolPlayer <45593998+FoolPlayer@users.noreply.github.com>
      Co-authored-by: default avatarklhhhhh <1412841649@qq.com>
      906426cb
    • FoolPlayer's avatar
      [test] skip some not compatible models · c3ca53cf
      FoolPlayer authored
      c3ca53cf
    • Hongxin Liu's avatar
      [hotfix] fix gemini and zero test (#4333) · 411cf1d2
      Hongxin Liu authored
      * [hotfix] fix gemini and zero test
      
      * [hotfix] fix lazy init test
      
      * [hotfix] fix lazy init test
      411cf1d2
    • Hongxin Liu's avatar
      [plugin] add 3d parallel plugin (#4295) · 261eab02
      Hongxin Liu authored
      * [amp] add mixed precision optimizer
      
      * [plugin] add 3d parallel plugin
      
      * [booster] support pipeline
      
      * [plugin] 3d parallel plugin support clip grad norm
      
      * [shardformer] fix sharder and add plugin test
      
      * [plugin] rename 3d parallel plugin
      
      * [ci] support testmon core pkg change detection (#4305)
      
      * [hotfix] debug testmon
      
      * [hotfix] fix llama
      
      * [hotfix] fix p2p bugs
      
      * [hotfix] fix requirements
      261eab02
  3. 31 Jul, 2023 1 commit
    • LuGY's avatar
      [zero] refactor low level zero for shard evenly (#4030) · c6ab9698
      LuGY authored
      * refactor low level zero
      
      * fix zero2 and support cpu offload
      
      * avg gradient and modify unit test
      
      * refactor grad store, support layer drop
      
      * refactor bucket store, support grad accumulation
      
      * fix and update unit test of zero and ddp
      
      * compatible with tp, ga and unit test
      
      * fix memory leak and polish
      
      * add zero layer drop unittest
      
      * polish code
      
      * fix import err in unit test
      
      * support diffenert comm dtype, modify docstring style
      
      * polish code
      
      * test padding and fix
      
      * fix unit test of low level zero
      
      * fix pad recording in bucket store
      
      * support some models
      
      * polish
      c6ab9698
  4. 04 Jul, 2023 1 commit
  5. 05 Jun, 2023 1 commit
    • Hongxin Liu's avatar
      [lazy] refactor lazy init (#3891) · dbb32692
      Hongxin Liu authored
      * [lazy] remove old lazy init
      
      * [lazy] refactor lazy init folder structure
      
      * [lazy] fix lazy tensor deepcopy
      
      * [test] update lazy init test
      dbb32692
  6. 23 May, 2023 1 commit
  7. 18 May, 2023 1 commit
    • Hongxin Liu's avatar
      [plugin] torch ddp plugin supports sharded model checkpoint (#3775) · 5452df63
      Hongxin Liu authored
      * [plugin] torch ddp plugin add save sharded model
      
      * [test] fix torch ddp ckpt io test
      
      * [test] fix torch ddp ckpt io test
      
      * [test] fix low level zero plugin test
      
      * [test] fix low level zero plugin test
      
      * [test] add debug info
      
      * [test] add debug info
      
      * [test] add debug info
      
      * [test] add debug info
      
      * [test] add debug info
      
      * [test] fix low level zero plugin test
      
      * [test] fix low level zero plugin test
      
      * [test] remove debug info
      5452df63
  8. 15 May, 2023 3 commits
  9. 11 May, 2023 1 commit
    • digger-yu's avatar
      [CI] fix typo with tests/ etc. (#3727) · 1f73609a
      digger-yu authored
      * fix spelling error with examples/comminity/
      
      * fix spelling error with tests/
      
      * fix some spelling error with tests/ colossalai/ etc.
      
      * fix spelling error with tests/ etc. date:2023.5.10
      1f73609a
  10. 09 May, 2023 1 commit
    • Hongxin Liu's avatar
      [booster] fix no_sync method (#3709) · 6552cbf8
      Hongxin Liu authored
      * [booster] fix no_sync method
      
      * [booster] add test for ddp no_sync
      
      * [booster] fix merge
      
      * [booster] update unit test
      
      * [booster] update unit test
      
      * [booster] update unit test
      6552cbf8
  11. 08 May, 2023 1 commit
  12. 05 May, 2023 1 commit
  13. 26 Apr, 2023 1 commit
    • Hongxin Liu's avatar
      [booster] add low level zero plugin (#3594) · 4b3240cb
      Hongxin Liu authored
      * [booster] add low level zero plugin
      
      * [booster] fix gemini plugin test
      
      * [booster] fix precision
      
      * [booster] add low level zero plugin test
      
      * [test] fix booster plugin test oom
      
      * [test] fix booster plugin test oom
      
      * [test] fix googlenet and inception output trans
      
      * [test] fix diffuser clip vision model
      
      * [test] fix torchaudio_wav2vec2_base
      
      * [test] fix low level zero plugin test
      4b3240cb
  14. 12 Apr, 2023 1 commit
    • Hongxin Liu's avatar
      [gemini] gemini supports lazy init (#3379) · 152239bb
      Hongxin Liu authored
      * [gemini] fix nvme optimizer init
      
      * [gemini] gemini supports lazy init
      
      * [gemini] add init example
      
      * [gemini] add fool model
      
      * [zero] update gemini ddp
      
      * [zero] update init example
      
      * add chunk method
      
      * add chunk method
      
      * [lazyinit] fix lazy tensor tolist
      
      * [gemini] fix buffer materialization
      
      * [misc] remove useless file
      
      * [booster] update gemini plugin
      
      * [test] update gemini plugin test
      
      * [test] fix gemini plugin test
      
      * [gemini] fix import
      
      * [gemini] fix import
      
      * [lazyinit] use new metatensor
      
      * [lazyinit] use new metatensor
      
      * [lazyinit] fix __set__ method
      152239bb
  15. 06 Apr, 2023 1 commit
  16. 04 Apr, 2023 1 commit
  17. 03 Apr, 2023 1 commit
  18. 31 Mar, 2023 1 commit
    • ver217's avatar
      [booster] implement Gemini plugin (#3352) · 5f2e34e6
      ver217 authored
      * [booster] add gemini plugin
      
      * [booster] update docstr
      
      * [booster] gemini plugin add coloparam convertor
      
      * [booster] fix coloparam convertor
      
      * [booster] fix gemini plugin device
      
      * [booster] add gemini plugin test
      
      * [booster] gemini plugin ignore sync bn
      
      * [booster] skip some model
      
      * [booster] skip some model
      
      * [booster] modify test world size
      
      * [booster] modify test world size
      
      * [booster] skip test
      5f2e34e6
  19. 27 Mar, 2023 1 commit
  20. 21 Mar, 2023 1 commit
  21. 20 Mar, 2023 2 commits
  22. 17 Mar, 2023 1 commit