1. 15 Aug, 2023 40 commits
    • klhhhhh's avatar
      [shardformer] polish code · cbb54d32
      klhhhhh authored
      cbb54d32
    • klhhhhh's avatar
      [shardformer] polish chatglm code · 1a29e8fc
      klhhhhh authored
      1a29e8fc
    • klhhhhh's avatar
      8620009d
    • klhhhhh's avatar
      6ee4c9ee
    • klhhhhh's avatar
      import chatglm · 7377be7a
      klhhhhh authored
      7377be7a
    • klhhhhh's avatar
      [shardformer] vit test finish and support · c4928698
      klhhhhh authored
      c4928698
    • klhhhhh's avatar
      [shardformer] added tests · f60162b2
      klhhhhh authored
      f60162b2
    • Kun Lin's avatar
      Feature/chatglm (#4240) · ed34bb13
      Kun Lin authored
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * [shardformer] chatglm ready
      
      * import chatglm
      
      * [shardformer] add test kit in model zoo for chatglm
      
      * [sharformer] add first version of policy of chatglm
      
      * [shardformer] polish chatglm code
      
      * [shardformer] polish code
      
      * [shardformer] support chatglm without layernorm
      
      * [shardformer] chatglm shard without mlp sharding
      
      * [shardformer] delete some file
      
      * [shardformer] ChatGLM support layernorm sharding
      
      * [shardformer] register without auto policy
      
      * [shardformer] pre-commit check files
      
      * [shardformer] fix chatglm configuration with pre-commit
      ed34bb13
    • FoolPlayer's avatar
      [shardformer] support whisper (#4212) · 9ee4ebea
      FoolPlayer authored
      * support whisper
      
      * fix bug in vocabembedding
      
      * support downstream model of whisper
      
      * update readme
      9ee4ebea
    • FoolPlayer's avatar
      [shardformer] support SAM (#4231) · dd2bf026
      FoolPlayer authored
      * 1.support sam 2.add fused qkv for nn.Linear
      
      * update utils support set element in list
      
      * overtwrite SamVisionAttention foward to use DropoutForParallelInput
      
      * remove unused code
      dd2bf026
    • Kun Lin's avatar
      Feature/vit support (#4182) · c59d7aca
      Kun Lin authored
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      c59d7aca
    • Baizhou Zhang's avatar
      [pipeline] support fp32 for HybridPlugin/merge shardformer test and pipeline... · 0ceec8f9
      Baizhou Zhang authored
      [pipeline] support fp32 for HybridPlugin/merge shardformer test and pipeline test into one file (#4354)
      
      * add naive optimizer for 3DPlugin/refactor gpt2 shardformer test
      
      * merge tests of PP/DP/TP combinations into one test file
      
      * fix bug when sync grad for dp in HybridPlugin
      
      * update supported precisions for 3DPlugin/fix bug when shifting tp_degree
      
      * improve the passing of lazy_init
      
      * modify lazy_init/use sync_shared_params
      0ceec8f9
    • Jianghai's avatar
      [pipeline] refactor test pipeline and remove useless utils in pipeline (#4324) · f13954cd
      Jianghai authored
      * refactor tests
      
      * refactor bloom model
      
      * finish policy tests
      
      * refactor tests
      
      * fix test pure pipeline
      
      * remove test pipeline and cutdown launch process
      
      * refactor tests
      
      * refactor bloom model
      
      * finish policy tests
      
      * refactor tests
      
      * fix test pure pipeline
      
      * remove test pipeline and cutdown launch process
      f13954cd
    • LuGY's avatar
      [pipeline] add unit test for 1f1b (#4303) · d3c6cd66
      LuGY authored
      * add unit test for 1f1b
      
      * polish code
      
      * polish code and update ut version
      
      * fix
      d3c6cd66
    • Baizhou Zhang's avatar
    • Hongxin Liu's avatar
      [hotfix] fix gemini and zero test (#4333) · 411cf1d2
      Hongxin Liu authored
      * [hotfix] fix gemini and zero test
      
      * [hotfix] fix lazy init test
      
      * [hotfix] fix lazy init test
      411cf1d2
    • Hongxin Liu's avatar
      [plugin] add 3d parallel plugin (#4295) · 261eab02
      Hongxin Liu authored
      * [amp] add mixed precision optimizer
      
      * [plugin] add 3d parallel plugin
      
      * [booster] support pipeline
      
      * [plugin] 3d parallel plugin support clip grad norm
      
      * [shardformer] fix sharder and add plugin test
      
      * [plugin] rename 3d parallel plugin
      
      * [ci] support testmon core pkg change detection (#4305)
      
      * [hotfix] debug testmon
      
      * [hotfix] fix llama
      
      * [hotfix] fix p2p bugs
      
      * [hotfix] fix requirements
      261eab02
    • FoolPlayer's avatar
      [shardformer] support pipeline base vit model (#4284) · b3f5d7a3
      FoolPlayer authored
      
      
      * Feature/vit support (#4182)
      
      * [shardformer] added tests
      
      * [shardformer] vit test finish and support
      
      * fix attention dropout
      
      * support base vit pipeline
      
      * support vit downstream model
      
      * fix vit shard test
      
      * modify hidden states return type
      
      ---------
      Co-authored-by: default avatarKun Lin <81014421+klhhhhh@users.noreply.github.com>
      b3f5d7a3
    • Baizhou Zhang's avatar
      [pipeline] add pipeline support for all T5 models (#4310) · 083d7da3
      Baizhou Zhang authored
      * complete policy for T5Model & T5ForConditionalGeneration
      
      * modify function signature in forwards
      
      * add forward for T5model
      
      * add forward for T5ForConditionalGeneration
      
      * fix a bug
      
      * fix hidden_states transporting in decoder
      
      * fix the passing of encoder_outputs
      083d7da3
    • Jianghai's avatar
      [pipeline] test pure pipeline process using llama (#4218) · d0807122
      Jianghai authored
      * bloom policy
      
      * llama pipeline forward and tests
      
      * fix the output and attention_mask
      
      * fix name
      
      * bind argument to policy
      
      * Revert "bloom policy"
      
      This reverts commit 8dee68a0a22568dbeed6d4563372b25e1e825fb0.
      
      This policy should be revert and copied to feature/bloom
      
      * revert the bloom changes
      
      * cancel unneeded inputs
      
      * gpt
      
      * finish llama
      
      * causal lm and sequence classification
      
      * revision
      
      * add pure pipeline test
      
      * fixed version
      
      * fixed version
      
      * pure pipeline
      d0807122
    • Baizhou Zhang's avatar
      [pipeline] add pipeline support for T5Stack/T5EncoderModel (#4300) · 36e546b2
      Baizhou Zhang authored
      * modify t5 policy & add test
      
      * pipeline stage distribution for t5
      
      * complete t5 base policy
      
      * t5 stack: halfway
      
      * modify gpt2 pipeline test
      
      * complete pipeline forward for T5Stack/T5EncoderModel
      
      * fix docstring
      
      * move t5 util tests to test_pipeline
      36e546b2
    • Jianghai's avatar
      [pipeline] reformat for unified design (#4283) · 18ebcf40
      Jianghai authored
      * bert_reformat
      
      * reformat
      
      * reformat
      
      * fix a typo
      
      * format
      
      * format
      
      * fix bug
      18ebcf40
    • Jianghai's avatar
      [hotfix] fix opt pipeline (#4293) · 0a8f3c85
      Jianghai authored
      * opt forward and test
      
      * pause
      
      * finish opt model pipeline
      
      * finish opt pipeline
      
      * opt forward and test
      
      * pause
      
      * finish opt model pipeline
      
      * finish opt pipeline
      
      * fix opt
      
      * set transformers version
      
      * refactor the test pipeline
      
      * fix bug
      0a8f3c85
    • Jianghai's avatar
      [pipeline] OPT model pipeline (#4258) · d8408d18
      Jianghai authored
      * opt forward and test
      
      * pause
      
      * finish opt model pipeline
      
      * finish opt pipeline
      
      * opt forward and test
      
      * pause
      
      * finish opt model pipeline
      
      * finish opt pipeline
      
      * fix opt
      
      * set transformers version
      
      * refactor the test pipeline
      d8408d18
    • Baizhou Zhang's avatar
      [pipeline] refactor gpt2 pipeline forwards (#4287) · b774d5ea
      Baizhou Zhang authored
      * move gpt2 pipeline forwards to modeling folder
      
      * check pipeline status when adding replacing policy
      
      * fix typehint
      
      * fix arguments processing in gpt2_model_forward
      b774d5ea
    • Hongxin Liu's avatar
      [shardformer] support inplace sharding (#4251) · d921ce83
      Hongxin Liu authored
      * [shardformer] embedding support inplace sharding
      
      * [shardformer] linear support inplace sharding
      
      * [shardformer] layernorm support inplace sharding
      
      * [shardformer] qkv support inplace sharding
      
      * [test] update shardformer layer test
      
      * [shardformer] fix shared param sharding
      
      * [shardformer] fix bert policy
      
      * [shardformer] fix bloom policy
      
      * [shardformer] fix llama policy
      
      * [shardformer] fix opt policy
      
      * [shardformer] fix t5 policy
      
      * [shardformer] fix fused qkv linear
      
      * [shardformer] fix bugs
      
      * force sync
      
      * [test] fix bugs
      
      * [test] fix transformer version
      d921ce83
    • Baizhou Zhang's avatar
      [pipeline] support shardformer for GPT2ForQuestionAnswering & complete... · 2a2eacfa
      Baizhou Zhang authored
      [pipeline] support shardformer for GPT2ForQuestionAnswering & complete pipeline support for GPT2 (#4245)
      
      * change for transformers loggers
      
      * add forward for GPT2ForQuestionAnswering
      
      * fix assert
      
      * fix torchrec test
      2a2eacfa
    • Jianghai's avatar
      [bugs] hot fix some testing bugs for new models (#4268) · d9be0472
      Jianghai authored
      * hot fix
      
      * hot fx tracer
      d9be0472
    • Jianghai's avatar
      [pipeline] finish bloom models pipeline and tests (#4223) · 34f0e34a
      Jianghai authored
      * bloom policy
      
      * llama pipeline forward and tests
      
      * fix the output and attention_mask
      
      * fix name
      
      * bind argument to policy
      
      * finish bloom model
      
      * test shard gpt2
      
      * clear cache
      
      * support all bloom models
      
      * add bloom models policies
      
      * finish bloom pipeline and tests
      
      * add set pipeline
      
      * finish bloom
      34f0e34a
    • Jianghai's avatar
      [pipeline] All bert models (#4233) · e7cc62d7
      Jianghai authored
      * bloom policy
      
      * llama pipeline forward and tests
      
      * fix the output and attention_mask
      
      * fix name
      
      * bind argument to policy
      
      * Revert "bloom policy"
      
      This reverts commit 8dee68a0a22568dbeed6d4563372b25e1e825fb0.
      
      This policy should be revert and copied to feature/bloom
      
      * revert the bloom changes
      
      * cancel unneeded inputs
      
      * gpt
      
      * finish llama
      
      * causal lm and sequence classification
      
      * revision
      
      * add pure pipeline test
      
      * finish some bert models
      
      * finish all bert models
      
      * finish bert tests
      
      * fix bugs
      
      * fix bugs
      
      * fix test pipeline
      
      * fix data gen for qa
      
      * update the set pipeline forward
      
      * shared params
      
      * fix bugs
      e7cc62d7
    • Baizhou Zhang's avatar
      [pipeline] add pipeline forward for variants of gpt2 (#4238) · a14d3520
      Baizhou Zhang authored
      * add forward for GPTLMHeadModel
      
      * add test for gpt_lm
      
      * arranging get_held_layers method
      
      * arrange forward replacement
      
      * add forward for GPT2ForTokenClassification
      
      * add forward for GPT2ForSequenceClassification
      
      * fix test_shard_gpt2.py
      
      * add GPT2DoubleHeadsmodel & fix bugs
      
      * add id checking in get_shared_params
      a14d3520
    • Hongxin Liu's avatar
      [shardformer] fix base policy (#4229) · 7e4de520
      Hongxin Liu authored
      7e4de520
    • Baizhou Zhang's avatar
      [pipeline] Add Pipeline Forward for GPT2Model Shardformer (#4224) · 208ac8f2
      Baizhou Zhang authored
      * * fix typehint & docstring in sharder.py
      
      * * update pipeline forward for GPT2Model
      
      * * add test for pipeline forward of GPT2Model
      
      * * add cache cleaning in gpt2 test
      
      * * change assert to raise command
      208ac8f2
    • Jianghai's avatar
      [pipeline] add bloom model pipeline (#4210) · 37d22f68
      Jianghai authored
      * bloom policy
      
      * llama pipeline forward and tests
      
      * fix the output and attention_mask
      
      * fix name
      
      * bind argument to policy
      
      * finish bloom model
      
      * test shard gpt2
      
      * clear cache
      37d22f68
    • Jianghai's avatar
      [pipeline] Llama causal lm and llama for sequence classification pipeline (#4208) · 31bcf867
      Jianghai authored
      * bloom policy
      
      * llama pipeline forward and tests
      
      * fix the output and attention_mask
      
      * fix name
      
      * bind argument to policy
      
      * Revert "bloom policy"
      
      This reverts commit 8dee68a0a22568dbeed6d4563372b25e1e825fb0.
      
      This policy should be revert and copied to feature/bloom
      
      * revert the bloom changes
      
      * cancel unneeded inputs
      
      * gpt
      
      * finish llama
      
      * causal lm and sequence classification
      
      * revision
      31bcf867
    • Jianghai's avatar
      [pipeline] Llama pipeline (#4205) · 16220310
      Jianghai authored
      * bloom policy
      
      * llama pipeline forward and tests
      
      * fix the output and attention_mask
      
      * fix name
      
      * bind argument to policy
      
      * Revert "bloom policy"
      
      This reverts commit 8dee68a0a22568dbeed6d4563372b25e1e825fb0.
      
      This policy should be revert and copied to feature/bloom
      
      * revert the bloom changes
      
      * cancel unneeded inputs
      
      * gpt
      16220310
    • Jianghai's avatar
      [pipeline] Bert pipeline for shardformer and its tests (#4197) · 1094e0f0
      Jianghai authored
      * add pipeline forward
      
      * complete pipeline forward check
      
      * fix bert forward without pipeline
      
      * fix comments
      
      * discard useless line
      
      * add todo
      
      * clean prints
      
      * fix distribute layers
      1094e0f0
    • Hongxin Liu's avatar
      [shardformer] support lazy init (#4202) · 890774b2
      Hongxin Liu authored
      * [shardformer] support lazy init
      
      * [shardformer] linear support lazy init
      
      * [shardformer] embedding support lazy init
      
      * [shardformer] norm support lazy init
      
      * [shardformer] fused linear support lazy init
      
      * [test] update shardformer test layer
      
      * [test] shardformer with lazy init fit ddp
      
      * [lazy] hotfix deepcopy of param
      
      * [shardformer] fix bert policy and update test
      
      * [shardformer] fix bloom policy and update test
      
      * [shardformer] fix opt policy and update test
      
      * [shardformer] fix t5 policy and update test
      
      * [shardformer] fix gpt2 policy and update test
      
      * [shardformer] fix llama policy and update test
      890774b2
    • Jianghai's avatar
      [pipeline] move bert related pipeline components to shardformer (#4187) · f3bcc292
      Jianghai authored
      * move bert related pipeline components to shardformer
      
      * fix bugs
      
      * revision
      
      * fix bert model tests
      
      * fix bert_lm_head model tests
      
      * fix tests
      
      * fix tests
      
      * done checks
      
      * skip bloom
      f3bcc292
    • Jianghai's avatar
      [pipeline] add bert_for_pretraining bert_lmhead forward and policy (#4172) · c5ea7280
      Jianghai authored
      * add pipeline policy and bert forward to be done
      
      * add bertmodel pipeline forward and make tests
      
      * add Bert_Policy and test for policy
      
      * update formatting
      
      * update formatting
      
      * update the code
      
      * fix bugs
      
      * fix name confilt
      
      * add bloom model and policy ,revise the base class of policy
      
      * revise
      
      * revision
      
      * add bert_for_pretraining
      
      * add bert_for_pretraining forward and policy
      
      * fix typos
      
      * cancel warning
      
      * change the imediate output to default dict
      
      * change the default output of get_shared_params
      c5ea7280