Commits · dd96d402a50ab00d859fcf7e403a47a8be7fd870 · OpenDAS / Megatron-LM

09 Mar, 2022 1 commit
- bug fixes · dd96d402
  Vijay Korthikanti authored Mar 08, 2022
  
  dd96d402
07 Mar, 2022 2 commits
- fixes to main merge · 269f28f7
  Vijay Korthikanti authored Mar 07, 2022
  
  269f28f7
- refactor to help merge with main · 0d77c0e9
  Vijay Korthikanti authored Mar 07, 2022
  
  0d77c0e9
03 Mar, 2022 2 commits
- column parallel linear with sequence parallelism · 02bb1f5c
  Vijay Korthikanti authored Mar 02, 2022
  
  02bb1f5c
- get sequence parallelism to work with pipeline parallelism · 6658158b
  Vijay Korthikanti authored Mar 02, 2022
  
  6658158b
02 Mar, 2022 1 commit
- layernorm grad sync + name chnages · c0f10643
  Vijay Korthikanti authored Mar 02, 2022
  
  c0f10643
19 Feb, 2022 1 commit
- tensor model parallelism memory optmization · 5d4689c4
  Vijay Korthikanti authored Feb 18, 2022
  
  5d4689c4
18 Feb, 2022 1 commit
- support fp32 training and fix embedding update · b5726555
  Sangkug Lym authored Feb 18, 2022
  
  b5726555
17 Feb, 2022 2 commits
- addressed Jared and Patrick comments. · 90ce932d
  mshoeybi authored Feb 17, 2022
  
  90ce932d
- changed all_gather to _all_gather_base in distributed checkpointing · 37181ef4
  mshoeybi authored Feb 17, 2022
  
  37181ef4
16 Feb, 2022 1 commit

Sangkug Lym authored Feb 13, 2022

remove redundant linear layer class definition

add fuse_gradient_accumulation attribute to weights for simple targetting

reflect feedback and clean up the codes

arg change

83b1e42f

08 Feb, 2022 1 commit
- fixed t5 'get_num_layers()' · 2fadaa50
  Lawrence McAfee authored Feb 08, 2022
  
  2fadaa50
04 Feb, 2022 1 commit
- renamed argument; 'embed' -> 'embedding' · c04c4977
  Lawrence McAfee authored Feb 04, 2022
  
  c04c4977
01 Feb, 2022 1 commit
- comments, cleanup. · b93bef00
  Lawrence McAfee authored Feb 01, 2022
  
  b93bef00
31 Jan, 2022 1 commit
- working for t5 [ encoder embedding only ] · 3af6725d
  Lawrence McAfee authored Jan 31, 2022
  
  3af6725d
25 Jan, 2022 1 commit
- working with interleaving · 804ed2e6
  Lawrence McAfee authored Jan 24, 2022
  
  804ed2e6
24 Jan, 2022 3 commits
- added args.transformer_pipeline_model_parallel_size · a06af061
  Lawrence McAfee authored Jan 24, 2022
  
  a06af061
- fixed args.virtual_pipeline_model_parallel_size · c2b7d0b3
  Lawrence McAfee authored Jan 24, 2022
  
  c2b7d0b3
- working when no interleaving · 33dc8e9c
  Lawrence McAfee authored Jan 24, 2022
  
  33dc8e9c
12 Jan, 2022 1 commit
- Phase1 merge: vit optimizations + dataset enhancements + scaled_softmax kernel · 7a77abd9
  Vijay Korthikanti authored Jan 12, 2022
  
  7a77abd9
11 Jan, 2022 3 commits
- added comments · a1fe4805
  Lawrence McAfee authored Jan 11, 2022
  
  a1fe4805
- partially cleaned · 806422e5
  Lawrence McAfee authored Jan 11, 2022
  
  806422e5
- jan 11 alpha · 05042081
  Lawrence McAfee authored Jan 11, 2022
  
  05042081
10 Jan, 2022 1 commit
- loss matches; memory savings for multi-node (tested n3, n16) · 270d6412
  Lawrence McAfee authored Jan 10, 2022
  
  270d6412
08 Jan, 2022 1 commit
- more iterating on 'viewless tensor' methods · ed0c8714
  Lawrence McAfee authored Jan 07, 2022
  
  ed0c8714
07 Jan, 2022 1 commit
- debugging make_standalone_tensor(), safely_set_tensor_data_attr() · 5422d23a
  Lawrence McAfee authored Jan 07, 2022
  
  5422d23a
17 Dec, 2021 2 commits
- minor fixes · f2bf5a56
  Vijay Korthikanti authored Dec 17, 2021
  
  f2bf5a56
- pipeline_fixes · 17843605
  Vijay Korthikanti authored Dec 17, 2021
  
  17843605
22 Nov, 2021 1 commit
- removed unused 'get_args' import · 941a793f
  Lawrence McAfee authored Nov 22, 2021
  
  941a793f
05 Nov, 2021 1 commit
- t5_pipeline_fix · ea128da5
  Vijay Korthikanti authored Nov 05, 2021
  
  ea128da5
03 Sep, 2021 1 commit
- reflect feedback · 4df8b7a2
  slym authored Sep 02, 2021
  
  4df8b7a2
02 Sep, 2021 3 commits
- reflect feedback · 3f652469
  slym authored Sep 02, 2021
  
  3f652469
- minor changes · 16c90445
  slym authored Sep 02, 2021
  
  16c90445
- t # This is a combination of 2 commits. · cf7efd4f
  Sangkug Lym authored Aug 30, 2021
```
allreduce overlap with wgrad gemm

change custom delay to dummy add
```
  cf7efd4f
31 Aug, 2021 1 commit
- fix a typo · 30abf2c5
  vycezhong authored Aug 31, 2021
  
  30abf2c5
19 Aug, 2021 3 commits
- Checkpoint a set number of invidividual Transformer layers · c1e0689d
  slym authored Aug 10, 2021
```
consider the case of pipeline-model prallelism

clean up arugments

argument naming cleanup

update readme and examples
```
  c1e0689d
- onlly support pp=1 · 7b585440
  mshoeybi authored Aug 19, 2021
  
  7b585440
- removed contiguous buffer for checkpointed activation · e923ec52
  mshoeybi authored Aug 19, 2021
  
  e923ec52
16 Aug, 2021 1 commit

Destroy more groups in `destroy_model_parallel` · eddf7593

eqy authored Aug 16, 2021

Some tests expect a clean model parallel slate and complain if a previous test left something behind; this change clears more variables that the tests complain about.

eddf7593

30 Jul, 2021 1 commit

Support for pipeline parallelism in T5 model · 46c74b4c

Deepak Narayanan authored Jun 22, 2021

- Accumulate encoder hidden state gradient to handle skip connection
- Correctly compute the number of layers in encoder / decoder for T5 model
- Ensure e weights are initialized the same way in embeddings
- Synchronize embedding gradients across encoder and decoder for T5 model
- Support for checkpoint loading and saving

46c74b4c