Commits · 0ed2f6ac943560ab0a8a58b6628a669af8c250db · OpenDAS / Megatron-LM

17 Feb, 2022 3 commits
- Merge branch 'all_gather_base' into 'main' · 0ed2f6ac
  Jared Casper authored Feb 17, 2022
```
changed all_gather to _all_gather_base in distributed checkpointing

See merge request ADLR/megatron-lm!395
```
  0ed2f6ac
- addressed Jared and Patrick comments. · 90ce932d
  mshoeybi authored Feb 17, 2022
  
  90ce932d
- changed all_gather to _all_gather_base in distributed checkpointing · 37181ef4
  mshoeybi authored Feb 17, 2022
  
  37181ef4
15 Feb, 2022 1 commit

Merge branch 'vision-merge' into 'main' · 8c8063eb

Jared Casper authored Feb 14, 2022

vision third phase merge: pretraining methods + mit,swin backbones

See merge request ADLR/megatron-lm!384

8c8063eb

11 Feb, 2022 5 commits
- adress review comments · 4554c3fe
  Vijay Korthikanti authored Feb 11, 2022
  
  4554c3fe
- Merge branch 'readme-no-versions' into 'main' · f00d0a3f
  Mohammad Shoeybi authored Feb 10, 2022
```
Remove specific versions of pytorch, etc. from README so it doesn't go out of date.

See merge request ADLR/megatron-lm!392
```
  f00d0a3f
- Remove specific versions of pytorch, etc. from README so it doesn't go out of date. · d50e89f1
  Jared Casper authored Feb 10, 2022
  
  d50e89f1
- Merge branch 'stop_tokens' into 'main' · d5fe59fe
  Mohammad Shoeybi authored Feb 10, 2022
```
Adding several things to the text_generation_server that were necessary for the demos

See merge request ADLR/megatron-lm!350
```
  d5fe59fe
- Addressing comments · 1d391bba
  rprenger authored Feb 10, 2022
  
  1d391bba
08 Feb, 2022 2 commits
- Merge branch 'lmcafee/embed-standalone' into 'main' · 10c6ad06
  Jared Casper authored Feb 08, 2022
```
Standalone embedding stage

See merge request ADLR/megatron-lm!385
```
  10c6ad06
- fixed t5 'get_num_layers()' · 2fadaa50
  Lawrence McAfee authored Feb 08, 2022
  
  2fadaa50
04 Feb, 2022 3 commits
- renamed argument; 'embed' -> 'embedding' · c04c4977
  Lawrence McAfee authored Feb 04, 2022
  
  c04c4977
- Adding message to ValueError · b0c824d9
  rprenger authored Feb 04, 2022
  
  b0c824d9
- Adding the web interface · 42982fc3
  rprenger authored Feb 03, 2022
  
  42982fc3
01 Feb, 2022 5 commits
- comments, cleanup. · b93bef00
  Lawrence McAfee authored Feb 01, 2022
  
  b93bef00
- more minor fixes · 3f1a728a
  Vijay Korthikanti authored Feb 01, 2022
  
  3f1a728a
- found root source of t5 issue (fast layer norm) · bea16fa3
  Lawrence McAfee authored Feb 01, 2022
  
  bea16fa3
- minor fixes · e1f9c3a5
  Vijay Korthikanti authored Feb 01, 2022
  
  e1f9c3a5
- vision third phase merge: pretraining methods + mit,swin backbones · 01a82723
  Vijay Korthikanti authored Jan 31, 2022
  
  01a82723
31 Jan, 2022 2 commits
- working for t5 [ encoder embedding only ] · 3af6725d
  Lawrence McAfee authored Jan 31, 2022
  
  3af6725d
- added 'no-op' layer, to replace transformer layer when num_layers == 0. · 1fa6990c
  Lawrence McAfee authored Jan 31, 2022
  
  1fa6990c
29 Jan, 2022 5 commits
- Merge branch 'vision-merge' into 'main' · e724785f
  Jared Casper authored Jan 28, 2022
```
second phase of vision code merge

See merge request ADLR/megatron-lm!381
```
  e724785f
- narrowed issue to pipeline rank 0, virtual pipeline rank >= 1 · 5bc9f889
  Lawrence McAfee authored Jan 28, 2022
  
  5bc9f889
- typo fix · 2b628f96
  Vijay Korthikanti authored Jan 28, 2022
  
  2b628f96
- Merge branch 'github-pr' into 'main' · e156d2fe
  Jared Casper authored Jan 28, 2022
```
Combination of several github PRs

See merge request ADLR/megatron-lm!383
```
  e156d2fe
- Revert incorrect fix. · cd499559
  Jared Casper authored Jan 28, 2022
  
  cd499559
28 Jan, 2022 9 commits
- Merge branch 'patch-1' of https://github.com/vycezhong/Megatron-LM into github-pr · 2a34e0ec
  Jared Casper authored Jan 28, 2022
  
  2a34e0ec
- Merge branch 'main' of https://github.com/satpalsr/Megatron-LM into github-pr · 34f55429
  Jared Casper authored Jan 28, 2022
  
  34f55429
- Merge branch 'patch-1' of https://github.com/jamesr66a/Megatron-LM into github-pr · adebe364
  Jared Casper authored Jan 28, 2022
  
  adebe364
- Merge branch 'patch-2' of https://github.com/kvtoraman/Megatron-LM into github-pr · 20f6169f
  Jared Casper authored Jan 28, 2022
  
  20f6169f
- Merge branch 'patch-1' of https://github.com/rajeshkppt/Megatron-LM into github-pr · 0747e8e5
  Jared Casper authored Jan 28, 2022
  
  0747e8e5
- Merge branch 'fix' of https://github.com/singleheart/Megatron-LM into github-pr · 9882fb3f
  Jared Casper authored Jan 28, 2022
  
  9882fb3f
- Merge branch 'patch-1' of https://github.com/stas00/Megatron-LM into github-pr · 4a62d582
  Jared Casper authored Jan 28, 2022
  
  4a62d582
- more naming cleanup · 641408f5
  Vijay Korthikanti authored Jan 28, 2022
  
  641408f5
- changing class name AnnealingLR to OptimizerParamScheduler · 04ecc834
  Vijay Korthikanti authored Jan 28, 2022
  
  04ecc834
27 Jan, 2022 5 commits
- address review comments · 53931b8b
  Vijay Korthikanti authored Jan 27, 2022
  
  53931b8b
- Merge branch 'main' into lmcafee/embed-standalone · f17a3933
  Lawrence McAfee authored Jan 27, 2022
  
  f17a3933
- Merge branch 'lmcafee/distrib-chkpt-fix-v2' into 'main' · fd5469aa
  Jared Casper authored Jan 26, 2022
```
Distributed checkpointing memory fix

See merge request ADLR/megatron-lm!379
```
  fd5469aa
- add clarification about the model parallel size column · 2d767f48
  Stas Bekman authored Jan 26, 2022
  
  2d767f48
- [README] specify explicitly which gpu and node size was used · 145d2eb9
  Stas Bekman authored Jan 26, 2022
```
The paper has this info, so proposing to copy it next to the table. 

Otherwise it's hard to guess whether you used 40GB A100s or 80GB ones (and secondary, n_gpus per node).

Thank you!
```
  145d2eb9