- 09 Mar, 2023 2 commits
-
-
Jared Casper authored
Fix GPT text generation See merge request ADLR/megatron-lm!528
-
Maanu Grover authored
-
- 08 Mar, 2023 2 commits
-
-
Jared Casper authored
Testing infrastructure for Megatron core See merge request ADLR/megatron-lm!514
-
Shanmugam Ramasamy authored
-
- 22 Feb, 2023 6 commits
-
-
Jared Casper authored
Pretraining README update See merge request ADLR/megatron-lm!512
-
Maanu Grover authored
-
Jared Casper authored
README: Paper Links See merge request ADLR/megatron-lm!521
-
Maanu Grover authored
-
Jared Casper authored
Retro See merge request ADLR/megatron-lm!489
-
Lawrence McAfee authored
-
- 17 Feb, 2023 6 commits
-
-
Jared Casper authored
Fix distributed pretraining examples See merge request ADLR/megatron-lm!523
-
Maanu Grover authored
-
Jared Casper authored
Update Distributed Pretraining REAME section See merge request ADLR/megatron-lm!522
-
Maanu Grover authored
-
Jared Casper authored
Fix pretraining examples See merge request ADLR/megatron-lm!513
-
Maanu Grover authored
-
- 11 Jan, 2023 2 commits
-
-
Jared Casper authored
Integrate FlashAttention into Megatron-LM See merge request ADLR/megatron-lm!488
-
Jared Casper authored
-
- 04 Jan, 2023 1 commit
-
-
Tri Dao authored
-
- 03 Jan, 2023 1 commit
-
-
Jared Casper authored
remove mpu dependency in zeroshot script See merge request ADLR/megatron-lm!493
-
- 22 Dec, 2022 2 commits
-
-
John Kamalu authored
Transformer Engine Integration Rebase See merge request ADLR/megatron-lm!487
-
John Kamalu authored
-
- 21 Dec, 2022 1 commit
-
-
Hao Wu authored
-
- 09 Dec, 2022 1 commit
-
-
Tri Dao authored
-
- 29 Nov, 2022 3 commits
-
-
Jared Casper authored
Fix DropPath for hidden shape [s, b, h] See merge request ADLR/megatron-lm!485
-
Jared Casper authored
bug fixes in partitioned data preprocessor See merge request ADLR/megatron-lm!486
-
Mike Chrzanowski authored
-
- 24 Nov, 2022 2 commits
-
-
Boxin Wang authored
Open sourcing lm detoxification code See merge request ADLR/megatron-lm!471
-
Boxin Wang authored
-
- 22 Nov, 2022 1 commit
-
-
Yu Yao authored
-
- 17 Nov, 2022 2 commits
-
-
Mike Chrzanowski authored
Data Preprocessing Optimizations See merge request ADLR/megatron-lm!483
-
Vijay Korthikanti authored
-
- 11 Nov, 2022 2 commits
-
-
Jared Casper authored
Support for variable sequence lengths across micro-batches See merge request ADLR/megatron-lm!472
-
Vijay Korthikanti authored
-
- 10 Nov, 2022 2 commits
-
-
Jared Casper authored
ViT Backbone Tensor Shape Fix See merge request ADLR/megatron-lm!479
-
Yu Yao authored
-
- 08 Nov, 2022 2 commits
-
-
Jared Casper authored
Fix merge error. See merge request ADLR/megatron-lm!478
-
Jared Casper authored
-
- 02 Nov, 2022 2 commits
-
-
Jared Casper authored
Move most of mpu functionality into a new "Megatron core" See merge request ADLR/megatron-lm!462
-
Jared Casper authored
Sending in prompts with the wrong type hangs the server. This is a check to make sure it's a list See merge request ADLR/megatron-lm!473
-