- 22 Feb, 2023 1 commit
-
-
Lawrence McAfee authored
-
- 17 Feb, 2023 6 commits
-
-
Jared Casper authored
Fix distributed pretraining examples See merge request ADLR/megatron-lm!523
-
Maanu Grover authored
-
Jared Casper authored
Update Distributed Pretraining REAME section See merge request ADLR/megatron-lm!522
-
Maanu Grover authored
-
Jared Casper authored
Fix pretraining examples See merge request ADLR/megatron-lm!513
-
Maanu Grover authored
-
- 11 Jan, 2023 2 commits
-
-
Jared Casper authored
Integrate FlashAttention into Megatron-LM See merge request ADLR/megatron-lm!488
-
Jared Casper authored
-
- 04 Jan, 2023 1 commit
-
-
Tri Dao authored
-
- 03 Jan, 2023 1 commit
-
-
Jared Casper authored
remove mpu dependency in zeroshot script See merge request ADLR/megatron-lm!493
-
- 22 Dec, 2022 2 commits
-
-
John Kamalu authored
Transformer Engine Integration Rebase See merge request ADLR/megatron-lm!487
-
John Kamalu authored
-
- 21 Dec, 2022 1 commit
-
-
Hao Wu authored
-
- 09 Dec, 2022 1 commit
-
-
Tri Dao authored
-
- 29 Nov, 2022 3 commits
-
-
Jared Casper authored
Fix DropPath for hidden shape [s, b, h] See merge request ADLR/megatron-lm!485
-
Jared Casper authored
bug fixes in partitioned data preprocessor See merge request ADLR/megatron-lm!486
-
Mike Chrzanowski authored
-
- 24 Nov, 2022 2 commits
-
-
Boxin Wang authored
Open sourcing lm detoxification code See merge request ADLR/megatron-lm!471
-
Boxin Wang authored
-
- 22 Nov, 2022 1 commit
-
-
Yu Yao authored
-
- 17 Nov, 2022 2 commits
-
-
Mike Chrzanowski authored
Data Preprocessing Optimizations See merge request ADLR/megatron-lm!483
-
Vijay Korthikanti authored
-
- 11 Nov, 2022 2 commits
-
-
Jared Casper authored
Support for variable sequence lengths across micro-batches See merge request ADLR/megatron-lm!472
-
Vijay Korthikanti authored
-
- 10 Nov, 2022 2 commits
-
-
Jared Casper authored
ViT Backbone Tensor Shape Fix See merge request ADLR/megatron-lm!479
-
Yu Yao authored
-
- 08 Nov, 2022 2 commits
-
-
Jared Casper authored
Fix merge error. See merge request ADLR/megatron-lm!478
-
Jared Casper authored
-
- 02 Nov, 2022 2 commits
-
-
Jared Casper authored
Move most of mpu functionality into a new "Megatron core" See merge request ADLR/megatron-lm!462
-
Jared Casper authored
Sending in prompts with the wrong type hangs the server. This is a check to make sure it's a list See merge request ADLR/megatron-lm!473
-
- 27 Oct, 2022 1 commit
-
-
rprenger authored
-
- 20 Oct, 2022 2 commits
-
-
Jared Casper authored
Disable newline after colon See merge request ADLR/megatron-lm!469
-
Peng Xu authored
-
- 14 Oct, 2022 5 commits
-
-
Jared Casper authored
inverse_square_root learning param schedule See merge request ADLR/megatron-lm!466
-
Jared Casper authored
Remove noop used to try to force scheduling and check for environment variable instead. See merge request ADLR/megatron-lm!463
-
Jared Casper authored
-
Vijay Korthikanti authored
-
Jared Casper authored
Core merge main See merge request ADLR/megatron-lm!464
-
- 13 Oct, 2022 1 commit
-
-
Jared Casper authored
-