- 29 Apr, 2024 2 commits
-
-
liangjj authored
- 27 Mar, 2024 1 commit
-
-
liangjing authored
-
- 26 Apr, 2023 2 commits
-
-
-
zhuwenwen authored
-
- 19 Apr, 2023 8 commits
-
-
Jared Casper authored
Add ability to enable autocast through schedules See merge request ADLR/megatron-lm!569
-
Abhinav Khattar authored
Signed-off-by:Abhinav Khattar <aklife97@gmail.com>
-
Abhinav Khattar authored
Signed-off-by:Abhinav Khattar <aklife97@gmail.com>
-
Abhinav Khattar authored
Signed-off-by:Abhinav Khattar <aklife97@gmail.com>
-
Jared Casper authored
replace custom layer_norm_cuda with Apex layer_norm_cuda See merge request ADLR/megatron-lm!566
-
Jon Barker authored
-
Jared Casper authored
Some quick fixes to checkpoint_util. See merge request ADLR/megatron-lm!568
-
Jared Casper authored
-
- 14 Apr, 2023 1 commit
-
-
Jared Casper authored
Update tests with new gold numbers See merge request ADLR/megatron-lm!565
-
- 13 Apr, 2023 1 commit
-
-
Jared Casper authored
Recent commit changed the bias dropout add jitted function which slightly changes numbers, manually tested to be accurate.
-
- 09 Apr, 2023 1 commit
-
-
Jared Casper authored
Add parallel_state helper functions See merge request ADLR/megatron-lm!564
-
- 07 Apr, 2023 4 commits
-
-
Abhinav Khattar authored
Signed-off-by:Abhinav Khattar <aklife97@gmail.com>
-
Jared Casper authored
Some quick fixes. See merge request ADLR/megatron-lm!563
-
Jared Casper authored
-
Jared Casper authored
Flash Attention inference fix See merge request ADLR/megatron-lm!562
-
- 06 Apr, 2023 17 commits
-
-
Jimmy Zhang authored
-
Jimmy Zhang authored
-
Jimmy Zhang authored
-
Jared Casper authored
Add support for GPTSentencePieceTokenizer and related fixes. See merge request ADLR/megatron-lm!561
-
Jared Casper authored
Untie Embeddings See merge request ADLR/megatron-lm!558
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
layernorm1p added See merge request ADLR/megatron-lm!557
-
Jared Casper authored
merging rope to main See merge request ADLR/megatron-lm!556
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
Add swiglu and squared relu activations and ability to disable bias. See merge request ADLR/megatron-lm!553
-
Shanmugam Ramasamy authored
Update .gitlab-ci.yml See merge request ADLR/megatron-lm!560
-
- 05 Apr, 2023 3 commits
-
-
Shanmugam Ramasamy authored
-
MaximumEntropy authored
Signed-off-by:MaximumEntropy <sandeep.subramanian.1@umontreal.ca>
-
Mostofa Patwary authored
-