- 13 Feb, 2021 1 commit
-
-
Deepak Narayanan authored
-
- 10 Feb, 2021 2 commits
-
-
Deepak Narayanan authored
-
Deepak Narayanan authored
-
- 09 Feb, 2021 4 commits
-
-
Deepak Narayanan authored
-
Deepak Narayanan authored
-
Deepak Narayanan authored
-
Deepak Narayanan authored
- Split a model's computation into multiple virtual stages as needed, and schedule communication correctly between these virtual stages - Move schedule code into `schedules.py` and communication code into `p2p_communication.py` - Use hyphens instead of spaces in all time logging for consistency - Factor out code in megatron/training.py into helper functions - Refactor evaluate() function: make it use forward_backward_schedule functions
-
- 08 Feb, 2021 2 commits
-
-
Mohammad Shoeybi authored
Improve handling of rng states in checkpoints. See merge request ADLR/megatron-lm!231
-
Jared Casper authored
-
- 06 Feb, 2021 1 commit
-
-
Jared Casper authored
Use torch.cuda.synchronize() right after calling batch_isend_irecv() communication API See merge request ADLR/megatron-lm!230
-
- 05 Feb, 2021 3 commits
-
-
Deepak Narayanan authored
-
Jared Casper authored
conditioning fused kernels See merge request ADLR/megatron-lm!228
-
Vijay Korthikanti authored
-
- 04 Feb, 2021 1 commit
-
-
Vijay Korthikanti authored
-
- 02 Feb, 2021 2 commits
-
-
Mohammad Shoeybi authored
Fix bug in merge_mp_partitions for handling recent checkpoints. See merge request ADLR/megatron-lm!226
-
Jared Casper authored
-
- 01 Feb, 2021 2 commits
-
-
Mohammad Shoeybi authored
Handle empty documents in preprocess_data. See merge request ADLR/megatron-lm!225
-
Jared Casper authored
-
- 29 Jan, 2021 4 commits
-
-
Jared Casper authored
Init CI tests with very basic import test. See merge request ADLR/megatron-lm!224
-
Jared Casper authored
-
Jared Casper authored
added option to change tensorboard queue size See merge request ADLR/megatron-lm!223
-
mohammad authored
-
- 28 Jan, 2021 11 commits
-
-
Jared Casper authored
added options for tensorboard logging See merge request ADLR/megatron-lm!222
-
mohammad authored
-
mohammad authored
-
Mohammad Shoeybi authored
Typo fix. See merge request ADLR/megatron-lm!221
-
Jared Casper authored
-
Mohammad Shoeybi authored
Teach merge_mp_partitions how to write out a pipelined model. See merge request ADLR/megatron-lm!218
-
Jared Casper authored
-
Jared Casper authored
license text for autoaugmentation See merge request ADLR/megatron-lm!220
-
Vijay Korthikanti authored
-
Jared Casper authored
Rework handling of older checkpoint's attention weight/bias ordering. See merge request ADLR/megatron-lm!219
-
Vijay Korthikanti authored
-
- 27 Jan, 2021 7 commits
-
-
Jared Casper authored
Move rearranging query_key_value and key_value values in old checkpoints to when the checkpoint is loaded instead of runtime..
-
Jared Casper authored
-
Jared Casper authored
added init method std to gpt3 example See merge request ADLR/megatron-lm!217
-
mshoeybi authored
-
Jared Casper authored
added grad and params norm to logging and tensorboard See merge request ADLR/megatron-lm!214
-
mohammad authored
-
mohammad authored
-