- 26 Feb, 2021 1 commit
-
-
Deepak Narayanan authored
-
- 25 Feb, 2021 3 commits
-
-
Mohammad Shoeybi authored
Don't import deprecated model from realm_model which is broken. See merge request ADLR/megatron-lm!241
-
Jared Casper authored
-
Jared Casper authored
fix warning condition See merge request ADLR/megatron-lm!238
-
- 23 Feb, 2021 2 commits
-
-
Vijay Korthikanti authored
-
Jared Casper authored
Storing and loading fingerprints of in deduplication See merge request ADLR/megatron-lm!236
-
- 22 Feb, 2021 3 commits
-
-
Mostofa Patwary authored
-
Jared Casper authored
Fix interleaved schedule assertion See merge request ADLR/megatron-lm!237
-
Deepak Narayanan authored
-
- 19 Feb, 2021 4 commits
-
-
Mostofa Patwary authored
-
Mostofa Patwary authored
-
Mostofa Patwary authored
-
Mostofa Patwary authored
-
- 18 Feb, 2021 9 commits
-
-
Jared Casper authored
Bug Jaccard similarity and filtering n-grams See merge request ADLR/megatron-lm!234
-
Jared Casper authored
ICT Retriever See merge request ADLR/megatron-lm!235
-
Mostofa Patwary authored
-
Mostofa Patwary authored
-
Mostofa Patwary authored
-
Jared Casper authored
Merge ICT Retriever with Main branch See merge request ADLR/megatron-lm!227
-
Mostofa Patwary authored
-
Mostofa Patwary authored
-
Jared Casper authored
Interleaved pipelined execution and refactored schedule / communication code See merge request ADLR/megatron-lm!189
-
- 17 Feb, 2021 3 commits
-
-
Mostofa Patwary authored
-
Mostofa Patwary authored
-
Deepak Narayanan authored
-
- 15 Feb, 2021 2 commits
-
-
Mostofa Patwary authored
-
Mostofa Patwary authored
-
- 13 Feb, 2021 2 commits
-
-
Deepak Narayanan authored
-
Deepak Narayanan authored
-
- 10 Feb, 2021 3 commits
-
-
Deepak Narayanan authored
-
Mostofa Patwary authored
-
Deepak Narayanan authored
-
- 09 Feb, 2021 4 commits
-
-
Deepak Narayanan authored
-
Deepak Narayanan authored
-
Deepak Narayanan authored
-
Deepak Narayanan authored
- Split a model's computation into multiple virtual stages as needed, and schedule communication correctly between these virtual stages - Move schedule code into `schedules.py` and communication code into `p2p_communication.py` - Use hyphens instead of spaces in all time logging for consistency - Factor out code in megatron/training.py into helper functions - Refactor evaluate() function: make it use forward_backward_schedule functions
-
- 08 Feb, 2021 2 commits
-
-
Mohammad Shoeybi authored
Improve handling of rng states in checkpoints. See merge request ADLR/megatron-lm!231
-
Jared Casper authored
-
- 06 Feb, 2021 1 commit
-
-
Jared Casper authored
Use torch.cuda.synchronize() right after calling batch_isend_irecv() communication API See merge request ADLR/megatron-lm!230
-
- 05 Feb, 2021 1 commit
-
-
Deepak Narayanan authored
-