- 24 May, 2022 6 commits
-
-
Lawrence McAfee authored
-
Lawrence McAfee authored
-
Lawrence McAfee authored
-
Lawrence McAfee authored
merge sequence parallelism's layernorm all-reduce into distributed optimizer.
-
Jared Casper authored
fix for sequence parallelism in bert pooling See merge request ADLR/megatron-lm!418
-
Vijay Korthikanti authored
-
- 20 May, 2022 4 commits
-
-
Jared Casper authored
Update readme with sequence parallelism and selective activation recompute. See merge request ADLR/megatron-lm!417
-
Jared Casper authored
Sequence parallelism + attention checkpoint See merge request ADLR/megatron-lm!413
-
Vijay Korthikanti authored
-
Vijay Korthikanti authored
-
- 19 May, 2022 4 commits
-
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
- 18 May, 2022 1 commit
-
-
Vijay Korthikanti authored
-
- 17 May, 2022 5 commits
-
-
Jared Casper authored
Data preprocessing testing changes + fixes See merge request ADLR/megatron-lm!416
-
Vijay Korthikanti authored
-
Vijay Korthikanti authored
-
Vijay Korthikanti authored
-
Vijay Korthikanti authored
-
- 16 May, 2022 8 commits
-
-
Vijay Korthikanti authored
-
Jared Casper authored
Fourth phase vision merge: classification and segmentation tasks See merge request ADLR/megatron-lm!400
-
Vijay Korthikanti authored
-
Vijay Korthikanti authored
-
Vijay Korthikanti authored
-
Vijay Korthikanti authored
-
Lawrence McAfee authored
-
Lawrence McAfee authored
-
- 13 May, 2022 1 commit
-
-
Vijay Korthikanti authored
-
- 12 May, 2022 1 commit
-
-
John Kamalu authored
tools/merge_datasets.py - tool to merge multiple dataset files into a single dataset - testing conducted and included in the megatron-testing repo https://gitlab-master.nvidia.com/ADLR/megatron-testing tools/preprocess_data.py - magic numbers changed to required command line arguments megatron/data/indexed_dataset.py - when merging, fix to properly update document index - testing conducted and included in the megatron-testing repo (see above) - fix follows this history https://github.com/bigscience-workshop/Megatron-DeepSpeed/pull/66
-
- 10 May, 2022 1 commit
-
-
Jared Casper authored
jit functions warmpups to (1) match fprop and recompute results and (2) remove grad_enable See merge request ADLR/megatron-lm!404
-
- 09 May, 2022 1 commit
-
-
Sangkug Lym authored
change dummy_handler to nullcontext
-
- 28 Apr, 2022 3 commits
-
-
Vijay Korthikanti authored
-
Vijay Korthikanti authored
-
Vijay Korthikanti authored
-
- 25 Apr, 2022 1 commit
-
-
Vijay Korthikanti authored
-
- 20 Apr, 2022 1 commit
-
-
Lawrence McAfee authored
-
- 31 Mar, 2022 1 commit
-
-
Lawrence McAfee authored
-
- 30 Mar, 2022 1 commit
-
-
Vijay Korthikanti authored
-
- 29 Mar, 2022 1 commit
-
-
Lawrence McAfee authored
-