- 14 Apr, 2020 7 commits
-
-
Jared Casper authored
GPT2 dataloader using indexed dataset See merge request ADLR/megatron-lm!45
-
Mohammad authored
-
Mohammad authored
-
Mohammad Shoeybi authored
Added logic for initilialize_megatron to work with cpu only jobs. This is... See merge request ADLR/megatron-lm!46
-
Jared Casper authored
refactored merge_mp_partitions.py See merge request ADLR/megatron-lm!47
-
Mohammad Shoeybi authored
Do not cast return type of indexed dataset See merge request ADLR/megatron-lm!48
-
Mohammad authored
-
- 13 Apr, 2020 6 commits
-
-
Jared Casper authored
Do not cast data returned from indexed_dataset to int64, rely on caller to cast to appropriate type.
-
Jared Casper authored
-
Mohammad authored
-
Mohammad authored
-
Mohammad authored
-
Mohammad authored
-
- 12 Apr, 2020 1 commit
-
-
Mohammad authored
-
- 11 Apr, 2020 2 commits
- 10 Apr, 2020 2 commits
- 09 Apr, 2020 2 commits
- 08 Apr, 2020 8 commits
-
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
preprocess_data: - Adds ability to not split sentences. This is used for gpt2 datasets. - Adds ability to create multiple datasets from different json keys, this is current untested. indexed_dataset: - Add new "get" function to get a portion of an entry.
-
Mohammad authored
-
- 07 Apr, 2020 1 commit
-
-
Mohammad authored
-
- 03 Apr, 2020 11 commits
-
-
Mohammad Shoeybi authored
added task ensembling See merge request ADLR/megatron-lm!40
-
Raul Puri authored
-
Mohammad Shoeybi authored
Lint megatron/data/dataset_utils.py See merge request ADLR/megatron-lm!42
-
Neel Kant authored
-
Raul Puri authored
-
Raul Puri authored
Refactoring text generation See merge request ADLR/megatron-lm!39
-
Mohammad authored
-
Mohammad authored
-
Mohammad authored
-
Mohammad authored
-
Mohammad authored
-