- 08 Apr, 2020 3 commits
-
-
Jared Casper authored
-
Jared Casper authored
preprocess_data: - Adds ability to not split sentences. This is used for gpt2 datasets. - Adds ability to create multiple datasets from different json keys, this is current untested. indexed_dataset: - Add new "get" function to get a portion of an entry.
-
Mohammad authored
-
- 07 Apr, 2020 1 commit
-
-
Mohammad authored
-
- 03 Apr, 2020 8 commits
- 02 Apr, 2020 8 commits
-
-
Raul Puri authored
Refactoring evaluate gpt2 See merge request ADLR/megatron-lm!38
-
Mohammad authored
-
Mohammad authored
-
Raul Puri authored
-
Jared Casper authored
Major refactor: args, global variables, tokenizer See merge request ADLR/megatron-lm!36
-
Neel Kant authored
-
Mohammad authored
-
Mohammad authored
-
- 01 Apr, 2020 1 commit
-
-
Mohammad authored
-
- 31 Mar, 2020 3 commits
- 30 Mar, 2020 7 commits
- 29 Mar, 2020 5 commits
- 28 Mar, 2020 4 commits