- 07 Nov, 2024 1 commit
-
-
liangjing authored
-
- 08 Apr, 2020 3 commits
-
-
Jared Casper authored
-
Jared Casper authored
-
Jared Casper authored
preprocess_data: - Adds ability to not split sentences. This is used for gpt2 datasets. - Adds ability to create multiple datasets from different json keys, this is current untested. indexed_dataset: - Add new "get" function to get a portion of an entry.
-
- 02 Apr, 2020 3 commits
- 19 Nov, 2019 1 commit
-
-
Jared Casper authored
-
- 29 Oct, 2019 1 commit
-
-
Mohammad Shoeybi authored
-
- 07 Oct, 2019 1 commit
-
-
Jared Casper authored
-
- 04 Oct, 2019 1 commit
-
-
Jared Casper authored
-
- 12 Sep, 2019 1 commit
-
-
Raul Puri authored
-
- 30 Jul, 2019 1 commit
-
-
Raul Puri authored
Co-authored-by:
shoeybi <shoeybim@gmail.com> Co-authored-by:
raulpuric <raulpuric@berkeley.edu> Co-authored-by:
jaredcasper <jaredcasper@gmail.com> Co-authored-by:
mpatwary <mostofa.patwary@gmail.com> Co-authored-by:
plegresl <plegresl@gmail.com>
-
- 13 May, 2019 1 commit
-
-
Raul Puri authored
-
- 11 May, 2019 1 commit
-
-
Raul Puri authored
-
- 27 Mar, 2019 1 commit
-
-
Raul Puri authored
-