1. 08 Apr, 2020 1 commit
    • Jared Casper's avatar
      Updates to preprocess_data.py and indexed_dataset. · da0562fc
      Jared Casper authored
      preprocess_data:
      - Adds ability to not split sentences. This is used for gpt2 datasets.
      
      - Adds ability to create multiple datasets from different json keys,
      this is current untested.
      
      indexed_dataset:
      - Add new "get" function to get a portion of an entry.
      da0562fc
  2. 02 Apr, 2020 3 commits
  3. 19 Nov, 2019 1 commit
  4. 29 Oct, 2019 1 commit
  5. 07 Oct, 2019 1 commit
  6. 04 Oct, 2019 1 commit
  7. 12 Sep, 2019 1 commit
  8. 30 Jul, 2019 1 commit
  9. 13 May, 2019 1 commit
  10. 11 May, 2019 1 commit
  11. 27 Mar, 2019 1 commit