1. 14 Oct, 2019 2 commits
    • Rémi Louf's avatar
      process the raw CNN/Daily Mail dataset · 447fffb2
      Rémi Louf authored
      the data provided by Li Dong et al. were already tokenized, which means
      that they are not compatible with  all the models in the library. We
      thus process the raw data directly and tokenize them using the models'
      tokenizers.
      447fffb2
    • Rémi Louf's avatar
      load and prepare CNN/Daily Mail data · 67d10960
      Rémi Louf authored
      We write a function to load an preprocess the CNN/Daily Mail dataset as
      provided by Li Dong et al. The issue is that this dataset has already
      been tokenized by the authors, so we actually need to find the original,
      plain-text dataset if we want to apply it to all models.
      67d10960
  2. 11 Oct, 2019 2 commits
  3. 10 Oct, 2019 1 commit