1. 16 May, 2019 1 commit
  2. 15 May, 2019 7 commits
  3. 14 May, 2019 3 commits
    • Myle Ott's avatar
      rm default_key from MultiCorpusSampledDataset · 7432130e
      Myle Ott authored
      Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/575
      
      Differential Revision: D15318004
      
      Pulled By: myleott
      
      fbshipit-source-id: ad918d71b1bd8074decf5ec3463dd9bc9487bbe9
      7432130e
    • Nayan Singhal's avatar
      Alignment Training task using minibatch · 2c278ff0
      Nayan Singhal authored
      Summary:
      1. Define a EpochMinibatchIterator which extends the EpochBatchIterator. It has same functionality as EpochBatchIterator except two major changes: use static batching and use MiniBatchIterator for getting the indices.
      
      2. SplitSeqCollater is used instead of Seq2SeqCollater.
      3. LSTM_subsample started storing the previous states and reset it once the sample is over.
      
      Reviewed By: jay-mahadeokar
      
      Differential Revision: D15209023
      
      fbshipit-source-id: 900b8bd1f25159ffc77f8106e26729a3e7422a1f
      2c278ff0
    • Dmytro Okhonko's avatar
      Move save/load checkpoint functions to utils · cd1e5c09
      Dmytro Okhonko authored
      Summary:
      Move `load_checkpoint`, `save_checkpoint` and `reload_train` from train.py to checkpoint_utils.py
      Move `get_perplexity` from train.py to utils.py.
      This will make train.py lighter and allow us to reuse all this utils functionality when fairseq is used as external library.
      
      Reviewed By: myleott
      
      Differential Revision: D15289607
      
      fbshipit-source-id: 4b7c95225ac22e402bcda3497811361809110df1
      cd1e5c09
  4. 13 May, 2019 4 commits
  5. 12 May, 2019 2 commits
  6. 11 May, 2019 2 commits
  7. 10 May, 2019 2 commits
  8. 09 May, 2019 5 commits
  9. 08 May, 2019 7 commits
  10. 07 May, 2019 5 commits
  11. 06 May, 2019 2 commits
    • Naman Goyal's avatar
      allowing sharded dataset (#696) · 0add50c2
      Naman Goyal authored
      
      
      Summary:
      Co-authored-by: default avatarmyleott <myleott@fb.com>
      
      Changing `data` to be `str` with colon separated list for loading sharded datasets. This change is useful for loading large datasets that cannot fit into, memory. The large dataset can be sharded and then each shard is loaded in one epoch in roudrobin manner.
      
      For example, if there are `5` shards of data and `10` epochs then the shards will be iterated upon `[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]`.
      
      myleott We need to look into `translation.py` as it currently already expects a list and then concats the datasets.
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/696
      
      Differential Revision: D15214049
      
      fbshipit-source-id: 03e43a7b69c7aefada2ca668abf1eac1969fe013
      0add50c2
    • Myle Ott's avatar
      Remove redundant distributed init · 57da383c
      Myle Ott authored
      Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/707
      
      Differential Revision: D15219014
      
      Pulled By: myleott
      
      fbshipit-source-id: f38f2cf817d05e0871ff9084a810d109848e827c
      57da383c