1. 08 Nov, 2019 1 commit
  2. 24 Oct, 2019 1 commit
  3. 12 Oct, 2019 1 commit
  4. 27 Sep, 2019 1 commit
    • Changhan Wang's avatar
      Levenshtein Transformer paper code · 86857a58
      Changhan Wang authored
      Summary:
      Code for our NeurIPS paper [Levenshtein Transformer](https://arxiv.org/abs/1905.11006)
      * Added Levenshtein Transformer model, task and criterion class
      * Added iterative NAT Transformer, insertion Transformer and CMLM Transformer model class for baselines
      * Add an option for prepending BOS to dictionary class and translation task class
      
      Reviewed By: myleott
      
      Differential Revision: D17297372
      
      fbshipit-source-id: 54eca60831ae95dc721c2c34e882e1810ee575c7
      86857a58
  5. 20 Sep, 2019 1 commit
    • Naman Goyal's avatar
      added multilingual masked LM training (#849) · 32335404
      Naman Goyal authored
      Summary:
      The multilingual-RoBERTa training is working with aconneau XLM data.
      
      Two pieces remaining:
      
      1) `XLM` limits batch to be from same language, I am not 100% sure about the reason for that, but should be easy to implement, basically we can add `batch_by_size_and_language` instead of default `batch_by_size` function. If it's not critical, I would want to leave it out as it keeps the code very clean and simple.
      
      2) `sample_ratio` in `ConcatDataset` works with `int` by tiling the datasets based on ratio. Currently I am handling it by sounding off the ratio to `first decimal` and then multiplying by `10`. We can see if some such simple heuristics are good enough, there are other options (we can talk about them offline).
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/849
      
      Differential Revision: D17162460
      
      fbshipit-source-id: d967f3d872f7a1f0aa4ea418bd362b68af9e432f
      32335404
  6. 31 Aug, 2019 1 commit
  7. 30 Aug, 2019 1 commit
  8. 29 Aug, 2019 1 commit
  9. 27 Aug, 2019 1 commit
    • Alexei Baevski's avatar
      wav2vec everstore support fix · 3ab8e0fd
      Alexei Baevski authored
      Summary: fixes some merge issues that prevented wav2vec from training properly
      
      Reviewed By: myleott
      
      Differential Revision: D16981120
      
      fbshipit-source-id: cad39aaf2f44daabcbafe7b4e8735d055b3842a7
      3ab8e0fd
  10. 21 Aug, 2019 1 commit
    • alexeib's avatar
      Multiset (#838) · a2f5361d
      alexeib authored
      Summary:
      Adds ability to tag individual examples with the names of their datasets, along with some minor miscellaneous fixes and improvements
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/838
      
      Differential Revision: D16919175
      
      Pulled By: alexeib
      
      fbshipit-source-id: 4bf493299645bae63f3ee6382e15f18a9f73666c
      a2f5361d
  11. 30 Jul, 2019 2 commits
  12. 29 Jul, 2019 1 commit
    • Naman Goyal's avatar
      adding glue data preprocessing scripts (#771) · 138dc8e4
      Naman Goyal authored
      Summary:
      1) Added glue data pre-processing script.
      2) updated README with usage.
      
      TODO:
      1) releasing fairseq dictionary and remove hardcoded path.
      2) remove hard-coded path for bpe-encoding,
      
      myleott what do you recommend for above TODOs?
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/771
      
      Reviewed By: myleott
      
      Differential Revision: D16547679
      
      Pulled By: myleott
      
      fbshipit-source-id: 6a6562d9b6215523d048fdf3daee63ffac21e231
      138dc8e4
  13. 24 Jul, 2019 1 commit
    • Spencer Poff's avatar
      check save_dir before beginning training · b49ea81c
      Spencer Poff authored
      Summary: I sadly discovery that my checkpoint directory wasn't globally readable after 8 hours of training. Adding this check at the beginning of train loop to keep that from happening again!
      
      Reviewed By: myleott
      
      Differential Revision: D16455394
      
      fbshipit-source-id: 35959aa058150b2afb63710c468d01ebc8a12b0c
      b49ea81c
  14. 02 Jul, 2019 1 commit
    • Xutai Ma's avatar
      add --max-tokens-valid option for validation · bccfddbb
      Xutai Ma authored
      Summary: Add the max-token-valid option. Sometime a separate max batch tokens for validation may be helpful, for example when there is a long sequence in validation set thats larger than max_tokens (it's rare in MT but could happen in ASR or AST).
      
      Reviewed By: myleott
      
      Differential Revision: D16076951
      
      fbshipit-source-id: ae7f4218594580b9450a8196d7afa1e7e2018aee
      bccfddbb
  15. 30 Jun, 2019 1 commit
  16. 26 Jun, 2019 1 commit
    • Liang Wang's avatar
      FIx dataset loading when there are multiple valid subsets (#835) · 8b514b9f
      Liang Wang authored
      Summary:
      When we have multiple valid subsets, say `valid`, `valid1` and `valid2`, if `combine=True` holds, when loading `valid` subset, it will try to locate and load `valid`, `valid1`, `valid2`... and then combine them into one dataset. Set `combine` to `False` solves this issue.
      
      In my experiment, I have 3 valid subsets with 3000, 5000 and 8701 examples, with argument `--valid-subset valid,valid1,valid2`, the log is as follows:
      
      ```
      ......
      | ./mix_data/bin valid src-trg 3000 examples
      | ./mix_data/bin valid1 src-trg 5000 examples
      | ./mix_data/bin valid2 src-trg 7801 examples
      | ./mix_data/bin valid1 src-trg 5000 examples
      | ./mix_data/bin valid2 src-trg 7801 examples
      ......
      ```
      
      As shown above, `valid1` and `valid2` subsets are incorrectly loaded twice.
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/835
      
      Differential Revision: D16006343
      
      Pulled By: myleott
      
      fbshipit-source-id: ece7fee3a00f97a6b3409defbf7f7ffaf0a54fdc
      8b514b9f
  17. 21 May, 2019 1 commit
  18. 20 May, 2019 2 commits
  19. 17 May, 2019 1 commit
  20. 14 May, 2019 1 commit
    • Dmytro Okhonko's avatar
      Move save/load checkpoint functions to utils · cd1e5c09
      Dmytro Okhonko authored
      Summary:
      Move `load_checkpoint`, `save_checkpoint` and `reload_train` from train.py to checkpoint_utils.py
      Move `get_perplexity` from train.py to utils.py.
      This will make train.py lighter and allow us to reuse all this utils functionality when fairseq is used as external library.
      
      Reviewed By: myleott
      
      Differential Revision: D15289607
      
      fbshipit-source-id: 4b7c95225ac22e402bcda3497811361809110df1
      cd1e5c09
  21. 08 May, 2019 2 commits
    • Myle Ott's avatar
      Cleanup LM + Flake8 · f2563c21
      Myle Ott authored
      Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/720
      
      Differential Revision: D15259091
      
      Pulled By: myleott
      
      fbshipit-source-id: 06a35996c06ccddb49fdc9e01e348ff3c9da334e
      f2563c21
    • Jay Mahadeokar's avatar
      bugfix data not in args · 6a7eb6ce
      Jay Mahadeokar authored
      Summary:
      D15214049 introduced a bug such that if a tasks args does not contain data, then it will give error
      ```
      File "/data/users/jaym/fbsource/fbcode/buck-out/dev/gen/deeplearning/projects/fairspeq/train#link-tree/train.py", line 119, in reload_train
         if len(args.data.split(":")) == 1:
      AttributeError: 'Namespace' object has no attribute 'data'
      ```
      
      This diff checks if data is in args to avoid above error.
      
      Reviewed By: myleott, jmp84
      
      Differential Revision: D15253373
      
      fbshipit-source-id: 14fb9ad878ee50f1b7583349bb17e29c03c40815
      6a7eb6ce
  22. 06 May, 2019 1 commit
    • Naman Goyal's avatar
      allowing sharded dataset (#696) · 0add50c2
      Naman Goyal authored
      
      
      Summary:
      Co-authored-by: default avatarmyleott <myleott@fb.com>
      
      Changing `data` to be `str` with colon separated list for loading sharded datasets. This change is useful for loading large datasets that cannot fit into, memory. The large dataset can be sharded and then each shard is loaded in one epoch in roudrobin manner.
      
      For example, if there are `5` shards of data and `10` epochs then the shards will be iterated upon `[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]`.
      
      myleott We need to look into `translation.py` as it currently already expects a list and then concats the datasets.
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/696
      
      Differential Revision: D15214049
      
      fbshipit-source-id: 03e43a7b69c7aefada2ca668abf1eac1969fe013
      0add50c2
  23. 05 May, 2019 2 commits
  24. 04 May, 2019 1 commit
  25. 02 May, 2019 2 commits
  26. 30 Apr, 2019 1 commit
  27. 24 Apr, 2019 1 commit
  28. 15 Apr, 2019 1 commit
    • freewym's avatar
      fix checkpoint timer (#634) · de8aeab5
      freewym authored
      Summary:
      If arg.keep_interval_updates or args.keep_last_epochs > 0, `checkpoints` would refer to a list of checkpoint files to be removed, which can be empty. So moved the logging code to the right position.
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/634
      
      Differential Revision: D14933655
      
      Pulled By: myleott
      
      fbshipit-source-id: 68182ee99d9701e1536833d31e0a7c5d2eb2d679
      de8aeab5
  29. 09 Apr, 2019 1 commit
    • Kartikay Khandelwal's avatar
      Fix save_dir creation while training on multiple nodes (#626) · 94e9d77c
      Kartikay Khandelwal authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/626
      
      While training a model on multiple GPUs, the current fairseq train workflow fails while creating the directory from which to load a checkpoint. This seems to be happening because multiple nodes attempt to create the same directory thus causing some weird interaction with os.makedirs option "exist_ok=True". Fixing this by making sure only rank 0 creates this directory.
      
      Reviewed By: myleott
      
      Differential Revision: D14841304
      
      fbshipit-source-id: c9b73ba804de97e2cb19a616189fefce476d8c74
      94e9d77c
  30. 07 Apr, 2019 1 commit
    • Haoran Li's avatar
      move distributed_init after get_batch_iterator · 34028c63
      Haoran Li authored
      Summary: There are constantly wait timeout issue for using multiple nodes, even setting copylocallytempdir:/ doesn't help, eg f105637629. It seems to be working after I moved distributed_init after get_batch_iterator, eg f106520580
      
      Reviewed By: myleott
      
      Differential Revision: D14817769
      
      fbshipit-source-id: edbb101a28d8082241c7bdd8c5500c9dad27647c
      34028c63
  31. 02 Apr, 2019 2 commits
  32. 12 Mar, 2019 1 commit
    • Dmytro Okhonko's avatar
      Handle 3+ dimensional input in sequence_generator + nits · 860010e9
      Dmytro Okhonko authored
      Summary: sequence_generator assumes that model input is 2d tensor of longs. But it can be something like 3d tensor of floats and we should be able to handle this as long as first dimension is batch size followed by source lengths.
      
      Reviewed By: myleott
      
      Differential Revision: D14420044
      
      fbshipit-source-id: bf8b1e42ad1873f7b803c1a377b0af21648db015
      860010e9
  33. 11 Mar, 2019 1 commit
  34. 04 Mar, 2019 1 commit