1. 04 Oct, 2019 2 commits
  2. 01 Oct, 2019 1 commit
  3. 30 Sep, 2019 2 commits
  4. 29 Sep, 2019 2 commits
  5. 28 Sep, 2019 1 commit
  6. 27 Sep, 2019 5 commits
  7. 26 Sep, 2019 1 commit
  8. 24 Sep, 2019 1 commit
  9. 23 Sep, 2019 3 commits
  10. 20 Sep, 2019 3 commits
    • Myle Ott's avatar
      Remove extraneous call to RNG in multi-GPU code path · 10f9349e
      Myle Ott authored
      Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/865
      
      Differential Revision: D17510276
      
      Pulled By: myleott
      
      fbshipit-source-id: 24119402ad5fe95a1312fadb77bafe49a9197c6b
      10f9349e
    • Myle Ott's avatar
      Update README.race.md · e869c80d
      Myle Ott authored
      Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1155
      
      Differential Revision: D17509762
      
      Pulled By: myleott
      
      fbshipit-source-id: 4de535289c1f35abff0d8142d8580f3ede039f47
      e869c80d
    • Naman Goyal's avatar
      added multilingual masked LM training (#849) · 32335404
      Naman Goyal authored
      Summary:
      The multilingual-RoBERTa training is working with aconneau XLM data.
      
      Two pieces remaining:
      
      1) `XLM` limits batch to be from same language, I am not 100% sure about the reason for that, but should be easy to implement, basically we can add `batch_by_size_and_language` instead of default `batch_by_size` function. If it's not critical, I would want to leave it out as it keeps the code very clean and simple.
      
      2) `sample_ratio` in `ConcatDataset` works with `int` by tiling the datasets based on ratio. Currently I am handling it by sounding off the ratio to `first decimal` and then multiplying by `10`. We can see if some such simple heuristics are good enough, there are other options (we can talk about them offline).
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/849
      
      Differential Revision: D17162460
      
      fbshipit-source-id: d967f3d872f7a1f0aa4ea418bd362b68af9e432f
      32335404
  11. 19 Sep, 2019 2 commits
    • Jerry Ma's avatar
      Add dataset class for weighted sampling with replacement. (#861) · a8a85c26
      Jerry Ma authored
      Summary:
      As discussed with Naman earlier today. Weighted sampling with
      replacement can be done on a per-epoch basis using `set_epoch()`
      functionality, which generates the samples as a function of random seed
      and epoch.
      
      Additionally, `FairseqTask` needs to set the starting epoch for the
      dataset at the very beginning of iterator construction.
      
      Not yet implemented is the per-epoch iterator construction, which
      is necessary to actually regenerate the batches for each epoch.
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/861
      
      Differential Revision: D17460687
      
      Pulled By: jma127
      
      fbshipit-source-id: 1c2a54f04ac96b3561c100a6fd66a9fccbe3c658
      a8a85c26
    • Myle Ott's avatar
      Add cython language_level hints · 0eaaf355
      Myle Ott authored
      Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1147
      
      Differential Revision: D17468447
      
      Pulled By: myleott
      
      fbshipit-source-id: 0dbac04b92c8df74ad991d5e92cd02036d662369
      0eaaf355
  12. 18 Sep, 2019 3 commits
  13. 17 Sep, 2019 2 commits
  14. 16 Sep, 2019 1 commit
    • Naman Goyal's avatar
      added fast stats sync option (#858) · e1ba32aa
      Naman Goyal authored
      Summary:
      Added `--fast-stat-sync` option.
      This avoids pickle and achieves `~7%` more `wps` on 16 nodes.
      It is less flexible as it just aggregates only basic stats and it ignores the aggregate function defined by criterion.
      
      Let me know what you think myleott
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/858
      
      Differential Revision: D17398770
      
      fbshipit-source-id: 36261a1d970e67deeda8211af8f009ef9b4f9c14
      e1ba32aa
  15. 12 Sep, 2019 1 commit
  16. 05 Sep, 2019 1 commit
    • Roman Rädle's avatar
      Return predicted token for RoBERTa filling mask · 3e3fe722
      Roman Rädle authored
      Summary:
      Added the `predicted_token` to each `topk` filled output item
      
      Updated RoBERTa filling mask example in README.md
      
      Reviewed By: myleott
      
      Differential Revision: D17188810
      
      fbshipit-source-id: 5fdc57ff2c13239dabf13a8dad43ae9a55e8931c
      3e3fe722
  17. 04 Sep, 2019 1 commit
    • Peng-Jen Chen's avatar
      Fix multilingual translation bug for to-many case · 1566cfb9
      Peng-Jen Chen authored
      Summary:
      The logic for adding decoder side language token was wrongly implemented.
      The way we inject the language token is by replacing the eos symbol with language token symbol. However, the parameter for source / target eos symbol was not set correctly.
      
      Reviewed By: tangyuq
      
      Differential Revision: D17129108
      
      fbshipit-source-id: 6fae385b787370656fd7ca7ab74e6bb91fe5463b
      1566cfb9
  18. 03 Sep, 2019 2 commits
  19. 01 Sep, 2019 1 commit
  20. 31 Aug, 2019 3 commits
  21. 30 Aug, 2019 2 commits
    • alexeib's avatar
      set numpy seed explicitly + other minor fixes (#850) · 4a7cd582
      alexeib authored
      Summary:
      not setting the numpy seed explicitly at the beginning was an extremely annoying bug to find. it it caused different gpus to have a different view of data if some randomization was used in the dataset (e.g. subsample dataset)
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/850
      
      Differential Revision: D17085006
      
      Pulled By: alexeib
      
      fbshipit-source-id: 62bb2116369fb703df878e6bc24c06f1ea4e75a0
      4a7cd582
    • Paul O'Shannessy's avatar
      Adopt Contributor Covenant · 8777465b
      Paul O'Shannessy authored
      Summary:
      In order to foster healthy open source communities, we're adopting the
      [Contributor Covenant](https://www.contributor-covenant.org/). It has been
      built by open source community members and represents a shared understanding of
      what is expected from a healthy community.
      
      Reviewed By: josephsavona, danobi, rdzhabarov
      
      Differential Revision: D17104640
      
      fbshipit-source-id: d210000de686c5f0d97d602b50472d5869bc6a49
      8777465b