1. 26 Nov, 2019 1 commit
  2. 13 Nov, 2019 2 commits
  3. 05 Nov, 2019 1 commit
    • Spencer Poff's avatar
      Fixing key padding mask during transformer generation · 68dd3e17
      Spencer Poff authored
      Summary:
      https://github.com/pytorch/fairseq/pull/1097 added key padding mask history in TransformerDecoderLayer, but during an edge case where only the current or only the previous key_padding_mask exists, the resulting key_padding_mask is the wrong size.
      
      This diff adds empty columns in such a case to ensure key_padding_mask is a usable size.
      
      Reviewed By: myleott
      
      Differential Revision: D18224313
      
      fbshipit-source-id: c9fb7266baf0a2d79a66704e00a5ea8bd2987ff6
      68dd3e17
  4. 15 Oct, 2019 1 commit
    • Nayan Singhal's avatar
      Add Unit test cases for BMUF · b5f41f82
      Nayan Singhal authored
      Summary:
      This unit test guards the bmuf code.
      
      change:
      1. distributed_init assumes we are always using cuda device which is not the case if you are using "gloo" backend on CPU machine.
      
      Reviewed By: jay-mahadeokar
      
      Differential Revision: D17821391
      
      fbshipit-source-id: 28e1bb39f7a4889b1dc6bd636b7c499e55bfc69a
      b5f41f82
  5. 30 Sep, 2019 1 commit
  6. 29 Sep, 2019 1 commit
  7. 27 Sep, 2019 1 commit
    • Changhan Wang's avatar
      Levenshtein Transformer paper code · 86857a58
      Changhan Wang authored
      Summary:
      Code for our NeurIPS paper [Levenshtein Transformer](https://arxiv.org/abs/1905.11006)
      * Added Levenshtein Transformer model, task and criterion class
      * Added iterative NAT Transformer, insertion Transformer and CMLM Transformer model class for baselines
      * Add an option for prepending BOS to dictionary class and translation task class
      
      Reviewed By: myleott
      
      Differential Revision: D17297372
      
      fbshipit-source-id: 54eca60831ae95dc721c2c34e882e1810ee575c7
      86857a58
  8. 19 Sep, 2019 1 commit
    • Jerry Ma's avatar
      Add dataset class for weighted sampling with replacement. (#861) · a8a85c26
      Jerry Ma authored
      Summary:
      As discussed with Naman earlier today. Weighted sampling with
      replacement can be done on a per-epoch basis using `set_epoch()`
      functionality, which generates the samples as a function of random seed
      and epoch.
      
      Additionally, `FairseqTask` needs to set the starting epoch for the
      dataset at the very beginning of iterator construction.
      
      Not yet implemented is the per-epoch iterator construction, which
      is necessary to actually regenerate the batches for each epoch.
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/861
      
      Differential Revision: D17460687
      
      Pulled By: jma127
      
      fbshipit-source-id: 1c2a54f04ac96b3561c100a6fd66a9fccbe3c658
      a8a85c26
  9. 19 Aug, 2019 1 commit
  10. 14 Aug, 2019 1 commit
  11. 13 Aug, 2019 1 commit
  12. 08 Aug, 2019 1 commit
  13. 01 Aug, 2019 1 commit
  14. 30 Jul, 2019 1 commit
  15. 22 Jul, 2019 2 commits
  16. 17 Jul, 2019 1 commit
    • Xing Zhou's avatar
      Nucleus (top-P) sampling (#710) · e46b924d
      Xing Zhou authored
      Summary:
      Implement Nucleus (top-P) sampling: sample among the smallest set of elements whose cumulative probability mass exceeds p.
      
      To test it:
      python generate.py   ~myleott/data/data-bin/wmt17_zh_en_full/   --path ~myleott/zh_en/model.pt   --remove-bpe   --nbest 5   --beam 5 --sampling --sampling-topp 0.3
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/710
      
      Test Plan:
      python generate.py   ~myleott/data/data-bin/wmt17_zh_en_full/   --path ~myleott/zh_en/model.pt   --remove-bpe   --nbest 5   --beam 5 --sampling --sampling-topp 0.3
      
      python tests/test_sequence_generator.py
      
      python tests/test_binaries.py
      
      Reviewed By: myleott
      
      Differential Revision: D16286688
      
      Pulled By: xingz9
      
      fbshipit-source-id: 1776d21e17c4532a3d24ac75bb7e75da9acad58f
      e46b924d
  17. 23 Jun, 2019 1 commit
  18. 11 Jun, 2019 1 commit
  19. 06 Jun, 2019 1 commit
  20. 04 Jun, 2019 1 commit
    • Matt Le's avatar
      Fix loading XLM pretraining · 5408bc08
      Matt Le authored
      Summary: We never actually load the model parameters from an XLM model when using tranformer_from_pretrained_xlm.  Also, change encoder_learned_pos from True -> False
      
      Reviewed By: liezl200
      
      Differential Revision: D15629061
      
      fbshipit-source-id: 759eadc88041eae94505477960de57dd78a99dcb
      5408bc08
  21. 30 May, 2019 1 commit
  22. 24 May, 2019 1 commit
  23. 20 May, 2019 1 commit
  24. 17 May, 2019 1 commit
  25. 15 May, 2019 1 commit
    • Myle Ott's avatar
      Updates to model API (#561) · dffb1674
      Myle Ott authored
      Summary:
      - `FairseqModel` -> `FairseqEncoderDecoderModel`
      - add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer`
      - `encoder_out_dict` -> `encoder_out`
      - rm unused `remove_head` functions
      - update docs
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/561
      
      Differential Revision: D15271142
      
      Pulled By: myleott
      
      fbshipit-source-id: 8e8864e399336020f0271c780598e968ff51a264
      dffb1674
  26. 14 May, 2019 2 commits
  27. 09 May, 2019 1 commit
    • Jingfei Du's avatar
      expose arguments for bias_kv and zero_attn for masked_lm · 93ec8d0b
      Jingfei Du authored
      Summary: the old no_bias_kv argument for masked_lm models are not used. Split it into 2 arguments and expose them.
      
      Reviewed By: myleott
      
      Differential Revision: D15266154
      
      fbshipit-source-id: 60b041f8370ca1d8869ed3402fb9a67d1cd8e0e8
      93ec8d0b
  28. 07 May, 2019 2 commits
  29. 06 May, 2019 1 commit
    • Naman Goyal's avatar
      allowing sharded dataset (#696) · 0add50c2
      Naman Goyal authored
      
      
      Summary:
      Co-authored-by: default avatarmyleott <myleott@fb.com>
      
      Changing `data` to be `str` with colon separated list for loading sharded datasets. This change is useful for loading large datasets that cannot fit into, memory. The large dataset can be sharded and then each shard is loaded in one epoch in roudrobin manner.
      
      For example, if there are `5` shards of data and `10` epochs then the shards will be iterated upon `[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]`.
      
      myleott We need to look into `translation.py` as it currently already expects a list and then concats the datasets.
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/696
      
      Differential Revision: D15214049
      
      fbshipit-source-id: 03e43a7b69c7aefada2ca668abf1eac1969fe013
      0add50c2
  30. 04 May, 2019 1 commit
  31. 30 Apr, 2019 1 commit
  32. 25 Apr, 2019 3 commits
  33. 17 Apr, 2019 1 commit
  34. 15 Apr, 2019 1 commit