1. 14 Nov, 2019 1 commit
  2. 30 Sep, 2019 1 commit
  3. 27 Sep, 2019 1 commit
    • Changhan Wang's avatar
      Levenshtein Transformer paper code · 86857a58
      Changhan Wang authored
      Summary:
      Code for our NeurIPS paper [Levenshtein Transformer](https://arxiv.org/abs/1905.11006)
      * Added Levenshtein Transformer model, task and criterion class
      * Added iterative NAT Transformer, insertion Transformer and CMLM Transformer model class for baselines
      * Add an option for prepending BOS to dictionary class and translation task class
      
      Reviewed By: myleott
      
      Differential Revision: D17297372
      
      fbshipit-source-id: 54eca60831ae95dc721c2c34e882e1810ee575c7
      86857a58
  4. 30 Jul, 2019 1 commit
  5. 17 Jul, 2019 1 commit
  6. 27 Jun, 2019 1 commit
  7. 30 Apr, 2019 1 commit
  8. 12 Mar, 2019 1 commit
    • Dmytro Okhonko's avatar
      Handle 3+ dimensional input in sequence_generator + nits · 860010e9
      Dmytro Okhonko authored
      Summary: sequence_generator assumes that model input is 2d tensor of longs. But it can be something like 3d tensor of floats and we should be able to handle this as long as first dimension is batch size followed by source lengths.
      
      Reviewed By: myleott
      
      Differential Revision: D14420044
      
      fbshipit-source-id: bf8b1e42ad1873f7b803c1a377b0af21648db015
      860010e9
  9. 11 Mar, 2019 1 commit
    • Matt Le's avatar
      Create fairseq_cli_lib · 7fc9a3be
      Matt Le authored
      Summary: This allows one to call fairseq_cli functions from within python without dispatching to bash.
      
      Reviewed By: myleott
      
      Differential Revision: D14404719
      
      fbshipit-source-id: 044eb652045bb15fc40e72ecbaf6fb10df9f8c61
      7fc9a3be
  10. 28 Feb, 2019 1 commit
  11. 22 Feb, 2019 1 commit
  12. 16 Feb, 2019 1 commit
  13. 05 Feb, 2019 1 commit
  14. 30 Jan, 2019 1 commit
    • Myle Ott's avatar
      Merge internal changes (#483) · 42be3ebd
      Myle Ott authored
      Summary:
      Changelog:
      - `4889802`: can now remove detokenize sentencepiece output with `--remove-bpe=sentencepiece` (fixes #331). Also added `--sacrebleu` for computing detokenized BLEU.
      - `0d76427`: fix assertion error when training language model with dataset containing empty sentences
      - minor bug and style fixes
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/483
      
      Differential Revision: D13867899
      
      Pulled By: myleott
      
      fbshipit-source-id: 25c940b847fe270262ac8f5ac838407b3977fdda
      42be3ebd
  15. 16 Jan, 2019 1 commit
    • Davide Caroselli's avatar
      FIX: '--user-dir' on multi-gpu (#449) · 7853818c
      Davide Caroselli authored
      Summary:
      On a multi-gpu training scenario, the `train.py` script spawns new processes with `torch.multiprocessing.spawn`. Unfortunately those child processes don't inherit the modules imported with `--user-dir`.
      
      This pull request fixes this problem: custom module import in now explicit on every `main()` function.
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/449
      
      Differential Revision: D13676922
      
      Pulled By: myleott
      
      fbshipit-source-id: 520358d66155697885b878a37e7d0484bddbc1c6
      7853818c
  16. 05 Jan, 2019 1 commit
  17. 26 Dec, 2018 1 commit
    • Myle Ott's avatar
      Merge internal changes (#422) · 8ce6499d
      Myle Ott authored
      Summary:
      - 04cc608: Add `--match-source-len` option to generate.py to for sequence-tagging tasks
      - 19f1a40: Add `--no-repeat-ngram-size` option to generate.py for ngram blocking
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/422
      
      Differential Revision: D13548445
      
      Pulled By: myleott
      
      fbshipit-source-id: 26d1ae83993e428fcb020dac5ae358b0e36233d9
      8ce6499d
  18. 03 Sep, 2018 4 commits
  19. 25 Jul, 2018 2 commits
  20. 08 Jul, 2018 1 commit
  21. 21 Jun, 2018 1 commit
  22. 15 Jun, 2018 10 commits
    • Myle Ott's avatar
    • Myle Ott's avatar
      Add FairseqTask · ff68a9ef
      Myle Ott authored
      A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.
      
      Changes:
      - Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
      - Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
      - Remove LEFT_PAD_* constants and make them configurable per task
      ff68a9ef
    • Myle Ott's avatar
      Unify various sharding into ShardedIterator · 24d7de44
      Myle Ott authored
      24d7de44
    • Myle Ott's avatar
      76b5ecab
    • Myle Ott's avatar
      Nits · cf1c64a5
      Myle Ott authored
      cf1c64a5
    • Angela Fan's avatar
    • alexeib's avatar
      Conv lm implementation · 4c2ef2de
      alexeib authored
      This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf
      
      There are 3 modes for constructing batches:
      
      - token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
      - complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
      - eos: one sentence per sample (skip blank lines)
      
      some results:
      
      GCNN-13 - GBW - 37.46
      GCNN-14B - GBW - 33.88
      GCNN-8 - Wiki103 - 43.76
      GCNN-14 - Wiki103 - 35.66
      
      train:
      
      python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500
      
      eval:
      
      python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'
      4c2ef2de
    • Alexei Baevski's avatar
      bf47b956
    • Alexei Baevski's avatar
      67af40c9
    • Alexei Baevski's avatar
      remove completed sentences from batch · 2a84f46b
      Alexei Baevski authored
      remove completed sentences from batch and allow batching uneven lengths (with fixes to make padded sequences work correctly in all models)
      2a84f46b
  23. 02 Apr, 2018 1 commit
    • Myle Ott's avatar
      Merge internal changes (#136) · d3795d6c
      Myle Ott authored
      Changes:
      - 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
      - c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
      - 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
      - small bugfixes for distributed training, LSTM, inverse square root LR scheduler
      d3795d6c
  24. 05 Mar, 2018 1 commit
  25. 01 Mar, 2018 1 commit
  26. 27 Feb, 2018 2 commits