1. 30 Sep, 2019 1 commit
  2. 30 Jul, 2019 1 commit
  3. 21 Jul, 2019 1 commit
  4. 19 Jul, 2019 1 commit
  5. 10 Jun, 2019 1 commit
  6. 08 May, 2019 1 commit
  7. 30 Apr, 2019 1 commit
  8. 29 Mar, 2019 1 commit
  9. 19 Mar, 2019 1 commit
  10. 28 Feb, 2019 1 commit
  11. 26 Feb, 2019 1 commit
  12. 22 Feb, 2019 1 commit
  13. 16 Feb, 2019 1 commit
  14. 05 Feb, 2019 1 commit
  15. 30 Jan, 2019 1 commit
  16. 16 Jan, 2019 1 commit
    • Davide Caroselli's avatar
      FIX: '--user-dir' on multi-gpu (#449) · 7853818c
      Davide Caroselli authored
      Summary:
      On a multi-gpu training scenario, the `train.py` script spawns new processes with `torch.multiprocessing.spawn`. Unfortunately those child processes don't inherit the modules imported with `--user-dir`.
      
      This pull request fixes this problem: custom module import in now explicit on every `main()` function.
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/449
      
      Differential Revision: D13676922
      
      Pulled By: myleott
      
      fbshipit-source-id: 520358d66155697885b878a37e7d0484bddbc1c6
      7853818c
  17. 05 Jan, 2019 1 commit
  18. 26 Dec, 2018 1 commit
    • Myle Ott's avatar
      Merge internal changes (#422) · 8ce6499d
      Myle Ott authored
      Summary:
      - 04cc608: Add `--match-source-len` option to generate.py to for sequence-tagging tasks
      - 19f1a40: Add `--no-repeat-ngram-size` option to generate.py for ngram blocking
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/422
      
      Differential Revision: D13548445
      
      Pulled By: myleott
      
      fbshipit-source-id: 26d1ae83993e428fcb020dac5ae358b0e36233d9
      8ce6499d
  19. 25 Sep, 2018 1 commit
  20. 03 Sep, 2018 3 commits
  21. 25 Jul, 2018 2 commits
  22. 19 Jul, 2018 1 commit
  23. 08 Jul, 2018 1 commit
  24. 25 Jun, 2018 1 commit
  25. 21 Jun, 2018 1 commit
  26. 15 Jun, 2018 7 commits
    • Myle Ott's avatar
    • Myle Ott's avatar
      Add FairseqTask · ff68a9ef
      Myle Ott authored
      A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.
      
      Changes:
      - Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
      - Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
      - Remove LEFT_PAD_* constants and make them configurable per task
      ff68a9ef
    • Myle Ott's avatar
      76b5ecab
    • Angela Fan's avatar
    • alexeib's avatar
      Conv lm implementation · 4c2ef2de
      alexeib authored
      This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf
      
      There are 3 modes for constructing batches:
      
      - token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
      - complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
      - eos: one sentence per sample (skip blank lines)
      
      some results:
      
      GCNN-13 - GBW - 37.46
      GCNN-14B - GBW - 33.88
      GCNN-8 - Wiki103 - 43.76
      GCNN-14 - Wiki103 - 35.66
      
      train:
      
      python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500
      
      eval:
      
      python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'
      4c2ef2de
    • Alexei Baevski's avatar
      implement batching in interactive mode · 663fd806
      Alexei Baevski authored
      663fd806
    • Sergey Edunov's avatar
      Sampling doesn't work with interactive · 4ce453b1
      Sergey Edunov authored
      4ce453b1
  27. 01 May, 2018 2 commits
  28. 02 Apr, 2018 1 commit
    • Myle Ott's avatar
      Merge internal changes (#136) · d3795d6c
      Myle Ott authored
      Changes:
      - 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
      - c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
      - 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
      - small bugfixes for distributed training, LSTM, inverse square root LR scheduler
      d3795d6c
  29. 27 Feb, 2018 2 commits
    • Myle Ott's avatar
      More unit test fixes · 0d90e35f
      Myle Ott authored
      0d90e35f
    • Myle Ott's avatar
      fairseq-py goes distributed (#106) · 66415206
      Myle Ott authored
      This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes.
      
      Changes:
      - c7033ef: add support for distributed training! See updated README for usage.
      - e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc.
      - 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf
      - 90c2973 and 1da6265: improve unit test coverage
      66415206