1. 03 Aug, 2019 1 commit
  2. 02 Aug, 2019 1 commit
  3. 01 Aug, 2019 1 commit
  4. 30 Jul, 2019 1 commit
  5. 19 Jul, 2019 1 commit
  6. 17 Jul, 2019 1 commit
    • Xing Zhou's avatar
      Nucleus (top-P) sampling (#710) · e46b924d
      Xing Zhou authored
      Summary:
      Implement Nucleus (top-P) sampling: sample among the smallest set of elements whose cumulative probability mass exceeds p.
      
      To test it:
      python generate.py   ~myleott/data/data-bin/wmt17_zh_en_full/   --path ~myleott/zh_en/model.pt   --remove-bpe   --nbest 5   --beam 5 --sampling --sampling-topp 0.3
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/710
      
      Test Plan:
      python generate.py   ~myleott/data/data-bin/wmt17_zh_en_full/   --path ~myleott/zh_en/model.pt   --remove-bpe   --nbest 5   --beam 5 --sampling --sampling-topp 0.3
      
      python tests/test_sequence_generator.py
      
      python tests/test_binaries.py
      
      Reviewed By: myleott
      
      Differential Revision: D16286688
      
      Pulled By: xingz9
      
      fbshipit-source-id: 1776d21e17c4532a3d24ac75bb7e75da9acad58f
      e46b924d
  7. 11 Jun, 2019 2 commits
  8. 02 Jun, 2019 1 commit
  9. 04 May, 2019 1 commit
  10. 22 Apr, 2019 1 commit
  11. 12 Mar, 2019 1 commit
    • Dmytro Okhonko's avatar
      Handle 3+ dimensional input in sequence_generator + nits · 860010e9
      Dmytro Okhonko authored
      Summary: sequence_generator assumes that model input is 2d tensor of longs. But it can be something like 3d tensor of floats and we should be able to handle this as long as first dimension is batch size followed by source lengths.
      
      Reviewed By: myleott
      
      Differential Revision: D14420044
      
      fbshipit-source-id: bf8b1e42ad1873f7b803c1a377b0af21648db015
      860010e9
  12. 26 Feb, 2019 1 commit
  13. 22 Feb, 2019 2 commits
  14. 16 Feb, 2019 1 commit
  15. 05 Jan, 2019 1 commit
  16. 26 Dec, 2018 1 commit
    • Myle Ott's avatar
      Merge internal changes (#422) · 8ce6499d
      Myle Ott authored
      Summary:
      - 04cc608: Add `--match-source-len` option to generate.py to for sequence-tagging tasks
      - 19f1a40: Add `--no-repeat-ngram-size` option to generate.py for ngram blocking
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/422
      
      Differential Revision: D13548445
      
      Pulled By: myleott
      
      fbshipit-source-id: 26d1ae83993e428fcb020dac5ae358b0e36233d9
      8ce6499d
  17. 30 Nov, 2018 1 commit
  18. 30 Sep, 2018 1 commit
  19. 25 Sep, 2018 5 commits
  20. 03 Sep, 2018 2 commits
  21. 25 Jul, 2018 2 commits
  22. 21 Jun, 2018 1 commit
  23. 15 Jun, 2018 10 commits
    • Myle Ott's avatar
    • Myle Ott's avatar
      Fix tests · 55dc4842
      Myle Ott authored
      55dc4842
    • Myle Ott's avatar
      Updates for latest PyTorch · e89329d6
      Myle Ott authored
      e89329d6
    • Myle Ott's avatar
      Add FairseqTask · ff68a9ef
      Myle Ott authored
      A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.
      
      Changes:
      - Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
      - Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
      - Remove LEFT_PAD_* constants and make them configurable per task
      ff68a9ef
    • Myle Ott's avatar
      fc87eea2
    • Myle Ott's avatar
      Nits · cf1c64a5
      Myle Ott authored
      cf1c64a5
    • Angela Fan's avatar
    • alexeib's avatar
      Conv lm implementation · 4c2ef2de
      alexeib authored
      This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf
      
      There are 3 modes for constructing batches:
      
      - token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
      - complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
      - eos: one sentence per sample (skip blank lines)
      
      some results:
      
      GCNN-13 - GBW - 37.46
      GCNN-14B - GBW - 33.88
      GCNN-8 - Wiki103 - 43.76
      GCNN-14 - Wiki103 - 35.66
      
      train:
      
      python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500
      
      eval:
      
      python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'
      4c2ef2de
    • Alexei Baevski's avatar
      9f1b37dd
    • Myle Ott's avatar
      Fix --prefix-size · 7f538f54
      Myle Ott authored
      7f538f54