1. 03 Oct, 2018 1 commit
    • Liezl Puzon's avatar
      Pass in kwargs and SequenceGenerator class to init BacktranslationDataset · f766c9a0
      Liezl Puzon authored
      Summary: This generalizes BacktranslationDataset to allow us to use any SequenceGenerator class. For example, if we want to use this model in PyTorch Translate, we can pass the following to BacktraanslationDataset init: (1) a PyTorch Translate SequenceGenerator class as generator_class and (2) the appropriate args for initializing that class as kwargs.
      
      Reviewed By: xianxl
      
      Differential Revision: D10156552
      
      fbshipit-source-id: 0495d825bf4727da96d0d9a40dc434135ff3486c
      f766c9a0
  2. 02 Oct, 2018 1 commit
    • Liezl Puzon's avatar
      Explicitly list out generation args for backtranslation dataset · 86e93f2b
      Liezl Puzon authored
      Summary:
      Using argparse Namespace hides the actual args that are expected and makes code harder to read.
      
      Note the difference in style for the args list
      
          def __init__(
              self,
              tgt_dataset,
              tgt_dict,
              backtranslation_model,
              unkpen,
              sampling,
              beam,
              max_len_a,
              max_len_b,
          ):
      
      instead of
      
          def __init__(
              self, tgt_dataset, tgt_dict, backtranslation_model, unkpen, sampling,
              beam,  max_len_a, max_len_b,
          ):
      
      Reviewed By: dpacgopinath
      
      Differential Revision: D10152331
      
      fbshipit-source-id: 6539ccba09d48acf23759996b7e32fb329b3e3f6
      86e93f2b
  3. 30 Sep, 2018 1 commit
  4. 25 Sep, 2018 8 commits
  5. 03 Sep, 2018 9 commits
  6. 25 Jul, 2018 1 commit
  7. 25 Jun, 2018 2 commits
  8. 24 Jun, 2018 1 commit
  9. 21 Jun, 2018 2 commits
  10. 15 Jun, 2018 11 commits
    • Myle Ott's avatar
      Fix bidirectional lstm · bfcc6ec7
      Myle Ott authored
      bfcc6ec7
    • Myle Ott's avatar
      Updates for latest PyTorch · e89329d6
      Myle Ott authored
      e89329d6
    • Myle Ott's avatar
      Add FairseqTask · ff68a9ef
      Myle Ott authored
      A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.
      
      Changes:
      - Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
      - Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
      - Remove LEFT_PAD_* constants and make them configurable per task
      ff68a9ef
    • Myle Ott's avatar
      16a72b4d
    • Myle Ott's avatar
      Suppress stdout in test_train · 736fbee2
      Myle Ott authored
      736fbee2
    • Myle Ott's avatar
      Nits · cf1c64a5
      Myle Ott authored
      cf1c64a5
    • alexeib's avatar
      record end_of_epoch in checkpoint · 7d560402
      alexeib authored
      7d560402
    • alexeib's avatar
    • alexeib's avatar
      Conv lm implementation · 4c2ef2de
      alexeib authored
      This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf
      
      There are 3 modes for constructing batches:
      
      - token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
      - complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
      - eos: one sentence per sample (skip blank lines)
      
      some results:
      
      GCNN-13 - GBW - 37.46
      GCNN-14B - GBW - 33.88
      GCNN-8 - Wiki103 - 43.76
      GCNN-14 - Wiki103 - 35.66
      
      train:
      
      python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500
      
      eval:
      
      python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'
      4c2ef2de
    • Myle Ott's avatar
      Fix tests · ae2585d9
      Myle Ott authored
      ae2585d9
    • Myle Ott's avatar
      Fix tests · 8afb7761
      Myle Ott authored
      8afb7761
  11. 24 May, 2018 1 commit
  12. 02 Apr, 2018 1 commit
    • Myle Ott's avatar
      Merge internal changes (#136) · d3795d6c
      Myle Ott authored
      Changes:
      - 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
      - c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
      - 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
      - small bugfixes for distributed training, LSTM, inverse square root LR scheduler
      d3795d6c
  13. 05 Mar, 2018 1 commit