1. 06 Oct, 2018 2 commits
  2. 04 Oct, 2018 1 commit
    • Liezl Puzon's avatar
      Option to remove EOS at source in backtranslation dataset · b9e29a47
      Liezl Puzon authored
      Summary:
      If we want our parallel data to have EOS at the end of source, we keep the EOS at the end of the generated source dialect backtranslation.
      If we don't want our parallel data to have EOS at the end of source, we **remove** the EOS at the end of the generated source dialect backtranslation.
      
      Note: we always want EOS at the end of our target / reference in parallel data so our model can learn to generate a sentence at any arbitrary length. So we make sure that the original target has an EOS before returning a batch of {generated src, original target}. If our original targets in tgt dataset doesn't have an EOS, we append EOS to each tgt sample before collating.
      We only do this for the purpose of collating a {generated src, original tgt} batch AFTER generating the backtranslations. We don't enforce any EOS before passing tgt to the tgt->src model for generating the backtranslation. The users of this dataset is expected to format tgt dataset examples in the correct format that the tgt->src model expects.
      
      Reviewed By: jmp84
      
      Differential Revision: D10157725
      
      fbshipit-source-id: eb6a15f13c651f7c435b8db28103c9a8189845fb
      b9e29a47
  3. 03 Oct, 2018 2 commits
    • Myle Ott's avatar
      Fix proxying in DistributedFairseqModel · fc677c94
      Myle Ott authored
      Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/302
      
      Differential Revision: D10174608
      
      Pulled By: myleott
      
      fbshipit-source-id: 4e2dfc76eae97afc5488f29b47e74f9897a643ff
      fc677c94
    • Liezl Puzon's avatar
      Pass in kwargs and SequenceGenerator class to init BacktranslationDataset · f766c9a0
      Liezl Puzon authored
      Summary: This generalizes BacktranslationDataset to allow us to use any SequenceGenerator class. For example, if we want to use this model in PyTorch Translate, we can pass the following to BacktraanslationDataset init: (1) a PyTorch Translate SequenceGenerator class as generator_class and (2) the appropriate args for initializing that class as kwargs.
      
      Reviewed By: xianxl
      
      Differential Revision: D10156552
      
      fbshipit-source-id: 0495d825bf4727da96d0d9a40dc434135ff3486c
      f766c9a0
  4. 02 Oct, 2018 1 commit
    • Liezl Puzon's avatar
      Explicitly list out generation args for backtranslation dataset · 86e93f2b
      Liezl Puzon authored
      Summary:
      Using argparse Namespace hides the actual args that are expected and makes code harder to read.
      
      Note the difference in style for the args list
      
          def __init__(
              self,
              tgt_dataset,
              tgt_dict,
              backtranslation_model,
              unkpen,
              sampling,
              beam,
              max_len_a,
              max_len_b,
          ):
      
      instead of
      
          def __init__(
              self, tgt_dataset, tgt_dict, backtranslation_model, unkpen, sampling,
              beam,  max_len_a, max_len_b,
          ):
      
      Reviewed By: dpacgopinath
      
      Differential Revision: D10152331
      
      fbshipit-source-id: 6539ccba09d48acf23759996b7e32fb329b3e3f6
      86e93f2b
  5. 30 Sep, 2018 1 commit
  6. 25 Sep, 2018 8 commits
  7. 03 Sep, 2018 9 commits
  8. 25 Jul, 2018 1 commit
  9. 25 Jun, 2018 2 commits
  10. 24 Jun, 2018 1 commit
  11. 21 Jun, 2018 2 commits
  12. 15 Jun, 2018 10 commits
    • Myle Ott's avatar
      Fix bidirectional lstm · bfcc6ec7
      Myle Ott authored
      bfcc6ec7
    • Myle Ott's avatar
      Updates for latest PyTorch · e89329d6
      Myle Ott authored
      e89329d6
    • Myle Ott's avatar
      Add FairseqTask · ff68a9ef
      Myle Ott authored
      A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.
      
      Changes:
      - Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
      - Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
      - Remove LEFT_PAD_* constants and make them configurable per task
      ff68a9ef
    • Myle Ott's avatar
      16a72b4d
    • Myle Ott's avatar
      Suppress stdout in test_train · 736fbee2
      Myle Ott authored
      736fbee2
    • Myle Ott's avatar
      Nits · cf1c64a5
      Myle Ott authored
      cf1c64a5
    • alexeib's avatar
      record end_of_epoch in checkpoint · 7d560402
      alexeib authored
      7d560402
    • alexeib's avatar
    • alexeib's avatar
      Conv lm implementation · 4c2ef2de
      alexeib authored
      This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf
      
      There are 3 modes for constructing batches:
      
      - token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
      - complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
      - eos: one sentence per sample (skip blank lines)
      
      some results:
      
      GCNN-13 - GBW - 37.46
      GCNN-14B - GBW - 33.88
      GCNN-8 - Wiki103 - 43.76
      GCNN-14 - Wiki103 - 35.66
      
      train:
      
      python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500
      
      eval:
      
      python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'
      4c2ef2de
    • Myle Ott's avatar
      Fix tests · ae2585d9
      Myle Ott authored
      ae2585d9