Commits · 12258e5798a7b89d46443c1c80dc6f281637807e · OpenDAS / Fairseq

03 Aug, 2019 1 commit

Fix generating with a fixed prefix · 12258e57

Myle Ott authored Aug 03, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/801

Differential Revision: D16628318

Pulled By: myleott

fbshipit-source-id: 50e93bb9108afd2ba90f1edd4f34306a7c9964a4

12258e57

02 Aug, 2019 1 commit

Update beam search code to support torch.bool change · 5f342527

Myle Ott authored Aug 02, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/797

Differential Revision: D16617067

Pulled By: myleott

fbshipit-source-id: 52e3aeb98d6e3b55ff9154b784028bf13eabfe38

5f342527

01 Aug, 2019 1 commit

Fix sampling with beam>1 · 4abadbdf

Myle Ott authored Aug 01, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/792

Differential Revision: D16591987

Pulled By: myleott

fbshipit-source-id: d27c490ae75f80ded19226b8384f4776485dd694

4abadbdf

30 Jul, 2019 1 commit

Relicense fairseq under MIT license (#786) · e75cff5f

Myle Ott authored Jul 30, 2019

Summary:
The previous BSD+PATENTS license was controversial. We have been
approved to relicense fairseq under the MIT license.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786

Differential Revision: D16560654

Pulled By: myleott

fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034

e75cff5f

19 Jul, 2019 1 commit

Improve interactive generation (support --tokenizer and --bpe) · 8af55542

Myle Ott authored Jul 19, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/734

Differential Revision: D16377044

Pulled By: myleott

fbshipit-source-id: 37d5553d76aa7c653113fec089f59710281c31d7

8af55542

17 Jul, 2019 1 commit

Nucleus (top-P) sampling (#710) · e46b924d

Xing Zhou authored Jul 17, 2019

Summary:
Implement Nucleus (top-P) sampling: sample among the smallest set of elements whose cumulative probability mass exceeds p.

To test it:
python generate.py   ~myleott/data/data-bin/wmt17_zh_en_full/   --path ~myleott/zh_en/model.pt   --remove-bpe   --nbest 5   --beam 5 --sampling --sampling-topp 0.3
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/710

Test Plan:
python generate.py   ~myleott/data/data-bin/wmt17_zh_en_full/   --path ~myleott/zh_en/model.pt   --remove-bpe   --nbest 5   --beam 5 --sampling --sampling-topp 0.3

python tests/test_sequence_generator.py

python tests/test_binaries.py

Reviewed By: myleott

Differential Revision: D16286688

Pulled By: xingz9

fbshipit-source-id: 1776d21e17c4532a3d24ac75bb7e75da9acad58f

e46b924d

11 Jun, 2019 2 commits

Add exception for bsz=1 with prefix generation (#796) · 1b937bb2

Myle Ott authored Jun 11, 2019

Summary:
This is a temporary workaround to support sampling after https://github.com/pytorch/fairseq/issues/713. We'll need to revisit this to support sampling and beam more generally.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/796

Differential Revision: D15760808

Pulled By: myleott

fbshipit-source-id: ecaf4f161b0c30de037f32007e4610a559a49230

1b937bb2

when given prefix_tokens, sequence generator would generate (exactly) same... · 9dc9a486

yilinyang7 authored Jun 11, 2019

when given prefix_tokens, sequence generator would generate (exactly) same finished candidates (#713)

Summary:
https://github.com/pytorch/fairseq/issues/712
Pull Request resolved: https://github.com/pytorch/fairseq/pull/713

Differential Revision: D15242432

Pulled By: myleott

fbshipit-source-id: a230ee48f4bf891c805609c428d7233a0ad21179

9dc9a486

02 Jun, 2019 1 commit

Fix rearranging of encoder_out in SequenceGenerator · b35d9bca

Myle Ott authored Jun 02, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/625

Differential Revision: D15595787

Pulled By: myleott

fbshipit-source-id: ba6edf305ed41be392194f492e034dd66d1743fe

b35d9bca

04 May, 2019 1 commit

Fix and generalize --temperature option (#508) · 96ac28d3

Myle Ott authored May 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/508

The previous version applied the temperature after the softmax. Fix that, and
also generalize so it works with other search approaches.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/694

Differential Revision: D15175160

Pulled By: myleott

fbshipit-source-id: cc87ff0e97a8a1dd37f9983163f58a8641155ab0

96ac28d3

22 Apr, 2019 1 commit

Fix generation with --no-early-stop (#627) · fa52d202

Max Ryabinin authored Apr 22, 2019

Summary:
Because the size of `unfinalized_scores` is equal to current `bsz` and not initial batch size, we need to index it by `unfin_idx` instead of `sent` in `is_finished`.
Fixes #588.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/627

Differential Revision: D15034641

Pulled By: myleott

fbshipit-source-id: 2638e68e877ae01256cac7d8e69b5b7fec8f7017

fa52d202

12 Mar, 2019 1 commit

Handle 3+ dimensional input in sequence_generator + nits · 860010e9

Dmytro Okhonko authored Mar 12, 2019

Summary: sequence_generator assumes that model input is 2d tensor of longs. But it can be something like 3d tensor of floats and we should be able to handle this as long as first dimension is batch size followed by source lengths.

Reviewed By: myleott

Differential Revision: D14420044

fbshipit-source-id: bf8b1e42ad1873f7b803c1a377b0af21648db015

860010e9

26 Feb, 2019 1 commit

Support LM generation from interactive.py (fixes #526) · 98daf039

Myle Ott authored Feb 25, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/528

Differential Revision: D14218377

Pulled By: myleott

fbshipit-source-id: facb0a32f6aebf56a4fea7259080394ad2d2d846

98daf039

22 Feb, 2019 2 commits

Add code for mixture of experts (#521) · 4294c4f6

Myle Ott authored Feb 22, 2019

Summary:
Code for the paper: [Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)](https://arxiv.org/abs/1902.07816).
Pull Request resolved: https://github.com/pytorch/fairseq/pull/521

Differential Revision: D14188021

Pulled By: myleott

fbshipit-source-id: ed5b1ed5ad9a582359bd5215fa2ea26dc76c673e

4294c4f6

Modularize generate.py (#351) · b65c579b

Myle Ott authored Feb 22, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/351

This makes it easier for tasks to plugin to generate.py/interactive.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/520

Differential Revision: D14183881

Pulled By: myleott

fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33

b65c579b

16 Feb, 2019 1 commit

Merge internal changes · 9998bbfa

Myle Ott authored Feb 15, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/505

Differential Revision: D14110201

Pulled By: myleott

fbshipit-source-id: 099ce61fa386c016f3a1d1815c6fe1a9a6c9005d

9998bbfa

05 Jan, 2019 1 commit

Merge internal changes (#283) · 7633129b

Myle Ott authored Jan 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5

7633129b

26 Dec, 2018 1 commit

Merge internal changes (#422) · 8ce6499d

Myle Ott authored Dec 26, 2018

Summary:
- 04cc608: Add `--match-source-len` option to generate.py to for sequence-tagging tasks
- 19f1a40: Add `--no-repeat-ngram-size` option to generate.py for ngram blocking
Pull Request resolved: https://github.com/pytorch/fairseq/pull/422

Differential Revision: D13548445

Pulled By: myleott

fbshipit-source-id: 26d1ae83993e428fcb020dac5ae358b0e36233d9

8ce6499d

30 Nov, 2018 1 commit

fixed torch 0.4.0 , "RuntimeError: Expected object of type torch.cuda… (#393) · 9dd87245

linkerr authored Nov 30, 2018

Summary:
….LongTensor but found type torch.cuda.FloatTensor for argument #3 'index' " error

in the torch.__version__ == 0.4.0 ,
new_order = torch.arange(bsz).view(-1, 1).repeat(1, beam_size).view(-1)
will return a float dtype Tensor, when exec the "line 321: fairseq/fairseq/models/fconv.py " will throw a RuntimeError
Pull Request resolved: https://github.com/pytorch/fairseq/pull/393

Differential Revision: D13276496

Pulled By: myleott

fbshipit-source-id: e7986246fbe2c79fff61bcab0e5bec9dd63e0afd

9dd87245

30 Sep, 2018 1 commit
- fbshipit-source-id: 6a835d32f9dc5e0de118f1b46d365d0e0cc85e11 · f8377a70
  myleott authored Sep 30, 2018
  
  f8377a70
25 Sep, 2018 5 commits
- core changes to support latte collab · cfd2a3a0
  Alexei Baevski authored Sep 20, 2018
  
  cfd2a3a0
- Pass encoder_input to generator, rather than src_tokens/src_lengths. · bfeb7732
  Stephen Roller authored Sep 08, 2018
  
  bfeb7732
- Revert sequence generator changes · 311d2c6c
  Myle Ott authored Sep 06, 2018
  
  311d2c6c
- Sequence generator bug fix. · 0714080b
  Stephen Roller authored Sep 05, 2018
  
  0714080b
- Generator: net_input instead of manual src_tokens. · e6d45d5c
  Stephen Roller authored Sep 05, 2018
  
  e6d45d5c
03 Sep, 2018 2 commits
- Diverse Beam Search · 8c0ca1a0
  Myle Ott authored Aug 10, 2018
  
  8c0ca1a0
- Factor out search logic in SequenceGenerator · ef43da72
  Myle Ott authored Aug 09, 2018
  
  ef43da72
25 Jul, 2018 2 commits
- Iterate on need_attn and fix tests · bb5f15d1
  Myle Ott authored Jul 12, 2018
  
  bb5f15d1
- disable printing alignment by default (for perf) and add a flag to enable it · 89e19d42
  Alexei Baevski authored Jul 10, 2018
  
  89e19d42
21 Jun, 2018 1 commit
- Move reorder_encoder_out to FairseqEncoder and fix non-incremental decoding · 6ec5022e
  Myle Ott authored Jun 21, 2018
  
  6ec5022e
15 Jun, 2018 10 commits

Faster generation when using a single model (rather than ensemble) · ef179415
Myle Ott authored Jun 14, 2018

ef179415
Fix tests · 55dc4842
Myle Ott authored Jun 12, 2018

55dc4842
Updates for latest PyTorch · e89329d6
Myle Ott authored Jun 12, 2018

e89329d6

Add FairseqTask · ff68a9ef

Myle Ott authored Jun 12, 2018

A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.

Changes:
- Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
- Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
- Remove LEFT_PAD_* constants and make them configurable per task

ff68a9ef

Fix length penalty when combined with --no-early-stop · fc87eea2
Myle Ott authored Jun 11, 2018
```
Co-authored-by: pmichel31415 <pmichel@fb.com>
```
fc87eea2
Nits · cf1c64a5
Myle Ott authored May 30, 2018

cf1c64a5
added multiscale gated self attention layer with multiple heads, and pretrained fusion models · b59815bc
Angela Fan authored May 09, 2018

b59815bc

Conv lm implementation · 4c2ef2de

alexeib authored May 25, 2018

This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf

There are 3 modes for constructing batches:

- token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
- complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
- eos: one sentence per sample (skip blank lines)

some results:

GCNN-13 - GBW - 37.46
GCNN-14B - GBW - 33.88
GCNN-8 - Wiki103 - 43.76
GCNN-14 - Wiki103 - 35.66

train:

python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500

eval:

python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'

4c2ef2de

fix alignment when using uneven batches and left pad · 9f1b37dd
Alexei Baevski authored May 11, 2018

9f1b37dd
Fix --prefix-size · 7f538f54
Myle Ott authored Apr 30, 2018

7f538f54