Commits · b85fb035063eb065a6204c520890eb1847c28ecf · OpenDAS / Fairseq

14 Nov, 2019 1 commit

Enable to print the history of NAT; fix LevT decoding bug (#908) · b85fb035

Jiatao Gu authored Nov 13, 2019

Summary:
(1) Enable to print the iterative refinement history for all NAT models by setting --retain-iter-history during decoding;
(2) Fix a small bug in the decoding process in Levenshtein Transformer.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/908

Differential Revision: D18493234

Pulled By: MultiPath

fbshipit-source-id: 9e7702adcea49f39d3c10b5349b5a9ae66399a24

b85fb035

30 Sep, 2019 1 commit

Implementation of the paper "Jointly Learning to Align and Translate with... · 1c667929

Sarthak Garg authored Sep 30, 2019

Implementation of the paper "Jointly Learning to Align and Translate with Transformer Models" (#877)

Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/877

This PR implements guided alignment training described in "Jointly Learning to Align and Translate with Transformer Models (https://arxiv.org/abs/1909.02074)".

In summary, it allows for training selected heads of the Transformer Model with external alignments computed by Statistical Alignment Toolkits. During inference, attention probabilities from the trained heads can be used to extract reliable alignments. In our work, we did not see any regressions in the translation performance because of guided alignment training.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1095

Differential Revision: D17170337

Pulled By: myleott

fbshipit-source-id: daa418bef70324d7088dbb30aa2adf9f95774859

1c667929

27 Sep, 2019 1 commit

Levenshtein Transformer paper code · 86857a58

Changhan Wang authored Sep 27, 2019

Summary:
Code for our NeurIPS paper [Levenshtein Transformer](https://arxiv.org/abs/1905.11006)
* Added Levenshtein Transformer model, task and criterion class
* Added iterative NAT Transformer, insertion Transformer and CMLM Transformer model class for baselines
* Add an option for prepending BOS to dictionary class and translation task class

Reviewed By: myleott

Differential Revision: D17297372

fbshipit-source-id: 54eca60831ae95dc721c2c34e882e1810ee575c7

86857a58

30 Jul, 2019 1 commit

Relicense fairseq under MIT license (#786) · e75cff5f

Myle Ott authored Jul 30, 2019

Summary:
The previous BSD+PATENTS license was controversial. We have been
approved to relicense fairseq under the MIT license.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786

Differential Revision: D16560654

Pulled By: myleott

fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034

e75cff5f

17 Jul, 2019 1 commit

Small bug fix for generation when batch_size is small · 473389a3

Jiajun Shen authored Jul 17, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/727

Differential Revision: D16332742

Pulled By: myleott

fbshipit-source-id: becedd573c2c071fd21fcb5e55fead554c9bd9d1

473389a3

27 Jun, 2019 1 commit

Update generate.py (#831) · c86d70cc

Bao-Yu authored Jun 27, 2019

Summary:
Repeated use of 'i' in evaluate may cause some problems.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/831

Differential Revision: D15980227

Pulled By: myleott

fbshipit-source-id: 7b6b54a6b54938ad63ed1720d930505b56e5c84b

c86d70cc

30 Apr, 2019 1 commit

Merge internal changes (#654) · d45db804

Myle Ott authored Apr 29, 2019

Summary:
- Add --add-bos-token option to LM task
- Cleanup utils.py and options.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/654

Differential Revision: D15041794

Pulled By: myleott

fbshipit-source-id: 3ad00007769d5f48308052cfd40de39c5ffa1a6e

d45db804

12 Mar, 2019 1 commit

Handle 3+ dimensional input in sequence_generator + nits · 860010e9

Dmytro Okhonko authored Mar 12, 2019

Summary: sequence_generator assumes that model input is 2d tensor of longs. But it can be something like 3d tensor of floats and we should be able to handle this as long as first dimension is batch size followed by source lengths.

Reviewed By: myleott

Differential Revision: D14420044

fbshipit-source-id: bf8b1e42ad1873f7b803c1a377b0af21648db015

860010e9

11 Mar, 2019 1 commit

Create fairseq_cli_lib · 7fc9a3be

Matt Le authored Mar 11, 2019

Summary: This allows one to call fairseq_cli functions from within python without dispatching to bash.

Reviewed By: myleott

Differential Revision: D14404719

fbshipit-source-id: 044eb652045bb15fc40e72ecbaf6fb10df9f8c61

7fc9a3be

28 Feb, 2019 1 commit

Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541) · f296824f

Vladimir Karpukhin authored Feb 28, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/541

Just a combo of a stacked pair D14057943 & D14176011,
Made this as a separete diff cause there seems to be some issue with porting a stacked change into github repo

Differential Revision: D14251048

fbshipit-source-id: 0a47f534a69d6ab2ebe035fba40fd51748cccfb8

f296824f

22 Feb, 2019 1 commit

Modularize generate.py (#351) · b65c579b

Myle Ott authored Feb 22, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/351

This makes it easier for tasks to plugin to generate.py/interactive.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/520

Differential Revision: D14183881

Pulled By: myleott

fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33

b65c579b

16 Feb, 2019 1 commit

Merge internal changes · 9998bbfa

Myle Ott authored Feb 15, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/505

Differential Revision: D14110201

Pulled By: myleott

fbshipit-source-id: 099ce61fa386c016f3a1d1815c6fe1a9a6c9005d

9998bbfa

05 Feb, 2019 1 commit

Add standalone binaries · 829bd8ce

Myle Ott authored Feb 05, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/489

Differential Revision: D13956810

Pulled By: myleott

fbshipit-source-id: 61ace179d1d3790226c38b3f3e47f5452b5ec514

829bd8ce

30 Jan, 2019 1 commit

Merge internal changes (#483) · 42be3ebd

Myle Ott authored Jan 30, 2019

Summary:
Changelog:
- `4889802`: can now remove detokenize sentencepiece output with `--remove-bpe=sentencepiece` (fixes #331). Also added `--sacrebleu` for computing detokenized BLEU.
- `0d76427`: fix assertion error when training language model with dataset containing empty sentences
- minor bug and style fixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/483

Differential Revision: D13867899

Pulled By: myleott

fbshipit-source-id: 25c940b847fe270262ac8f5ac838407b3977fdda

42be3ebd

16 Jan, 2019 1 commit

FIX: '--user-dir' on multi-gpu (#449) · 7853818c

Davide Caroselli authored Jan 16, 2019

Summary:
On a multi-gpu training scenario, the `train.py` script spawns new processes with `torch.multiprocessing.spawn`. Unfortunately those child processes don't inherit the modules imported with `--user-dir`.

This pull request fixes this problem: custom module import in now explicit on every `main()` function.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/449

Differential Revision: D13676922

Pulled By: myleott

fbshipit-source-id: 520358d66155697885b878a37e7d0484bddbc1c6

7853818c

05 Jan, 2019 1 commit

Merge internal changes (#283) · 7633129b

Myle Ott authored Jan 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5

7633129b

26 Dec, 2018 1 commit

Merge internal changes (#422) · 8ce6499d

Myle Ott authored Dec 26, 2018

Summary:
- 04cc608: Add `--match-source-len` option to generate.py to for sequence-tagging tasks
- 19f1a40: Add `--no-repeat-ngram-size` option to generate.py for ngram blocking
Pull Request resolved: https://github.com/pytorch/fairseq/pull/422

Differential Revision: D13548445

Pulled By: myleott

fbshipit-source-id: 26d1ae83993e428fcb020dac5ae358b0e36233d9

8ce6499d

03 Sep, 2018 4 commits
- Add documentation · 6381cc97
  Myle Ott authored Sep 03, 2018
  
  6381cc97
- Clean up FairseqTask so that it's easier to extend/add new tasks · 2e507d3c
  Myle Ott authored Aug 30, 2018
  
  2e507d3c
- Diverse Beam Search · 8c0ca1a0
  Myle Ott authored Aug 10, 2018
  
  8c0ca1a0
- Factor out search logic in SequenceGenerator · ef43da72
  Myle Ott authored Aug 09, 2018
  
  ef43da72
25 Jul, 2018 2 commits
- Iterate on need_attn and fix tests · bb5f15d1
  Myle Ott authored Jul 12, 2018
  
  bb5f15d1
- disable printing alignment by default (for perf) and add a flag to enable it · 89e19d42
  Alexei Baevski authored Jul 10, 2018
  
  89e19d42
08 Jul, 2018 1 commit
- add model override argument from load_ensemble_for_inference at generation... · 30ef667d
  Angela Fan authored Jul 07, 2018
```
add model override argument from load_ensemble_for_inference at generation time, updating readme for stories
```
  30ef667d
21 Jun, 2018 1 commit
- Support FP16 during inference · 930c9580
  Myle Ott authored Jun 19, 2018
  
  930c9580
15 Jun, 2018 10 commits

Change --path to be colon-separated instead of comma-separated · 16caed31
Myle Ott authored Jun 14, 2018

16caed31

Add FairseqTask · ff68a9ef

Myle Ott authored Jun 12, 2018

A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.

Changes:
- Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
- Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
- Remove LEFT_PAD_* constants and make them configurable per task

ff68a9ef

Unify various sharding into ShardedIterator · 24d7de44
Myle Ott authored May 30, 2018

24d7de44
Migrate all binaries to use options.parse_args_and_arch · 76b5ecab
Myle Ott authored May 30, 2018

76b5ecab
Nits · cf1c64a5
Myle Ott authored May 30, 2018

cf1c64a5
added multiscale gated self attention layer with multiple heads, and pretrained fusion models · b59815bc
Angela Fan authored May 09, 2018

b59815bc

Conv lm implementation · 4c2ef2de

alexeib authored May 25, 2018

This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf

There are 3 modes for constructing batches:

- token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
- complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
- eos: one sentence per sample (skip blank lines)

some results:

GCNN-13 - GBW - 37.46
GCNN-14B - GBW - 33.88
GCNN-8 - Wiki103 - 43.76
GCNN-14 - Wiki103 - 35.66

train:

python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500

eval:

python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'

4c2ef2de

also report sentence/s timing when generating · bf47b956
Alexei Baevski authored May 17, 2018

bf47b956
allow specifying max_tokens for generation · 67af40c9
Alexei Baevski authored May 15, 2018

67af40c9

remove completed sentences from batch · 2a84f46b

Alexei Baevski authored Apr 12, 2018

remove completed sentences from batch and allow batching uneven lengths (with fixes to make padded sequences work correctly in all models)

2a84f46b

02 Apr, 2018 1 commit

Merge internal changes (#136) · d3795d6c

Myle Ott authored Apr 02, 2018

Changes:
- 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
- c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
- 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
- small bugfixes for distributed training, LSTM, inverse square root LR scheduler

d3795d6c

05 Mar, 2018 1 commit
- Allow more flexible pre-processing and generation (#227) · b03b53b4
  Sergey Edunov authored Mar 05, 2018
```
* Allow more flexible pre-processing and generation

* Addressing CR comments

* small fix
```
  b03b53b4
01 Mar, 2018 1 commit
- More fixes for recent PyTorch (incl. topk issue) (#113) · 3bde773d
  Myle Ott authored Mar 01, 2018
  
  3bde773d
27 Feb, 2018 2 commits
- Add support to prefixes (#221) · 866b27d5
  Dario Pavllo authored Feb 23, 2018
```
* Add prefix

* Fixes

* Keep original scores with prefix

* Improve prefix code

* Replace 'repeat' with 'expand'
```
  866b27d5
- More unit test fixes · 0d90e35f
  Myle Ott authored Feb 15, 2018
  
  0d90e35f