Commits · 1c6679294848f303a361cba7b306b760e299bd9c · OpenDAS / Fairseq

30 Sep, 2019 1 commit

Implementation of the paper "Jointly Learning to Align and Translate with... · 1c667929

Sarthak Garg authored Sep 30, 2019

Implementation of the paper "Jointly Learning to Align and Translate with Transformer Models" (#877)

Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/877

This PR implements guided alignment training described in "Jointly Learning to Align and Translate with Transformer Models (https://arxiv.org/abs/1909.02074)".

In summary, it allows for training selected heads of the Transformer Model with external alignments computed by Statistical Alignment Toolkits. During inference, attention probabilities from the trained heads can be used to extract reliable alignments. In our work, we did not see any regressions in the translation performance because of guided alignment training.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1095

Differential Revision: D17170337

Pulled By: myleott

fbshipit-source-id: daa418bef70324d7088dbb30aa2adf9f95774859

1c667929

30 Jul, 2019 1 commit

Relicense fairseq under MIT license (#786) · e75cff5f

Myle Ott authored Jul 30, 2019

Summary:
The previous BSD+PATENTS license was controversial. We have been
approved to relicense fairseq under the MIT license.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/786

Differential Revision: D16560654

Pulled By: myleott

fbshipit-source-id: f78b1beb4f2895dd7b9bfc79f5f952a2bfb94034

e75cff5f

21 Jul, 2019 1 commit

Rename data.transforms -> data.encoders · f812e529

Myle Ott authored Jul 21, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/747

Differential Revision: D16403464

Pulled By: myleott

fbshipit-source-id: ee3b4184f129a02be833c7bdc00685978b4de883

f812e529

19 Jul, 2019 1 commit

Improve interactive generation (support --tokenizer and --bpe) · 8af55542

Myle Ott authored Jul 19, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/734

Differential Revision: D16377044

Pulled By: myleott

fbshipit-source-id: 37d5553d76aa7c653113fec089f59710281c31d7

8af55542

10 Jun, 2019 1 commit

More generator features for demo (#791) · 4868c182

Myle Ott authored Jun 10, 2019

Summary:
- make it possible to load file_utils.py without the dependencies
- add some more demo features
Pull Request resolved: https://github.com/pytorch/fairseq/pull/791

Differential Revision: D15739950

Pulled By: myleott

fbshipit-source-id: 38df5209973a6fe2e3651575b97134e096aaf5bf

4868c182

08 May, 2019 1 commit

Cleanup LM + Flake8 · f2563c21

Myle Ott authored May 08, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/720

Differential Revision: D15259091

Pulled By: myleott

fbshipit-source-id: 06a35996c06ccddb49fdc9e01e348ff3c9da334e

f2563c21

30 Apr, 2019 1 commit

Merge internal changes (#654) · d45db804

Myle Ott authored Apr 29, 2019

Summary:
- Add --add-bos-token option to LM task
- Cleanup utils.py and options.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/654

Differential Revision: D15041794

Pulled By: myleott

fbshipit-source-id: 3ad00007769d5f48308052cfd40de39c5ffa1a6e

d45db804

29 Mar, 2019 1 commit

Output original IDs in interactive.py (in case some rows are filtered; fixes #591) · 2340832f

Myle Ott authored Mar 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/606

Differential Revision: D14680968

Pulled By: myleott

fbshipit-source-id: 8044d828a8167199c10f2aee24f7e611feb91802

2340832f

19 Mar, 2019 1 commit

Remove unused import · b26d6b58

Myle Ott authored Mar 19, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/587

Differential Revision: D14517597

Pulled By: myleott

fbshipit-source-id: 4831ea5a9da1c2e207529a4ab3c4d0b070f5f34e

b26d6b58

28 Feb, 2019 1 commit

Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541) · f296824f

Vladimir Karpukhin authored Feb 28, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/541

Just a combo of a stacked pair D14057943 & D14176011,
Made this as a separete diff cause there seems to be some issue with porting a stacked change into github repo

Differential Revision: D14251048

fbshipit-source-id: 0a47f534a69d6ab2ebe035fba40fd51748cccfb8

f296824f

26 Feb, 2019 1 commit

Support LM generation from interactive.py (fixes #526) · 98daf039

Myle Ott authored Feb 25, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/528

Differential Revision: D14218377

Pulled By: myleott

fbshipit-source-id: facb0a32f6aebf56a4fea7259080394ad2d2d846

98daf039

22 Feb, 2019 1 commit

Modularize generate.py (#351) · b65c579b

Myle Ott authored Feb 22, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/351

This makes it easier for tasks to plugin to generate.py/interactive.py
Pull Request resolved: https://github.com/pytorch/fairseq/pull/520

Differential Revision: D14183881

Pulled By: myleott

fbshipit-source-id: ede5e53ddc1215ed3b12b8f1eba048c946913c33

b65c579b

16 Feb, 2019 1 commit

Merge internal changes · 9998bbfa

Myle Ott authored Feb 15, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/505

Differential Revision: D14110201

Pulled By: myleott

fbshipit-source-id: 099ce61fa386c016f3a1d1815c6fe1a9a6c9005d

9998bbfa

05 Feb, 2019 1 commit

Add standalone binaries · 829bd8ce

Myle Ott authored Feb 05, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/489

Differential Revision: D13956810

Pulled By: myleott

fbshipit-source-id: 61ace179d1d3790226c38b3f3e47f5452b5ec514

829bd8ce

30 Jan, 2019 1 commit

Add --input option to interactive.py to support reading from file · 3dce7c9f

Myle Ott authored Jan 30, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/484

Differential Revision: D13880636

Pulled By: myleott

fbshipit-source-id: 984b2e1c3b281c28243102eb971ea45ec891d94e

3dce7c9f

16 Jan, 2019 1 commit

FIX: '--user-dir' on multi-gpu (#449) · 7853818c

Davide Caroselli authored Jan 16, 2019

Summary:
On a multi-gpu training scenario, the `train.py` script spawns new processes with `torch.multiprocessing.spawn`. Unfortunately those child processes don't inherit the modules imported with `--user-dir`.

This pull request fixes this problem: custom module import in now explicit on every `main()` function.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/449

Differential Revision: D13676922

Pulled By: myleott

fbshipit-source-id: 520358d66155697885b878a37e7d0484bddbc1c6

7853818c

05 Jan, 2019 1 commit

Merge internal changes (#283) · 7633129b

Myle Ott authored Jan 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5

7633129b

26 Dec, 2018 1 commit

Merge internal changes (#422) · 8ce6499d

Myle Ott authored Dec 26, 2018

Summary:
- 04cc608: Add `--match-source-len` option to generate.py to for sequence-tagging tasks
- 19f1a40: Add `--no-repeat-ngram-size` option to generate.py for ngram blocking
Pull Request resolved: https://github.com/pytorch/fairseq/pull/422

Differential Revision: D13548445

Pulled By: myleott

fbshipit-source-id: 26d1ae83993e428fcb020dac5ae358b0e36233d9

8ce6499d

25 Sep, 2018 1 commit
- Pass encoder_input to generator, rather than src_tokens/src_lengths. · bfeb7732
  Stephen Roller authored Sep 08, 2018
  
  bfeb7732
03 Sep, 2018 3 commits
- Add documentation · 6381cc97
  Myle Ott authored Sep 03, 2018
  
  6381cc97
- Clean up FairseqTask so that it's easier to extend/add new tasks · 2e507d3c
  Myle Ott authored Aug 30, 2018
  
  2e507d3c
- Diverse Beam Search · 8c0ca1a0
  Myle Ott authored Aug 10, 2018
  
  8c0ca1a0
25 Jul, 2018 2 commits
- Output positional scores in interactive.py · c37fc8fd
  Myle Ott authored Jul 12, 2018
  
  c37fc8fd
- Iterate on need_attn and fix tests · bb5f15d1
  Myle Ott authored Jul 12, 2018
  
  bb5f15d1
19 Jul, 2018 1 commit
- Pass sampling-temperature trough to the generator in interactive.py · eaa576b0
  Sergey Edunov authored Jul 19, 2018
  
  eaa576b0
08 Jul, 2018 1 commit
- adding model arg override at generation time for interactive.py · 1d79ed9b
  Angela Fan authored Jul 08, 2018
  
  1d79ed9b
25 Jun, 2018 1 commit
- Remove more Variable() calls (#198) · 6edf81dd
  Myle Ott authored Jun 25, 2018
  
  6edf81dd
21 Jun, 2018 1 commit
- Support FP16 during inference · 930c9580
  Myle Ott authored Jun 19, 2018
  
  930c9580
15 Jun, 2018 7 commits

Change --path to be colon-separated instead of comma-separated · 16caed31
Myle Ott authored Jun 14, 2018

16caed31

Add FairseqTask · ff68a9ef

Myle Ott authored Jun 12, 2018

A Task defines the data format, stores shared state (e.g., dictionaries) and provides helpers for building the model/criterion and calculating the loss.

Changes:
- Add TranslationTask and LanguageModelingTask. New tasks can be registered with @register_task decorator.
- Add EpochBatchIterator to encapsulate batching and saving/restoring dataloader position
- Remove LEFT_PAD_* constants and make them configurable per task

ff68a9ef

Migrate all binaries to use options.parse_args_and_arch · 76b5ecab
Myle Ott authored May 30, 2018

76b5ecab
added multiscale gated self attention layer with multiple heads, and pretrained fusion models · b59815bc
Angela Fan authored May 09, 2018

b59815bc

Conv lm implementation · 4c2ef2de

alexeib authored May 25, 2018

This implements convolutional language model from https://arxiv.org/pdf/1612.08083.pdf

There are 3 modes for constructing batches:

- token block: fill each sample with a specified number of tokens without regard for sentence delimiters - this is what was used for training in the paper
- complete: fill each sample with a specified number of tokens but make sure it contains only complete sentences (i.e. if next sentence goes over token block limit, move it to the next sample) - this was used for evaluation in the paper
- eos: one sentence per sample (skip blank lines)

some results:

GCNN-13 - GBW - 37.46
GCNN-14B - GBW - 33.88
GCNN-8 - Wiki103 - 43.76
GCNN-14 - Wiki103 - 35.66

train:

python train.py /private/home/abaevski/data/wiki103 --save-dir /tmp --fp16 --max-epoch 35 --save-interval 1 --save-interval-updates 1000 --keep-interval-updates 25 --arch fconv_lm --optimizer nag --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 --decoder-embed-dim 280 --decoder-layers '[(850, 6)] * 3 + [(850,1)] + [(850,5)] * 4 + [(850,1)] + [(850,4)] * 3 + [(1024,4)] + [(2048, 4)]' --clip-norm 0.1 --dropout 0.2 --weight-decay 5e-06 --criterion cross_entropy --max-tokens 1024 --max-target-positions 1024 --seed 1 --log-format json --log-interval 500

eval:

python eval_lm.py ~abaevski/data/wiki103 --path '/checkpoint02/abaevski/2018-04-27/lm_wiki.fp16.mxup300000.fconv.adam.lrs=reduce_lr_on_plateau.emb280.layers(850,6)*3+(850,1)+(850,5)*4+(850,1)+(850,4)*3+(1024,1)+(2048,4).lr0.0005.clp0.1.drp0.3.wd0.0.crt=cross_entropy.mxtk2048.smptk256.seed1.ngpu8/checkpoint_last.pt'

4c2ef2de

implement batching in interactive mode · 663fd806
Alexei Baevski authored May 11, 2018

663fd806
Sampling doesn't work with interactive · 4ce453b1
Sergey Edunov authored May 10, 2018

4ce453b1

01 May, 2018 2 commits
- Disallow --batch-size in interactive.py · 56099c74
  Myle Ott authored May 01, 2018
  
  56099c74
- make interactive mode print out alignment nicely · 6532e32b
  alexeib authored Apr 11, 2018
  
  6532e32b
02 Apr, 2018 1 commit

Merge internal changes (#136) · d3795d6c

Myle Ott authored Apr 02, 2018

Changes:
- 7d19e36: Add `--sampling` flag to generate.py to sample instead of doing beam search
- c777340: Add `scripts/average_checkpoints.py` to average multiple checkpoints into a combined model
- 3ea882c: Add `--max-update` option to train.py to stop training after a given number of updates
- small bugfixes for distributed training, LSTM, inverse square root LR scheduler

d3795d6c

27 Feb, 2018 2 commits

More unit test fixes · 0d90e35f
Myle Ott authored Feb 15, 2018

0d90e35f

fairseq-py goes distributed (#106) · 66415206

Myle Ott authored Feb 27, 2018

This PR includes breaking API changes to modularize fairseq-py and adds support for distributed training across multiple nodes.

Changes:
- c7033ef: add support for distributed training! See updated README for usage.
- e016299: modularize fairseq-py, adding support for register_model, register_criterion, register_optimizer, etc.
- 154e440: update LSTM implementation to use PackedSequence objects in the encoder, better following best practices and improving perf
- 90c2973 and 1da6265: improve unit test coverage

66415206