Commits · d8d03745b343e455fe288debc69092be2496e47a · OpenDAS / Fairseq

25 Apr, 2019 1 commit

Added link to blog post (#662) · d8d03745

ankur6ue authored Apr 24, 2019

Summary:
Added link to blog post about incremental decoder in the FairseqIncrementalDecoder class description.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/662

Differential Revision: D15077845

Pulled By: myleott

fbshipit-source-id: f23294721739600e14feb2cca4ece95f2b968f44

d8d03745

22 Apr, 2019 1 commit

Fix generation with --no-early-stop (#627) · fa52d202

Max Ryabinin authored Apr 22, 2019

Summary:
Because the size of `unfinalized_scores` is equal to current `bsz` and not initial batch size, we need to index it by `unfin_idx` instead of `sent` in `is_finished`.
Fixes #588.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/627

Differential Revision: D15034641

Pulled By: myleott

fbshipit-source-id: 2638e68e877ae01256cac7d8e69b5b7fec8f7017

fa52d202

17 Apr, 2019 3 commits

Open BlockPairDataset for MaskedLMData to work (#641) · d2f3007c

Kartikay Khandelwal authored Apr 17, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/641

Fix breaking import

Reviewed By: pipibjc

Differential Revision: D14978454

fbshipit-source-id: 7b43152cb30100881e9991ead871531ee3f60e07

d2f3007c

Enable custom sampling strategy in MultiCorpusSampledDataset (#639) · 90d6eac2

Ning Dong authored Apr 16, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/639

Add argument sampling_func in the constructor to enable custom sampling over a list of dataset keys. The default strategy is to sample uniformly as it did previously.

Reviewed By: liezl200

Differential Revision: D14965774

fbshipit-source-id: f3285688a9ae3729c0ba12c22254c1144d0eea9e

90d6eac2

Black formatting for multi_corpus_sampled_dataset.py (#638) · 17cef3f6

Ning Dong authored Apr 16, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/638

RT

Reviewed By: liezl200

Differential Revision: D14967268

fbshipit-source-id: 2da361497743d90a841fdbf2a50085136c70b468

17cef3f6

16 Apr, 2019 1 commit

Open Source MLM Implementation in Fairseq (#635) · 8776928c

Kartikay Khandelwal authored Apr 16, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/635

Adding a task and relevant models, datasets and criteria needed for training Cross-lingual Language Models similar to Masked Language Model used in XLM (Lample and Conneau, 2019 - https://arxiv.org/abs/1901.07291).

Reviewed By: liezl200

Differential Revision: D14943776

fbshipit-source-id: 3e416a730303d1dd4f5b92550c78db989be27073

8776928c

15 Apr, 2019 2 commits

Better distributed init · 303b95ce

Myle Ott authored Apr 15, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/615

Differential Revision: D14933742

Pulled By: myleott

fbshipit-source-id: c2c20425875743c89bbc2ac564a2fbb6ff4958b2

303b95ce

Simplify and generalize utils.make_positions · e12e1d25

Myle Ott authored Apr 15, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/625

Differential Revision: D14822123

Pulled By: myleott

fbshipit-source-id: 8a263d30020588577ee02fb8c6959ff918705103

e12e1d25

12 Apr, 2019 1 commit

Fix hybrid transformer state dict update after encoder layernorm rename (#633) · a47630e1

Liezl Puzon authored Apr 12, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/633

Pull Request resolved: https://github.com/pytorch/translate/pull/456

This diff makes it easier to upgrade the state dict for components that use TransformerEncoderLayer

Reviewed By: jhcross

Differential Revision: D14916941

fbshipit-source-id: 6d0258c8a9492a720684dadce59c90fc87cbf5cf

a47630e1

10 Apr, 2019 4 commits

Fix sacrebleu (#630) · 58b912f6

Xian Li authored Apr 10, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/630

sacrebleu scorer has stopped working in pytorch_translate (maybe
fairseq too) probably due to  a recent api change.

Reviewed By: jmp84

Differential Revision: D14792797

fbshipit-source-id: c2a00246e08bc913c41e60c5fbf8ab4ab5e80d18

58b912f6

Make TransformerEncoderLayer layer norm names more descriptive · e5ba94ab

Liezl Puzon authored Apr 10, 2019

Summary:
I added an upgrade_state_dict function so that loading old models will still work

layer_norms[0] --> self_attn_layer_norm
layer_norms[1] --> final_layer_norm

Reviewed By: pipibjc

Differential Revision: D14689849

fbshipit-source-id: b2809262c11fe9d083e571fa31044798aefd48ce

e5ba94ab

Add anneal-eps argument · 309f2511

Kritika Singh authored Apr 10, 2019

Summary: Used in fairspeq/train.py

Reviewed By: myleott, yqwangustc

Differential Revision: D14841512

fbshipit-source-id: 02fd7b58841c32e2797e3159e65f2bef36f02da1

309f2511

Back translation + denoising in MultilingualTranslation task (#620) · d7e19573

Peng-Jen Chen authored Apr 10, 2019

Summary:
- Add language token to MultilingualTranslation task
- Add back translation and denoising loss to MultilingualTranslation task
Pull Request resolved: https://github.com/pytorch/fairseq/pull/620

Reviewed By: liezl200

Differential Revision: D14756873

Pulled By: pipibjc

fbshipit-source-id: 89d668db26848fd95f446edf5923bab2113636f7

d7e19573

09 Apr, 2019 1 commit

Rename embedding layers to be the same as NMT (#628) · c2820af0

Kartikay Khandelwal authored Apr 09, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/628

Updating embedding layers in TransformerSentenceEncoder to be compatible with the transformer model.

Reviewed By: liezl200

Differential Revision: D14836883

fbshipit-source-id: 2240f61bf40b191d01b4efdaac4dd7562b4166c6

c2820af0

05 Apr, 2019 3 commits

Eval and log on a subset of directions for multimodel training (#605) · 40ac340b

Liezl Puzon authored Apr 05, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/605

Eval and log on a subset of directions for multimodel training

This reduces code duplication in PyTorch Translate's semi_supervised task and will enable clean multitask setups in the future.

Reviewed By: pipibjc, dpacgopinath

Differential Revision: D14672779

fbshipit-source-id: 1342c71781f0824cc56a38ad1c1822e34eaef337

40ac340b

Refactor Fairseq models for BERT and XLM to use TransformerSentenceEncoder (#622) · f492db25

Kartikay Khandelwal authored Apr 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/622

Updating some defaults to more meaningful values

Reviewed By: rutyrinott

Differential Revision: D14761263

fbshipit-source-id: 7ac670aa370f315ddfb511c63273583a6062c569

f492db25

Add Transformer Sentence Encoder for BERT and XLM Pre-training in PyText (#621) · f040158a

Kartikay Khandelwal authored Apr 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/621

In this commit, I add some modules to Fairseq needed to set up Bert/XLM style pretraining.

Reviewed By: borguz

Differential Revision: D14719663

fbshipit-source-id: 1c5c36b6b2cde1c9bcd3c9e9ac853d2b7ae64102

f040158a

04 Apr, 2019 1 commit

aligned training task and CE related changes · 3658fa32

Jay Mahadeokar authored Apr 03, 2019

Summary:
This diff adds:

1. Aligned training task specifically for doing cross entropy criterion training using prod data and prod like models
2. Few changes to correctly register the task and criterions.
3. Changes to trainer code for propogating accuracy metrics which we care about for training.

Couple of things are hacky right now:
- The reporting is not modular (this needs to be thought about in general for fairseq).

- The get dummy batch could be specific to task instead of specific for dataset.

Reviewed By: myleott

Differential Revision: D14670482

fbshipit-source-id: dc077247b2ae9d26a8e842a386ec5faa5771e836

3658fa32

03 Apr, 2019 2 commits

work around lack of optional output for forks (#429) · 3a64aced

James Cross authored Apr 03, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/429

Pull Request resolved: https://github.com/pytorch/fairseq/pull/618

PyTorch export for transformer models was broken because as written, they used a placeholder `None` value during inference for the variable `key_padding_mask` to indicate no padding, but PyTorch is unable trace such values. This diff adds a minor hack to allow the use of an empty tensor for the same purpose.

Reviewed By: jmp84

Differential Revision: D14581730

fbshipit-source-id: 2ea4664c20ecab8478c578b2182a85319140036c

3a64aced

sort dictionary items lexicographically for consistency · 10ad7495

Paco Guzman authored Apr 03, 2019

Summary: Sorts dictionaries lexicographically before creating counter. This makes distributed preprocessing deterministic

Reviewed By: myleott

Differential Revision: D14678214

fbshipit-source-id: 7a9e2f0cb367e8fb76da01e108dda4c6c5aab505

10ad7495

02 Apr, 2019 1 commit

Update data_utils.py for (#598) · 3efc39ee

Yash Kumar Atri authored Apr 01, 2019

Summary:
Correcting the syntax error in assert function cause of a character before error message.

Assertion and the code is working fine now, Tested with wmt-ende task.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/598

Differential Revision: D14712846

Pulled By: myleott

fbshipit-source-id: 3f708aa2362ceecba19174750f9ffc9238537512

3efc39ee

29 Mar, 2019 2 commits

Add utils.deprecation_warning · a78ad1ac

Myle Ott authored Mar 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/607

Differential Revision: D14681031

Pulled By: myleott

fbshipit-source-id: 466ee526a30543218e2b7138fb651db866ae5ab3

a78ad1ac

Fixing a bug of DynamicConv in the unfolding mode (#593) · 34c9ebf0

Felix Wu authored Mar 28, 2019

Summary:
The unfold1d.py has the same name as the function `unfold1d` function, which will cause an error when using DynamicConv1dTBC with `unfold=True`.
This doesn't affect the NMT models which don't use the unfolding mode though.

I rename `unfold1d.py` as `unfold.py` to fix this bug.

Originally we would get `TypeError` when running this code:
```
import torch
from fairseq.modules import LightweightConv1dTBC, DynamicConv1dTBC

x = torch.rand(4, 10, 8)
m = LightweightConv1dTBC(8, 4, 3)
o = m(x, unfold=True)

m = DynamicConv1dTBC(8, 4, 3)
o = m(x, unfold=True)
```
Pull Request resolved: https://github.com/pytorch/fairseq/pull/593

Differential Revision: D14597117

Pulled By: myleott

fbshipit-source-id: 59752fd7ff62c53a4aba8b56b83155291e5f5792

34c9ebf0

26 Mar, 2019 1 commit

fixes for exporter issue of bi-transformer model (#597) · 8e66a12f

Haoran Li authored Mar 26, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/597

Pull Request resolved: https://github.com/facebookresearch/pytext/pull/424

Fixes two issues:
1. the new Layernorm has issues in exporting
2. fix tensorboard writing by using the "RAW" operator_export_type

Differential Revision: D14610694

fbshipit-source-id: 1b859f54c571a90766128ab28539a9901375c3e6

8e66a12f

15 Mar, 2019 1 commit

0.6.1 -> 0.6.2 (#577) · e6422528

Myle Ott authored Mar 15, 2019

Summary:
Changelog:
- 998ba4f: Add language models from Baevski & Auli (2018)
- 4294c4f6: Add mixture of experts code from Shen et al. (2019)
- 00493490: Add example for multilingual training
- 48d9afbe: Speed improvements, including fused operators from apex
- 44d27e64: Add Tensorboard support
- d17fa851: Add Adadelta optimizer
- 9e1c880f: Add `FairseqEncoderModel`
- b65c579b: Add `FairseqTask.inference_step` to modularize generate.py
- 2ad1178e: Add back `--curriculum`
- Misc bug fixes and other features

Pull Request resolved: https://github.com/pytorch/fairseq/pull/577

Differential Revision: D14481233

Pulled By: myleott

fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7

e6422528

14 Mar, 2019 1 commit

Speed improvements (#531) · 48d9afbe

Myle Ott authored Mar 14, 2019

Summary:
* Add FusedLayerNorm and FusedAdam
* Softmax and zero grad optimizations
Pull Request resolved: https://github.com/pytorch/fairseq/pull/531

Differential Revision: D14218457

Pulled By: myleott

fbshipit-source-id: 5656b2d0152cd85f77dc21ec0e1439ec04b9fa89

48d9afbe

13 Mar, 2019 1 commit

Enable sampling (#571) · 4d3401b0

Qing Sun authored Mar 12, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/571

Enable sampling from Fairseq

Reviewed By: akinh

Differential Revision: D13981666

fbshipit-source-id: 2af1bd67701a73a2c76a9255bd8381d6a7518876

4d3401b0

12 Mar, 2019 3 commits

Handle 3+ dimensional input in sequence_generator + nits · 860010e9

Dmytro Okhonko authored Mar 12, 2019

Summary: sequence_generator assumes that model input is 2d tensor of longs. But it can be something like 3d tensor of floats and we should be able to handle this as long as first dimension is batch size followed by source lengths.

Reviewed By: myleott

Differential Revision: D14420044

fbshipit-source-id: bf8b1e42ad1873f7b803c1a377b0af21648db015

860010e9

Adadelta optimizer · d17fa851

Dmytro Okhonko authored Mar 12, 2019

Summary: Adding Adadelta optimizer to fairseq as wrapper around torch.optim.Adadelta

Reviewed By: myleott

Differential Revision: D14418635

fbshipit-source-id: 6bf5ec008e905a4a2cbf7415e9492f5eea3ff07f

d17fa851

FairseqEncoderModel · 9e1c880f

Dmytro Okhonko authored Mar 12, 2019

Summary: Base class for encoder-only models. Some models doesn't have decoder part.

Reviewed By: myleott

Differential Revision: D14413406

fbshipit-source-id: f36473b91dcf3c835fd6d50e2eb6002afa75f11a

9e1c880f

04 Mar, 2019 2 commits

Try to access sys.stdin.fileno() only at runtime and not during import (#553) · 5869385c

Louis MARTIN authored Mar 04, 2019

Summary:
Accessing sys.stdin.fileno() raises an error in multiple contexts
(pytest, joblib, jupyter...).
Thus accessing it at the top level of the file can cause other scripts
to crash when they import fairseq.
This is why it is moved inside the method of MultiprocessingPdb to only
be accessed at runtime if needed.

See  Issue #517
Pull Request resolved: https://github.com/pytorch/fairseq/pull/553

Differential Revision: D14309284

Pulled By: myleott

fbshipit-source-id: 6ca36f2053a86ebc02e2d6f025459c6a78c592e7

5869385c

Add --curriculum (fixes #533) · 2ad1178e

Myle Ott authored Mar 04, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/554

Differential Revision: D14300596

Pulled By: myleott

fbshipit-source-id: f38c8e58daef99d5e4b97dd423e4142e4294a4f0

2ad1178e

02 Mar, 2019 1 commit

Fix Pdb · 1fd0a6f6

Myle Ott authored Mar 02, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/551

Differential Revision: D14295227

Pulled By: myleott

fbshipit-source-id: 404f2a2697a62ce0dbf22e5ab2e1cf932acc83ac

1fd0a6f6

01 Mar, 2019 2 commits

Fixed the issue that no space in string converted from tensor · 88bf8b56

James King authored Mar 01, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/548

Differential Revision: D14286021

Pulled By: myleott

fbshipit-source-id: 7c725304185e63787220371a812ec860e178872c

88bf8b56

Refactor BERTDataset to the more general MaskedLMDataset · 92a6c548

Kartikay Khandelwal authored Feb 28, 2019

Summary: The current BERTDataset has a lot of components needed for generic MaskedLM training but is too restrictive in terms of the assumptions it makes - two blocks being masked, the special tokens used for the sentence embedding as well as the separator etc. In this diff I refactor this dataset and at the same time add make some of the parameters including the probabilities associated with masking configurable.

Reviewed By: rutyrinott

Differential Revision: D14222467

fbshipit-source-id: e9f78788dfe7f56646ba09c62967c4c0bd30aed8

92a6c548

28 Feb, 2019 2 commits

Deprecate _aggregate_logging_outputs · 8a8df81d

Myle Ott authored Feb 28, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/498

Differential Revision: D14024524

Pulled By: myleott

fbshipit-source-id: 1b0be4bb212dbab41ea0959ac34020832ff00645

8a8df81d

Move string line encoding logic from tokenizer to Dictionary (unified diff). (#541) · f296824f

Vladimir Karpukhin authored Feb 28, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/541

Just a combo of a stacked pair D14057943 & D14176011,
Made this as a separete diff cause there seems to be some issue with porting a stacked change into github repo

Differential Revision: D14251048

fbshipit-source-id: 0a47f534a69d6ab2ebe035fba40fd51748cccfb8

f296824f

26 Feb, 2019 3 commits

Support LM generation from interactive.py (fixes #526) · 98daf039

Myle Ott authored Feb 25, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/528

Differential Revision: D14218377

Pulled By: myleott

fbshipit-source-id: facb0a32f6aebf56a4fea7259080394ad2d2d846

98daf039

Multilingual training example (#527) · 00493490

Myle Ott authored Feb 25, 2019

Summary:
* Add example for multilingual translation on IWSLT'17
* Match dataset ordering for multilingual_translation and translation
* Fix bug with LegacyDistributedDataParallel when calling forward of sub-modules
Pull Request resolved: https://github.com/pytorch/fairseq/pull/527

Differential Revision: D14218372

Pulled By: myleott

fbshipit-source-id: 2e3fe24aa39476bcc5c9af68ef9a40192db34a3b

00493490

Add Tensorboard support (#530) · 44d27e64

Myle Ott authored Feb 25, 2019

Summary:
Enable with the `--tensorboard-logdir` option.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/530

Differential Revision: D14218430

Pulled By: myleott

fbshipit-source-id: e7a54f66f928e3bb02ae03fda09b22fa4fa7d053

44d27e64