Commits · d8d03745b343e455fe288debc69092be2496e47a · OpenDAS / Fairseq

25 Apr, 2019 1 commit

Added link to blog post (#662) · d8d03745

ankur6ue authored Apr 24, 2019

Summary:
Added link to blog post about incremental decoder in the FairseqIncrementalDecoder class description.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/662

Differential Revision: D15077845

Pulled By: myleott

fbshipit-source-id: f23294721739600e14feb2cca4ece95f2b968f44

d8d03745

16 Apr, 2019 1 commit

Open Source MLM Implementation in Fairseq (#635) · 8776928c

Kartikay Khandelwal authored Apr 16, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/635

Adding a task and relevant models, datasets and criteria needed for training Cross-lingual Language Models similar to Masked Language Model used in XLM (Lample and Conneau, 2019 - https://arxiv.org/abs/1901.07291).

Reviewed By: liezl200

Differential Revision: D14943776

fbshipit-source-id: 3e416a730303d1dd4f5b92550c78db989be27073

8776928c

15 Apr, 2019 1 commit

Simplify and generalize utils.make_positions · e12e1d25

Myle Ott authored Apr 15, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/625

Differential Revision: D14822123

Pulled By: myleott

fbshipit-source-id: 8a263d30020588577ee02fb8c6959ff918705103

e12e1d25

12 Apr, 2019 1 commit

Fix hybrid transformer state dict update after encoder layernorm rename (#633) · a47630e1

Liezl Puzon authored Apr 12, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/633

Pull Request resolved: https://github.com/pytorch/translate/pull/456

This diff makes it easier to upgrade the state dict for components that use TransformerEncoderLayer

Reviewed By: jhcross

Differential Revision: D14916941

fbshipit-source-id: 6d0258c8a9492a720684dadce59c90fc87cbf5cf

a47630e1

10 Apr, 2019 2 commits

Make TransformerEncoderLayer layer norm names more descriptive · e5ba94ab

Liezl Puzon authored Apr 10, 2019

Summary:
I added an upgrade_state_dict function so that loading old models will still work

layer_norms[0] --> self_attn_layer_norm
layer_norms[1] --> final_layer_norm

Reviewed By: pipibjc

Differential Revision: D14689849

fbshipit-source-id: b2809262c11fe9d083e571fa31044798aefd48ce

e5ba94ab

Back translation + denoising in MultilingualTranslation task (#620) · d7e19573

Peng-Jen Chen authored Apr 10, 2019

Summary:
- Add language token to MultilingualTranslation task
- Add back translation and denoising loss to MultilingualTranslation task
Pull Request resolved: https://github.com/pytorch/fairseq/pull/620

Reviewed By: liezl200

Differential Revision: D14756873

Pulled By: pipibjc

fbshipit-source-id: 89d668db26848fd95f446edf5923bab2113636f7

d7e19573

15 Mar, 2019 1 commit

0.6.1 -> 0.6.2 (#577) · e6422528

Myle Ott authored Mar 15, 2019

Summary:
Changelog:
- 998ba4f: Add language models from Baevski & Auli (2018)
- 4294c4f6: Add mixture of experts code from Shen et al. (2019)
- 00493490: Add example for multilingual training
- 48d9afbe: Speed improvements, including fused operators from apex
- 44d27e64: Add Tensorboard support
- d17fa851: Add Adadelta optimizer
- 9e1c880f: Add `FairseqEncoderModel`
- b65c579b: Add `FairseqTask.inference_step` to modularize generate.py
- 2ad1178e: Add back `--curriculum`
- Misc bug fixes and other features

Pull Request resolved: https://github.com/pytorch/fairseq/pull/577

Differential Revision: D14481233

Pulled By: myleott

fbshipit-source-id: 4ff8625ef1c0b24273fc65df7c5658e3c932e8b7

e6422528

14 Mar, 2019 1 commit

Speed improvements (#531) · 48d9afbe

Myle Ott authored Mar 14, 2019

Summary:
* Add FusedLayerNorm and FusedAdam
* Softmax and zero grad optimizations
Pull Request resolved: https://github.com/pytorch/fairseq/pull/531

Differential Revision: D14218457

Pulled By: myleott

fbshipit-source-id: 5656b2d0152cd85f77dc21ec0e1439ec04b9fa89

48d9afbe

12 Mar, 2019 1 commit

FairseqEncoderModel · 9e1c880f

Dmytro Okhonko authored Mar 12, 2019

Summary: Base class for encoder-only models. Some models doesn't have decoder part.

Reviewed By: myleott

Differential Revision: D14413406

fbshipit-source-id: f36473b91dcf3c835fd6d50e2eb6002afa75f11a

9e1c880f

16 Feb, 2019 1 commit

Merge internal changes · 9998bbfa

Myle Ott authored Feb 15, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/505

Differential Revision: D14110201

Pulled By: myleott

fbshipit-source-id: 099ce61fa386c016f3a1d1815c6fe1a9a6c9005d

9998bbfa

30 Jan, 2019 1 commit

Merge internal changes (#483) · 42be3ebd

Myle Ott authored Jan 30, 2019

Summary:
Changelog:
- `4889802`: can now remove detokenize sentencepiece output with `--remove-bpe=sentencepiece` (fixes #331). Also added `--sacrebleu` for computing detokenized BLEU.
- `0d76427`: fix assertion error when training language model with dataset containing empty sentences
- minor bug and style fixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/483

Differential Revision: D13867899

Pulled By: myleott

fbshipit-source-id: 25c940b847fe270262ac8f5ac838407b3977fdda

42be3ebd

25 Jan, 2019 2 commits

Add code for "Pay Less Attention with Lightweight and Dynamic Convolutions" (#473) · b41c74dc

Myle Ott authored Jan 25, 2019

Summary:
Changelog:
- `e330f56`: Add code for the "Pay Less Attention with Lightweight and Dynamic Convolutions" paper
- `5e3b98c`: Add scripts for computing tokenized BLEU with compound splitting and sacrebleu
- update READMEs
- misc fixes
Pull Request resolved: https://github.com/pytorch/fairseq/pull/473

Differential Revision: D13819717

Pulled By: myleott

fbshipit-source-id: f2dc12ea89a436b950cafec3593ed1b04af808e9

b41c74dc

Only use c10d distributed primitives · 7e0d222c

Myle Ott authored Jan 25, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/471

Differential Revision: D13818918

Pulled By: myleott

fbshipit-source-id: d3b8dc50e81ee1d2dcc5efc5815998be8461085f

7e0d222c

24 Jan, 2019 1 commit

LSTM improvements (fixes #414) · 9196c0b6

Myle Ott authored Jan 24, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/470

Differential Revision: D13803964

Pulled By: myleott

fbshipit-source-id: 91b66599e9a539833fcedea07c608b349ba3b449

9196c0b6

17 Jan, 2019 1 commit

Fix stories generation · d259ffa9

Myle Ott authored Jan 16, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/454

Differential Revision: D13708565

Pulled By: myleott

fbshipit-source-id: 5cd0e07e3e1885eef14e3a5e8074f24cf4bde632

d259ffa9

14 Jan, 2019 1 commit

Fixes (#442) · d9284ee7

Huihui Fan authored Jan 14, 2019

Summary:
minor fixes:
1- adding fairseq logo
2- encoder padding for fconv self att
3- legacy ddp change
Pull Request resolved: https://github.com/pytorch/fairseq/pull/442

Differential Revision: D13651715

Pulled By: myleott

fbshipit-source-id: ac93c80f1dbffdfe03fbd4b8a8ea527aecb576a7

d9284ee7

10 Jan, 2019 1 commit

Make error message for trying to train after make_generation_fast work correctly · 315fa5cb

Wei Ho authored Jan 09, 2019

Summary: https://github.com/pytorch/fairseq/blob/master/fairseq/trainer.py#L164 calls `train()` without any argument

Reviewed By: myleott

Differential Revision: D13599203

fbshipit-source-id: 3a096a6dd35a7a3f8309fbda3b54a36f606475e3

315fa5cb

05 Jan, 2019 1 commit

Merge internal changes (#283) · 7633129b

Myle Ott authored Jan 04, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/translate/pull/283

Pull Request resolved: https://github.com/pytorch/fairseq/pull/428

Differential Revision: D13564190

Pulled By: myleott

fbshipit-source-id: 3b62282d7069c288f5bdd1dd2c120788cee4abb5

7633129b

26 Dec, 2018 1 commit

Add option to disable positional embeddings in TransformerModel (#421) · 19c17b74

Emanuele Bugliarello authored Dec 26, 2018

Summary:
Add argument `--no-token-positional-embeddings` to TransformerModel (currently only available in TransformerLanguageModel) to disable positional embeddings.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/421

Differential Revision: D13548450

Pulled By: myleott

fbshipit-source-id: b352c702ed1609e3b84d9a8404941d3274a7f883

19c17b74

06 Dec, 2018 3 commits

Fix arg formatting in preprocess.py and add fmt control for black formatting (#399) · 82a9f923

Myle Ott authored Dec 06, 2018

Summary:
Not switching to Black formatting just yet, but adding fmt: off directives in case we decide to later.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/399

Differential Revision: D13364674

Pulled By: myleott

fbshipit-source-id: a20a11a18be3d583ee30eff770278fb4bd05b93c

82a9f923

Add check that --encoder-layers matches --decoder-layers for LSTM (fixes #394) · 0693c351

Myle Ott authored Dec 06, 2018

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/398

Differential Revision: D13358876

Pulled By: myleott

fbshipit-source-id: 57673f2643aac01492cb8f5728bb9f1a34ba6aa7

0693c351

Enable check_reduction for imagenet flow and fairseq · 50591a29

Teng Li authored Dec 05, 2018

Summary:
As the title says, better to enable this for certain use cases to make
sure things are right

Reviewed By: myleott, pietern

Differential Revision: D13351753

fbshipit-source-id: cf495960fda71ebd679c23212e19703c93a9dbdc

50591a29

27 Nov, 2018 1 commit

Decoder embedding sharing in PyTorch Translate for denoising autoencoder (#386) · 07e34244

Liezl Puzon authored Nov 27, 2018

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/386

Pull Request resolved: https://github.com/pytorch/translate/pull/266

This allows decoder embedding sharing for denoising autoencoder modules with different decoders (one for src decoding and one for tgt decoding)

Reviewed By: dpacgopinath

Differential Revision: D13133015

fbshipit-source-id: 3c98be639d705744ccf5ba3a8fd7d10ddc7aef4a

07e34244

26 Nov, 2018 1 commit

Fix some recursive functions (e.g., reorder_incremental_state) to only touch... · 14506a83

Myle Ott authored Nov 25, 2018

Fix some recursive functions (e.g., reorder_incremental_state) to only touch each module once (#379)

Summary:
This can happen if a module is registered in more than one place in the network.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/379

Differential Revision: D13154498

Pulled By: myleott

fbshipit-source-id: a35575d1956a46cd35ac8b16a719ad20ac3e380a

14506a83

17 Nov, 2018 1 commit

Add LegacyDistributedDataParallel in place of no_c10d (#370) · 2625b0a4

Myle Ott authored Nov 17, 2018

Summary:
This should bring back the speedup with --update-freq that we reported in the Scaling Neural Machine Translation paper.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/370

Differential Revision: D13100281

Pulled By: myleott

fbshipit-source-id: 4a81b51bb7390a197add314a4be5512bbf68c085

2625b0a4

07 Nov, 2018 1 commit

Merge internal changes · 8eb232ce

Myle Ott authored Nov 07, 2018

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/352

Differential Revision: D12956930

Pulled By: myleott

fbshipit-source-id: 39334a79544bac570feb04be9103269d7c1563f9

8eb232ce

21 Oct, 2018 1 commit

Manually port pull request 385 · 8441cbf3

Peng-Jen Chen authored Oct 20, 2018

Summary:
Manually port fairinternal fairseq-py pull request #385 [1] to fbcode.

Resolve the merge conflict of removing fp16_trainer per offline discussion with Myle. Also updated codes to make generate.py works.

[1] https://github.com/fairinternal/fairseq-py/pull/385/commits/18fa6e154781cf0c4b1596429dba7e753a545069

Reviewed By: liezl200

Differential Revision: D10052908

fbshipit-source-id: c3c378d78dc1e9ac087c815f359e78c0048ff2f5

8441cbf3

19 Oct, 2018 1 commit

Update upgrade_state_dict in transformer.py to upgrade_state_dict_named (#317) · 0a628401

Peng-Jen Chen authored Oct 19, 2018

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/317

When upgrading `state_dict` variable, `upgrade_state_dict` function in TransformerEncoder/TransformerDecoder doesn't handle multiple encoders/decoders, however, D10052908 will be the case.

Before the change, we will hit error message [1] when loading checkpoint for multilingual_transformer model in D10052908. This diff will fix it.

Reviewed By: myleott, liezl200

Differential Revision: D10375418

fbshipit-source-id: 7104c1a463e78f3fa33d8479a37c51608be50610

0a628401

03 Oct, 2018 1 commit

Fix proxying in DistributedFairseqModel · fc677c94

Myle Ott authored Oct 03, 2018

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/302

Differential Revision: D10174608

Pulled By: myleott

fbshipit-source-id: 4e2dfc76eae97afc5488f29b47e74f9897a643ff

fc677c94

25 Sep, 2018 4 commits

Merge internal changes · 535ca991
Myle Ott authored Sep 24, 2018

535ca991
core changes to support latte collab · cfd2a3a0
Alexei Baevski authored Sep 20, 2018

cfd2a3a0
Better support for various c10d API changes · fbe8ce65
Myle Ott authored Sep 17, 2018

fbe8ce65

Switch to DistributedDataParallelC10d and bump version 0.5.0 -> 0.6.0 · 1082ba35

Sergey Edunov authored Sep 06, 2018

- no more FP16Trainer, we just have an FP16Optimizer wrapper
- most of the distributed code is moved to a new wrapper class called DistributedFairseqModel, which behaves like DistributedDataParallel and a FairseqModel at the same time
- Trainer now requires an extra dummy_batch argument at initialization, which we do fwd/bwd on when there's an uneven number of batches per worker. We hide the gradients from these dummy batches by multiplying the loss by 0
- Trainer.train_step now takes a list of samples, which will allow cleaner --update-freq

1082ba35

03 Sep, 2018 7 commits
- Add documentation · 6381cc97
  Myle Ott authored Sep 03, 2018
  
  6381cc97
- Misc changes to simplify upcoming tutorial · 0e101e9c
  Myle Ott authored Sep 02, 2018
  
  0e101e9c
- Add adaptive softmax changes for lstm model · 5852d3a0
  Li Zhao authored Aug 28, 2018
  
  5852d3a0
- Fix FP16 version comparison · b9956a6a
  Myle Ott authored Aug 27, 2018
  
  b9956a6a
- Merge internal changes · 753935ef
  Myle Ott authored Aug 27, 2018
  
  753935ef
- disable final layer norm for transformer decoder as it makes things worse · f84e1ed4
  Alexei Baevski authored Aug 23, 2018
  
  f84e1ed4
- Remove --normalization-constant from fconv · 47f095fd
  Myle Ott authored Aug 21, 2018
  
  47f095fd