Commits · e265c2396f6865e2aa888cecc4e69c5068237550 · OpenDAS / Fairseq

"scripts/convert_versatile_diffusion_to_diffusers.py" did not exist on "d9cfe325a53502641f16ce4f839391c5b0d0a684"

19 May, 2019 1 commit

Make Fairseq compatible with pre-computed position tensors (#570) · e265c239

Kartikay Khandelwal authored May 18, 2019

Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/570

Pull Request resolved: https://github.com/pytorch/fairseq/pull/731

Currently the LearnedPositionalEmbedding module computes the position tensor based on the input data. However this really doesnt work for XLM where we have different behavior based on the Masked LM and Translation LM. In this diff I keep the same default behavior for LearnedPositionalEmbedding as before but add the ability for these models to work with pre-computed position tensors.

Reviewed By: myleott

Differential Revision: D15305474

fbshipit-source-id: de7d908245a2a620b58d36055211600a08f2d1dc

e265c239

17 May, 2019 2 commits

Small features + lint · ba989ed1

Myle Ott authored May 17, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/588

Differential Revision: D15389638

Pulled By: myleott

fbshipit-source-id: 4632ce22d51dc2c74d250bae999630095d849701

ba989ed1

Clean up sharded train iterator · 3bfbb49b

Myle Ott authored May 16, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/586

Differential Revision: D15372949

Pulled By: myleott

fbshipit-source-id: c1cf1c645e8d55fc8568f23a47c45677ac9ab1da

3bfbb49b

16 May, 2019 5 commits

fixed bugs of masked_lm for fine-tuning (#744) · fca32e05

Jingfei Du authored May 16, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/744

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/587

After we added additional prediciton layers for language model predictions. The fine-tuning is broken because of 2 reasons.
1. checkpoint cannot be loaded since we didn't update state_dict names
2. lm_output_learned_bias is not initialize if load_softmax is false

Reviewed By: myleott

Differential Revision: D15377380

fbshipit-source-id: d58544b1d2c549586abef42fec19ec8bf27a994a

fca32e05

Back out "reduce memory footprint for average_checkpoints" (#743) · e2a0b87d

Myle Ott authored May 16, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/743

Original commit changeset: 0afe37c9a031

According to edunov: "We need to be careful here with shared parameters, I believe right now it is broken if you have shared encoder/decoder input embeddings (encoder.embed_tokens.weight and decoder.embed_tokens.weight) as they get updated several times"

We also have OSS issues that look related, e.g., https://github.com/pytorch/fairseq/issues/732.

Backing this out until we can confirm the correct behavior for shared params.

Differential Revision: D15372673

fbshipit-source-id: 8683c0f2514e21fa1e9d2fe6dfc48d98957a2831

e2a0b87d

Cleanup rm_pt.py script · e797f633

Myle Ott authored May 16, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/585

Differential Revision: D15372416

fbshipit-source-id: add226a4558ae4d84dd261e9317b80c43970f771

e797f633

Add multi-dataset loading to multilingual_translation · 0863ea68

Peng-Jen Chen authored May 15, 2019

Summary: Similar to TranslationTask, we want to enable multilingual translation task to be able to load 'train{k}' datasets from data-bin folder.

Reviewed By: lematt1991

Differential Revision: D15363481

fbshipit-source-id: 5fed7be19383023b792ed2fd38e655cbcecc8b90

0863ea68

fixed cmd arg for shuffle dataset masked lm task · 861dd2b7

Naman Goyal authored May 15, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/584

Reviewed By: myleott

Differential Revision: D15360774

Pulled By: myleott

fbshipit-source-id: b18efbb6ff5a8832c61b689f3d87c958cbd908e9

861dd2b7

15 May, 2019 7 commits

Fix biTransformer export (#583) · 2a3adcdc

Ruty Rinott authored May 15, 2019

Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/583

D14610694 fixed issues in layerNorm exporting by making it conditional. D15260838 changed the implementation of TransformerDecoderLayer to the one under transformer, thus losing the fix. Bringing it back here.

Reviewed By: myleott, geof90, liaimi

Differential Revision: D15357119

fbshipit-source-id: e29e053ca5beca0008d7a8dad9880a483a14c7b9

2a3adcdc

added shuffle as arg for masked_lm for experimenting with pad effecie… (#582) · 74c936dc

Naman Goyal authored May 15, 2019

Summary:
added shuffle as arg for masked_lm for experimenting with pad effecient batching
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/582

Reviewed By: jingfeidu

Differential Revision: D15355105

Pulled By: jingfeidu

fbshipit-source-id: 9925271a0bc2f9d283f354d158bd4b5ec8788b39

74c936dc

added missing dense layers in masked lm model (#581) · d1d3a581

Naman Goyal authored May 15, 2019

Summary:
1) Added pooled_output for sentence classification as `Tanh(Linear())`.
2) Added lm_head_transform as `LayerNorm(GeLU(Linear(x)))`
3) `act_dropout = 0.0`
4) added `lm_output_learned_bias`
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/581

Reviewed By: borguz

Differential Revision: D15353575

Pulled By: borguz

fbshipit-source-id: 4ff64c6ceed23f3e99348f73d189546f1d84452e

d1d3a581

Updates to model API (#561) · dffb1674

Myle Ott authored May 15, 2019

Summary:
- `FairseqModel` -> `FairseqEncoderDecoderModel`
- add `FairseqDecoder.extract_features` and `FairseqDecoder.output_layer`
- `encoder_out_dict` -> `encoder_out`
- rm unused `remove_head` functions
- update docs
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/561

Differential Revision: D15271142

Pulled By: myleott

fbshipit-source-id: 8e8864e399336020f0271c780598e968ff51a264

dffb1674

Allow TransformerSentenceEncoder to return only last state · a0c5f9b8

Myle Ott authored May 15, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/578

Differential Revision: D15352060

Pulled By: myleott

fbshipit-source-id: 7dc2fceca37ec96c89356662831b0d82f28bef6f

a0c5f9b8

Add missing imports · 52778827

Myle Ott authored May 15, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/579

Differential Revision: D15352058

Pulled By: myleott

fbshipit-source-id: cebef02edcfcb203ef2e32c64f7f28e08c4e46b0

52778827

Various fixes for Masked LM (#573) · bf106796

Myle Ott authored May 14, 2019

Summary:
Various fixes for Masked LM

- use --activation-fn instead of --gelu
- use --dataset-impl instead of --lazy-load
- add embed_scale option to TransformerSentenceEncoder
- fix encoder_normalize_before to include a final layer norm
- delete BertLayerNorm
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/573

Reviewed By: borguz

Differential Revision: D15317933

Pulled By: myleott

fbshipit-source-id: 8ecb46556ad43e76e92d41ed8f5a62e8516fd375

bf106796

14 May, 2019 3 commits

rm default_key from MultiCorpusSampledDataset · 7432130e

Myle Ott authored May 14, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/575

Differential Revision: D15318004

Pulled By: myleott

fbshipit-source-id: ad918d71b1bd8074decf5ec3463dd9bc9487bbe9

7432130e

Alignment Training task using minibatch · 2c278ff0

Nayan Singhal authored May 14, 2019

Summary:
1. Define a EpochMinibatchIterator which extends the EpochBatchIterator. It has same functionality as EpochBatchIterator except two major changes: use static batching and use MiniBatchIterator for getting the indices.

2. SplitSeqCollater is used instead of Seq2SeqCollater.
3. LSTM_subsample started storing the previous states and reset it once the sample is over.

Reviewed By: jay-mahadeokar

Differential Revision: D15209023

fbshipit-source-id: 900b8bd1f25159ffc77f8106e26729a3e7422a1f

2c278ff0

Move save/load checkpoint functions to utils · cd1e5c09

Dmytro Okhonko authored May 14, 2019

Summary:
Move `load_checkpoint`, `save_checkpoint` and `reload_train` from train.py to checkpoint_utils.py
Move `get_perplexity` from train.py to utils.py.
This will make train.py lighter and allow us to reuse all this utils functionality when fairseq is used as external library.

Reviewed By: myleott

Differential Revision: D15289607

fbshipit-source-id: 4b7c95225ac22e402bcda3497811361809110df1

cd1e5c09

13 May, 2019 4 commits

Transition smoothly after warmup in polynomial LR decay schedule · c124d272

Myle Ott authored May 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/576

Differential Revision: D15318086

Pulled By: myleott

fbshipit-source-id: c6587737ca7b97edc97ad4aef5c5c9ac7e92b2f2

c124d272

gelu_fast -> gelu_accurate (#571) · 939ab6ae

Myle Ott authored May 13, 2019

Summary:
This was named gelu_fast after the original implementation: https://github.com/hendrycks/GELUs/blob/master/mnist_ae.py#L62-L63

But in practice it's actually slower and uses more memory. Rename to gelu_accurate.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/571

Differential Revision: D15317874

Pulled By: myleott

fbshipit-source-id: c96fbc89bf91b27ced1ab8d5b25a8f23f922ec24

939ab6ae

Lint · 72291287

Myle Ott authored May 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/574

Differential Revision: D15317984

Pulled By: myleott

fbshipit-source-id: 09a66229cc6b4c95678ca1ca13c9e0da25b203de

72291287

Add LAMB optimizer · b95f1b5d

Myle Ott authored May 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/572

Differential Revision: D15317928

Pulled By: myleott

fbshipit-source-id: b3f0e9229737a63b49937e7c5b918470f18ddc45

b95f1b5d

12 May, 2019 2 commits

Fix option in docs (#735) · d0577ba7

zhiqiang authored May 12, 2019

Summary:
`--output-format` -> `--dataset-impl` in Tutorial: Classifying Names with a Character-Level RNN
Pull Request resolved: https://github.com/pytorch/fairseq/pull/735

Differential Revision: D15314625

Pulled By: myleott

fbshipit-source-id: 65b8efd1a367ca754e5b9dca088aefbc648864dd

d0577ba7

Add scripts for working with txt files containing document boundaries · 287d31e2

Myle Ott authored May 12, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/736

Differential Revision: D15314626

Pulled By: myleott

fbshipit-source-id: 1e0c32529afee57e43fe5d6c7991cd13eb8a52c4

287d31e2

11 May, 2019 2 commits

convert logits to fp32 for calculating loss in masked_lm_loss criterion · 43722c5e

Naman Goyal authored May 11, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/568

Differential Revision: D15308483

Pulled By: myleott

fbshipit-source-id: 9d898ce523e46e6b6fb444274f478da0b577b603

43722c5e

Add missing options to TransformerDecoderLayer · 5dcc855a

Myle Ott authored May 11, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/560

Differential Revision: D15260838

Pulled By: myleott

fbshipit-source-id: 5f80dd82775c10ce46a3e1c451ccaf0ef55bfa31

5dcc855a

10 May, 2019 2 commits
- add option to specify lr-threshold while using lr-on-plateau strategy · 8a2e6e81
  Jay Mahadeokar authored May 10, 2019
```
Summary: As in title.

Reviewed By: skritika

Differential Revision: D15299135

fbshipit-source-id: 2fd513b32c0ab41911cdf0b0186f6c3bb5256285
```
  8a2e6e81
- fbshipit-source-id: 682b375c6e7535f12faaf9ca32811051f9e874da · 47fbc491
  myleott authored May 10, 2019
  
  47fbc491
09 May, 2019 5 commits

Merge pull request #727 from pytorch/fix_lr_scheduler · cfeb2163
Myle Ott authored May 09, 2019
```
Set initial learning rate in LR schedulers by calling step_update(0) at init
```
cfeb2163
Set initial learning rate in LR schedulers by calling step_update(0) at init · 219cbf6e
Myle Ott authored May 09, 2019

219cbf6e
Revert "Add sweep scripts" · 2af922f1
Myle Ott authored May 09, 2019
```
This reverts commit 8e8e1afc.
```
2af922f1

Add sweep scripts · 8e8e1afc

Myle Ott authored May 09, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/564

Differential Revision: D15278017

Pulled By: myleott

fbshipit-source-id: b6fba1b62145ea533b40f5eb9b134e6aa122e546

8e8e1afc

expose arguments for bias_kv and zero_attn for masked_lm · 93ec8d0b

Jingfei Du authored May 08, 2019

Summary: the old no_bias_kv argument for masked_lm models are not used. Split it into 2 arguments and expose them.

Reviewed By: myleott

Differential Revision: D15266154

fbshipit-source-id: 60b041f8370ca1d8869ed3402fb9a67d1cd8e0e8

93ec8d0b

08 May, 2019 7 commits

Don't allow abbreviated argument options · acb9ab32

Myle Ott authored May 08, 2019

Reviewed By: jmp84

Differential Revision: D15264847

fbshipit-source-id: 4ba9224d1b35c3de0d26c9b4c1ee6d641d3d8535

acb9ab32

Better error message for incorrect --dataset-impl · 61f29f7f

Myle Ott authored May 08, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/723

Differential Revision: D15260870

Pulled By: myleott

fbshipit-source-id: 73d9b138b9ab44f96824076258f1a6319193d0f7

61f29f7f

bug_fixes and small changes to masked lm (#721) · bd6e5c4f

Naman Goyal authored May 08, 2019

Summary:
1) Made the model compatible with using either `masked_lm_dataset` or `monolingual_dataset`.
2) fixed default args setting task. (`bert` vs `masked_lm`) myleott should we keep both?
3) bug in setting default value of `sentence_class_num`
4) bug for padding mask in `fp16`.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/721

Differential Revision: D15259885

fbshipit-source-id: 9dbf7fb8192992c1251670287bed719e41c08fcc

bd6e5c4f

Cleanup LM + Flake8 · f2563c21

Myle Ott authored May 08, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/720

Differential Revision: D15259091

Pulled By: myleott

fbshipit-source-id: 06a35996c06ccddb49fdc9e01e348ff3c9da334e

f2563c21

Fix indexing in TokenBlockDataset · eddcdf08

Myle Ott authored May 08, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/719

Differential Revision: D15258483

Pulled By: myleott

fbshipit-source-id: dd00daa6f1c87264c1196a77dfffc8c876ebde7f

eddcdf08

Bugfix · 0cb45bcb

Myle Ott authored May 08, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/717

Differential Revision: D15254560

Pulled By: myleott

fbshipit-source-id: 2a07614e8d294636f706939e60f0091c73115494

0cb45bcb

bugfix data not in args · 6a7eb6ce

Jay Mahadeokar authored May 07, 2019

Summary:
D15214049 introduced a bug such that if a tasks args does not contain data, then it will give error
```
File "/data/users/jaym/fbsource/fbcode/buck-out/dev/gen/deeplearning/projects/fairspeq/train#link-tree/train.py", line 119, in reload_train
   if len(args.data.split(":")) == 1:
AttributeError: 'Namespace' object has no attribute 'data'
```

This diff checks if data is in args to avoid above error.

Reviewed By: myleott, jmp84

Differential Revision: D15253373

fbshipit-source-id: 14fb9ad878ee50f1b7583349bb17e29c03c40815

6a7eb6ce