Commits · 6982c404c2e2f30566a6304bf40d8033bf29a645 · OpenDAS / Fairseq

12 Jun, 2019 3 commits

Nayan Singhal authored Jun 12, 2019

Summary:
Implemented model averaging for fairseq.
Removed the ddp wrapper if global optimizer is provided.
Syncing all the models based on the iteration provide in the input

TODO:
1) Fix throughput and wps meter. Need to check other meters too.
2) Replace Model average code with BMUF algorithm implementation.

Reviewed By: myleott

Differential Revision: D15711044

fbshipit-source-id: 58a4af74db2a61d06762597b95836cbeb1ed82cc

6982c404

Add more torch.hub deps · 78c2fcf0

Myle Ott authored Jun 12, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/801

Differential Revision: D15781975

Pulled By: myleott

fbshipit-source-id: b86276cd3a40138c09494637c43ce52a56c4aced

78c2fcf0

Add missing dependencies to hubconf · 37df862e

Myle Ott authored Jun 11, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/799

Differential Revision: D15773932

Pulled By: myleott

fbshipit-source-id: 650c0621bedb3b7ecebc0654d8e10d7692c50994

37df862e

11 Jun, 2019 7 commits

Iterate on torch.hub interface · 5bdee18e

Myle Ott authored Jun 11, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/793

Differential Revision: D15758755

Pulled By: myleott

fbshipit-source-id: b93e4ac11bde36a0b59b4d6d1c84d31c3124d767

5bdee18e

Automatically fill in default values from add_args · eea4d20b

Myle Ott authored Jun 11, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/797

Differential Revision: D15761071

Pulled By: myleott

fbshipit-source-id: 257d4a2297e83da7e59baed154dbafd6bfe614bf

eea4d20b

Add exception for bsz=1 with prefix generation (#796) · 1b937bb2

Myle Ott authored Jun 11, 2019

Summary:
This is a temporary workaround to support sampling after https://github.com/pytorch/fairseq/issues/713. We'll need to revisit this to support sampling and beam more generally.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/796

Differential Revision: D15760808

Pulled By: myleott

fbshipit-source-id: ecaf4f161b0c30de037f32007e4610a559a49230

1b937bb2

Python3.5 compat (#794) · a8f28ecb

Bairen Yi authored Jun 11, 2019

Summary:
See #467. Ping myleott to review.

This is a work-related contribution. Ping lark to review.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/794

Differential Revision: D15756816

Pulled By: myleott

fbshipit-source-id: 6dce3ff3a713bf5f60e5782bc260b2ca9d2c0a9b

a8f28ecb

Add generic registry mechanism · 9b40999e

Myle Ott authored Jun 11, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/792

Differential Revision: D15741781

Pulled By: myleott

fbshipit-source-id: c256c7900c307d485904e69b1526b9acbe08fec9

9b40999e

when given prefix_tokens, sequence generator would generate (exactly) same... · 9dc9a486

yilinyang7 authored Jun 11, 2019

when given prefix_tokens, sequence generator would generate (exactly) same finished candidates (#713)

Summary:
https://github.com/pytorch/fairseq/issues/712
Pull Request resolved: https://github.com/pytorch/fairseq/pull/713

Differential Revision: D15242432

Pulled By: myleott

fbshipit-source-id: a230ee48f4bf891c805609c428d7233a0ad21179

9dc9a486

Fix of MHA for TPUs (#636) · ee8bcb17

Sergey Edunov authored Jun 10, 2019

Summary:
Multi-Head attention is currently not TPU-friendly, specifically .data_ptr() is not supported and should not be used. Also there are potential issues with correctness of existing code (e.g. data_ptr() can point to the same storage for different tensors). Rather than rely on data_ptr() we should explicitly set self_attention or encoder_decoder_attention flags.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/636

Reviewed By: myleott

Differential Revision: D15709898

Pulled By: edunov

fbshipit-source-id: f931713193c51be848a5de20da730ac3a3ce0187

ee8bcb17

10 Jun, 2019 2 commits

More generator features for demo (#791) · 4868c182

Myle Ott authored Jun 10, 2019

Summary:
- make it possible to load file_utils.py without the dependencies
- add some more demo features
Pull Request resolved: https://github.com/pytorch/fairseq/pull/791

Differential Revision: D15739950

Pulled By: myleott

fbshipit-source-id: 38df5209973a6fe2e3651575b97134e096aaf5bf

4868c182

fix log printing in progress bar (#778) · a58c1127

freewym authored Jun 10, 2019

Summary:
In the current progress bar, the counter for log_interval will always start from 0, which is not correct if reloading from a checkpoint in the middle of an epoch. This fix obtains the offset from the iterator to set the counter correctly.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/778

Differential Revision: D15739953

Pulled By: myleott

fbshipit-source-id: a1d13403ec5783b22e01d7cb63874fd8dea7f8b0

a58c1127

07 Jun, 2019 1 commit

Replace unknown word by original source word when empty string is given (#770) · 1ca075a2

Ning Dong authored Jun 06, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/770

Without this change comment here https://fburl.com/w1cejgw9 is inconsistent with the implementation.

Reviewed By: xianxl

Differential Revision: D15582826

fbshipit-source-id: 16d8368560153b251beed8b290f51fcdd8a8faee

1ca075a2

06 Jun, 2019 1 commit

Change encoder_learned_pos default back to True for xlm_base · fa7791df

Matt Le authored Jun 06, 2019

Reviewed By: pipibjc

Differential Revision: D15635402

fbshipit-source-id: e92fab914de40775d7bad851420355240d822bde

fa7791df

04 Jun, 2019 4 commits

Fix loading XLM pretraining · 5408bc08

Matt Le authored Jun 04, 2019

Summary: We never actually load the model parameters from an XLM model when using tranformer_from_pretrained_xlm. Also, change encoder_learned_pos from True -> False

Reviewed By: liezl200

Differential Revision: D15629061

fbshipit-source-id: 759eadc88041eae94505477960de57dd78a99dcb

5408bc08

Fixing xlm example docts (#776) · 0d636744

lematt1991 authored Jun 04, 2019

Summary:
Resolves #762
Pull Request resolved: https://github.com/pytorch/fairseq/pull/776

Differential Revision: D15631503

Pulled By: lematt1991

fbshipit-source-id: 103f77d553476917b8b0f8001767217fb311d920

0d636744

Remove overridden inverse_sqrt lr scheduler in dynamic conv example (#769) · b1dd40cf

lematt1991 authored Jun 04, 2019

Summary:
Resolves #768
Pull Request resolved: https://github.com/pytorch/fairseq/pull/769

Differential Revision: D15621841

Pulled By: lematt1991

fbshipit-source-id: 694effe3788ff7d04864217d673608ec31da589e

b1dd40cf

Adding masked_lm_dictionary to pytorch_translate (#630) · 4ed5abc9

Biao Lu authored Jun 03, 2019

Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/630

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/629

Pull Request resolved: https://github.com/pytorch/translate/pull/562

Pull Request resolved: https://github.com/pytorch/fairseq/pull/774

forked masked_lm_dictionary from fairseq
changed import in pytorch_translate to use the new masked_lm_dictionary
registered cooresponding tasks

Reviewed By: liezl200

Differential Revision: D15410352

fbshipit-source-id: 06516caabdd4dc5cdee9ad1d8025978f4eea6c4b

4ed5abc9

03 Jun, 2019 2 commits

fix masked_lm for loading in pytext · dc028c52

Haoran Li authored Jun 03, 2019

Summary: lm_output_learned_bias doesn't exist when loading the model for fine-tuning

Reviewed By: jingfeidu

Differential Revision: D15579190

fbshipit-source-id: 45e8e193399943c89b77cc553d3d6d49b056e55a

dc028c52

Torch hub · a2aed890

Nathan Ng authored Jun 03, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/621

Differential Revision: D15571435

Pulled By: myleott

fbshipit-source-id: 67d25b00c8c1bc69dbffd8521da56f7cc14eb75e

a2aed890

02 Jun, 2019 2 commits

Fix rearranging of encoder_out in SequenceGenerator · b35d9bca

Myle Ott authored Jun 02, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/625

Differential Revision: D15595787

Pulled By: myleott

fbshipit-source-id: ba6edf305ed41be392194f492e034dd66d1743fe

b35d9bca

Backward compatibility + updated links for pretrained language models · 6a21b232

Myle Ott authored Jun 02, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/624

Differential Revision: D15595746

Pulled By: myleott

fbshipit-source-id: b79e489de9ff37ee7cbf939092a6e5ec0dbebbf5

6a21b232

01 Jun, 2019 1 commit

Fix positions for LM · 8c03ff2d

Myle Ott authored May 31, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/622

Differential Revision: D15572555

Pulled By: myleott

fbshipit-source-id: 2b81f22207b4c894ffe645af0b45c70ac0a80612

8c03ff2d

31 May, 2019 1 commit

Replace --decoder-final-norm with --no-decoder-final-norm · 8ca05802

Myle Ott authored May 30, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/620

Differential Revision: D15569440

Pulled By: myleott

fbshipit-source-id: c4681f1c72467c04cd2654e87bc724c94b76e3fb

8ca05802

30 May, 2019 7 commits

Update --memory-efficient-fp16 to work with c10d DDP · 38e82904

Myle Ott authored May 30, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/617

Differential Revision: D15555328

Pulled By: myleott

fbshipit-source-id: 35d1f329f887cb0b867c7a22f17a16f3c9c66815

38e82904

Update MoE README · 75cc8821

Myle Ott authored May 30, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/619

Differential Revision: D15562983

Pulled By: myleott

fbshipit-source-id: 9240f56f18c87120b7d38e0db374d24a55999395

75cc8821

Clarify mixed precision training support (#766) · d5f76d74

Khoa Ho authored May 30, 2019

Summary:
Change the wording to avoid confusion. Mixed precision ensures both higher arithmetic throughput and numerical stability, not exactly synonymous to pure half-precision/FP16 training. Also add mentioning of tensor cores since older generation GPUs without tensor cores don't support true mixed precision training.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/766

Differential Revision: D15559565

Pulled By: myleott

fbshipit-source-id: c71e720772657bb3e8ad330b58bf69e23beb614e

d5f76d74

Add --reset-dataloader · ffc3bb58

Myle Ott authored May 30, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/613

Differential Revision: D15541384

Pulled By: myleott

fbshipit-source-id: ef2c0b0a51cdf37af2ccff0546f524d49f87e65d

ffc3bb58

Fix PyTorch deprecation warnings · 9770f367

Myle Ott authored May 30, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/618

Differential Revision: D15552599

Pulled By: myleott

fbshipit-source-id: 2192a30a9c5af31b954a3a1716166dd6ba27b23a

9770f367

Added support for plotting scalars through palaas tbwriter interface. (#580) · 47313d85

Sujit Verma authored May 29, 2019

Summary: Changes for supporting tensorboard scalar plotting.

Reviewed By: myleott

Differential Revision: D15456534

Pulled By: myleott

fbshipit-source-id: a012a4eea028aae764ce11786570b7d96841c4a5

47313d85

device error in SinusoidalPositionalEmbedding (#746) · dd0dc54c

lukovnikov authored May 29, 2019

Summary:
Not sure if I'm doing something wrong elsewhere, but I had a device error in `SinusoidalPositionalEmbedding` when running on GPU > 0 because the weights were on a different device than the input.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/746

Differential Revision: D15547217

Pulled By: myleott

fbshipit-source-id: 37849d895ce483c14615fdb4ace8a8c4fb05b568

dd0dc54c

29 May, 2019 7 commits

Fix Tensorboard Init (#763) · 497d972e

Zhanghao Wu authored May 29, 2019

Summary:
Fix the mismatching between the parameter fed into `SummaryWriter` and the API of the latest [tensorboardX](https://github.com/lanpa/tensorboardX/blob/3e35c9b5f85e8ceb0294532d9eb772341a04c097/tensorboardX/writer.py#L192), i.e. "log_dir" -> "logdir".
Pull Request resolved: https://github.com/pytorch/fairseq/pull/763

Differential Revision: D15547192

Pulled By: myleott

fbshipit-source-id: c51b88da5ec589fb8ca5b4876bc229efeb7bf494

497d972e

Faster masking in MultiheadAttention · b18a3126

Myle Ott authored May 29, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/612

Differential Revision: D15541377

Pulled By: myleott

fbshipit-source-id: 4762516a3b545d03bc81d3660f47827e15466dce

b18a3126

Fix warmup for polynomial decay schedule · c97978a2

Myle Ott authored May 29, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/611

Differential Revision: D15541303

Pulled By: myleott

fbshipit-source-id: 279ca813437c834fca49576a48b75cbf1fdf0e76

c97978a2

Support multiple seeds in data_utils.numpy_seed · 977e36e5

Myle Ott authored May 29, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/610

Differential Revision: D15541261

Pulled By: myleott

fbshipit-source-id: f0b823cf4f04c5ef3205f6d259c6dcad4cc329b1

977e36e5

rm BertLayerNorm · 3e472b22

Myle Ott authored May 29, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/608

Differential Revision: D15541220

Pulled By: myleott

fbshipit-source-id: 52a8e4da72cc6e3e25cf98c989d34a269d614c9d

3e472b22

making it easier to use transformer_lm model with new tasks · ed592ab5

Spencer Poff authored May 29, 2019

Summary:
There were two non-obvious errors I ran into while creating a new language modeling task:
- `transformer_lm` implicitly required the `tokens_per_sample` arg
- `transformer_lm` assumed the task had a `dictionary` and `output_dictionary` property, neither of which are specified in the FairseqTask interface

Reviewed By: myleott

Differential Revision: D15532345

fbshipit-source-id: 200d7d3b542c35f17cc2d6bca4219c4a4d17cb6b

ed592ab5

Make XLM torchscipt Export-able (#765) · 4e9ecb80

Kartikay Khandelwal authored May 29, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/765

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/614

This diff has changes needed to make XLM torchscript exportable.

Reviewed By: bethebunny

Differential Revision: D15497208

fbshipit-source-id: fd9645119e154e3c397f147acf9144d661d9a5c8

4e9ecb80

28 May, 2019 1 commit

Add --sentence-bleu option to score.py · 65f46473

Myle Ott authored May 28, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/605

Differential Revision: D15518167

Pulled By: myleott

fbshipit-source-id: 8b0e6b32adff018136d0d251b7fde3818e373d6f

65f46473

24 May, 2019 1 commit

Implement reducing footprint of average checkpoint correctly (#747) · 8ce2c35d

Yongqiang Wang authored May 24, 2019

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/747

In https://github.com/pytorch/fairseq/pull/647, checkpoint averaging
is not Implemented correctly when it comes to shared parameters. This diff
has the right Implementation and a test case to guard future change.

Reviewed By: myleott

Differential Revision: D15402943

fbshipit-source-id: 8004836d5c2571814ea54844650618008a9ee522

8ce2c35d