Commits · d48895bd53ffc90099beb76cf154c50a1ba23742 · OpenDAS / Fairseq

03 Dec, 2019 3 commits

fixed word level extract features for roberta-xlmr · d48895bd

Naman Goyal authored Dec 03, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/933

Differential Revision: D18783780

fbshipit-source-id: fa0a27fab886a5fa5be8d5f49151d1d9dd9775f1

d48895bd

Fix lightconv_lm and add test (#932) · 1c565940

Myle Ott authored Dec 03, 2019

Summary:
Fixes https://github.com/fairinternal/fairseq-py/issues/536
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/932

Differential Revision: D18783032

Pulled By: myleott

fbshipit-source-id: a520faccc20be78296a228214923ee1495fb536f

1c565940

Fix bug in forward_embedding (#931) · 5be1cf30

Myle Ott authored Dec 03, 2019

Summary:
See: https://twitter.com/nymwa/status/1200684169115734016
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/931

Differential Revision: D18782345

Pulled By: myleott

fbshipit-source-id: 9e74287a8ce677237042c5fdbe0bdbd4774b5ce6

5be1cf30

02 Dec, 2019 2 commits

Apply Black auto-formatting · 10c2e8c1

Wei Ho authored Dec 02, 2019

Reviewed By: sujitoc

Differential Revision: D18738392

fbshipit-source-id: b7b7b75ef97946786c463c1887ef9a8676f030e6

10c2e8c1

added missing cmd arg for bart cnn ft · cfc4b303

Naman Goyal authored Dec 02, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/930

Differential Revision: D18763707

fbshipit-source-id: 453a877f5bb39c5afcf7f9bc101019b1be4a0a60

cfc4b303

26 Nov, 2019 6 commits

Add an extract_features option for LSTMDecoder · 9398a282

Xilun Chen authored Nov 26, 2019

Summary:
This diff adds a new option to the LSTMDecoder to obtain unprojected decoder outputs (before the final output projection layer).

The original forward() method remains unchanged, but is divided into two parts: extract_features() and output_layer().

extract_features() outputs the hidden states of the decoder, which offers more flexibility to the model.

For instance, the unprojected decoder outputs are needed to implement a copy pointer attention that uses the decoder output to determine whether to copy certain tokens from the source sequence.

Reviewed By: myleott

Differential Revision: D18650255

fbshipit-source-id: 321c3085676d98b8b4f4ad6102917c94800643a5

9398a282

Remove unused ignore_utf_errors option from Dictionary · 9f6b552d

Wei Ho authored Nov 26, 2019

Reviewed By: donhusa

Differential Revision: D18703314

fbshipit-source-id: 93a6b25a7de5e8a29879302ba23b9d6f60660b40

9f6b552d

Documentation fixes · 181dc58e

Myle Ott authored Nov 25, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/927

Differential Revision: D18691521

Pulled By: myleott

fbshipit-source-id: a79cb0a7614a30be765e741761819263d9fb5047

181dc58e

Better handling of unspecified max_tokens and max_sentences (fixes #1427) · 5d9392df

Myle Ott authored Nov 25, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/928

Differential Revision: D18691525

Pulled By: myleott

fbshipit-source-id: e787c17434d4cb0c4621e9858e0ebec4f9951630

5d9392df

Make torch.hub interface automatically apply tokenization and BPE · cb6c67bc

Myle Ott authored Nov 25, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/926

Differential Revision: D18685772

Pulled By: myleott

fbshipit-source-id: 0f99d79ed6ee72e9d3ced786d75ab9504d0dfcf0

cb6c67bc

update LevT ensemble · fb3e1e36

Changhan Wang authored Nov 25, 2019

Summary: Update LevT ensemble class with the recent API changes in LevT and iterative decoder classes.

Reviewed By: jhcross

Differential Revision: D18689292

fbshipit-source-id: 64d4cdb6513a32a32d49e0ebf57886ae576722d4

fb3e1e36

22 Nov, 2019 1 commit

Quick fix for Structured Dropout checkpointing (#1406) · 5349052a

mingruimingrui authored Nov 21, 2019

Summary:
Here's a quick fix for https://github.com/pytorch/fairseq/issues/1403.

To keep it short, this fix allows the user to checkpoint a translation model properly after applying layer pruning on a restored transformer file.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1406

Differential Revision: D18637540

Pulled By: myleott

fbshipit-source-id: 0f5e91e05e6579f0f459bc5293e9b14cb267322d

5349052a

21 Nov, 2019 3 commits

Fix warmup for fixed_schedule in case of first update (#1408) · 60e16a35

Tatiana Likhomanenko authored Nov 21, 2019

Summary:
I faced the error while using warmup for fixed lr schedule

```
Traceback (most recent call last):
  File "/private/home/antares/.conda/envs/fairseq-20190809/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
    fn(i, *args)
  File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/train.py", line 291, in distributed_main
    main(args, init_distributed=True)
  File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/train.py", line 81, in main
    train(args, trainer, task, epoch_itr)
  File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/train.py", line 122, in train
    log_output = trainer.train_step(samples)
  File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/fairseq/trainer.py", line 409, in train_step
    self.optimizer.step()
  File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/fairseq/optim/fp16_optimizer.py", line 153, in step
    self.fp32_optimizer.step(closure)
  File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/fairseq/optim/fairseq_optimizer.py", line 98, in step
    self.optimizer.step(closure)
  File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/fairseq/optim/nag.py", line 68, in step
    lr_correct = lr / lr_old
ZeroDivisionError: float division by zero
```
which is due to `num_updates=0` for the first iteration and thus `lr` we set to the optimizer is zero.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1408

Differential Revision: D18637526

Pulled By: myleott

fbshipit-source-id: fdd81dd69b1b38bc21a4fa315b4e25cee03af6bf

60e16a35

added instructions to FT bart on cnn-dm · 226c1f48

ngoyal2707 authored Nov 21, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/922

Differential Revision: D18617322

fbshipit-source-id: 50645197cb7f075b5f878818a97358653077c3e0

226c1f48

Refactor data sharding to be specified via caller of task rather than task itself · 99fbd317

Alex Xiao authored Nov 20, 2019

Summary: Modifying number of shards internally to disable data sharding for batch iteration is dangerous because the caller of these tasks is not limited to fairspeq/train. So therefore we should put the onus of data sharding properly on the caller rather than the task itself.

Reviewed By: myleott

Differential Revision: D18456424

fbshipit-source-id: d46be16c441c50082f9a768d0b259e6c28a4b67b

99fbd317

20 Nov, 2019 1 commit

clean up the NAT loss (#921) · 51eb9802

Jiatao Gu authored Nov 19, 2019

Summary:
Clean up the original NAT loss and make it more general to adapt new losses used in NAT models.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/921

Differential Revision: D18610145

Pulled By: MultiPath

fbshipit-source-id: d04dd0fc4047b5f8e332cfe66b1e28cbf39494af

51eb9802

19 Nov, 2019 3 commits

Bart fix prev tokens collate · 831b6b6e

Naman Goyal authored Nov 19, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/920

Differential Revision: D18593088

fbshipit-source-id: d4479ee8dae2ca623e62e12bd145165a116fb70a

831b6b6e

add and set the missing state "shuffle" properly to EpochBatchIterato… (#1375) · 534eaa2c

freewym authored Nov 19, 2019

Summary:
…r to correctly recover the training from a "non-shuffle" checkpoint
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1375

Differential Revision: D18566535

Pulled By: myleott

fbshipit-source-id: ff7b1a6ead708801f537ec7885e30e37168cd34b

534eaa2c

Bart push cnn eval · 4fd2a16b

ngoyal2707 authored Nov 18, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/915

Differential Revision: D18580996

fbshipit-source-id: 9505a81892ba8ad997c03465d6a2d488c379c762

4fd2a16b

18 Nov, 2019 3 commits

fix defaults for layer drop things (#918) · 9bf0f107

alexeib authored Nov 18, 2019

Summary:
recent layerdrop related changes break existing models because they assume presence of certain args
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/918

Reviewed By: huihuifan

Differential Revision: D18578572

Pulled By: alexeib

fbshipit-source-id: 368c2d5b3add55864bf59516820807303aac6001

9bf0f107

Build Cython components when loading hub (#1386) · 6f7b7d20

Myle Ott authored Nov 17, 2019

Summary:
Fixes https://github.com/pytorch/fairseq/issues/1376
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1386

Differential Revision: D18566839

Pulled By: myleott

fbshipit-source-id: 71805f58fab90f53f757bf4ef69eb914195af38a

6f7b7d20

Update README.md · 6fc03d3c

Louis Martin authored Nov 17, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/913

Differential Revision: D18565866

Pulled By: myleott

fbshipit-source-id: e845759dafe915805c2e38f53c6835cbcef5db2f

6fc03d3c

17 Nov, 2019 1 commit

added prompt file · 7fb1df53

Angela Fan authored Nov 17, 2019

Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1385

Differential Revision: D18565188

Pulled By: huihuifan

fbshipit-source-id: 9580663b208f286a249bbfa2bacd71f34a01ca9f

7fb1df53

15 Nov, 2019 1 commit

TorchScript-ify BERT training (#887) · 8446cb63

Ilia Cherniavskii authored Nov 15, 2019

Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/887

Pull Request resolved: https://github.com/facebookresearch/pytext/pull/1052

Pull Request resolved: https://github.com/pytorch/fairseq/pull/1250

Adding config parameter "use_torchscript" that enables use of TS for BERT
training

Reviewed By: chenyangyu1988

Differential Revision: D17872083

fbshipit-source-id: 00ac4b04e7f26aa56fe84fe9feaded676d6deb71

8446cb63

14 Nov, 2019 4 commits

Add internal tests for torch hub · e4047852

Myle Ott authored Nov 14, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/911

Differential Revision: D18511627

Pulled By: myleott

fbshipit-source-id: 37d7606ae629f9acf84715dbc9045fb683075db4

e4047852

fix a bug when resuming training from the last epoch (#1275) · 0d03aa88

freewym authored Nov 14, 2019

Summary:
If the training stopped in the middle of the last epoch, and then it was resumed from checkpoint, it will not continue the training because `epoch_itr.epoch < max_epoch` is not satisfied. This PR fixed the issue.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1275

Differential Revision: D18483945

Pulled By: myleott

fbshipit-source-id: 80df6f73fa17606a79a28e8328bb4c577f504683

0d03aa88

Adopt Fairseq MemoryEfficientFP16Optimizer in PyText (#910) · d9836217

Abhimanyu Sharma authored Nov 14, 2019

Summary:
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/910

Pull Request resolved: https://github.com/facebookresearch/pytext/pull/1124

Pull Request resolved: https://github.com/pytorch/fairseq/pull/1362

Split the Fariseq MemoryEfficientFP16Optimizer class into 2 classes so that it can be easily imported in pytext through a wrapper class.

Iter 2 - fixed some issues to ensure code runs correctly on fblearner.

Iter 3 - fixed review comments, incorrect import and lints.

Iter 4 - fixed pytext test breaks.

Iter 5 - fix pytext test breaks.

Iter 6 - fix comments and refactor based on conversation with chenyang.

Reviewed By: chenyangyu1988

Differential Revision: D18410916

fbshipit-source-id: 5238ee553cd2811ed0573825e1c29000980cc489

d9836217

Enable to print the history of NAT; fix LevT decoding bug (#908) · b85fb035

Jiatao Gu authored Nov 13, 2019

Summary:
(1) Enable to print the iterative refinement history for all NAT models by setting --retain-iter-history during decoding;
(2) Fix a small bug in the decoding process in Levenshtein Transformer.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/908

Differential Revision: D18493234

Pulled By: MultiPath

fbshipit-source-id: 9e7702adcea49f39d3c10b5349b5a9ae66399a24

b85fb035

13 Nov, 2019 5 commits

Fix LM generation and add unit test · e26ee47a

Myle Ott authored Nov 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/896

Differential Revision: D18250948

Pulled By: myleott

fbshipit-source-id: 7a515311e18795670b29f5e24eeba7619a625da7

e26ee47a

Fix the type annotations of three parameters found in two constructors (#1268) · 096d7d30

zheng authored Nov 13, 2019

Summary:
As their names suggest, the parameters `embedding_dim`, `ffn_embedding_dim`, and `num_attention_heads` should have type `int`, not `float`.

Also validated by https://github.com/pytorch/fairseq/blob/b5f41f828b0ec9b67fa60aceb0778073d1b368b2/fairseq/modules/sparse_transformer_sentence_encoder.py#L22#L24.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1268

Differential Revision: D18372518

Pulled By: myleott

fbshipit-source-id: 666739b6270a975536785886068a975e07312bb0

096d7d30

Add 'ppl' to tensorboard (#1212) · aaa37f05

Zhanghao Wu authored Nov 13, 2019

Summary:
Originally, the 'ppl' is calculated but returned as a string, which will not be printed to the tensorboard.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1212

Differential Revision: D18339553

Pulled By: myleott

fbshipit-source-id: 52e64d5d173bfd79836a72ee103cb25c8bb2a4c2

aaa37f05

Have `setup.py clean` remove compiled Cython files · 4d21c157

Myle Ott authored Nov 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/907

Differential Revision: D18480215

Pulled By: myleott

fbshipit-source-id: b02002f631f6d47380f309d4f464bd135d623280

4d21c157

Merge TracingCompliantTransformer and regular Transformer, fix NAT tests · 27568a7e

Myle Ott authored Nov 13, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/899

Differential Revision: D18373060

Pulled By: myleott

fbshipit-source-id: bb5510ec15799a0a10a7c0669e76d8200e1ba479

27568a7e

12 Nov, 2019 1 commit

More thorough support for iterable datasets · 2a9b4ec2

Spencer Poff authored Nov 11, 2019

Summary: Using PyTorch IterableDataset for streaming iterators. Such that there is a clean differentiation in interface between datasets that are streaming data and those that support indexed access.

Reviewed By: myleott

Differential Revision: D18438694

fbshipit-source-id: 482857d8357091ea2a6bf819535b09ba7f1a5b7d

2a9b4ec2

10 Nov, 2019 1 commit

Camembert model and code (#904) · b31849aa

Louis Martin authored Nov 10, 2019

Summary:
Check locally that everything works fine.
Model is uploaded to fbaipublicfiles.

I fixed a few inconsistencies in the bpe encoding along the way, e.g. related to https://github.com/pytorch/fairseq/issues/1306..
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/904

Reviewed By: ngoyal2707

Differential Revision: D18418345

Pulled By: louismartin

fbshipit-source-id: 53acb4d021581968d70430ee9babee07d6573c17

b31849aa

09 Nov, 2019 1 commit

adding first version of bart code release (#902) · a92bcdad

Naman Goyal authored Nov 08, 2019

Summary:
This is the first version of BART code / model release.

It still requires lot of clean up, instructions, making sure results are reproducible before we can release it.
Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/902

Differential Revision: D18389535

fbshipit-source-id: 77f16800307ce831bd29538fdd34800793210f46

a92bcdad

08 Nov, 2019 2 commits

Move fb_pathmgr registration out of train.py · e98bf7e6

Myle Ott authored Nov 08, 2019

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/903

Reviewed By: sujitoc

Differential Revision: D18327653

fbshipit-source-id: 739ddbaf54862acdf7b4f1bc3ad538bde5ae00fd

e98bf7e6

Fix LevT edge cases · e9171ce1

Xian Li authored Nov 07, 2019

Summary:
To avoid the case where can_ins_mask has all False so max_lengths has size [0, 1] which failed expand_as operator. Move it back into the skipping branch in script.

The same for deletion and ins_word.

Reviewed By: kahne

Differential Revision: D18365340

fbshipit-source-id: 509ac21d7d6fd9083d0710697288203977314c52

e9171ce1

07 Nov, 2019 2 commits

Fix changes of file locations of subword-nmt (#1219) · 13d9e2ba

Kevin authored Nov 07, 2019

Summary:
Solves https://github.com/pytorch/fairseq/issues/1218.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1219

Differential Revision: D18339541

Pulled By: myleott

fbshipit-source-id: 6d5bd7b60fa7fd30c038fdad54591343a01f228b

13d9e2ba

Add whole word masking for SentencepieceBPE (#1292) · 37c9d96f

Louis MARTIN authored Nov 07, 2019

Summary:
Models seem to train fine with this modification. I checked that the mask for beginning of words is correct but didn't check if the actual masking worked correctly.
Pull Request resolved: https://github.com/pytorch/fairseq/pull/1292

Differential Revision: D18338307

Pulled By: myleott

fbshipit-source-id: eae9e29d6ab648e768d70921694a898554496704

37c9d96f