- 03 Dec, 2019 3 commits
-
-
Naman Goyal authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/933 Differential Revision: D18783780 fbshipit-source-id: fa0a27fab886a5fa5be8d5f49151d1d9dd9775f1
-
Myle Ott authored
Summary: Fixes https://github.com/fairinternal/fairseq-py/issues/536 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/932 Differential Revision: D18783032 Pulled By: myleott fbshipit-source-id: a520faccc20be78296a228214923ee1495fb536f
-
Myle Ott authored
Summary: See: https://twitter.com/nymwa/status/1200684169115734016 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/931 Differential Revision: D18782345 Pulled By: myleott fbshipit-source-id: 9e74287a8ce677237042c5fdbe0bdbd4774b5ce6
-
- 02 Dec, 2019 2 commits
-
-
Wei Ho authored
Reviewed By: sujitoc Differential Revision: D18738392 fbshipit-source-id: b7b7b75ef97946786c463c1887ef9a8676f030e6
-
Naman Goyal authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/930 Differential Revision: D18763707 fbshipit-source-id: 453a877f5bb39c5afcf7f9bc101019b1be4a0a60
-
- 26 Nov, 2019 6 commits
-
-
Xilun Chen authored
Summary: This diff adds a new option to the LSTMDecoder to obtain unprojected decoder outputs (before the final output projection layer). The original forward() method remains unchanged, but is divided into two parts: extract_features() and output_layer(). extract_features() outputs the hidden states of the decoder, which offers more flexibility to the model. For instance, the unprojected decoder outputs are needed to implement a copy pointer attention that uses the decoder output to determine whether to copy certain tokens from the source sequence. Reviewed By: myleott Differential Revision: D18650255 fbshipit-source-id: 321c3085676d98b8b4f4ad6102917c94800643a5
-
Wei Ho authored
Reviewed By: donhusa Differential Revision: D18703314 fbshipit-source-id: 93a6b25a7de5e8a29879302ba23b9d6f60660b40
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/927 Differential Revision: D18691521 Pulled By: myleott fbshipit-source-id: a79cb0a7614a30be765e741761819263d9fb5047
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/928 Differential Revision: D18691525 Pulled By: myleott fbshipit-source-id: e787c17434d4cb0c4621e9858e0ebec4f9951630
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/926 Differential Revision: D18685772 Pulled By: myleott fbshipit-source-id: 0f99d79ed6ee72e9d3ced786d75ab9504d0dfcf0
-
Changhan Wang authored
Summary: Update LevT ensemble class with the recent API changes in LevT and iterative decoder classes. Reviewed By: jhcross Differential Revision: D18689292 fbshipit-source-id: 64d4cdb6513a32a32d49e0ebf57886ae576722d4
-
- 22 Nov, 2019 1 commit
-
-
mingruimingrui authored
Summary: Here's a quick fix for https://github.com/pytorch/fairseq/issues/1403. To keep it short, this fix allows the user to checkpoint a translation model properly after applying layer pruning on a restored transformer file. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1406 Differential Revision: D18637540 Pulled By: myleott fbshipit-source-id: 0f5e91e05e6579f0f459bc5293e9b14cb267322d
-
- 21 Nov, 2019 3 commits
-
-
Tatiana Likhomanenko authored
Summary: I faced the error while using warmup for fixed lr schedule ``` Traceback (most recent call last): File "/private/home/antares/.conda/envs/fairseq-20190809/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap fn(i, *args) File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/train.py", line 291, in distributed_main main(args, init_distributed=True) File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/train.py", line 81, in main train(args, trainer, task, epoch_itr) File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/train.py", line 122, in train log_output = trainer.train_step(samples) File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/fairseq/trainer.py", line 409, in train_step self.optimizer.step() File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/fairseq/optim/fp16_optimizer.py", line 153, in step self.fp32_optimizer.step(closure) File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/fairseq/optim/fairseq_optimizer.py", line 98, in step self.optimizer.step(closure) File "/private/home/antares/work/unsupervised/blank_test/fairseq-py/fairseq/optim/nag.py", line 68, in step lr_correct = lr / lr_old ZeroDivisionError: float division by zero ``` which is due to `num_updates=0` for the first iteration and thus `lr` we set to the optimizer is zero. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1408 Differential Revision: D18637526 Pulled By: myleott fbshipit-source-id: fdd81dd69b1b38bc21a4fa315b4e25cee03af6bf -
ngoyal2707 authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/922 Differential Revision: D18617322 fbshipit-source-id: 50645197cb7f075b5f878818a97358653077c3e0
-
Alex Xiao authored
Summary: Modifying number of shards internally to disable data sharding for batch iteration is dangerous because the caller of these tasks is not limited to fairspeq/train. So therefore we should put the onus of data sharding properly on the caller rather than the task itself. Reviewed By: myleott Differential Revision: D18456424 fbshipit-source-id: d46be16c441c50082f9a768d0b259e6c28a4b67b
-
- 20 Nov, 2019 1 commit
-
-
Jiatao Gu authored
Summary: Clean up the original NAT loss and make it more general to adapt new losses used in NAT models. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/921 Differential Revision: D18610145 Pulled By: MultiPath fbshipit-source-id: d04dd0fc4047b5f8e332cfe66b1e28cbf39494af
-
- 19 Nov, 2019 3 commits
-
-
Naman Goyal authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/920 Differential Revision: D18593088 fbshipit-source-id: d4479ee8dae2ca623e62e12bd145165a116fb70a
-
freewym authored
Summary: …r to correctly recover the training from a "non-shuffle" checkpoint Pull Request resolved: https://github.com/pytorch/fairseq/pull/1375 Differential Revision: D18566535 Pulled By: myleott fbshipit-source-id: ff7b1a6ead708801f537ec7885e30e37168cd34b
-
ngoyal2707 authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/915 Differential Revision: D18580996 fbshipit-source-id: 9505a81892ba8ad997c03465d6a2d488c379c762
-
- 18 Nov, 2019 3 commits
-
-
alexeib authored
Summary: recent layerdrop related changes break existing models because they assume presence of certain args Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/918 Reviewed By: huihuifan Differential Revision: D18578572 Pulled By: alexeib fbshipit-source-id: 368c2d5b3add55864bf59516820807303aac6001
-
Myle Ott authored
Summary: Fixes https://github.com/pytorch/fairseq/issues/1376 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1386 Differential Revision: D18566839 Pulled By: myleott fbshipit-source-id: 71805f58fab90f53f757bf4ef69eb914195af38a
-
Louis Martin authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/913 Differential Revision: D18565866 Pulled By: myleott fbshipit-source-id: e845759dafe915805c2e38f53c6835cbcef5db2f
-
- 17 Nov, 2019 1 commit
-
-
Angela Fan authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/1385 Differential Revision: D18565188 Pulled By: huihuifan fbshipit-source-id: 9580663b208f286a249bbfa2bacd71f34a01ca9f
-
- 15 Nov, 2019 1 commit
-
-
Ilia Cherniavskii authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/887 Pull Request resolved: https://github.com/facebookresearch/pytext/pull/1052 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1250 Adding config parameter "use_torchscript" that enables use of TS for BERT training Reviewed By: chenyangyu1988 Differential Revision: D17872083 fbshipit-source-id: 00ac4b04e7f26aa56fe84fe9feaded676d6deb71
-
- 14 Nov, 2019 4 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/911 Differential Revision: D18511627 Pulled By: myleott fbshipit-source-id: 37d7606ae629f9acf84715dbc9045fb683075db4
-
freewym authored
Summary: If the training stopped in the middle of the last epoch, and then it was resumed from checkpoint, it will not continue the training because `epoch_itr.epoch < max_epoch` is not satisfied. This PR fixed the issue. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1275 Differential Revision: D18483945 Pulled By: myleott fbshipit-source-id: 80df6f73fa17606a79a28e8328bb4c577f504683
-
Abhimanyu Sharma authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/910 Pull Request resolved: https://github.com/facebookresearch/pytext/pull/1124 Pull Request resolved: https://github.com/pytorch/fairseq/pull/1362 Split the Fariseq MemoryEfficientFP16Optimizer class into 2 classes so that it can be easily imported in pytext through a wrapper class. Iter 2 - fixed some issues to ensure code runs correctly on fblearner. Iter 3 - fixed review comments, incorrect import and lints. Iter 4 - fixed pytext test breaks. Iter 5 - fix pytext test breaks. Iter 6 - fix comments and refactor based on conversation with chenyang. Reviewed By: chenyangyu1988 Differential Revision: D18410916 fbshipit-source-id: 5238ee553cd2811ed0573825e1c29000980cc489
-
Jiatao Gu authored
Summary: (1) Enable to print the iterative refinement history for all NAT models by setting --retain-iter-history during decoding; (2) Fix a small bug in the decoding process in Levenshtein Transformer. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/908 Differential Revision: D18493234 Pulled By: MultiPath fbshipit-source-id: 9e7702adcea49f39d3c10b5349b5a9ae66399a24
-
- 13 Nov, 2019 5 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/896 Differential Revision: D18250948 Pulled By: myleott fbshipit-source-id: 7a515311e18795670b29f5e24eeba7619a625da7
-
zheng authored
Summary: As their names suggest, the parameters `embedding_dim`, `ffn_embedding_dim`, and `num_attention_heads` should have type `int`, not `float`. Also validated by https://github.com/pytorch/fairseq/blob/b5f41f828b0ec9b67fa60aceb0778073d1b368b2/fairseq/modules/sparse_transformer_sentence_encoder.py#L22#L24. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1268 Differential Revision: D18372518 Pulled By: myleott fbshipit-source-id: 666739b6270a975536785886068a975e07312bb0
-
Zhanghao Wu authored
Summary: Originally, the 'ppl' is calculated but returned as a string, which will not be printed to the tensorboard. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1212 Differential Revision: D18339553 Pulled By: myleott fbshipit-source-id: 52e64d5d173bfd79836a72ee103cb25c8bb2a4c2
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/907 Differential Revision: D18480215 Pulled By: myleott fbshipit-source-id: b02002f631f6d47380f309d4f464bd135d623280
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/899 Differential Revision: D18373060 Pulled By: myleott fbshipit-source-id: bb5510ec15799a0a10a7c0669e76d8200e1ba479
-
- 12 Nov, 2019 1 commit
-
-
Spencer Poff authored
Summary: Using PyTorch IterableDataset for streaming iterators. Such that there is a clean differentiation in interface between datasets that are streaming data and those that support indexed access. Reviewed By: myleott Differential Revision: D18438694 fbshipit-source-id: 482857d8357091ea2a6bf819535b09ba7f1a5b7d
-
- 10 Nov, 2019 1 commit
-
-
Louis Martin authored
Summary: Check locally that everything works fine. Model is uploaded to fbaipublicfiles. I fixed a few inconsistencies in the bpe encoding along the way, e.g. related to https://github.com/pytorch/fairseq/issues/1306.. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/904 Reviewed By: ngoyal2707 Differential Revision: D18418345 Pulled By: louismartin fbshipit-source-id: 53acb4d021581968d70430ee9babee07d6573c17
-
- 09 Nov, 2019 1 commit
-
-
Naman Goyal authored
Summary: This is the first version of BART code / model release. It still requires lot of clean up, instructions, making sure results are reproducible before we can release it. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/902 Differential Revision: D18389535 fbshipit-source-id: 77f16800307ce831bd29538fdd34800793210f46
-
- 08 Nov, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/903 Reviewed By: sujitoc Differential Revision: D18327653 fbshipit-source-id: 739ddbaf54862acdf7b4f1bc3ad538bde5ae00fd
-
Xian Li authored
Summary: To avoid the case where can_ins_mask has all False so max_lengths has size [0, 1] which failed expand_as operator. Move it back into the skipping branch in script. The same for deletion and ins_word. Reviewed By: kahne Differential Revision: D18365340 fbshipit-source-id: 509ac21d7d6fd9083d0710697288203977314c52
-
- 07 Nov, 2019 2 commits
-
-
Kevin authored
Summary: Solves https://github.com/pytorch/fairseq/issues/1218. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1219 Differential Revision: D18339541 Pulled By: myleott fbshipit-source-id: 6d5bd7b60fa7fd30c038fdad54591343a01f228b
-
Louis MARTIN authored
Summary: Models seem to train fine with this modification. I checked that the mask for beginning of words is correct but didn't check if the actual masking worked correctly. Pull Request resolved: https://github.com/pytorch/fairseq/pull/1292 Differential Revision: D18338307 Pulled By: myleott fbshipit-source-id: eae9e29d6ab648e768d70921694a898554496704
-