- 03 Jun, 2019 2 commits
-
-
Haoran Li authored
Summary: lm_output_learned_bias doesn't exist when loading the model for fine-tuning Reviewed By: jingfeidu Differential Revision: D15579190 fbshipit-source-id: 45e8e193399943c89b77cc553d3d6d49b056e55a
-
Nathan Ng authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/621 Differential Revision: D15571435 Pulled By: myleott fbshipit-source-id: 67d25b00c8c1bc69dbffd8521da56f7cc14eb75e
-
- 02 Jun, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/625 Differential Revision: D15595787 Pulled By: myleott fbshipit-source-id: ba6edf305ed41be392194f492e034dd66d1743fe
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/624 Differential Revision: D15595746 Pulled By: myleott fbshipit-source-id: b79e489de9ff37ee7cbf939092a6e5ec0dbebbf5
-
- 01 Jun, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/622 Differential Revision: D15572555 Pulled By: myleott fbshipit-source-id: 2b81f22207b4c894ffe645af0b45c70ac0a80612
-
- 31 May, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/620 Differential Revision: D15569440 Pulled By: myleott fbshipit-source-id: c4681f1c72467c04cd2654e87bc724c94b76e3fb
-
- 30 May, 2019 7 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/617 Differential Revision: D15555328 Pulled By: myleott fbshipit-source-id: 35d1f329f887cb0b867c7a22f17a16f3c9c66815
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/619 Differential Revision: D15562983 Pulled By: myleott fbshipit-source-id: 9240f56f18c87120b7d38e0db374d24a55999395
-
Khoa Ho authored
Summary: Change the wording to avoid confusion. Mixed precision ensures both higher arithmetic throughput and numerical stability, not exactly synonymous to pure half-precision/FP16 training. Also add mentioning of tensor cores since older generation GPUs without tensor cores don't support true mixed precision training. Pull Request resolved: https://github.com/pytorch/fairseq/pull/766 Differential Revision: D15559565 Pulled By: myleott fbshipit-source-id: c71e720772657bb3e8ad330b58bf69e23beb614e
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/613 Differential Revision: D15541384 Pulled By: myleott fbshipit-source-id: ef2c0b0a51cdf37af2ccff0546f524d49f87e65d
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/618 Differential Revision: D15552599 Pulled By: myleott fbshipit-source-id: 2192a30a9c5af31b954a3a1716166dd6ba27b23a
-
Sujit Verma authored
Summary: Changes for supporting tensorboard scalar plotting. Reviewed By: myleott Differential Revision: D15456534 Pulled By: myleott fbshipit-source-id: a012a4eea028aae764ce11786570b7d96841c4a5
-
lukovnikov authored
Summary: Not sure if I'm doing something wrong elsewhere, but I had a device error in `SinusoidalPositionalEmbedding` when running on GPU > 0 because the weights were on a different device than the input. Pull Request resolved: https://github.com/pytorch/fairseq/pull/746 Differential Revision: D15547217 Pulled By: myleott fbshipit-source-id: 37849d895ce483c14615fdb4ace8a8c4fb05b568
-
- 29 May, 2019 7 commits
-
-
Zhanghao Wu authored
Summary: Fix the mismatching between the parameter fed into `SummaryWriter` and the API of the latest [tensorboardX](https://github.com/lanpa/tensorboardX/blob/3e35c9b5f85e8ceb0294532d9eb772341a04c097/tensorboardX/writer.py#L192), i.e. "log_dir" -> "logdir". Pull Request resolved: https://github.com/pytorch/fairseq/pull/763 Differential Revision: D15547192 Pulled By: myleott fbshipit-source-id: c51b88da5ec589fb8ca5b4876bc229efeb7bf494
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/612 Differential Revision: D15541377 Pulled By: myleott fbshipit-source-id: 4762516a3b545d03bc81d3660f47827e15466dce
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/611 Differential Revision: D15541303 Pulled By: myleott fbshipit-source-id: 279ca813437c834fca49576a48b75cbf1fdf0e76
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/610 Differential Revision: D15541261 Pulled By: myleott fbshipit-source-id: f0b823cf4f04c5ef3205f6d259c6dcad4cc329b1
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/608 Differential Revision: D15541220 Pulled By: myleott fbshipit-source-id: 52a8e4da72cc6e3e25cf98c989d34a269d614c9d
-
Spencer Poff authored
Summary: There were two non-obvious errors I ran into while creating a new language modeling task: - `transformer_lm` implicitly required the `tokens_per_sample` arg - `transformer_lm` assumed the task had a `dictionary` and `output_dictionary` property, neither of which are specified in the FairseqTask interface Reviewed By: myleott Differential Revision: D15532345 fbshipit-source-id: 200d7d3b542c35f17cc2d6bca4219c4a4d17cb6b
-
Kartikay Khandelwal authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/765 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/614 This diff has changes needed to make XLM torchscript exportable. Reviewed By: bethebunny Differential Revision: D15497208 fbshipit-source-id: fd9645119e154e3c397f147acf9144d661d9a5c8
-
- 28 May, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/605 Differential Revision: D15518167 Pulled By: myleott fbshipit-source-id: 8b0e6b32adff018136d0d251b7fde3818e373d6f
-
- 24 May, 2019 2 commits
-
-
Yongqiang Wang authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/747 In https://github.com/pytorch/fairseq/pull/647, checkpoint averaging is not Implemented correctly when it comes to shared parameters. This diff has the right Implementation and a test case to guard future change. Reviewed By: myleott Differential Revision: D15402943 fbshipit-source-id: 8004836d5c2571814ea54844650618008a9ee522
-
Jingfei Du authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/758 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/603 fixed a typo for _mask_block of mlm. This typo will make we never set masked token as random token, which should take 10% of the masked tokens. Reviewed By: akinh Differential Revision: D15492315 fbshipit-source-id: 1e03dc862e23a6543e51d7401c74608d366ba62d
-
- 23 May, 2019 3 commits
-
-
Jason Fried authored
Summary: In python 3.7 collections.abc warns when importing abc classes from `collections` In 3.8 this will not work at all. This changes all code using abc's from collections to attempt to import from `collections.abc` I am not fixing existing lint's don't ask, if `arc lint` auto-fixed I accepted, except for spelling in code. Reviewed By: lisroach Differential Revision: D15461049 fbshipit-source-id: ac2bf2ec8cffacd8ba5572882b0832bbf99a1646
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/600 Differential Revision: D15469322 Pulled By: myleott fbshipit-source-id: fdefa8efbb10e48b2a04a6bc10404fd2f3f21ecf
-
Kritika Singh authored
Summary: Context from https://fb.workplace.com/groups/1405155842844877/permalink/2785095451517569/: I am adding a model to pyspeech (formerly fairspeq) with the following `forward`: ``` def forward(self, src_tokens, src_lengths, prev_output_tokens, name): encoder_out = self.encoder(src_tokens, src_lengths) if name == Dataset.d1: decoder_out = self.decoder1(prev_output_tokens, encoder_out) elif name == Dataset.d2: decoder_out = self.decoder2(encoder_out) return decoder_out ``` When I run distributed training on this model, I get the following error: ``` RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`; (2) making sure all `forward` function outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's `forward` function. Please include the loss function and the structure of the return value of `forward` of your module when reporting this issue (e.g. list, dict, iterable). (prepare_for_backward at caffe2/torch/csrc/distributed/c10d/reducer.cpp:410) ``` The recommended fix is to pass find_unused_parameters=True to DistributedDataParallel's initialization Reviewed By: myleott Differential Revision: D15439726 fbshipit-source-id: 7fd80d4a3f49ac90182dec723b49b14e6689406a
-
- 22 May, 2019 2 commits
-
-
Matt Le authored
Summary: Fixes semisupervised translation task to deal with change in order of data loading and model creation (D15428242). When we build the model, we create the backtranslation function, which we can then pass in to the constructor of BacktranslationDataset Reviewed By: myleott Differential Revision: D15455420 fbshipit-source-id: 95101ca92f8af33702be3416147edd98da135a20
-
zhiqiang authored
Summary: Remove duplicate definition of PositionalEmbedding in `lightconv.py` Pull Request resolved: https://github.com/pytorch/fairseq/pull/754 Differential Revision: D15451443 Pulled By: myleott fbshipit-source-id: a3d82ab2c1335d66be3c5d67a07893162d138c7a
-
- 21 May, 2019 3 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/595 Differential Revision: D15428242 Pulled By: myleott fbshipit-source-id: 3cec83a2353498a4802398eba8bcb1aefaf6d5c4
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/596 Differential Revision: D15432359 Pulled By: myleott fbshipit-source-id: ebfdf0031864c3c88357543c0202ba0bd65a7b90
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/597 Differential Revision: D15432965 Pulled By: myleott fbshipit-source-id: 4471a2a8bb468bb639a80f977ab4c20480acb461
-
- 20 May, 2019 4 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/592 Differential Revision: D15415499 Pulled By: myleott fbshipit-source-id: 87ba09b9b38501daebd95bbf28815e048c78f9a3
-
Jingfei Du authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/752 previously we sample masked tokens with replace=True (default). Because of this, we would mask same tokens multiple times, which will make us mask less tokens finally Reviewed By: liaimi Differential Revision: D15403556 fbshipit-source-id: cf12eeb13f9610431136a345de9199ad0292984b
-
Ning Dong authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/730 Pull Request resolved: https://github.com/pytorch/translate/pull/528 Add/modify necessary functions for ConcatDataset to work in PytorchTranslateTask and replace MultiCorpusSampledDataset which doesn't support mixed batch. Any idea on how to implement collater here for mixed batch? Now I'm just using the collater of the first dataset. Reviewed By: liezl200 Differential Revision: D15260872 fbshipit-source-id: 14b148c506e9f8ebf4fe60a49f95444d4123d76f
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/591 Differential Revision: D15415490 Pulled By: myleott fbshipit-source-id: c45df5f3b5327911e2c9b11642e7da2e8bb835dc
-
- 19 May, 2019 1 commit
-
-
Kartikay Khandelwal authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/570 Pull Request resolved: https://github.com/pytorch/fairseq/pull/731 Currently the LearnedPositionalEmbedding module computes the position tensor based on the input data. However this really doesnt work for XLM where we have different behavior based on the Masked LM and Translation LM. In this diff I keep the same default behavior for LearnedPositionalEmbedding as before but add the ability for these models to work with pre-computed position tensors. Reviewed By: myleott Differential Revision: D15305474 fbshipit-source-id: de7d908245a2a620b58d36055211600a08f2d1dc
-
- 17 May, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/588 Differential Revision: D15389638 Pulled By: myleott fbshipit-source-id: 4632ce22d51dc2c74d250bae999630095d849701
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/586 Differential Revision: D15372949 Pulled By: myleott fbshipit-source-id: c1cf1c645e8d55fc8568f23a47c45677ac9ab1da
-
- 16 May, 2019 2 commits
-
-
Jingfei Du authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/744 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/587 After we added additional prediciton layers for language model predictions. The fine-tuning is broken because of 2 reasons. 1. checkpoint cannot be loaded since we didn't update state_dict names 2. lm_output_learned_bias is not initialize if load_softmax is false Reviewed By: myleott Differential Revision: D15377380 fbshipit-source-id: d58544b1d2c549586abef42fec19ec8bf27a994a
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/743 Original commit changeset: 0afe37c9a031 According to edunov: "We need to be careful here with shared parameters, I believe right now it is broken if you have shared encoder/decoder input embeddings (encoder.embed_tokens.weight and decoder.embed_tokens.weight) as they get updated several times" We also have OSS issues that look related, e.g., https://github.com/pytorch/fairseq/issues/732. Backing this out until we can confirm the correct behavior for shared params. Differential Revision: D15372673 fbshipit-source-id: 8683c0f2514e21fa1e9d2fe6dfc48d98957a2831
-