- 11 Jun, 2019 6 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/797 Differential Revision: D15761071 Pulled By: myleott fbshipit-source-id: 257d4a2297e83da7e59baed154dbafd6bfe614bf
-
Myle Ott authored
Summary: This is a temporary workaround to support sampling after https://github.com/pytorch/fairseq/issues/713. We'll need to revisit this to support sampling and beam more generally. Pull Request resolved: https://github.com/pytorch/fairseq/pull/796 Differential Revision: D15760808 Pulled By: myleott fbshipit-source-id: ecaf4f161b0c30de037f32007e4610a559a49230
-
Bairen Yi authored
Summary: See #467. Ping myleott to review. This is a work-related contribution. Ping lark to review. Pull Request resolved: https://github.com/pytorch/fairseq/pull/794 Differential Revision: D15756816 Pulled By: myleott fbshipit-source-id: 6dce3ff3a713bf5f60e5782bc260b2ca9d2c0a9b
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/792 Differential Revision: D15741781 Pulled By: myleott fbshipit-source-id: c256c7900c307d485904e69b1526b9acbe08fec9
-
yilinyang7 authored
when given prefix_tokens, sequence generator would generate (exactly) same finished candidates (#713) Summary: https://github.com/pytorch/fairseq/issues/712 Pull Request resolved: https://github.com/pytorch/fairseq/pull/713 Differential Revision: D15242432 Pulled By: myleott fbshipit-source-id: a230ee48f4bf891c805609c428d7233a0ad21179
-
Sergey Edunov authored
Summary: Multi-Head attention is currently not TPU-friendly, specifically .data_ptr() is not supported and should not be used. Also there are potential issues with correctness of existing code (e.g. data_ptr() can point to the same storage for different tensors). Rather than rely on data_ptr() we should explicitly set self_attention or encoder_decoder_attention flags. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/636 Reviewed By: myleott Differential Revision: D15709898 Pulled By: edunov fbshipit-source-id: f931713193c51be848a5de20da730ac3a3ce0187
-
- 10 Jun, 2019 2 commits
-
-
Myle Ott authored
Summary: - make it possible to load file_utils.py without the dependencies - add some more demo features Pull Request resolved: https://github.com/pytorch/fairseq/pull/791 Differential Revision: D15739950 Pulled By: myleott fbshipit-source-id: 38df5209973a6fe2e3651575b97134e096aaf5bf
-
freewym authored
Summary: In the current progress bar, the counter for log_interval will always start from 0, which is not correct if reloading from a checkpoint in the middle of an epoch. This fix obtains the offset from the iterator to set the counter correctly. Pull Request resolved: https://github.com/pytorch/fairseq/pull/778 Differential Revision: D15739953 Pulled By: myleott fbshipit-source-id: a1d13403ec5783b22e01d7cb63874fd8dea7f8b0
-
- 07 Jun, 2019 1 commit
-
-
Ning Dong authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/770 Without this change comment here https://fburl.com/w1cejgw9 is inconsistent with the implementation. Reviewed By: xianxl Differential Revision: D15582826 fbshipit-source-id: 16d8368560153b251beed8b290f51fcdd8a8faee
-
- 06 Jun, 2019 1 commit
-
-
Matt Le authored
Reviewed By: pipibjc Differential Revision: D15635402 fbshipit-source-id: e92fab914de40775d7bad851420355240d822bde
-
- 04 Jun, 2019 4 commits
-
-
Matt Le authored
Summary: We never actually load the model parameters from an XLM model when using tranformer_from_pretrained_xlm. Also, change encoder_learned_pos from True -> False Reviewed By: liezl200 Differential Revision: D15629061 fbshipit-source-id: 759eadc88041eae94505477960de57dd78a99dcb
-
lematt1991 authored
Summary: Resolves #762 Pull Request resolved: https://github.com/pytorch/fairseq/pull/776 Differential Revision: D15631503 Pulled By: lematt1991 fbshipit-source-id: 103f77d553476917b8b0f8001767217fb311d920
-
lematt1991 authored
Summary: Resolves #768 Pull Request resolved: https://github.com/pytorch/fairseq/pull/769 Differential Revision: D15621841 Pulled By: lematt1991 fbshipit-source-id: 694effe3788ff7d04864217d673608ec31da589e
-
Biao Lu authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/630 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/629 Pull Request resolved: https://github.com/pytorch/translate/pull/562 Pull Request resolved: https://github.com/pytorch/fairseq/pull/774 forked masked_lm_dictionary from fairseq changed import in pytorch_translate to use the new masked_lm_dictionary registered cooresponding tasks Reviewed By: liezl200 Differential Revision: D15410352 fbshipit-source-id: 06516caabdd4dc5cdee9ad1d8025978f4eea6c4b
-
- 03 Jun, 2019 2 commits
-
-
Haoran Li authored
Summary: lm_output_learned_bias doesn't exist when loading the model for fine-tuning Reviewed By: jingfeidu Differential Revision: D15579190 fbshipit-source-id: 45e8e193399943c89b77cc553d3d6d49b056e55a
-
Nathan Ng authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/621 Differential Revision: D15571435 Pulled By: myleott fbshipit-source-id: 67d25b00c8c1bc69dbffd8521da56f7cc14eb75e
-
- 02 Jun, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/625 Differential Revision: D15595787 Pulled By: myleott fbshipit-source-id: ba6edf305ed41be392194f492e034dd66d1743fe
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/624 Differential Revision: D15595746 Pulled By: myleott fbshipit-source-id: b79e489de9ff37ee7cbf939092a6e5ec0dbebbf5
-
- 01 Jun, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/622 Differential Revision: D15572555 Pulled By: myleott fbshipit-source-id: 2b81f22207b4c894ffe645af0b45c70ac0a80612
-
- 31 May, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/620 Differential Revision: D15569440 Pulled By: myleott fbshipit-source-id: c4681f1c72467c04cd2654e87bc724c94b76e3fb
-
- 30 May, 2019 7 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/617 Differential Revision: D15555328 Pulled By: myleott fbshipit-source-id: 35d1f329f887cb0b867c7a22f17a16f3c9c66815
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/619 Differential Revision: D15562983 Pulled By: myleott fbshipit-source-id: 9240f56f18c87120b7d38e0db374d24a55999395
-
Khoa Ho authored
Summary: Change the wording to avoid confusion. Mixed precision ensures both higher arithmetic throughput and numerical stability, not exactly synonymous to pure half-precision/FP16 training. Also add mentioning of tensor cores since older generation GPUs without tensor cores don't support true mixed precision training. Pull Request resolved: https://github.com/pytorch/fairseq/pull/766 Differential Revision: D15559565 Pulled By: myleott fbshipit-source-id: c71e720772657bb3e8ad330b58bf69e23beb614e
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/613 Differential Revision: D15541384 Pulled By: myleott fbshipit-source-id: ef2c0b0a51cdf37af2ccff0546f524d49f87e65d
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/618 Differential Revision: D15552599 Pulled By: myleott fbshipit-source-id: 2192a30a9c5af31b954a3a1716166dd6ba27b23a
-
Sujit Verma authored
Summary: Changes for supporting tensorboard scalar plotting. Reviewed By: myleott Differential Revision: D15456534 Pulled By: myleott fbshipit-source-id: a012a4eea028aae764ce11786570b7d96841c4a5
-
lukovnikov authored
Summary: Not sure if I'm doing something wrong elsewhere, but I had a device error in `SinusoidalPositionalEmbedding` when running on GPU > 0 because the weights were on a different device than the input. Pull Request resolved: https://github.com/pytorch/fairseq/pull/746 Differential Revision: D15547217 Pulled By: myleott fbshipit-source-id: 37849d895ce483c14615fdb4ace8a8c4fb05b568
-
- 29 May, 2019 7 commits
-
-
Zhanghao Wu authored
Summary: Fix the mismatching between the parameter fed into `SummaryWriter` and the API of the latest [tensorboardX](https://github.com/lanpa/tensorboardX/blob/3e35c9b5f85e8ceb0294532d9eb772341a04c097/tensorboardX/writer.py#L192), i.e. "log_dir" -> "logdir". Pull Request resolved: https://github.com/pytorch/fairseq/pull/763 Differential Revision: D15547192 Pulled By: myleott fbshipit-source-id: c51b88da5ec589fb8ca5b4876bc229efeb7bf494
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/612 Differential Revision: D15541377 Pulled By: myleott fbshipit-source-id: 4762516a3b545d03bc81d3660f47827e15466dce
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/611 Differential Revision: D15541303 Pulled By: myleott fbshipit-source-id: 279ca813437c834fca49576a48b75cbf1fdf0e76
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/610 Differential Revision: D15541261 Pulled By: myleott fbshipit-source-id: f0b823cf4f04c5ef3205f6d259c6dcad4cc329b1
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/608 Differential Revision: D15541220 Pulled By: myleott fbshipit-source-id: 52a8e4da72cc6e3e25cf98c989d34a269d614c9d
-
Spencer Poff authored
Summary: There were two non-obvious errors I ran into while creating a new language modeling task: - `transformer_lm` implicitly required the `tokens_per_sample` arg - `transformer_lm` assumed the task had a `dictionary` and `output_dictionary` property, neither of which are specified in the FairseqTask interface Reviewed By: myleott Differential Revision: D15532345 fbshipit-source-id: 200d7d3b542c35f17cc2d6bca4219c4a4d17cb6b
-
Kartikay Khandelwal authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/765 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/614 This diff has changes needed to make XLM torchscript exportable. Reviewed By: bethebunny Differential Revision: D15497208 fbshipit-source-id: fd9645119e154e3c397f147acf9144d661d9a5c8
-
- 28 May, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/605 Differential Revision: D15518167 Pulled By: myleott fbshipit-source-id: 8b0e6b32adff018136d0d251b7fde3818e373d6f
-
- 24 May, 2019 2 commits
-
-
Yongqiang Wang authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/747 In https://github.com/pytorch/fairseq/pull/647, checkpoint averaging is not Implemented correctly when it comes to shared parameters. This diff has the right Implementation and a test case to guard future change. Reviewed By: myleott Differential Revision: D15402943 fbshipit-source-id: 8004836d5c2571814ea54844650618008a9ee522
-
Jingfei Du authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/758 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/603 fixed a typo for _mask_block of mlm. This typo will make we never set masked token as random token, which should take 10% of the masked tokens. Reviewed By: akinh Differential Revision: D15492315 fbshipit-source-id: 1e03dc862e23a6543e51d7401c74608d366ba62d
-
- 23 May, 2019 3 commits
-
-
Jason Fried authored
Summary: In python 3.7 collections.abc warns when importing abc classes from `collections` In 3.8 this will not work at all. This changes all code using abc's from collections to attempt to import from `collections.abc` I am not fixing existing lint's don't ask, if `arc lint` auto-fixed I accepted, except for spelling in code. Reviewed By: lisroach Differential Revision: D15461049 fbshipit-source-id: ac2bf2ec8cffacd8ba5572882b0832bbf99a1646
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/600 Differential Revision: D15469322 Pulled By: myleott fbshipit-source-id: fdefa8efbb10e48b2a04a6bc10404fd2f3f21ecf
-
Kritika Singh authored
Summary: Context from https://fb.workplace.com/groups/1405155842844877/permalink/2785095451517569/: I am adding a model to pyspeech (formerly fairspeq) with the following `forward`: ``` def forward(self, src_tokens, src_lengths, prev_output_tokens, name): encoder_out = self.encoder(src_tokens, src_lengths) if name == Dataset.d1: decoder_out = self.decoder1(prev_output_tokens, encoder_out) elif name == Dataset.d2: decoder_out = self.decoder2(encoder_out) return decoder_out ``` When I run distributed training on this model, I get the following error: ``` RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argument `find_unused_parameters=True` to `torch.nn.parallel.DistributedDataParallel`; (2) making sure all `forward` function outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module's `forward` function. Please include the loss function and the structure of the return value of `forward` of your module when reporting this issue (e.g. list, dict, iterable). (prepare_for_backward at caffe2/torch/csrc/distributed/c10d/reducer.cpp:410) ``` The recommended fix is to pass find_unused_parameters=True to DistributedDataParallel's initialization Reviewed By: myleott Differential Revision: D15439726 fbshipit-source-id: 7fd80d4a3f49ac90182dec723b49b14e6689406a
-