- 12 Jun, 2019 3 commits
-
-
Nayan Singhal authored
Summary: Implemented model averaging for fairseq. Removed the ddp wrapper if global optimizer is provided. Syncing all the models based on the iteration provide in the input TODO: 1) Fix throughput and wps meter. Need to check other meters too. 2) Replace Model average code with BMUF algorithm implementation. Reviewed By: myleott Differential Revision: D15711044 fbshipit-source-id: 58a4af74db2a61d06762597b95836cbeb1ed82cc
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/801 Differential Revision: D15781975 Pulled By: myleott fbshipit-source-id: b86276cd3a40138c09494637c43ce52a56c4aced
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/799 Differential Revision: D15773932 Pulled By: myleott fbshipit-source-id: 650c0621bedb3b7ecebc0654d8e10d7692c50994
-
- 11 Jun, 2019 7 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/793 Differential Revision: D15758755 Pulled By: myleott fbshipit-source-id: b93e4ac11bde36a0b59b4d6d1c84d31c3124d767
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/797 Differential Revision: D15761071 Pulled By: myleott fbshipit-source-id: 257d4a2297e83da7e59baed154dbafd6bfe614bf
-
Myle Ott authored
Summary: This is a temporary workaround to support sampling after https://github.com/pytorch/fairseq/issues/713. We'll need to revisit this to support sampling and beam more generally. Pull Request resolved: https://github.com/pytorch/fairseq/pull/796 Differential Revision: D15760808 Pulled By: myleott fbshipit-source-id: ecaf4f161b0c30de037f32007e4610a559a49230
-
Bairen Yi authored
Summary: See #467. Ping myleott to review. This is a work-related contribution. Ping lark to review. Pull Request resolved: https://github.com/pytorch/fairseq/pull/794 Differential Revision: D15756816 Pulled By: myleott fbshipit-source-id: 6dce3ff3a713bf5f60e5782bc260b2ca9d2c0a9b
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/792 Differential Revision: D15741781 Pulled By: myleott fbshipit-source-id: c256c7900c307d485904e69b1526b9acbe08fec9
-
yilinyang7 authored
when given prefix_tokens, sequence generator would generate (exactly) same finished candidates (#713) Summary: https://github.com/pytorch/fairseq/issues/712 Pull Request resolved: https://github.com/pytorch/fairseq/pull/713 Differential Revision: D15242432 Pulled By: myleott fbshipit-source-id: a230ee48f4bf891c805609c428d7233a0ad21179
-
Sergey Edunov authored
Summary: Multi-Head attention is currently not TPU-friendly, specifically .data_ptr() is not supported and should not be used. Also there are potential issues with correctness of existing code (e.g. data_ptr() can point to the same storage for different tensors). Rather than rely on data_ptr() we should explicitly set self_attention or encoder_decoder_attention flags. Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/636 Reviewed By: myleott Differential Revision: D15709898 Pulled By: edunov fbshipit-source-id: f931713193c51be848a5de20da730ac3a3ce0187
-
- 10 Jun, 2019 2 commits
-
-
Myle Ott authored
Summary: - make it possible to load file_utils.py without the dependencies - add some more demo features Pull Request resolved: https://github.com/pytorch/fairseq/pull/791 Differential Revision: D15739950 Pulled By: myleott fbshipit-source-id: 38df5209973a6fe2e3651575b97134e096aaf5bf
-
freewym authored
Summary: In the current progress bar, the counter for log_interval will always start from 0, which is not correct if reloading from a checkpoint in the middle of an epoch. This fix obtains the offset from the iterator to set the counter correctly. Pull Request resolved: https://github.com/pytorch/fairseq/pull/778 Differential Revision: D15739953 Pulled By: myleott fbshipit-source-id: a1d13403ec5783b22e01d7cb63874fd8dea7f8b0
-
- 07 Jun, 2019 1 commit
-
-
Ning Dong authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/770 Without this change comment here https://fburl.com/w1cejgw9 is inconsistent with the implementation. Reviewed By: xianxl Differential Revision: D15582826 fbshipit-source-id: 16d8368560153b251beed8b290f51fcdd8a8faee
-
- 06 Jun, 2019 1 commit
-
-
Matt Le authored
Reviewed By: pipibjc Differential Revision: D15635402 fbshipit-source-id: e92fab914de40775d7bad851420355240d822bde
-
- 04 Jun, 2019 4 commits
-
-
Matt Le authored
Summary: We never actually load the model parameters from an XLM model when using tranformer_from_pretrained_xlm. Also, change encoder_learned_pos from True -> False Reviewed By: liezl200 Differential Revision: D15629061 fbshipit-source-id: 759eadc88041eae94505477960de57dd78a99dcb
-
lematt1991 authored
Summary: Resolves #762 Pull Request resolved: https://github.com/pytorch/fairseq/pull/776 Differential Revision: D15631503 Pulled By: lematt1991 fbshipit-source-id: 103f77d553476917b8b0f8001767217fb311d920
-
lematt1991 authored
Summary: Resolves #768 Pull Request resolved: https://github.com/pytorch/fairseq/pull/769 Differential Revision: D15621841 Pulled By: lematt1991 fbshipit-source-id: 694effe3788ff7d04864217d673608ec31da589e
-
Biao Lu authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/630 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/629 Pull Request resolved: https://github.com/pytorch/translate/pull/562 Pull Request resolved: https://github.com/pytorch/fairseq/pull/774 forked masked_lm_dictionary from fairseq changed import in pytorch_translate to use the new masked_lm_dictionary registered cooresponding tasks Reviewed By: liezl200 Differential Revision: D15410352 fbshipit-source-id: 06516caabdd4dc5cdee9ad1d8025978f4eea6c4b
-
- 03 Jun, 2019 2 commits
-
-
Haoran Li authored
Summary: lm_output_learned_bias doesn't exist when loading the model for fine-tuning Reviewed By: jingfeidu Differential Revision: D15579190 fbshipit-source-id: 45e8e193399943c89b77cc553d3d6d49b056e55a
-
Nathan Ng authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/621 Differential Revision: D15571435 Pulled By: myleott fbshipit-source-id: 67d25b00c8c1bc69dbffd8521da56f7cc14eb75e
-
- 02 Jun, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/625 Differential Revision: D15595787 Pulled By: myleott fbshipit-source-id: ba6edf305ed41be392194f492e034dd66d1743fe
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/624 Differential Revision: D15595746 Pulled By: myleott fbshipit-source-id: b79e489de9ff37ee7cbf939092a6e5ec0dbebbf5
-
- 01 Jun, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/622 Differential Revision: D15572555 Pulled By: myleott fbshipit-source-id: 2b81f22207b4c894ffe645af0b45c70ac0a80612
-
- 31 May, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/620 Differential Revision: D15569440 Pulled By: myleott fbshipit-source-id: c4681f1c72467c04cd2654e87bc724c94b76e3fb
-
- 30 May, 2019 7 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/617 Differential Revision: D15555328 Pulled By: myleott fbshipit-source-id: 35d1f329f887cb0b867c7a22f17a16f3c9c66815
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/619 Differential Revision: D15562983 Pulled By: myleott fbshipit-source-id: 9240f56f18c87120b7d38e0db374d24a55999395
-
Khoa Ho authored
Summary: Change the wording to avoid confusion. Mixed precision ensures both higher arithmetic throughput and numerical stability, not exactly synonymous to pure half-precision/FP16 training. Also add mentioning of tensor cores since older generation GPUs without tensor cores don't support true mixed precision training. Pull Request resolved: https://github.com/pytorch/fairseq/pull/766 Differential Revision: D15559565 Pulled By: myleott fbshipit-source-id: c71e720772657bb3e8ad330b58bf69e23beb614e
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/613 Differential Revision: D15541384 Pulled By: myleott fbshipit-source-id: ef2c0b0a51cdf37af2ccff0546f524d49f87e65d
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/618 Differential Revision: D15552599 Pulled By: myleott fbshipit-source-id: 2192a30a9c5af31b954a3a1716166dd6ba27b23a
-
Sujit Verma authored
Summary: Changes for supporting tensorboard scalar plotting. Reviewed By: myleott Differential Revision: D15456534 Pulled By: myleott fbshipit-source-id: a012a4eea028aae764ce11786570b7d96841c4a5
-
lukovnikov authored
Summary: Not sure if I'm doing something wrong elsewhere, but I had a device error in `SinusoidalPositionalEmbedding` when running on GPU > 0 because the weights were on a different device than the input. Pull Request resolved: https://github.com/pytorch/fairseq/pull/746 Differential Revision: D15547217 Pulled By: myleott fbshipit-source-id: 37849d895ce483c14615fdb4ace8a8c4fb05b568
-
- 29 May, 2019 7 commits
-
-
Zhanghao Wu authored
Summary: Fix the mismatching between the parameter fed into `SummaryWriter` and the API of the latest [tensorboardX](https://github.com/lanpa/tensorboardX/blob/3e35c9b5f85e8ceb0294532d9eb772341a04c097/tensorboardX/writer.py#L192), i.e. "log_dir" -> "logdir". Pull Request resolved: https://github.com/pytorch/fairseq/pull/763 Differential Revision: D15547192 Pulled By: myleott fbshipit-source-id: c51b88da5ec589fb8ca5b4876bc229efeb7bf494
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/612 Differential Revision: D15541377 Pulled By: myleott fbshipit-source-id: 4762516a3b545d03bc81d3660f47827e15466dce
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/611 Differential Revision: D15541303 Pulled By: myleott fbshipit-source-id: 279ca813437c834fca49576a48b75cbf1fdf0e76
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/610 Differential Revision: D15541261 Pulled By: myleott fbshipit-source-id: f0b823cf4f04c5ef3205f6d259c6dcad4cc329b1
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/608 Differential Revision: D15541220 Pulled By: myleott fbshipit-source-id: 52a8e4da72cc6e3e25cf98c989d34a269d614c9d
-
Spencer Poff authored
Summary: There were two non-obvious errors I ran into while creating a new language modeling task: - `transformer_lm` implicitly required the `tokens_per_sample` arg - `transformer_lm` assumed the task had a `dictionary` and `output_dictionary` property, neither of which are specified in the FairseqTask interface Reviewed By: myleott Differential Revision: D15532345 fbshipit-source-id: 200d7d3b542c35f17cc2d6bca4219c4a4d17cb6b
-
Kartikay Khandelwal authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/765 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/614 This diff has changes needed to make XLM torchscript exportable. Reviewed By: bethebunny Differential Revision: D15497208 fbshipit-source-id: fd9645119e154e3c397f147acf9144d661d9a5c8
-
- 28 May, 2019 1 commit
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/605 Differential Revision: D15518167 Pulled By: myleott fbshipit-source-id: 8b0e6b32adff018136d0d251b7fde3818e373d6f
-
- 24 May, 2019 1 commit
-
-
Yongqiang Wang authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/747 In https://github.com/pytorch/fairseq/pull/647, checkpoint averaging is not Implemented correctly when it comes to shared parameters. This diff has the right Implementation and a test case to guard future change. Reviewed By: myleott Differential Revision: D15402943 fbshipit-source-id: 8004836d5c2571814ea54844650618008a9ee522
-