- 29 Jul, 2019 3 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/921 Differential Revision: D16541025 Pulled By: myleott fbshipit-source-id: bb78d30fe285da2adfc7c4e5897ee01fa413b2e4
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/916 Differential Revision: D16537774 Pulled By: myleott fbshipit-source-id: 86bb7b1913a428ee4a21674cc3fc7b39264067ec
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/780 Differential Revision: D16537567 Pulled By: myleott fbshipit-source-id: 4e18c529959935e82ea122c3a2ee477308ffcbe3
-
- 28 Jul, 2019 6 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/779 Differential Revision: D16536673 Pulled By: myleott fbshipit-source-id: bf56e9a81d3086f3d95a3273391dc5e04ed2dbc4
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/914 Differential Revision: D16536670 Pulled By: myleott fbshipit-source-id: 8a41c98f0fb87af6c384cdade756e3eae2978a88
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/912 Differential Revision: D16536561 Pulled By: myleott fbshipit-source-id: 54c5c20a826a14f4e690770e027bcb282acdf911
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/911 Differential Revision: D16536559 Pulled By: myleott fbshipit-source-id: 7fe495054ce5b7658b1d3a43eca38c5858360236
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/913 Differential Revision: D16536562 Pulled By: myleott fbshipit-source-id: ce28642da6868ec884e3e416388a652977a062df
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/910 Differential Revision: D16536532 Pulled By: myleott fbshipit-source-id: 56bb5570e70b5670ad87c64d9dd20c64c1fa9f5c
-
- 27 Jul, 2019 2 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/909 Differential Revision: D16532919 Pulled By: myleott fbshipit-source-id: 16ce884cf3d84579026e4406a75ba3c01a128dbd
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/778 Differential Revision: D16525447 Pulled By: myleott fbshipit-source-id: e721e3a10e243a2408a04f89f06b5adbbe2fdff2
-
- 25 Jul, 2019 2 commits
-
-
Myle Ott authored
Summary: Input feeding generally refers to a slightly different concept Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/769 Differential Revision: D16491898 Pulled By: myleott fbshipit-source-id: 68573584e820f11f199db4e7e37e9ee7a69a3287
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/770 Differential Revision: D16491911 Pulled By: myleott fbshipit-source-id: 8dd2b76f8fa24183640ae9d1129ea47ded77d43d
-
- 24 Jul, 2019 1 commit
-
-
Spencer Poff authored
Summary: I sadly discovery that my checkpoint directory wasn't globally readable after 8 hours of training. Adding this check at the beginning of train loop to keep that from happening again! Reviewed By: myleott Differential Revision: D16455394 fbshipit-source-id: 35959aa058150b2afb63710c468d01ebc8a12b0c
-
- 23 Jul, 2019 3 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/899 Differential Revision: D16448602 Pulled By: myleott fbshipit-source-id: afd1a1b713274b6328150cd85d7f8a81833597aa
-
Taylan Bilal authored
Summary: Since mask really is a tensor of ints, this change should be mathematically equivalent to the base. On the other hand, this has performance implications for xla, hence the pull request. Pull Request resolved: https://github.com/pytorch/fairseq/pull/875 Differential Revision: D16232877 Pulled By: myleott fbshipit-source-id: e63175ee0016dcf0dfe10e2fd22570b8bbfbde84
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/762 Differential Revision: D16427266 Pulled By: myleott fbshipit-source-id: 9bd9b8c6b4994ae98a62a37b34d03265bd365453
-
- 22 Jul, 2019 8 commits
-
-
Sara Hanson authored
Summary: Pull Request resolved: https://github.com/facebookresearch/pytext/pull/804 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/746 Pull Request resolved: https://github.com/pytorch/fairseq/pull/894 Adding an implementation of the sparse transformer to multi-head attention using the fixed attention pattern specified https://arxiv.org/pdf/1904.10509.pdf. The sparse_mask masks out words using -inf; after softmax, -inf becomes 0. Thus, a mask does not need to be re-calculated and re-applied when multiplying attn_weights and values. Four inputs are added to the config: sparse, is_bidirectional, stride, expressivity. If we are using the sparse transformer, is_bidirectional, stride, and expressivity must be specified (there are defaults). If is_bidirectional is False, the mask values using the fixed attention pattern described in the paper. If is_bidirectional is True, subset one includes all values in the current stride window and a summary from every stride window--all other values are masked. Stride (L in the paper) controls the window size and expressivity (c in the paper) controls the size of the summary. Reviewed By: borguz Differential Revision: D16042988 fbshipit-source-id: c59166dc7cfe89187a256e4076000c2458842fd5
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/761 Differential Revision: D16421335 Pulled By: myleott fbshipit-source-id: 257d92c2b90361147642e2baa38486b4d18f6297
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/757 Differential Revision: D16418305 Pulled By: myleott fbshipit-source-id: 25f293a2792509f7a75c688e4bf8cff02e6bba2e
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/758 Differential Revision: D16418932 Pulled By: myleott fbshipit-source-id: 59f005164b61b9fa712922eeb23525f7eec38f38
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/756 Differential Revision: D16418302 Pulled By: myleott fbshipit-source-id: 62495a0bff41d1741e2b09807a3b43ff2c66c8fb
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/752 Differential Revision: D16417582 Pulled By: myleott fbshipit-source-id: 6b4289febcf9290452bb91f1f2181a02c09c82a7
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/740 Differential Revision: D16377797 Pulled By: myleott fbshipit-source-id: f7d6c8b00a77e279ea94376b1f0fcd15087eaf5f
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/750 Differential Revision: D16410986 Pulled By: myleott fbshipit-source-id: 8ee6b4371d6ae5b041b00a54a6039a422345795e
-
- 21 Jul, 2019 4 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/749 Differential Revision: D16410984 Pulled By: myleott fbshipit-source-id: 7698df46b8a179afccb287990f9705358690454a
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/751 Differential Revision: D16410989 Pulled By: myleott fbshipit-source-id: ddbbee49756f9ff6c4487977a3f5d2259b7abafe
-
Liang Wang authored
Summary: Two issues here: 1. `last_included` should be the last included index `cumsum_mask[:, :, -1:]` instead of `cumsum_mask[:, :, :1]` (which is either 0 or 1); 2. If `--no-repeat-ngram-size` is set, the sum of `probs` may less than 1, we need to re-normalize to make it a valid probability distribution The following code can reproduce this issues: ``` import torch import numpy as np def _sample_topp(probs): # ===== Code from fairseq/search.py _sample_topp ====== # sort the last dimension (vocab dimension) in descending order sorted_probs, sorted_indices = probs.sort(descending=True) # compute a mask to indicate the words to be included in the top-P set. cumsum_probs = sorted_probs.cumsum(dim=2) mask = cumsum_probs.lt(sampling_topp) # note that mask was computed by 'lt'. One more word needs to be included # so that the cumulative probability mass can exceed p. cumsum_mask = mask.cumsum(dim=2) last_included = cumsum_mask[:, :, :1] mask = mask.scatter_(2, last_included, 1) # truncate unnecessary dims. max_dim = last_included.max() truncated_mask = mask[:, :, :max_dim + 1] truncated_probs = sorted_probs[:, :, :max_dim + 1] truncated_indices = sorted_indices[:, :, :max_dim + 1] # trim the words that are not in top-P by setting their probabilities # to 0, so that they would not be sampled later. trim_mask = 1 - truncated_mask trimed_probs = truncated_probs.masked_fill_(trim_mask, 0) return trimed_probs, truncated_indices # ======================================================== if __name__ == '__main__': np.random.seed(1234) torch.manual_seed(1234) sampling_topp = 0.9 probs = torch.softmax(torch.randn(1, 1, 10), dim=-1) # probs = tensor([0.0545, 0.0779, 0.0189, 0.0647, 0.0282, 0.0862, 0.0656, 0.1041, 0.0399, 0.4600]) print('probs =', probs[0][0]) trimed_probs, truncated_indices = _sample_topp(probs) cum_probs = trimed_probs.cumsum(dim=-1)[0][0] # cumsum = tensor([0.4600, 0.5641]) print('cumsum =', cum_probs) # Will throw AssertionError assert float(cum_probs[-1]) >= sampling_topp ``` Pull Request resolved: https://github.com/pytorch/fairseq/pull/882 Differential Revision: D16409269 Pulled By: xingz9 fbshipit-source-id: 94b1122eed50c656057b64e22af6f4a6ea7a68af -
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/747 Differential Revision: D16403464 Pulled By: myleott fbshipit-source-id: ee3b4184f129a02be833c7bdc00685978b4de883
-
- 19 Jul, 2019 7 commits
-
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/738 Differential Revision: D16377803 Pulled By: myleott fbshipit-source-id: 6beb2f78e7464b70ff65a965d2b747cdca0ca951
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/736 Differential Revision: D16378001 Pulled By: myleott fbshipit-source-id: 2907f63bcbf7068ceaa48b00096040fa2639e569
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/739 Differential Revision: D16377798 Pulled By: myleott fbshipit-source-id: 20047c80de2e6f108269ace4ae3eec906a5920dd
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/737 Differential Revision: D16377805 Pulled By: myleott fbshipit-source-id: 1e090a02ff4fbba8695173f57d3cc5b88ae98bbf
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/734 Differential Revision: D16377044 Pulled By: myleott fbshipit-source-id: 37d5553d76aa7c653113fec089f59710281c31d7
-
Myle Ott authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/735 Differential Revision: D16377046 Pulled By: myleott fbshipit-source-id: 9725d4a3ce6b2fc8cee0b1d1cb8921f9d59c551a
-
Myle Ott authored
Summary: No major API changes since the last release. Cutting a new release since we'll be merging significant (possibly breaking) changes to logging, data loading and the masked LM implementation soon. Pull Request resolved: https://github.com/pytorch/fairseq/pull/891 Differential Revision: D16377132 Pulled By: myleott fbshipit-source-id: f1cb88e671ccd510e53334d0f449fe18585268c7
-
- 17 Jul, 2019 4 commits
-
-
Ning Dong authored
Summary: Pull Request resolved: https://github.com/pytorch/fairseq/pull/879 Pull Request resolved: https://github.com/pytorch/translate/pull/598 Details in https://fb.workplace.com/notes/ning-dong/closing-research-to-production-gap-a-story-of-latent-variable-model-migration/443418839813586/ Reviewed By: xianxl Differential Revision: D15742439 fbshipit-source-id: 168c84bd30a5da3c2fb404fcca74266deef1f964
-
Xing Zhou authored
Summary: Implement Nucleus (top-P) sampling: sample among the smallest set of elements whose cumulative probability mass exceeds p. To test it: python generate.py ~myleott/data/data-bin/wmt17_zh_en_full/ --path ~myleott/zh_en/model.pt --remove-bpe --nbest 5 --beam 5 --sampling --sampling-topp 0.3 Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/710 Test Plan: python generate.py ~myleott/data/data-bin/wmt17_zh_en_full/ --path ~myleott/zh_en/model.pt --remove-bpe --nbest 5 --beam 5 --sampling --sampling-topp 0.3 python tests/test_sequence_generator.py python tests/test_binaries.py Reviewed By: myleott Differential Revision: D16286688 Pulled By: xingz9 fbshipit-source-id: 1776d21e17c4532a3d24ac75bb7e75da9acad58f
-
Jiajun Shen authored
Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/727 Differential Revision: D16332742 Pulled By: myleott fbshipit-source-id: becedd573c2c071fd21fcb5e55fead554c9bd9d1
-
Myle Ott authored
Summary: This is useful for standalone scripts that want to load a model and inherit most of the args from the model (e.g., eval_lm.py). Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/723 Differential Revision: D16255751 Pulled By: myleott fbshipit-source-id: 562b61511d5d7113e805c9644c877ebb8a3a1889
-