1. 29 Jul, 2019 4 commits
  2. 28 Jul, 2019 6 commits
  3. 27 Jul, 2019 2 commits
  4. 25 Jul, 2019 2 commits
  5. 24 Jul, 2019 1 commit
    • Spencer Poff's avatar
      check save_dir before beginning training · b49ea81c
      Spencer Poff authored
      Summary: I sadly discovery that my checkpoint directory wasn't globally readable after 8 hours of training. Adding this check at the beginning of train loop to keep that from happening again!
      
      Reviewed By: myleott
      
      Differential Revision: D16455394
      
      fbshipit-source-id: 35959aa058150b2afb63710c468d01ebc8a12b0c
      b49ea81c
  6. 23 Jul, 2019 3 commits
  7. 22 Jul, 2019 8 commits
  8. 21 Jul, 2019 4 commits
    • Myle Ott's avatar
      Update GPT-2 BPE · 62b5498b
      Myle Ott authored
      Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/749
      
      Differential Revision: D16410984
      
      Pulled By: myleott
      
      fbshipit-source-id: 7698df46b8a179afccb287990f9705358690454a
      62b5498b
    • Myle Ott's avatar
      Default to mmap and infer dataset implementations automatically · 5f78106a
      Myle Ott authored
      Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/751
      
      Differential Revision: D16410989
      
      Pulled By: myleott
      
      fbshipit-source-id: ddbbee49756f9ff6c4487977a3f5d2259b7abafe
      5f78106a
    • Liang Wang's avatar
      Fix topp sampling issues (#882) · 1f96d284
      Liang Wang authored
      Summary:
      Two issues here:
      
      1. `last_included` should be the last included index `cumsum_mask[:, :, -1:]` instead of `cumsum_mask[:, :, :1]`  (which is either 0 or 1);
      
      2. If `--no-repeat-ngram-size` is set, the sum of `probs` may less than 1, we need to re-normalize to make it a valid probability distribution
      
      The following code can reproduce this issues:
      
      ```
      import torch
      import numpy as np
      
      def _sample_topp(probs):
      
          # =====  Code from  fairseq/search.py _sample_topp ======
      
          # sort the last dimension (vocab dimension) in descending order
          sorted_probs, sorted_indices = probs.sort(descending=True)
      
          # compute a mask to indicate the words to be included in the top-P set.
          cumsum_probs = sorted_probs.cumsum(dim=2)
          mask = cumsum_probs.lt(sampling_topp)
      
          # note that mask was computed by 'lt'. One more word needs to be included
          # so that the cumulative probability mass can exceed p.
          cumsum_mask = mask.cumsum(dim=2)
          last_included = cumsum_mask[:, :, :1]
          mask = mask.scatter_(2, last_included, 1)
      
          # truncate unnecessary dims.
          max_dim = last_included.max()
          truncated_mask = mask[:, :, :max_dim + 1]
          truncated_probs = sorted_probs[:, :, :max_dim + 1]
          truncated_indices = sorted_indices[:, :, :max_dim + 1]
      
          # trim the words that are not in top-P by setting their probabilities
          # to 0, so that they would not be sampled later.
          trim_mask = 1 - truncated_mask
          trimed_probs = truncated_probs.masked_fill_(trim_mask, 0)
          return trimed_probs, truncated_indices
      
          # ========================================================
      
      if __name__ == '__main__':
          np.random.seed(1234)
          torch.manual_seed(1234)
      
          sampling_topp = 0.9
          probs = torch.softmax(torch.randn(1, 1, 10), dim=-1)
          # probs = tensor([0.0545, 0.0779, 0.0189, 0.0647, 0.0282, 0.0862, 0.0656, 0.1041, 0.0399, 0.4600])
          print('probs =', probs[0][0])
      
          trimed_probs, truncated_indices = _sample_topp(probs)
      
          cum_probs = trimed_probs.cumsum(dim=-1)[0][0]
          # cumsum = tensor([0.4600, 0.5641])
          print('cumsum =', cum_probs)
          # Will throw AssertionError
          assert float(cum_probs[-1]) >= sampling_topp
      
      ```
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/882
      
      Differential Revision: D16409269
      
      Pulled By: xingz9
      
      fbshipit-source-id: 94b1122eed50c656057b64e22af6f4a6ea7a68af
      1f96d284
    • Myle Ott's avatar
      Rename data.transforms -> data.encoders · f812e529
      Myle Ott authored
      Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/747
      
      Differential Revision: D16403464
      
      Pulled By: myleott
      
      fbshipit-source-id: ee3b4184f129a02be833c7bdc00685978b4de883
      f812e529
  9. 19 Jul, 2019 7 commits
  10. 17 Jul, 2019 3 commits