1. 23 Aug, 2019 2 commits
    • Naman Goyal's avatar
      Cythonize token block dataset (#834) · 4fc39538
      Naman Goyal authored
      Summary:
      Cythonized token block dataset code, it's `> 100x` faster. Token block for entire `bookwiki+CC+stories+openweb` is just ~`39.9` seconds.
      
      TODO:
      1) I think, I can make it 2x more faster.
      2) cleanup.
      
      EDIT History:
      ~~First pass at parellelizing `token_block_dataset`. The code feels somewhat complicated and cluttered.
      This is 2-3x faster though on my tests on `bookwiki` dataset with both `complete` and `complete_doc` modes.
      myleott Can you take a look for correctness as I am still not 100% sure that I am not missing corner cases.~~
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/834
      
      Test Plan:
      Imported from GitHub, without a `Test Plan:` line.
      
      Test workflow: f133816198
      
      Reviewed By: myleott
      
      Differential Revision: D16970257
      
      Pulled By: myleott
      
      fbshipit-source-id: ec45a308193c9e9f3e7075336c15df4723228d6f
      4fc39538
    • Alexei Baevski's avatar
      wav2vec everstore support · 6e2bd794
      Alexei Baevski authored
      Summary: changes for internal support
      
      Differential Revision: D16646887
      
      fbshipit-source-id: ac5bf6c32901819726249422324eae32a0a6e148
      6e2bd794
  2. 22 Aug, 2019 3 commits
  3. 21 Aug, 2019 4 commits
    • Trinkle23897's avatar
      fix string format to work in python 3.5 (#1050) · 93057cc0
      Trinkle23897 authored
      Summary:
      change string fromat in fairseq/data/subsample_dataset.py#20
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/1050
      
      Differential Revision: D16946060
      
      Pulled By: okhonko
      
      fbshipit-source-id: 0eabf22e7ffd4f658b6d18c87dc6e59c81a355c7
      93057cc0
    • Jeff Cai's avatar
      Parameterized criterions (#808) · ba5f829f
      Jeff Cai authored
      Summary:
      Support criterion with parameters, such as AutoSegmentationCriterion (ASG) used in wav2letter which has a transition matrix parameter. This is needed to integrate wav2letter's ASG into PySpeech.
      
      With this diff, parameters in criterions will be:
      (1) updated by optimizers, with a configurable learning rate
      (2) saved and loaded from checkpoints, preserving backward compatibility for criterions without parameters
      (3) synchronized across nodes in distributed training.
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/808
      
      Reviewed By: jcai1
      
      Differential Revision: D16934097
      
      Pulled By: okhonko
      
      fbshipit-source-id: 121ec9382459385c6f9cbef3a8274bec1a434038
      ba5f829f
    • alexeib's avatar
      Multiset (#838) · a2f5361d
      alexeib authored
      Summary:
      Adds ability to tag individual examples with the names of their datasets, along with some minor miscellaneous fixes and improvements
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/838
      
      Differential Revision: D16919175
      
      Pulled By: alexeib
      
      fbshipit-source-id: 4bf493299645bae63f3ee6382e15f18a9f73666c
      a2f5361d
    • Siddharth Dalmia's avatar
      vggblock support without pooling and pooling_kernel_size missing self (#839) · 7a31fe06
      Siddharth Dalmia authored
      Summary:
      1) VggBlock was not supported if pooling kernel size was None.
      2) Since we modify pooling kernel size by using _pair. We should use self.pooling_kernel_size. But I agree it doesn't matter as pytorch is robust to this.
      Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/839
      
      Differential Revision: D16934112
      
      Pulled By: okhonko
      
      fbshipit-source-id: b6b95163b0e7f7203d76d535f01a41912382bdc3
      7a31fe06
  4. 20 Aug, 2019 2 commits
  5. 19 Aug, 2019 6 commits
  6. 17 Aug, 2019 1 commit
  7. 16 Aug, 2019 2 commits
  8. 15 Aug, 2019 5 commits
  9. 14 Aug, 2019 5 commits
  10. 13 Aug, 2019 4 commits
  11. 12 Aug, 2019 5 commits
  12. 10 Aug, 2019 1 commit