1. 03 Feb, 2022 3 commits
  2. 02 Feb, 2022 5 commits
  3. 01 Feb, 2022 6 commits
  4. 31 Jan, 2022 1 commit
  5. 27 Jan, 2022 4 commits
    • hwangjeff's avatar
      Remove invalid token blanking logic from RNN-T decoder (#2180) · ed6256a2
      hwangjeff authored
      Summary:
      This PR removes logic in `RNNTBeamSearch` that blanks out joiner output values corresponding to special tokens, e.g. \<unk\>, \<eos\>, for the following reasons:
      - Provided that the model was configured and trained properly, it shouldn't be necessary, e.g. the model would naturally produce low probabilities for special tokens if they don't exist in the training set.
      - For our pre-trained LibriSpeech training pipeline, the removal of the logic doesn't affect evaluation WER on any of the dev/test splits.
      - The existing logic doesn't generalize to arbitrary token vocabularies.
      - Internally, it seems to have been acknowledged that this logic was introduced to compensate for quirks in other parts of the modeling infra.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2180
      
      Reviewed By: carolineechen, mthrok
      
      Differential Revision: D33822683
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: e7047e294f71c732c77ae0c20fec60412f26f05a
      ed6256a2
    • Caroline Chen's avatar
      Add no lm support for CTC decoder (#2174) · 4c3fa875
      Caroline Chen authored
      Summary:
      Add support for CTC lexicon decoder without LM support by adding a non language model `ZeroLM` that returns score 0 for everything. Generalize the decoder class/API a bit to support this, adding it as an option for the kenlm decoder at the moment (will likely be separated out from kenlm when adding support for other kinds of LMs in the future)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2174
      
      Reviewed By: hwangjeff, nateanl
      
      Differential Revision: D33798674
      
      Pulled By: carolineechen
      
      fbshipit-source-id: ef8265f1d046011b143597b3b7c691566b08dcde
      4c3fa875
    • Zhaoheng Ni's avatar
      Refactor RNNT factory function to support num_symbols argument (#2178) · 2cb87c6b
      Zhaoheng Ni authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2178
      
      Reviewed By: mthrok
      
      Differential Revision: D33797649
      
      Pulled By: nateanl
      
      fbshipit-source-id: 7a8f54294e7b5bd4d343c8e361e747bfd8b5b603
      2cb87c6b
    • moto's avatar
      Add `is_ffmpeg_available` in test (#2170) · 39fe9df6
      moto authored
      Summary:
      Part of https://github.com/pytorch/audio/issues/2164.
      To make the tests introduced in https://github.com/pytorch/audio/issues/2164 skippable if ffmpeg features are not available,
      this commit adds `is_ffmpeg_available`.
      
      The availability of the features depend on two factors;
      1. If it was enabled at build.
      2. If the ffmpeg libraries are found at runtime.
      
      A simple way (for OSS workflow) to detect these is simply checking if
      `libtorchaudio_ffmpeg` presents and can be loaded without a failure.
      
      To facilitate this, this commit changes the
      `torchaudio._extension._load_lib` to return boolean result.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2170
      
      Reviewed By: carolineechen
      
      Differential Revision: D33797695
      
      Pulled By: mthrok
      
      fbshipit-source-id: 85e767fc06350b8f99de255bc965b8c92b8cfe97
      39fe9df6
  6. 26 Jan, 2022 6 commits
  7. 24 Jan, 2022 1 commit
    • popcornell's avatar
      allow Tacotron2 decoding batch_size 1 examples (#2156) · cea1dc66
      popcornell authored
      Summary:
      it seems to me that the current Tacotron2 model does not allow for decoding batch size 1 examples:
      e.g. following code fails. I may have a fix for that.
      
      ```python
      if __name__ == "__main__":
          max_length = 400
          n_batch = 1
          hdim = 32
          dec = _Decoder(
              encoder_embedding_dim=hdim,
              n_mels = hdim,
              n_frames_per_step = 1,
              decoder_rnn_dim = 1024,
              decoder_max_step = 2000,
              decoder_dropout = 0.1,
              decoder_early_stopping = True,
              attention_rnn_dim = 1024,
              attention_hidden_dim = 128,
              attention_location_n_filter = 32,
              attention_location_kernel_size = 31,
              attention_dropout = 0.1,
              prenet_dim = 256,
              gate_threshold = 0.5)
      
          inp = torch.rand((n_batch, max_length, hdim))
          lengths = torch.tensor([max_length]).expand(n_batch).to(inp.device, inp.dtype)
          dec(inp, torch.rand((n_batch, hdim, max_length)), lengths)[0]
          dec.infer(inp, lengths)[0]
      ```
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2156
      
      Reviewed By: carolineechen
      
      Differential Revision: D33744006
      
      Pulled By: nateanl
      
      fbshipit-source-id: 7d04726dfe7e45951ab0007f22f10f90f26379a7
      cea1dc66
  8. 22 Jan, 2022 1 commit
  9. 21 Jan, 2022 3 commits
  10. 20 Jan, 2022 2 commits
  11. 19 Jan, 2022 2 commits
  12. 18 Jan, 2022 1 commit
  13. 14 Jan, 2022 2 commits
  14. 08 Jan, 2022 1 commit
    • Binh Tang's avatar
      [PyTorchLightning/pytorch-lightning] Add deprecation path for renamed training... · 7b6b2d00
      Binh Tang authored
      [PyTorchLightning/pytorch-lightning] Add deprecation path for renamed training type plugins (#11227)
      
      Summary:
      ### New commit log messages
        4eede7c30 Add deprecation path for renamed training type plugins (#11227)
      
      Reviewed By: edward-io, daniellepintz
      
      Differential Revision: D33409991
      
      fbshipit-source-id: 373e48767e992d67db3c85e436648481ad16c9d0
      7b6b2d00
  15. 07 Jan, 2022 2 commits
    • Caroline Chen's avatar
      Add parameter usage to CTC inference tutorial (#2141) · ffbfe74a
      Caroline Chen authored
      Summary:
      Add explanation and demonstration of different beam search decoder parameters.
      Additionally use a better sample audio file and load in with token list instead of tokens file.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2141
      
      Reviewed By: mthrok
      
      Differential Revision: D33463230
      
      Pulled By: carolineechen
      
      fbshipit-source-id: d3dd6452b03d4fc2e095d778189c66f7161e4c68
      ffbfe74a
    • moto's avatar
      Enable build ffmpeg-features in all related jobs (#2140) · 565f8d41
      moto authored
      Summary:
      This commit enables ffmpeg-feature build in tests and
      binary builds of all platforms.
      (Linux/macOS/Windows x conda/wheel)
      
      It also moves the definition of BUILD_FFMPEG env vars to the
      top level `config.yml`.
      
       ---
      Manual checking if all the build log contains `libtorchaudio_ffmpeg`.
      ### binary build
      - [x] `binary_linux_conda_py3.7_cpu`
      - [x] `binary_linux_conda_py3.7_cu102`
      - [x] `binary_linux_wheel_py3.7_cpu`
      - [x] `binary_linux_wheel_py3.7_cu102`
      - [x] `binary_macos_conda_py3.7_cpu`
      - [x] `binary_macos_wheel_py3.7_cpu`
      - [x] `binary_windows_conda_py3.7_cpu`
      - [x] `binary_windows_conda_py3.7_cu113`
      - [x] `binary_windows_wheel_py3.7_cpu`
      - [x] `binary_windows_wheel_py3.7_cu113`
      
      ### test
      - [x] `unittest_linux_cpu_py3.7`
      - [x] `unittest_linux_gpu_py3.7`
      - [x] `unittest_macos_cpu_py3.7`
      - [x] `unittest_windows_cpu_py3.7`
      - [x] `unittest_windows_gpu_py3.7`
      - [x] `integration test`
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2140
      
      Reviewed By: hwangjeff
      
      Differential Revision: D33464430
      
      Pulled By: mthrok
      
      fbshipit-source-id: 2c5b72be75d49019bf1599036180d4e56074e46b
      565f8d41