1. 01 Feb, 2022 5 commits
  2. 31 Jan, 2022 1 commit
  3. 27 Jan, 2022 4 commits
    • hwangjeff's avatar
      Remove invalid token blanking logic from RNN-T decoder (#2180) · ed6256a2
      hwangjeff authored
      Summary:
      This PR removes logic in `RNNTBeamSearch` that blanks out joiner output values corresponding to special tokens, e.g. \<unk\>, \<eos\>, for the following reasons:
      - Provided that the model was configured and trained properly, it shouldn't be necessary, e.g. the model would naturally produce low probabilities for special tokens if they don't exist in the training set.
      - For our pre-trained LibriSpeech training pipeline, the removal of the logic doesn't affect evaluation WER on any of the dev/test splits.
      - The existing logic doesn't generalize to arbitrary token vocabularies.
      - Internally, it seems to have been acknowledged that this logic was introduced to compensate for quirks in other parts of the modeling infra.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2180
      
      Reviewed By: carolineechen, mthrok
      
      Differential Revision: D33822683
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: e7047e294f71c732c77ae0c20fec60412f26f05a
      ed6256a2
    • Caroline Chen's avatar
      Add no lm support for CTC decoder (#2174) · 4c3fa875
      Caroline Chen authored
      Summary:
      Add support for CTC lexicon decoder without LM support by adding a non language model `ZeroLM` that returns score 0 for everything. Generalize the decoder class/API a bit to support this, adding it as an option for the kenlm decoder at the moment (will likely be separated out from kenlm when adding support for other kinds of LMs in the future)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2174
      
      Reviewed By: hwangjeff, nateanl
      
      Differential Revision: D33798674
      
      Pulled By: carolineechen
      
      fbshipit-source-id: ef8265f1d046011b143597b3b7c691566b08dcde
      4c3fa875
    • Zhaoheng Ni's avatar
      Refactor RNNT factory function to support num_symbols argument (#2178) · 2cb87c6b
      Zhaoheng Ni authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2178
      
      Reviewed By: mthrok
      
      Differential Revision: D33797649
      
      Pulled By: nateanl
      
      fbshipit-source-id: 7a8f54294e7b5bd4d343c8e361e747bfd8b5b603
      2cb87c6b
    • moto's avatar
      Add `is_ffmpeg_available` in test (#2170) · 39fe9df6
      moto authored
      Summary:
      Part of https://github.com/pytorch/audio/issues/2164.
      To make the tests introduced in https://github.com/pytorch/audio/issues/2164 skippable if ffmpeg features are not available,
      this commit adds `is_ffmpeg_available`.
      
      The availability of the features depend on two factors;
      1. If it was enabled at build.
      2. If the ffmpeg libraries are found at runtime.
      
      A simple way (for OSS workflow) to detect these is simply checking if
      `libtorchaudio_ffmpeg` presents and can be loaded without a failure.
      
      To facilitate this, this commit changes the
      `torchaudio._extension._load_lib` to return boolean result.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2170
      
      Reviewed By: carolineechen
      
      Differential Revision: D33797695
      
      Pulled By: mthrok
      
      fbshipit-source-id: 85e767fc06350b8f99de255bc965b8c92b8cfe97
      39fe9df6
  4. 26 Jan, 2022 6 commits
  5. 24 Jan, 2022 1 commit
    • popcornell's avatar
      allow Tacotron2 decoding batch_size 1 examples (#2156) · cea1dc66
      popcornell authored
      Summary:
      it seems to me that the current Tacotron2 model does not allow for decoding batch size 1 examples:
      e.g. following code fails. I may have a fix for that.
      
      ```python
      if __name__ == "__main__":
          max_length = 400
          n_batch = 1
          hdim = 32
          dec = _Decoder(
              encoder_embedding_dim=hdim,
              n_mels = hdim,
              n_frames_per_step = 1,
              decoder_rnn_dim = 1024,
              decoder_max_step = 2000,
              decoder_dropout = 0.1,
              decoder_early_stopping = True,
              attention_rnn_dim = 1024,
              attention_hidden_dim = 128,
              attention_location_n_filter = 32,
              attention_location_kernel_size = 31,
              attention_dropout = 0.1,
              prenet_dim = 256,
              gate_threshold = 0.5)
      
          inp = torch.rand((n_batch, max_length, hdim))
          lengths = torch.tensor([max_length]).expand(n_batch).to(inp.device, inp.dtype)
          dec(inp, torch.rand((n_batch, hdim, max_length)), lengths)[0]
          dec.infer(inp, lengths)[0]
      ```
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2156
      
      Reviewed By: carolineechen
      
      Differential Revision: D33744006
      
      Pulled By: nateanl
      
      fbshipit-source-id: 7d04726dfe7e45951ab0007f22f10f90f26379a7
      cea1dc66
  6. 22 Jan, 2022 1 commit
  7. 21 Jan, 2022 3 commits
  8. 20 Jan, 2022 2 commits
  9. 19 Jan, 2022 2 commits
  10. 18 Jan, 2022 1 commit
  11. 14 Jan, 2022 2 commits
  12. 08 Jan, 2022 1 commit
    • Binh Tang's avatar
      [PyTorchLightning/pytorch-lightning] Add deprecation path for renamed training... · 7b6b2d00
      Binh Tang authored
      [PyTorchLightning/pytorch-lightning] Add deprecation path for renamed training type plugins (#11227)
      
      Summary:
      ### New commit log messages
        4eede7c30 Add deprecation path for renamed training type plugins (#11227)
      
      Reviewed By: edward-io, daniellepintz
      
      Differential Revision: D33409991
      
      fbshipit-source-id: 373e48767e992d67db3c85e436648481ad16c9d0
      7b6b2d00
  13. 07 Jan, 2022 2 commits
    • Caroline Chen's avatar
      Add parameter usage to CTC inference tutorial (#2141) · ffbfe74a
      Caroline Chen authored
      Summary:
      Add explanation and demonstration of different beam search decoder parameters.
      Additionally use a better sample audio file and load in with token list instead of tokens file.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2141
      
      Reviewed By: mthrok
      
      Differential Revision: D33463230
      
      Pulled By: carolineechen
      
      fbshipit-source-id: d3dd6452b03d4fc2e095d778189c66f7161e4c68
      ffbfe74a
    • moto's avatar
      Enable build ffmpeg-features in all related jobs (#2140) · 565f8d41
      moto authored
      Summary:
      This commit enables ffmpeg-feature build in tests and
      binary builds of all platforms.
      (Linux/macOS/Windows x conda/wheel)
      
      It also moves the definition of BUILD_FFMPEG env vars to the
      top level `config.yml`.
      
       ---
      Manual checking if all the build log contains `libtorchaudio_ffmpeg`.
      ### binary build
      - [x] `binary_linux_conda_py3.7_cpu`
      - [x] `binary_linux_conda_py3.7_cu102`
      - [x] `binary_linux_wheel_py3.7_cpu`
      - [x] `binary_linux_wheel_py3.7_cu102`
      - [x] `binary_macos_conda_py3.7_cpu`
      - [x] `binary_macos_wheel_py3.7_cpu`
      - [x] `binary_windows_conda_py3.7_cpu`
      - [x] `binary_windows_conda_py3.7_cu113`
      - [x] `binary_windows_wheel_py3.7_cpu`
      - [x] `binary_windows_wheel_py3.7_cu113`
      
      ### test
      - [x] `unittest_linux_cpu_py3.7`
      - [x] `unittest_linux_gpu_py3.7`
      - [x] `unittest_macos_cpu_py3.7`
      - [x] `unittest_windows_cpu_py3.7`
      - [x] `unittest_windows_gpu_py3.7`
      - [x] `integration test`
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2140
      
      Reviewed By: hwangjeff
      
      Differential Revision: D33464430
      
      Pulled By: mthrok
      
      fbshipit-source-id: 2c5b72be75d49019bf1599036180d4e56074e46b
      565f8d41
  14. 06 Jan, 2022 5 commits
  15. 05 Jan, 2022 4 commits
    • moto's avatar
      Do not auto-skip tests on CI (#2127) · 4f487c4a
      moto authored
      Summary:
      Update the internal of `skipIfXXX` decorators so that tests in CI will not be automatically skipped.
      
      Currently we automatically skip some tests based on the availability of related features/test tools.
      This causes issues where we miss signals on certain important features. (CUDA on Windows) https://github.com/pytorch/audio/issues/1565
      
      The new `skipIf` decorator will fail if in CI unless it is explicitly allowed to skip tests.
      It does so by checking `CI` and `TORCHAUDIO_TEST_ALLOW_SKIP_IF_XXX` environment variables.
      
      For non-CI environments, the behavior is same as before, but users can now set `TORCHAUDIO_TEST_ALLOW_SKIP_IF_XXX=false` to disallow the automatic skip.
      
      Results without `TORCHAUDIO_TEST_ALLOW_SKIP_IF_XXX` https://app.circleci.com/pipelines/github/pytorch/audio/9112/workflows/4e6db046-a1a2-4965-b0fe-d5baf4a1efac
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2127
      
      Reviewed By: hwangjeff
      
      Differential Revision: D33430711
      
      Pulled By: mthrok
      
      fbshipit-source-id: d8954dd720469c5ab0f34ea062fd8cf04a8afa3e
      4f487c4a
    • Caroline Chen's avatar
      Remove RNNTL unused vars (#2142) · 80ea3419
      Caroline Chen authored
      Summary:
      remove unnecessary RNNT Loss variables and comment as indicated in https://github.com/pytorch/audio/issues/1479 review comments
      (will follow up on `workspace` comments separately depending on complexity)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2142
      
      Reviewed By: mthrok
      
      Differential Revision: D33433764
      
      Pulled By: carolineechen
      
      fbshipit-source-id: be0ecb77dabd63d733f0d33ff258eae32305eeaf
      80ea3419
    • moto's avatar
      Add minimal ffmpeg build to linux wheel job envs (#2137) · 0a072f9a
      moto authored
      Summary:
      This change adds a minimal ffmpeg installation step to the build wheel job so that later, we can use the resulting ffmpeg libraries for building torchaudio's ffmpeg-features.
      
      The linux wheel build jobs run in CentOS 8 based environment, which does not provide an easy way to install ffmpeg without conda.
      
      After https://github.com/pytorch/audio/pull/2124 is merged, then we can enable the ffmpeg-feature build in Linux wheel.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2137
      
      Reviewed By: carolineechen
      
      Differential Revision: D33430032
      
      Pulled By: mthrok
      
      fbshipit-source-id: bf946d394c0718ddbdc679d7970befc3221982b9
      0a072f9a
    • moto's avatar
      Update ffmpeg discovery logic (#2124) · d8a65450
      moto authored
      Summary:
      Update ffmpeg discovery logic
      
      Previously the build process used pkg-config to locate
      an installation of ffmpeg, which does not work well Windows/CentOS.
      
      This commit update the discovery process to use the custom
      FindFFMPEG.cmake adopted from Kitware/VTK repository with addition of
      conda environment.
      
       The custom discovery logic can support Windows and CentOS.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2124
      
      Reviewed By: carolineechen
      
      Differential Revision: D33429564
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6cb50c1d8c58f51e0f3f3af5c5b541aa3a699bba
      d8a65450