1. 14 Aug, 2023 1 commit
  2. 11 Aug, 2023 1 commit
    • moto's avatar
      Expose AudioMetadata (#3556) · 9467fc44
      moto authored
      Summary:
      `torchaudio.info` returns `AudioMetaData`. It should be exposed as public API, without referring `backend` submodule.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3556
      
      Reviewed By: huangruizhe
      
      Differential Revision: D48267349
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6ccc0c32bf62fbdcb71495fc7d8d4cc29891538a
      9467fc44
  3. 10 Aug, 2023 1 commit
  4. 07 Aug, 2023 2 commits
    • moto's avatar
      Add MMS FA Bundle (#3521) · 5e211d66
      moto authored
      Summary:
      Port the MMS FA model from tutorial to the library with post-processing module.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3521
      
      Reviewed By: huangruizhe
      
      Differential Revision: D48038285
      
      Pulled By: mthrok
      
      fbshipit-source-id: 571cf0fceaaab4790983be2719f1a85805b814f5
      5e211d66
    • moto's avatar
      Add merge_tokens / TokenSpan (#3535) · 30668afb
      moto authored
      Summary:
      This commit adds `merge_tokens` function which removes repeated tokens from CTC token sequences returned from `forced_align`.
      
      Resolving repeated tokens is a necessary step and almost universal, thus it makes sense to have such helper function in torchaudio.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3535
      
      Reviewed By: huangruizhe
      
      Differential Revision: D48111202
      
      Pulled By: mthrok
      
      fbshipit-source-id: 25354bfa210aa5c03f8c1d3e201f253ca3761b24
      30668afb
  5. 03 Aug, 2023 1 commit
  6. 01 Aug, 2023 2 commits
  7. 31 Jul, 2023 1 commit
  8. 28 Jul, 2023 3 commits
  9. 27 Jul, 2023 1 commit
    • moto's avatar
      Replace libsox with stub library (#3497) · 8588fba1
      moto authored
      Summary:
      This commit updates the way libsox is integrated to torchaudio
      
      1. We stop statically linking libsox, so torchaudio will not ship libsox.
      2. We link libsox dynamically. Users are expected to install libsox by themselves.
      3. We use stab library to build torchaudio.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3497
      
      Differential Revision: D47803706
      
      Pulled By: mthrok
      
      fbshipit-source-id: 31b05495d81069186fa52d67beea360cc7e817a8
      8588fba1
  10. 25 Jul, 2023 2 commits
  11. 18 Jul, 2023 1 commit
  12. 15 Jul, 2023 1 commit
  13. 12 Jul, 2023 1 commit
    • moto's avatar
      Support multiple FFmpeg versions (#3464) · 786066b4
      moto authored
      Summary:
      This commit introduces support for multiple FFmpeg versions for OSS binary distributions.
      
      Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking.
      This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4.
      
      The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them.
      At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension.
      The order of preference is 6, 5, then 4.
      
      To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build.
      They are LGPL and downloaded from S3 at build time, instead of building every time.
      
      The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces
      single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built
      so that it will only support one specific version of FFmpeg.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3464
      
      Differential Revision: D47300223
      
      Pulled By: mthrok
      
      fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04
      786066b4
  14. 11 Jul, 2023 1 commit
  15. 28 Jun, 2023 1 commit
  16. 21 Jun, 2023 2 commits
  17. 08 Jun, 2023 1 commit
  18. 07 Jun, 2023 1 commit
  19. 05 Jun, 2023 1 commit
  20. 26 May, 2023 1 commit
  21. 24 May, 2023 2 commits
  22. 23 May, 2023 1 commit
  23. 22 May, 2023 1 commit
  24. 19 May, 2023 1 commit
  25. 17 May, 2023 1 commit
    • Carl Parker's avatar
      Fix for breadcrumbs displaying "Old version (stable)" on Nightly build (#3333) · 3ffd76c8
      Carl Parker authored
      Summary:
      Previously, `breadcrumbs.html` identified a nightly build version by the prefix "Nightly" which would normally be prepended to the version in `conf.py`. However, the version string is coming through without the "Nightly" prefix, so this change causes `breadcrumbs.html` to key on the substring "dev" instead.
      
      The reason we aren't getting "Nightly" is apparently because the environment variable BUILD_VERSION is available, so `conf.py` is using the value of that env var instead of the version string imported from the `torchaudio` module itself, which actually appears to be incorrect; see below.
      
      If I install torchaudio using
      
          conda install torchaudio -c pytorch-nightly
      
      then `torchaudio.__version__` returns the incorrect version string:
      
          2.0.0.dev20230309
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3333
      
      Reviewed By: mthrok
      
      Differential Revision: D45926466
      
      Pulled By: carljparker
      
      fbshipit-source-id: d5516f2d9f1716c2400d3e9b285bd5d32b4b3a77
      3ffd76c8
  26. 16 May, 2023 2 commits
  27. 11 May, 2023 1 commit
  28. 10 May, 2023 2 commits
  29. 29 Apr, 2023 1 commit
  30. 28 Apr, 2023 1 commit
    • Yuekai Zhang's avatar
      Add cuctc decoder (#3096) · 0a1801ed
      Yuekai Zhang authored
      Summary:
      This PR implements a CUDA based ctc prefix beam search decoder.
      
      Attach serveral benchmark results using V100 below:
      |decoder type| model |datasets       | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size |
      |--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------|
      | cuctc |  conformer nemo    |dev clean        |7.68s | 8           |  32       | bpe         |    4  | 1000|
      | cuctc |  conformer nemo   |dev clean  (sort by length)      |1.6s | 8           |  32       | bpe         |    4  | 1000|
      | cuctc |  wav2vec2.0 torchaudio |dev clean                                |22s | 10           |  1       | char         |    2  | 29|
      | cuctc |   conformer espnet   |aishell1 test                             | 5s | 10           |  24       | char         |    4  | 4233|
      
      Note:
      1.  The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations.
      2. WER is the same as CPU implementations. However, it can't decode with LM now.
      
      Resolves: https://github.com/pytorch/audio/issues/2957.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3096
      
      Reviewed By: nateanl
      
      Differential Revision: D44709397
      
      Pulled By: mthrok
      
      fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155
      0a1801ed
  31. 11 Apr, 2023 1 commit