1. 20 May, 2023 1 commit
  2. 28 Apr, 2023 1 commit
    • Yuekai Zhang's avatar
      Add cuctc decoder (#3096) · 0a1801ed
      Yuekai Zhang authored
      Summary:
      This PR implements a CUDA based ctc prefix beam search decoder.
      
      Attach serveral benchmark results using V100 below:
      |decoder type| model |datasets       | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size |
      |--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------|
      | cuctc |  conformer nemo    |dev clean        |7.68s | 8           |  32       | bpe         |    4  | 1000|
      | cuctc |  conformer nemo   |dev clean  (sort by length)      |1.6s | 8           |  32       | bpe         |    4  | 1000|
      | cuctc |  wav2vec2.0 torchaudio |dev clean                                |22s | 10           |  1       | char         |    2  | 29|
      | cuctc |   conformer espnet   |aishell1 test                             | 5s | 10           |  24       | char         |    4  | 4233|
      
      Note:
      1.  The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations.
      2. WER is the same as CPU implementations. However, it can't decode with LM now.
      
      Resolves: https://github.com/pytorch/audio/issues/2957.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3096
      
      Reviewed By: nateanl
      
      Differential Revision: D44709397
      
      Pulled By: mthrok
      
      fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155
      0a1801ed
  3. 05 Apr, 2023 1 commit
  4. 14 Feb, 2023 1 commit
  5. 09 Feb, 2023 1 commit
    • moto's avatar
      Follow-up fix policy set (#3046) · 70acff7a
      moto authored
      Summary:
      Commit b4c66d1f broke all the CIs.
      The new policy changes the timestamp of configuration files of third party libraries,
      which triggers re-configuration which requires extra tools.
      
      This commit fixes it by reverting the old behavior.
      Also this adds guard for older cmake versions.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3046
      
      Reviewed By: atalman
      
      Differential Revision: D43133536
      
      Pulled By: mthrok
      
      fbshipit-source-id: 357055c8c1b53e593b8b7880f2045e13512c7a8f
      70acff7a
  6. 08 Feb, 2023 1 commit
    • moto's avatar
      Suppres warning about archive timestamp (#3044) · b4c66d1f
      moto authored
      Summary:
      Currently, for each third party library checked out with ExternalProject_Add, the following warning is shown.
      
      This commit set the policy so that the warning is not shown.
      
      ```
      CMake Warning (dev) at ci_env/lib/python3.10/site-packages/cmake/data/share/cmake-3.25/Modules/ExternalProject.cmake:3075 (message):
        The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
        not set.  The policy's OLD behavior will be used.  When using a URL
        download, the timestamps of extracted files should preferably be that of
        the time of extraction, otherwise code that depends on the extracted
        contents might not be rebuilt if the URL changes.  The OLD behavior
        preserves the timestamps from the archive instead, but this is usually not
        what you want.  Update your project to the NEW behavior or specify the
        DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this
        robustness issue.
      ```
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3044
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D43110818
      
      Pulled By: mthrok
      
      fbshipit-source-id: d2e20c9fdbbeeedb5ad546fe32dbda28c5bdd431
      b4c66d1f
  7. 12 Jan, 2023 1 commit
  8. 04 Jan, 2023 1 commit
  9. 29 Dec, 2022 1 commit
  10. 29 Jul, 2022 1 commit
  11. 28 Jul, 2022 1 commit
  12. 02 Jun, 2022 1 commit
    • moto's avatar
      Remove mad (#2428) · d2ecba98
      moto authored
      Summary:
      Remove the code related to libmad, which had been disabled in https://github.com/pytorch/audio/issues/2354
      
      In https://github.com/pytorch/audio/issues/2419, we mp3 decoding to ffmpeg. But CI tests were still using libmad.
      This commit completely removes libmad from torchaudio.
      
      This is BC-breaking change as `apply_sox_effects_file` function cannot handle MP3, and it cannot fallback to ffmpeg.
      The workaround for this is to use `torchaudio.load` then `apply_sox_effects_tensor`.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2428
      
      Reviewed By: carolineechen
      
      Differential Revision: D36851805
      
      Pulled By: mthrok
      
      fbshipit-source-id: f98795c59a1ac61cef511f2bbeac37f7c3c69d55
      d2ecba98
  13. 13 May, 2022 1 commit
    • moto's avatar
      Move Streamer API out of prototype (#2378) · 72b712a1
      moto authored
      Summary:
      This commit moves the Streaming API out of prototype module.
      
      * The related classes are renamed as following
      
        - `Streamer` -> `StreamReader`.
        - `SourceStream` -> `StreamReaderSourceStream`
        - `SourceAudioStream` -> `StreamReaderSourceAudioStream`
        - `SourceVideoStream` -> `StreamReaderSourceVideoStream`
        - `OutputStream` -> `StreamReaderOutputStream`
      
      This change is preemptive measurement for the possibility to add
      `StreamWriter` API.
      
      * Replace BUILD_FFMPEG build arg with USE_FFMPEG
      
      We are not building FFmpeg, so USE_FFMPEG is more appropriate
      
       ---
      
      After https://github.com/pytorch/audio/issues/2377
      
      Remaining TODOs: (different PRs)
      - [ ] Introduce `is_ffmpeg_binding_available` function.
      - [ ] Refactor C++ code:
         - Rename `Streamer` to `StreamReader`.
         - Rename `streamer.[h|cpp]` to `stream_reader.[h|cpp]`.
         - Rename `prototype.cpp` to `stream_reader_binding.cpp`.
         - Introduce `stream_reader` directory.
      - [x] Enable FFmpeg in smoke test (https://github.com/pytorch/audio/issues/2381)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2378
      
      Reviewed By: carolineechen
      
      Differential Revision: D36359299
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6a57b702996af871e577fb7addbf3522081c1328
      72b712a1
  14. 28 Apr, 2022 1 commit
  15. 05 Jan, 2022 1 commit
    • moto's avatar
      Update ffmpeg discovery logic (#2124) · d8a65450
      moto authored
      Summary:
      Update ffmpeg discovery logic
      
      Previously the build process used pkg-config to locate
      an installation of ffmpeg, which does not work well Windows/CentOS.
      
      This commit update the discovery process to use the custom
      FindFFMPEG.cmake adopted from Kitware/VTK repository with addition of
      conda environment.
      
       The custom discovery logic can support Windows and CentOS.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2124
      
      Reviewed By: carolineechen
      
      Differential Revision: D33429564
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6cb50c1d8c58f51e0f3f3af5c5b541aa3a699bba
      d8a65450
  16. 30 Dec, 2021 1 commit
    • moto's avatar
      Add a switch to build ffmpeg binding (#2048) · ece03edc
      moto authored
      Summary:
      This PR adds `BUILD_FFMPEG` switch to torchaudio build process so that features related to ffmpeg are built.
      The flag is false by default, so no CI jobs or development flow are affected.
      
      This is because handling the dependencies around ffmpeg is a bit tricky.
      Currently, the CMake file uses `pkg-config` to find an ffmpeg installation in the system.
      This works fine for both conda-based installation and system-managed installation (like `apt`).
      
      In subsequent PRs, I will find a solution that works for local development and binary distributions.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2048
      
      Reviewed By: hwangjeff, nateanl
      
      Differential Revision: D33367260
      
      Pulled By: mthrok
      
      fbshipit-source-id: 94517acecb62bd6d4e96d4b7cbc3ab3c2a25706c
      ece03edc
  17. 18 Dec, 2021 1 commit
  18. 17 Dec, 2021 1 commit
    • moto's avatar
      Add static build of KenLM (#2076) · adc559a8
      moto authored
      Summary:
      Add KenLM and its dependencies required for static build (`zlib`, `bzip2`, `lzma` and `boost-thread`).
      
      The KenLM and its dependencies are build but since no corresponding code on torchaudio side is changed, the resulting torchaudio extension module is not changed. (therefore, as long as build process passes on CI this PR should be good to go.)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2076
      
      Reviewed By: carolineechen
      
      Differential Revision: D33189980
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6096113128b939f3cf70990c99aacc4aaa954584
      adc559a8
  19. 06 Oct, 2021 1 commit
  20. 29 Sep, 2021 1 commit
  21. 24 Sep, 2021 1 commit
  22. 16 Sep, 2021 1 commit
    • moto's avatar
      Split extension into custom impl and Python wrapper libraries (#1752) · 0f822179
      moto authored
      * Split `libtorchaudio` and `_torchaudio`
      
      This change extract the core implementation from `_torchaudio` to `libtorchaudio`,
      so that `libtorchaudio` is reusable in TorchScript-based app.
      
      `_torchaudio` is a wrapper around `libtorchaudio` and only provides PyBind11-based
      features. (currently file-like object support in I/O)
      
      * Removed `BUILD_LIBTORCHAUDIO` option
      
      When invoking `cmake`, `libtorchaudio` is always built, so this option is removed.
      
      The new assumptions around the library discoverability
      
      - In regular OSS workflow (`pip`/`conda`-based binary installation), both `libtorchaudio` and `_torchaudio` are present.
          In this case,`libtorchaudio` has to be loaded manually with `torch.ops.load_library` and/or `torch.classes.load_library` otherwise importing `_torchaudio` would not be able to resolve the symbols defined in `libtorchaudio`.
      - When `torchaudio` is deployed with PEX format (single zip file)
        - We expect that`libtorchaudio.so` exists as a file in some search path configured by client code.
        - `_torchaudio` is still importable and because we do not know where `libtorchaudio` will exist, we will let the dynamic loader resolve the dependency from `_torchaudio` to `libtorchaudio`, which should work as long as `libtorchaudio` is in a library search path (search path is not modifiable from already-running Python process).
      0f822179
  23. 13 Sep, 2021 1 commit
  24. 30 Aug, 2021 1 commit
  25. 26 Aug, 2021 1 commit
    • moto's avatar
      Default to BUILD_SOX=1 in non-Windows systems (#1725) · 89ea6955
      moto authored
      * Default to BUILD_SOX=1 in non-Windows systems
      
      Since the adaptation of CMake and restricting to the static linking of libsox,
      the build process has become much robust with libsox integration enabled.
      
      This commit makes it default behavior to build libsox integration in non-Windows systems.
      The build process still checks BUILD_SOX env var so, setting `BUILD_SOX=0` disables it.
      89ea6955
  26. 19 Aug, 2021 1 commit
  27. 28 Jun, 2021 1 commit
  28. 06 May, 2021 1 commit
  29. 02 Apr, 2021 1 commit
  30. 09 Feb, 2021 1 commit
  31. 08 Feb, 2021 2 commits
  32. 04 Feb, 2021 1 commit