1. 09 Oct, 2023 1 commit
  2. 19 Sep, 2023 1 commit
  3. 30 Aug, 2023 1 commit
  4. 19 Aug, 2023 1 commit
  5. 12 Jul, 2023 1 commit
    • moto's avatar
      Support multiple FFmpeg versions (#3464) · 786066b4
      moto authored
      Summary:
      This commit introduces support for multiple FFmpeg versions for OSS binary distributions.
      
      Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking.
      This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4.
      
      The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them.
      At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension.
      The order of preference is 6, 5, then 4.
      
      To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build.
      They are LGPL and downloaded from S3 at build time, instead of building every time.
      
      The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces
      single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built
      so that it will only support one specific version of FFmpeg.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3464
      
      Differential Revision: D47300223
      
      Pulled By: mthrok
      
      fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04
      786066b4
  6. 07 Jul, 2023 1 commit
    • moto's avatar
      Use pre-built binaries for ffmpeg extension (#3460) · f77c3e5b
      moto authored
      Summary:
      This commit changes the way FFmpeg extension is built.
      
      Originally, the build process expected the FFmpeg binaries to be somehow available in build env.
      This makes the build process unpredictable and prevents default enabling FFmpeg extension.
      
      The proposed change uses pre-built FFmpeg binaries as build-time only scaffold, which are built in our CI job https://github.com/pytorch/audio/actions/workflows/ffmpeg.yml.
      
      This makes the build process more predictable and removes the necessity to build FFmpeg in our CI.
      Currently, it supports macOS (arm64, x86_64), unix (x86_64, aarch64) and windows (amd64).
      The downside is that it no longer works with the architecture not listed above.
      We can potentially workaround by searching the FFmpeg binaries available in system (the old way) for
      these system, but since they are not supported by PyTorch, the priority is low.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3460
      
      Differential Revision: D47261885
      
      Pulled By: mthrok
      
      fbshipit-source-id: 223a15e95c9140c95688af968beb35ff40354476
      f77c3e5b
  7. 05 Jul, 2023 1 commit
  8. 02 Jun, 2023 1 commit
    • moto's avatar
      [BC-Breaking] Remove compute_kaldi_pitch (#3368) · 5bbbb1d5
      moto authored
      Summary:
      This commit removes compute_kaldi_pitch function and the underlying Kaldi integration from torchaudio.
      
      Kaldi pitch function was added in a short period of time by integrating the original Kaldi implementation, instead of reimplementing it in PyTorch.
      
      The Kaldi integration employed a hack which replaces the base vector/matrix implementation of Kaldi with PyTorch Tensor so that there is only one blas library within torchaudio.
      
      Recently, we are making torchaudio more lean, and we don't see a wide adoption of kaldi_pitch feature, so we decided to remove them.
      
      See some of the discussion https://github.com/pytorch/audio/issues/1269
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3368
      
      Differential Revision: D46406176
      
      Pulled By: mthrok
      
      fbshipit-source-id: ee5e24d825188f379979ddccd680c7323b119b1e
      5bbbb1d5
  9. 20 May, 2023 1 commit
  10. 28 Apr, 2023 1 commit
    • Yuekai Zhang's avatar
      Add cuctc decoder (#3096) · 0a1801ed
      Yuekai Zhang authored
      Summary:
      This PR implements a CUDA based ctc prefix beam search decoder.
      
      Attach serveral benchmark results using V100 below:
      |decoder type| model |datasets       | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size |
      |--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------|
      | cuctc |  conformer nemo    |dev clean        |7.68s | 8           |  32       | bpe         |    4  | 1000|
      | cuctc |  conformer nemo   |dev clean  (sort by length)      |1.6s | 8           |  32       | bpe         |    4  | 1000|
      | cuctc |  wav2vec2.0 torchaudio |dev clean                                |22s | 10           |  1       | char         |    2  | 29|
      | cuctc |   conformer espnet   |aishell1 test                             | 5s | 10           |  24       | char         |    4  | 4233|
      
      Note:
      1.  The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations.
      2. WER is the same as CPU implementations. However, it can't decode with LM now.
      
      Resolves: https://github.com/pytorch/audio/issues/2957.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3096
      
      Reviewed By: nateanl
      
      Differential Revision: D44709397
      
      Pulled By: mthrok
      
      fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155
      0a1801ed
  11. 05 Apr, 2023 1 commit
  12. 14 Feb, 2023 1 commit
  13. 09 Feb, 2023 1 commit
    • moto's avatar
      Follow-up fix policy set (#3046) · 70acff7a
      moto authored
      Summary:
      Commit b4c66d1f broke all the CIs.
      The new policy changes the timestamp of configuration files of third party libraries,
      which triggers re-configuration which requires extra tools.
      
      This commit fixes it by reverting the old behavior.
      Also this adds guard for older cmake versions.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3046
      
      Reviewed By: atalman
      
      Differential Revision: D43133536
      
      Pulled By: mthrok
      
      fbshipit-source-id: 357055c8c1b53e593b8b7880f2045e13512c7a8f
      70acff7a
  14. 08 Feb, 2023 1 commit
    • moto's avatar
      Suppres warning about archive timestamp (#3044) · b4c66d1f
      moto authored
      Summary:
      Currently, for each third party library checked out with ExternalProject_Add, the following warning is shown.
      
      This commit set the policy so that the warning is not shown.
      
      ```
      CMake Warning (dev) at ci_env/lib/python3.10/site-packages/cmake/data/share/cmake-3.25/Modules/ExternalProject.cmake:3075 (message):
        The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
        not set.  The policy's OLD behavior will be used.  When using a URL
        download, the timestamps of extracted files should preferably be that of
        the time of extraction, otherwise code that depends on the extracted
        contents might not be rebuilt if the URL changes.  The OLD behavior
        preserves the timestamps from the archive instead, but this is usually not
        what you want.  Update your project to the NEW behavior or specify the
        DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this
        robustness issue.
      ```
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3044
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D43110818
      
      Pulled By: mthrok
      
      fbshipit-source-id: d2e20c9fdbbeeedb5ad546fe32dbda28c5bdd431
      b4c66d1f
  15. 12 Jan, 2023 1 commit
  16. 04 Jan, 2023 1 commit
  17. 29 Dec, 2022 1 commit
  18. 29 Jul, 2022 1 commit
  19. 28 Jul, 2022 1 commit
  20. 02 Jun, 2022 1 commit
    • moto's avatar
      Remove mad (#2428) · d2ecba98
      moto authored
      Summary:
      Remove the code related to libmad, which had been disabled in https://github.com/pytorch/audio/issues/2354
      
      In https://github.com/pytorch/audio/issues/2419, we mp3 decoding to ffmpeg. But CI tests were still using libmad.
      This commit completely removes libmad from torchaudio.
      
      This is BC-breaking change as `apply_sox_effects_file` function cannot handle MP3, and it cannot fallback to ffmpeg.
      The workaround for this is to use `torchaudio.load` then `apply_sox_effects_tensor`.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2428
      
      Reviewed By: carolineechen
      
      Differential Revision: D36851805
      
      Pulled By: mthrok
      
      fbshipit-source-id: f98795c59a1ac61cef511f2bbeac37f7c3c69d55
      d2ecba98
  21. 13 May, 2022 1 commit
    • moto's avatar
      Move Streamer API out of prototype (#2378) · 72b712a1
      moto authored
      Summary:
      This commit moves the Streaming API out of prototype module.
      
      * The related classes are renamed as following
      
        - `Streamer` -> `StreamReader`.
        - `SourceStream` -> `StreamReaderSourceStream`
        - `SourceAudioStream` -> `StreamReaderSourceAudioStream`
        - `SourceVideoStream` -> `StreamReaderSourceVideoStream`
        - `OutputStream` -> `StreamReaderOutputStream`
      
      This change is preemptive measurement for the possibility to add
      `StreamWriter` API.
      
      * Replace BUILD_FFMPEG build arg with USE_FFMPEG
      
      We are not building FFmpeg, so USE_FFMPEG is more appropriate
      
       ---
      
      After https://github.com/pytorch/audio/issues/2377
      
      Remaining TODOs: (different PRs)
      - [ ] Introduce `is_ffmpeg_binding_available` function.
      - [ ] Refactor C++ code:
         - Rename `Streamer` to `StreamReader`.
         - Rename `streamer.[h|cpp]` to `stream_reader.[h|cpp]`.
         - Rename `prototype.cpp` to `stream_reader_binding.cpp`.
         - Introduce `stream_reader` directory.
      - [x] Enable FFmpeg in smoke test (https://github.com/pytorch/audio/issues/2381)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2378
      
      Reviewed By: carolineechen
      
      Differential Revision: D36359299
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6a57b702996af871e577fb7addbf3522081c1328
      72b712a1
  22. 28 Apr, 2022 1 commit
  23. 05 Jan, 2022 1 commit
    • moto's avatar
      Update ffmpeg discovery logic (#2124) · d8a65450
      moto authored
      Summary:
      Update ffmpeg discovery logic
      
      Previously the build process used pkg-config to locate
      an installation of ffmpeg, which does not work well Windows/CentOS.
      
      This commit update the discovery process to use the custom
      FindFFMPEG.cmake adopted from Kitware/VTK repository with addition of
      conda environment.
      
       The custom discovery logic can support Windows and CentOS.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2124
      
      Reviewed By: carolineechen
      
      Differential Revision: D33429564
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6cb50c1d8c58f51e0f3f3af5c5b541aa3a699bba
      d8a65450
  24. 30 Dec, 2021 1 commit
    • moto's avatar
      Add a switch to build ffmpeg binding (#2048) · ece03edc
      moto authored
      Summary:
      This PR adds `BUILD_FFMPEG` switch to torchaudio build process so that features related to ffmpeg are built.
      The flag is false by default, so no CI jobs or development flow are affected.
      
      This is because handling the dependencies around ffmpeg is a bit tricky.
      Currently, the CMake file uses `pkg-config` to find an ffmpeg installation in the system.
      This works fine for both conda-based installation and system-managed installation (like `apt`).
      
      In subsequent PRs, I will find a solution that works for local development and binary distributions.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2048
      
      Reviewed By: hwangjeff, nateanl
      
      Differential Revision: D33367260
      
      Pulled By: mthrok
      
      fbshipit-source-id: 94517acecb62bd6d4e96d4b7cbc3ab3c2a25706c
      ece03edc
  25. 18 Dec, 2021 1 commit
  26. 17 Dec, 2021 1 commit
    • moto's avatar
      Add static build of KenLM (#2076) · adc559a8
      moto authored
      Summary:
      Add KenLM and its dependencies required for static build (`zlib`, `bzip2`, `lzma` and `boost-thread`).
      
      The KenLM and its dependencies are build but since no corresponding code on torchaudio side is changed, the resulting torchaudio extension module is not changed. (therefore, as long as build process passes on CI this PR should be good to go.)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2076
      
      Reviewed By: carolineechen
      
      Differential Revision: D33189980
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6096113128b939f3cf70990c99aacc4aaa954584
      adc559a8
  27. 06 Oct, 2021 1 commit
  28. 29 Sep, 2021 1 commit
  29. 24 Sep, 2021 1 commit
  30. 16 Sep, 2021 1 commit
    • moto's avatar
      Split extension into custom impl and Python wrapper libraries (#1752) · 0f822179
      moto authored
      * Split `libtorchaudio` and `_torchaudio`
      
      This change extract the core implementation from `_torchaudio` to `libtorchaudio`,
      so that `libtorchaudio` is reusable in TorchScript-based app.
      
      `_torchaudio` is a wrapper around `libtorchaudio` and only provides PyBind11-based
      features. (currently file-like object support in I/O)
      
      * Removed `BUILD_LIBTORCHAUDIO` option
      
      When invoking `cmake`, `libtorchaudio` is always built, so this option is removed.
      
      The new assumptions around the library discoverability
      
      - In regular OSS workflow (`pip`/`conda`-based binary installation), both `libtorchaudio` and `_torchaudio` are present.
          In this case,`libtorchaudio` has to be loaded manually with `torch.ops.load_library` and/or `torch.classes.load_library` otherwise importing `_torchaudio` would not be able to resolve the symbols defined in `libtorchaudio`.
      - When `torchaudio` is deployed with PEX format (single zip file)
        - We expect that`libtorchaudio.so` exists as a file in some search path configured by client code.
        - `_torchaudio` is still importable and because we do not know where `libtorchaudio` will exist, we will let the dynamic loader resolve the dependency from `_torchaudio` to `libtorchaudio`, which should work as long as `libtorchaudio` is in a library search path (search path is not modifiable from already-running Python process).
      0f822179
  31. 13 Sep, 2021 1 commit
  32. 30 Aug, 2021 1 commit
  33. 26 Aug, 2021 1 commit
    • moto's avatar
      Default to BUILD_SOX=1 in non-Windows systems (#1725) · 89ea6955
      moto authored
      * Default to BUILD_SOX=1 in non-Windows systems
      
      Since the adaptation of CMake and restricting to the static linking of libsox,
      the build process has become much robust with libsox integration enabled.
      
      This commit makes it default behavior to build libsox integration in non-Windows systems.
      The build process still checks BUILD_SOX env var so, setting `BUILD_SOX=0` disables it.
      89ea6955
  34. 19 Aug, 2021 1 commit
  35. 28 Jun, 2021 1 commit
  36. 06 May, 2021 1 commit
  37. 02 Apr, 2021 1 commit
  38. 09 Feb, 2021 1 commit
  39. 08 Feb, 2021 2 commits