Commits · 8f2f74c38f10c148ad8edaafa08a4aecf7be472e · OpenDAS / Torchaudio

10 Jun, 2025 1 commit
- add --gpu-max-threads-per-block · 8f2f74c3
  zhanggzh authored Jun 10, 2025
  
  8f2f74c3
12 Oct, 2023 1 commit

Move libtorchaudio to dedicated directory · e65e4726

moto-meta authored Oct 11, 2023

Differential Revision: D50086556

Pull Request resolved: https://github.com/pytorch/audio/pull/3648

e65e4726

11 Oct, 2023 1 commit

Move libtorchaudio_ffmpeg to dedicated directory · 2836a23d

moto-meta authored Oct 11, 2023

Differential Revision: D50082877

Pull Request resolved: https://github.com/pytorch/audio/pull/3646

2836a23d

09 Oct, 2023 1 commit

Migrate to src-layout · ec13a815

moto-meta authored Oct 09, 2023

Differential Revision: D49965263

Pull Request resolved: https://github.com/pytorch/audio/pull/3639

ec13a815

19 Sep, 2023 1 commit
- Add wall implementation for RIR ray tracing (#3612) · 94aafd83
  moto authored Sep 19, 2023
```
Extracted from #3604

Add Wall helper class and C++ unit test
```
  94aafd83
30 Aug, 2023 1 commit

Revert "Enable ROCm RNN-T Loss (#2485)" (#3586) · 5cf7d2db

atalman authored Aug 30, 2023

Summary:
This reverts commit c5939616.

Unblock 2.1.0 rc

Pull Request resolved: https://github.com/pytorch/audio/pull/3586

Reviewed By: osalpekar

Differential Revision: D48842032

Pulled By: atalman

fbshipit-source-id: bbdf9e45c9aa5fde00f315a2ff491ed050bc1707

5cf7d2db

19 Aug, 2023 1 commit

Enable ROCm RNN-T Loss (#2485) · c5939616

Juan Villamizar authored Aug 18, 2023

Summary:
Added HIPIFY code and small changes for ROCm. Targeting RNN-T loss.

Pull Request resolved: https://github.com/pytorch/audio/pull/2485

Reviewed By: huangruizhe

Differential Revision: D43537864

Pulled By: mthrok

fbshipit-source-id: 4bdb1f291dc51a12232ccd072b97ae94ae20cc0c

c5939616

12 Jul, 2023 1 commit

Support multiple FFmpeg versions (#3464) · 786066b4

moto authored Jul 11, 2023

Summary:
This commit introduces support for multiple FFmpeg versions for OSS binary distributions.

Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking.
This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4.

The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them.
At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension.
The order of preference is 6, 5, then 4.

To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build.
They are LGPL and downloaded from S3 at build time, instead of building every time.

The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces
single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built
so that it will only support one specific version of FFmpeg.

Pull Request resolved: https://github.com/pytorch/audio/pull/3464

Differential Revision: D47300223

Pulled By: mthrok

fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04

786066b4

07 Jul, 2023 1 commit

Use pre-built binaries for ffmpeg extension (#3460) · f77c3e5b

moto authored Jul 07, 2023

Summary:
This commit changes the way FFmpeg extension is built.

Originally, the build process expected the FFmpeg binaries to be somehow available in build env.
This makes the build process unpredictable and prevents default enabling FFmpeg extension.

The proposed change uses pre-built FFmpeg binaries as build-time only scaffold, which are built in our CI job https://github.com/pytorch/audio/actions/workflows/ffmpeg.yml.

This makes the build process more predictable and removes the necessity to build FFmpeg in our CI.
Currently, it supports macOS (arm64, x86_64), unix (x86_64, aarch64) and windows (amd64).
The downside is that it no longer works with the architecture not listed above.
We can potentially workaround by searching the FFmpeg binaries available in system (the old way) for
these system, but since they are not supported by PyTorch, the priority is low.

Pull Request resolved: https://github.com/pytorch/audio/pull/3460

Differential Revision: D47261885

Pulled By: mthrok

fbshipit-source-id: 223a15e95c9140c95688af968beb35ff40354476

f77c3e5b

05 Jul, 2023 1 commit

Untangle third party inclusion in CMake (#3457) · c34a1d6d

moto authored Jul 05, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3457

Differential Revision: D47241343

Pulled By: mthrok

fbshipit-source-id: fd1bfd1531397cb59e9cf11de9dede6949f8517e

c34a1d6d

02 Jun, 2023 1 commit

[BC-Breaking] Remove compute_kaldi_pitch (#3368) · 5bbbb1d5

moto authored Jun 02, 2023

Summary:
This commit removes compute_kaldi_pitch function and the underlying Kaldi integration from torchaudio.

Kaldi pitch function was added in a short period of time by integrating the original Kaldi implementation, instead of reimplementing it in PyTorch.

The Kaldi integration employed a hack which replaces the base vector/matrix implementation of Kaldi with PyTorch Tensor so that there is only one blas library within torchaudio.

Recently, we are making torchaudio more lean, and we don't see a wide adoption of kaldi_pitch feature, so we decided to remove them.

See some of the discussion https://github.com/pytorch/audio/issues/1269

Pull Request resolved: https://github.com/pytorch/audio/pull/3368

Differential Revision: D46406176

Pulled By: mthrok

fbshipit-source-id: ee5e24d825188f379979ddccd680c7323b119b1e

5bbbb1d5

20 May, 2023 1 commit

[audio][PR] Add forced_align function to torchaudio (#3348) · e7935cff

Zhaoheng Ni authored May 19, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3348

The pull request adds a CTC-based forced alignment function that supports both CPU and CUDA deviced. The function takes the CTC emissions and target labels as inputs and generates the corresponding labels for each frame.

Reviewed By: vineelpratap, xiaohui-zhang

Differential Revision: D45867265

fbshipit-source-id: 3e25b06bf9bc8bb1bdcdc08de7f4434d912154cb

e7935cff

28 Apr, 2023 1 commit

Add cuctc decoder (#3096) · 0a1801ed

Yuekai Zhang authored Apr 28, 2023

Summary:
This PR implements a CUDA based ctc prefix beam search decoder.

Attach serveral benchmark results using V100 below:
|decoder type| model |datasets       | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size |
|--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------|
| cuctc |  conformer nemo    |dev clean        |7.68s | 8           |  32       | bpe         |    4  | 1000|
| cuctc |  conformer nemo   |dev clean  (sort by length)      |1.6s | 8           |  32       | bpe         |    4  | 1000|
| cuctc |  wav2vec2.0 torchaudio |dev clean                                |22s | 10           |  1       | char         |    2  | 29|
| cuctc |   conformer espnet   |aishell1 test                             | 5s | 10           |  24       | char         |    4  | 4233|

Note:
1.  The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations.
2. WER is the same as CPU implementations. However, it can't decode with LM now.

Resolves: https://github.com/pytorch/audio/issues/2957.

Pull Request resolved: https://github.com/pytorch/audio/pull/3096

Reviewed By: nateanl

Differential Revision: D44709397

Pulled By: mthrok

fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155

0a1801ed

05 Apr, 2023 1 commit

Remove source for flashlight-text bundle (#3236) · 5053aa7f

moto authored Apr 05, 2023

Summary:
Following https://github.com/pytorch/audio/pull/3232, static build of flashlight-text has been disabled and removed from nightly build.

This commit removes the related source/build from torchaudio code base.

Pull Request resolved: https://github.com/pytorch/audio/pull/3236

Reviewed By: jacobkahn

Differential Revision: D44712539

Pulled By: mthrok

fbshipit-source-id: a201c89b5046f224526309cd4e17a5105e58a949

5053aa7f

14 Feb, 2023 1 commit

Add simulate_rir_ism method for room impulse response simulation (#2880) · 8c5c9a9b

Zhaoheng Ni authored Feb 14, 2023

Summary:
replicate of https://github.com/pytorch/audio/issues/2644

Pull Request resolved: https://github.com/pytorch/audio/pull/2880

Reviewed By: mthrok

Differential Revision: D41633911

Pulled By: nateanl

fbshipit-source-id: 73cf145d75c389e996aafe96571ab86dc21f86e5

8c5c9a9b

09 Feb, 2023 1 commit

Follow-up fix policy set (#3046) · 70acff7a

moto authored Feb 09, 2023

Summary:
Commit b4c66d1f broke all the CIs.
The new policy changes the timestamp of configuration files of third party libraries,
which triggers re-configuration which requires extra tools.

This commit fixes it by reverting the old behavior.
Also this adds guard for older cmake versions.

Pull Request resolved: https://github.com/pytorch/audio/pull/3046

Reviewed By: atalman

Differential Revision: D43133536

Pulled By: mthrok

fbshipit-source-id: 357055c8c1b53e593b8b7880f2045e13512c7a8f

70acff7a

08 Feb, 2023 1 commit

Suppres warning about archive timestamp (#3044) · b4c66d1f

moto authored Feb 08, 2023

Summary:
Currently, for each third party library checked out with ExternalProject_Add, the following warning is shown.

This commit set the policy so that the warning is not shown.

```
CMake Warning (dev) at ci_env/lib/python3.10/site-packages/cmake/data/share/cmake-3.25/Modules/ExternalProject.cmake:3075 (message):
  The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
  not set.  The policy's OLD behavior will be used.  When using a URL
  download, the timestamps of extracted files should preferably be that of
  the time of extraction, otherwise code that depends on the extracted
  contents might not be rebuilt if the URL changes.  The OLD behavior
  preserves the timestamps from the archive instead, but this is usually not
  what you want.  Update your project to the NEW behavior or specify the
  DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this
  robustness issue.
```

Pull Request resolved: https://github.com/pytorch/audio/pull/3044

Reviewed By: xiaohui-zhang

Differential Revision: D43110818

Pulled By: mthrok

fbshipit-source-id: d2e20c9fdbbeeedb5ad546fe32dbda28c5bdd431

b4c66d1f

12 Jan, 2023 1 commit

Update C++ standard to 17 (#2973) · d1cc1da6

moto authored Jan 11, 2023

Summary:
Following the change in PyTorch core.

https://github.com/pytorch/pytorch/commit/87e4a087784c805312a2b48bb063d2400df26c5e

Pull Request resolved: https://github.com/pytorch/audio/pull/2973

Reviewed By: xiaohui-zhang

Differential Revision: D42462709

Pulled By: mthrok

fbshipit-source-id: 60c2aa3d63fe25d8e0b7aa476404e7a55d6eb87f

d1cc1da6

04 Jan, 2023 1 commit

Use CCache if available (#2866) · d5b5aba6

moto authored Jan 04, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2866

Reviewed By: carolineechen

Differential Revision: D42349474

Pulled By: mthrok

fbshipit-source-id: 31455184031fff52719ef829e40bb1e09e11b0e7

d5b5aba6

29 Dec, 2022 1 commit

Refactor CMake modules (#2930) · 7b5317b3

moto authored Dec 29, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2930

Reviewed By: carolineechen, nateanl

Differential Revision: D42280966

Pulled By: mthrok

fbshipit-source-id: f9d5f1dc7c1a62d932fb2020aafb63734f2bf405

7b5317b3

29 Jul, 2022 1 commit

Enable CTC decoder in Windows (#2587) · 67cb420d

moto authored Jul 29, 2022

Summary:
This commit enables CTC decoder on Windows.

The functionality seems to work fine.
The tests are passing, the decoding tutorial runs fine.

The only difference to the Linux/macOS version is that
loading model in XZ compression format is not supported.

![289961785_399620772041679_7768117002438616376_n](https://user-images.githubusercontent.com/855818/181420923-cfbd8402-20de-4e63-b9e4-e39f9aa9fc50.png)

Pull Request resolved: https://github.com/pytorch/audio/pull/2587

Reviewed By: carolineechen, nateanl

Differential Revision: D38276490

Pulled By: mthrok

fbshipit-source-id: f2203b2235c5bbb0220fe560aaaf0e1d5530347a

67cb420d

28 Jul, 2022 1 commit

Refactor cmake (#2585) · d84ce3b2

moto authored Jul 28, 2022

Summary:
Extract the helper functions for defining library and extension so that they can be reused for building flashlight library and binding in https://github.com/pytorch/audio/issues/2580.

Pull Request resolved: https://github.com/pytorch/audio/pull/2585

Reviewed By: carolineechen

Differential Revision: D38233407

Pulled By: mthrok

fbshipit-source-id: 96f7c62a8b70bb3ff5caede9730165d54a55272f

d84ce3b2

02 Jun, 2022 1 commit

Remove mad (#2428) · d2ecba98

moto authored Jun 02, 2022

Summary:
Remove the code related to libmad, which had been disabled in https://github.com/pytorch/audio/issues/2354

In https://github.com/pytorch/audio/issues/2419, we mp3 decoding to ffmpeg. But CI tests were still using libmad.
This commit completely removes libmad from torchaudio.

This is BC-breaking change as `apply_sox_effects_file` function cannot handle MP3, and it cannot fallback to ffmpeg.
The workaround for this is to use `torchaudio.load` then `apply_sox_effects_tensor`.

Pull Request resolved: https://github.com/pytorch/audio/pull/2428

Reviewed By: carolineechen

Differential Revision: D36851805

Pulled By: mthrok

fbshipit-source-id: f98795c59a1ac61cef511f2bbeac37f7c3c69d55

d2ecba98

13 May, 2022 1 commit

Move Streamer API out of prototype (#2378) · 72b712a1

moto authored May 13, 2022

Summary:
This commit moves the Streaming API out of prototype module.

* The related classes are renamed as following

  - `Streamer` -> `StreamReader`.
  - `SourceStream` -> `StreamReaderSourceStream`
  - `SourceAudioStream` -> `StreamReaderSourceAudioStream`
  - `SourceVideoStream` -> `StreamReaderSourceVideoStream`
  - `OutputStream` -> `StreamReaderOutputStream`

This change is preemptive measurement for the possibility to add
`StreamWriter` API.

* Replace BUILD_FFMPEG build arg with USE_FFMPEG

We are not building FFmpeg, so USE_FFMPEG is more appropriate

 ---

After https://github.com/pytorch/audio/issues/2377

Remaining TODOs: (different PRs)
- [ ] Introduce `is_ffmpeg_binding_available` function.
- [ ] Refactor C++ code:
   - Rename `Streamer` to `StreamReader`.
   - Rename `streamer.[h|cpp]` to `stream_reader.[h|cpp]`.
   - Rename `prototype.cpp` to `stream_reader_binding.cpp`.
   - Introduce `stream_reader` directory.
- [x] Enable FFmpeg in smoke test (https://github.com/pytorch/audio/issues/2381)

Pull Request resolved: https://github.com/pytorch/audio/pull/2378

Reviewed By: carolineechen

Differential Revision: D36359299

Pulled By: mthrok

fbshipit-source-id: 6a57b702996af871e577fb7addbf3522081c1328

72b712a1

28 Apr, 2022 1 commit

Add BUILD_MAD option and default to OFF (#2354) · a71e3a40

moto authored Apr 28, 2022

Summary:
libmad integration should be enabled only from source-build

Pull Request resolved: https://github.com/pytorch/audio/pull/2354

Reviewed By: nateanl

Differential Revision: D36012035

Pulled By: mthrok

fbshipit-source-id: adeda8cbfd418f96245909cae6862b648a6915a7

a71e3a40

05 Jan, 2022 1 commit

Update ffmpeg discovery logic (#2124) · d8a65450

moto authored Jan 05, 2022

Summary:
Update ffmpeg discovery logic

Previously the build process used pkg-config to locate
an installation of ffmpeg, which does not work well Windows/CentOS.

This commit update the discovery process to use the custom
FindFFMPEG.cmake adopted from Kitware/VTK repository with addition of
conda environment.

 The custom discovery logic can support Windows and CentOS.

Pull Request resolved: https://github.com/pytorch/audio/pull/2124

Reviewed By: carolineechen

Differential Revision: D33429564

Pulled By: mthrok

fbshipit-source-id: 6cb50c1d8c58f51e0f3f3af5c5b541aa3a699bba

d8a65450

30 Dec, 2021 1 commit

Add a switch to build ffmpeg binding (#2048) · ece03edc

moto authored Dec 30, 2021

Summary:
This PR adds `BUILD_FFMPEG` switch to torchaudio build process so that features related to ffmpeg are built.
The flag is false by default, so no CI jobs or development flow are affected.

This is because handling the dependencies around ffmpeg is a bit tricky.
Currently, the CMake file uses `pkg-config` to find an ffmpeg installation in the system.
This works fine for both conda-based installation and system-managed installation (like `apt`).

In subsequent PRs, I will find a solution that works for local development and binary distributions.

Pull Request resolved: https://github.com/pytorch/audio/pull/2048

Reviewed By: hwangjeff, nateanl

Differential Revision: D33367260

Pulled By: mthrok

fbshipit-source-id: 94517acecb62bd6d4e96d4b7cbc3ab3c2a25706c

ece03edc

18 Dec, 2021 1 commit

Add FL Decoder / KenLM integration to build process (#2078) · 246dd52a

moto authored Dec 18, 2021

Summary:
After all the C++ code from https://github.com/pytorch/audio/issues/2072 are added, this commit will enable decoder/KenLM integration in the build process.

Pull Request resolved: https://github.com/pytorch/audio/pull/2078

Reviewed By: carolineechen

Differential Revision: D33198183

Pulled By: mthrok

fbshipit-source-id: 9d7fa76151d06fbbac3785183c7c2ff9862d3128

246dd52a

17 Dec, 2021 1 commit

Add static build of KenLM (#2076) · adc559a8

moto authored Dec 17, 2021

Summary:
Add KenLM and its dependencies required for static build (`zlib`, `bzip2`, `lzma` and `boost-thread`).

The KenLM and its dependencies are build but since no corresponding code on torchaudio side is changed, the resulting torchaudio extension module is not changed. (therefore, as long as build process passes on CI this PR should be good to go.)

Pull Request resolved: https://github.com/pytorch/audio/pull/2076

Reviewed By: carolineechen

Differential Revision: D33189980

Pulled By: mthrok

fbshipit-source-id: 6096113128b939f3cf70990c99aacc4aaa954584

adc559a8

06 Oct, 2021 1 commit
- Add OpenMP support (#1761) · e3734fef
  moto authored Oct 06, 2021
  
  e3734fef
29 Sep, 2021 1 commit
- [fbsync] Remove trailing whitespace (#1803) · b75e3bb9
  Caroline Chen authored Sep 29, 2021
  
  b75e3bb9
24 Sep, 2021 1 commit
- Fix build on Windows with CUDA (#1787) · cf0adb28
  Yi Zhang authored Sep 24, 2021
```
This commit fixes the local build on Windows with CUDA.
```
  cf0adb28
16 Sep, 2021 1 commit

Split extension into custom impl and Python wrapper libraries (#1752) · 0f822179

moto authored Sep 16, 2021

* Split `libtorchaudio` and `_torchaudio`

This change extract the core implementation from `_torchaudio` to `libtorchaudio`,
so that `libtorchaudio` is reusable in TorchScript-based app.

`_torchaudio` is a wrapper around `libtorchaudio` and only provides PyBind11-based
features. (currently file-like object support in I/O)

* Removed `BUILD_LIBTORCHAUDIO` option

When invoking `cmake`, `libtorchaudio` is always built, so this option is removed.

The new assumptions around the library discoverability

- In regular OSS workflow (`pip`/`conda`-based binary installation), both `libtorchaudio` and `_torchaudio` are present.
    In this case,`libtorchaudio` has to be loaded manually with `torch.ops.load_library` and/or `torch.classes.load_library` otherwise importing `_torchaudio` would not be able to resolve the symbols defined in `libtorchaudio`.
- When `torchaudio` is deployed with PEX format (single zip file)
  - We expect that`libtorchaudio.so` exists as a file in some search path configured by client code.
  - `_torchaudio` is still importable and because we do not know where `libtorchaudio` will exist, we will let the dynamic loader resolve the dependency from `_torchaudio` to `libtorchaudio`, which should work as long as `libtorchaudio` is in a library search path (search path is not modifiable from already-running Python process).

0f822179

13 Sep, 2021 1 commit

[ROCM] fix build error (#1729) · ddb04e7d

Michael Melesse authored Sep 13, 2021



* fix build error on ROCM

* Update CMakeLists.txt
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>

* address comments and fix cuda detction on rocm
Co-authored-by: Nikita Shulga <nikita.shulga@gmail.com>

ddb04e7d

30 Aug, 2021 1 commit

setup.py should parse TORCH_CUDA_ARCH_LIST (#1733) · 8cbd56c2

Nikita Shulga authored Aug 29, 2021

Needed to support CUDA builds on CPU machine

Parse `TORCH_CUDA_ARCH_LIST` as new-CUDA-language Cmake-3.18+ style [CMAKE_CUDA_ARCHITECTURES](https://cmake.org/cmake/help/latest/prop_tgt/CUDA_ARCHITECTURES.html#prop_tgt:CUDA_ARCHITECTURES)

8cbd56c2

26 Aug, 2021 1 commit

Default to BUILD_SOX=1 in non-Windows systems (#1725) · 89ea6955

moto authored Aug 26, 2021

* Default to BUILD_SOX=1 in non-Windows systems

Since the adaptation of CMake and restricting to the static linking of libsox,
the build process has become much robust with libsox integration enabled.

This commit makes it default behavior to build libsox integration in non-Windows systems.
The build process still checks BUILD_SOX env var so, setting `BUILD_SOX=0` disables it.

89ea6955

19 Aug, 2021 1 commit
- Move RNNT Loss out of prototype (#1711) · 2c115821
  Caroline Chen authored Aug 19, 2021
  
  2c115821
28 Jun, 2021 1 commit
- Rename transducer to RNNT (#1603) · a9623854
  Caroline Chen authored Jun 28, 2021
  
  a9623854
06 May, 2021 1 commit
- Add GPU RNNT Loss (#1483) · 5417e4fb
  Caroline Chen authored May 06, 2021
  
  5417e4fb
02 Apr, 2021 1 commit
- [ROCM] Add ROCm support to source build (#1411) · a6cdd6c7
  Michael Melesse authored Apr 02, 2021
  
  a6cdd6c7