Commits · e908357118c38227ecfb8784d173162b832fc23c · OpenDAS / Torchaudio

04 Jun, 2023 1 commit

Update HuBERT/SSL training recipes to support Lightning 2.x (#3396) · e9083571

Zhaoheng Ni authored Jun 04, 2023

Summary:
There are some BC-Breaking changes from pytorch_lightning to lightning library. The PR adjust those changes to support latest lightning library.

Pull Request resolved: https://github.com/pytorch/audio/pull/3396

Reviewed By: mthrok

Differential Revision: D46345206

Pulled By: nateanl

fbshipit-source-id: 59469c15dc5fe5466a99a5b5380eb4f98c2c633f

e9083571

03 Jun, 2023 1 commit

[audio][PR] Add option to dlopen FFmpeg libraries (#3402) · b7d3e89a

Moto Hira authored Jun 02, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3402

This is a second attempt of https://github.com/pytorch/audio/pull/3353.

The basic logic to enable dlopen for FFmpeg libraries are same.
It uses `at::DynamicLibrary`, which allows to compile torchaudio without
linking FFmpeg libraries.

This time, the option to enable this feature DLOPEN_FFMPEG has been added,
so that users have a way to disable this feature and keep using build-time
linking.

Please refer to stub.h for more technical detail.

Differential Revision: D46403783

fbshipit-source-id: ca3db57ff6bdc50c8c225d22f12f3e76c6dc3f16

b7d3e89a

02 Jun, 2023 3 commits

[BC-Breaking] Remove compute_kaldi_pitch (#3368) · 5bbbb1d5

moto authored Jun 02, 2023

Summary:
This commit removes compute_kaldi_pitch function and the underlying Kaldi integration from torchaudio.

Kaldi pitch function was added in a short period of time by integrating the original Kaldi implementation, instead of reimplementing it in PyTorch.

The Kaldi integration employed a hack which replaces the base vector/matrix implementation of Kaldi with PyTorch Tensor so that there is only one blas library within torchaudio.

Recently, we are making torchaudio more lean, and we don't see a wide adoption of kaldi_pitch feature, so we decided to remove them.

See some of the discussion https://github.com/pytorch/audio/issues/1269

Pull Request resolved: https://github.com/pytorch/audio/pull/3368

Differential Revision: D46406176

Pulled By: mthrok

fbshipit-source-id: ee5e24d825188f379979ddccd680c7323b119b1e

5bbbb1d5

Update data augmentation tutorial (#3375) · 2ba36b47

moto authored Jun 02, 2023

Summary:
Replace sox_effects with `torchaudio.io.AudioEffector`

1. To show case the new and better feature
2. To prepare for the upcoming removal of file-like support object

Pull Request resolved: https://github.com/pytorch/audio/pull/3375

Reviewed By: nateanl

Differential Revision: D46379016

Pulled By: mthrok

fbshipit-source-id: 70f24b62494204949f327f6ac6c49f315c9ee315

2ba36b47

Revert D46059199: [audio][PR] Use dlopen for FFmpeg · ab7a39f7

Moto Hira authored Jun 02, 2023

Differential Revision:
D46059199

Original commit changeset: 4493a5fd8a4c

Original Phabricator Diff: D46059199

fbshipit-source-id: 71cde3f8cd870d1ad9114e3e87cdd1ba564441c0

ab7a39f7

01 Jun, 2023 8 commits

Use dlopen for FFmpeg (#3353) · b14ced1a

moto authored Jun 01, 2023

Summary:
This commit changes the way FFmpeg extension is built and used.
Instead of linking (LGPL) FFmpeg libraries to torchaudio at build time,
It uses dlopen to search and link them at run time.

For dlopen-ing, we use PyTorch's `at::DynamicLibrary` class, which provides
portable wrapper.

Pull Request resolved: https://github.com/pytorch/audio/pull/3353

Differential Revision: D46059199

Pulled By: mthrok

fbshipit-source-id: 4493a5fd8a4c802178d20276522f5334d637307d

b14ced1a

[BC-breaking] Remove file-like object support from sox_io backend (#3035) · bc54ac8a

moto authored Jun 01, 2023

Summary:
This commit removes file-like obejct support so that we can remove custom patch

The motivation and plan is outlined in https://github.com/pytorch/audio/issues/2950.

Pull Request resolved: https://github.com/pytorch/audio/pull/3035

Reviewed By: hwangjeff

Differential Revision: D44695647

Pulled By: mthrok

fbshipit-source-id: 13af0234e288c041bc7b490e1f967f85ce7eb8ec

bc54ac8a

[Nova] Deleting Remaining CircleCI jobs (#3399) · cc89f743

Omkar Salpekar authored Jun 01, 2023

Summary:
This job completely deletes the CircleCI `config.yml`. Here is what was remaining in the config at the point of deletion:

Used Jobs:
* **Lint** - Now running on Nova - see https://github.com/pytorch/audio/actions/runs/5144082942 for an example run on the latest PR in trunk
* **CircleCI Consistency** - Not needed anymore now if there is no CCI config.

Unused Jobs:
* **build-ffmpeg-$OS** - For the build jobs, we are already building FFMPEG from source as part of the Nova workflows.
* **download-third-parties** - This is caching. We currently do not have caching in Nova jobs, but atalman is working on adding support for this as a future optimization.

Pull Request resolved: https://github.com/pytorch/audio/pull/3399

Reviewed By: mthrok

Differential Revision: D46363921

Pulled By: osalpekar

fbshipit-source-id: 8abf5b0c1612c3492908fb2f5797e6b0a3c70766

cc89f743

Fix style issue (#3398) · c7ac1aff

moto authored Jun 01, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3398

Reviewed By: nateanl

Differential Revision: D46354862

Pulled By: mthrok

fbshipit-source-id: b86dcdfeff8ed9db87b0b78eca20f6f18117e97e

c7ac1aff

Fix apply_codec to use named file (#3397) · 1dfac469

moto authored Jun 01, 2023

Summary:
Follow-up https://github.com/pytorch/audio/issues/3386 The intended change was to use path of temporary file, instead of file-like object

Pull Request resolved: https://github.com/pytorch/audio/pull/3397

Reviewed By: hwangjeff

Differential Revision: D46346189

Pulled By: mthrok

fbshipit-source-id: 44da799c6587bcb63a118a6313b7299bad742a40

1dfac469

Refactor arg mapping in ffmpeg save function (#3387) · b99e5f46

moto authored May 31, 2023

Summary:
The arguments of TorchAudio's save function ("format", "bits_per_sample" and "encoding")
are not one-to-one mapping to the arguments of FFmpeg encoding.

For example, to use vorbis codec, FFmpeg expects "ogg" container/extension with "vorbis"
encoder. It does not recognize "vorbis" extension like TorchAudio (libsox) does.

This commit refactors the logic to parse/map the arguments.

As a result it now properly works with vorbis and mp3 extension.

Pull Request resolved: https://github.com/pytorch/audio/pull/3387

Reviewed By: hwangjeff

Differential Revision: D46328787

Pulled By: mthrok

fbshipit-source-id: 36f993952a062bfec58a8b51be6aa86297571f90

b99e5f46

Update and deprecate apply_codec function (#3386) · d6dd497c

moto authored May 31, 2023

Summary:
To prepare for the upcoming removal of file-like object support from sox_io backend,
this commit changes apply_codec function to use tempfile.

`apply_codec` function is now deprecated and users are encourated to use `torchaudio.io.AudioEffector`.
We will not remove the function itself, but will remove the entry from the doc.

Pull Request resolved: https://github.com/pytorch/audio/pull/3386

Reviewed By: hwangjeff

Differential Revision: D46330610

Pulled By: mthrok

fbshipit-source-id: 3071bdefa05b4cbb9f00629bef50f0981eae89b4

d6dd497c

Delete CCI Linux and MacOS Unittest Jobs (#3391) · d5d94b7e

Omkar Salpekar authored May 31, 2023

Summary:
Deprecates the Linux and MacOS Unittest jobs now that they've been running on Nova for over a week.

Aside: There was also a stylecheck job that was dependent on the Linux Unittest job. I also put up https://github.com/pytorch/audio/pull/3390 to move that stylecheck job to Nova. I'm happy to reintroduce the CCI stylecheck job standalone in CCI if we want the Nova version to run on main for a week.

Pull Request resolved: https://github.com/pytorch/audio/pull/3391

Reviewed By: mthrok

Differential Revision: D46324198

Pulled By: osalpekar

fbshipit-source-id: 2115748e153c5dee1a38db2b6230acebc4f56927

d5d94b7e

31 May, 2023 6 commits

[Nova] Stylechecks on Nova (#3390) · f7cb6c68

Omkar Salpekar authored May 31, 2023

Summary:
Introducing the stylecheck job on Nova. It seems like it is failing on trunk, but the functionality of this job itself is working and it fails with the same error as it does on trunk with CCI.

Pull Request resolved: https://github.com/pytorch/audio/pull/3390

Reviewed By: mthrok

Differential Revision: D46324223

Pulled By: osalpekar

fbshipit-source-id: 1324202e53569d610559ef6f1b90cb5c364e6909

f7cb6c68

[Nova] Lint on GHA (#3341) · 5d0697bc

Omkar Salpekar authored May 31, 2023

Summary:
See title. If all is well, we can deprecate the CCI job in a few days.

Pull Request resolved: https://github.com/pytorch/audio/pull/3341

Reviewed By: mthrok

Differential Revision: D46324265

Pulled By: osalpekar

fbshipit-source-id: bc706c6ae4285d4085dc5f0223ea41d8fc290f1c

5d0697bc

Surface test failures on CI (#3394) · 2283df8a

moto authored May 31, 2023

Summary:
Set the directory of JUnitText XML file to the one where test-infra picks up and put them in summary.

Example: https://github.com/pytorch/audio/actions/runs/5136305988

Pull Request resolved: https://github.com/pytorch/audio/pull/3394

Differential Revision: D46328832

Pulled By: mthrok

fbshipit-source-id: f0b5020a911ca4ec09345a965bdec769300859f0

2283df8a

[Nova] Deprecate windows circleci unit tests (#3393) · c5d3706c

atalman authored May 31, 2023

Summary:
Nova - Deprecate windows circleci unit tests

Pull Request resolved: https://github.com/pytorch/audio/pull/3393

Reviewed By: malfet

Differential Revision: D46315608

Pulled By: atalman

fbshipit-source-id: 3d7b5d0618b9d2e12e5f97e21d7becdc61d85c69

c5d3706c

Windows GPU workflows (#3364) · 92d0fb55

atalman authored May 31, 2023

Summary:
Windows GPU workflows

Pull Request resolved: https://github.com/pytorch/audio/pull/3364

Reviewed By: mthrok

Differential Revision: D46292403

Pulled By: atalman

fbshipit-source-id: ee3c6f8082ca77bdc1ffdb930c59fa5a9cb25a4a

92d0fb55

Fixes to #3295 Improve RNN-T streaming decoding (#3379) · b8016e44

Jeff Hwang authored May 30, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3379

Fixes `RNNTBeamSearch.infer`'s docstring and removes unused import from tutorial.

Reviewed By: mthrok

Differential Revision: D46227174

fbshipit-source-id: 7c1c3f05a6476cb0437622dea6f3ae6cb3ea9468

b8016e44

30 May, 2023 3 commits

Disable failing GPU unit test (#3384) · caf3ac07

atalman authored May 30, 2023

Summary:
Disable failing GPU unit test.
See associated issue: https://github.com/pytorch/audio/issues/3376

Pull Request resolved: https://github.com/pytorch/audio/pull/3384

Reviewed By: mthrok

Differential Revision: D46279324

Pulled By: atalman

fbshipit-source-id: 3a606bb992e0261451f48d1fb458e054f7fd5583

caf3ac07

Use const reference (#3389) · 9cdf26fd

Moto Hira authored May 30, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3389

Adopt more of const reference in sox source code.

Differential Revision: D46264068

fbshipit-source-id: 809d34a6e16f621c856d4278ef7ce45a5868a717

9cdf26fd

Simplify sox namespace (#3383) · a81b0ed2

Moto Hira authored May 30, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3383

This commit reduces `torchaudio::sox_*` namespace into `torchaudio::sox`.
Also put Pybind11 registration and TorchBind registration into anonymous namescope.

Differential Revision: D46257367

fbshipit-source-id: 0f0f181eaa72036916e223263daf4b7c298fca0d

a81b0ed2

29 May, 2023 1 commit

[Nova] Windows CPU Unittests on Nova (#3329) · 6425d46c

Omkar Salpekar authored May 29, 2023

Summary:
Continuing with the job migrations from CCI to Nova, this PR introduces the Windows CPU Unittest job as a Nova workflow.

The job is passing: https://github.com/pytorch/audio/actions/runs/5094569687/jobs/9159020192?pr=3329.

Pull Request resolved: https://github.com/pytorch/audio/pull/3329

Reviewed By: huydhn

Differential Revision: D46265649

Pulled By: atalman

fbshipit-source-id: 7659dfbcc8ad400f2e109ff64530e1f768e82ef9

6425d46c

27 May, 2023 1 commit

Fix AudioEffector for mulaw (#3372) · af932cc7

moto authored May 26, 2023

Summary:
When encoding audio with mulaw, the resulting data does not have header, and the StreamReader defaults to 16k Hz, which can strech/shrink the resulting waveform.

Pull Request resolved: https://github.com/pytorch/audio/pull/3372

Reviewed By: hwangjeff

Differential Revision: D46234772

Pulled By: mthrok

fbshipit-source-id: 942c89a8cfe29b0b6f57b3e5b6c9dfd3524ca552

af932cc7

26 May, 2023 6 commits

Fix encoding g722 format (#3373) · 1b05ca7e

moto authored May 26, 2023

Summary:
g722 format only supports 16k Hz, but AVCodec does not list this. The implementation does not insert resampling and the resulting audio can be slowed down or sped up.

Pull Request resolved: https://github.com/pytorch/audio/pull/3373

Reviewed By: hwangjeff

Differential Revision: D46233181

Pulled By: mthrok

fbshipit-source-id: 902b3f862a8f7269dc35bc871e868b0e78326c6c

1b05ca7e

Use the same CUDNN version on Windows as PyTorch (#3380) · c120f316

Huy Do authored May 26, 2023

Summary:
11.7 uses 8.5.0; 11.8 uses 8.7.0; 12.1 uses 8.8.1.  Otherwise, Windows vision job (8.5.0) would overwrite the CUDNN version setup by PyTorch (8.7.0) leading to this flaky failures https://github.com/pytorch/pytorch/actions/runs/5088860652/jobs/9146641450

```
RuntimeError: cuDNN version incompatibility: PyTorch was compiled  against (8, 7, 0) but found runtime version (8, 5, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.
```

Pull Request resolved: https://github.com/pytorch/audio/pull/3380

Reviewed By: atalman

Differential Revision: D46236286

Pulled By: huydhn

fbshipit-source-id: 9ca12d5068c3029688347d52c5c284488f33728d

c120f316

Use cuda 11.8 for circleci tests (#3381) · 5c0249b0

atalman authored May 26, 2023

Summary:
Use cuda 11.8 for circleci tests.
11.7 was deprecated

Pull Request resolved: https://github.com/pytorch/audio/pull/3381

Reviewed By: osalpekar

Differential Revision: D46236223

Pulled By: atalman

fbshipit-source-id: 6d6a8e09603807a07241f31c1bd1e6d3a2b67d9d

5c0249b0

Temporarily remove test for extract_features (#3378) · 05649ca3

Zhaoheng Ni authored May 26, 2023

Summary:
The tests failed for several bundles. Remove them and will re-add once the root cause is figured out.

Pull Request resolved: https://github.com/pytorch/audio/pull/3378

Reviewed By: atalman

Differential Revision: D46230884

Pulled By: nateanl

fbshipit-source-id: 42056a29b2ec2335268b273d3e37fb517035be92

05649ca3

Revert "Upgrade to FFmpeg5 (#3298)" (#3377) · 37779ef9

atalman authored May 26, 2023

Summary:
This reverts commit d38a7854.

This is temporary revert to unblock unit test migration from circleci to github

Pull Request resolved: https://github.com/pytorch/audio/pull/3377

Reviewed By: mthrok

Differential Revision: D46230498

Pulled By: atalman

fbshipit-source-id: 000d8a9ca00750fc1ca61f4c2cdd6e930a5ce46d

37779ef9

Improve RNN-T streaming decoding (#3295) · 9fc0dcaa

Lakshmi Krishnan authored May 26, 2023

Summary:
This commit fixes the following issues affecting streaming decoding quality
1. The `init_b` hypothesis is only regenerated from blank token if no initial hypotheses are provided.
2. Allows the decoder to receive top-K hypothesis to continue decoding from, instead of using just the top hypothesis at each decoding step. This dramatically affects decoding quality especially for speech with long pauses and disfluencies.
3. Some minor errors regarding shape checking for length.

This also means that the resulting output is the entire transcript up until that time step, instead of just the incremental change in transcript.

Pull Request resolved: https://github.com/pytorch/audio/pull/3295

Reviewed By: nateanl

Differential Revision: D46216113

Pulled By: hwangjeff

fbshipit-source-id: 8f7efae28dcca4a052f434ca55a2795c9e5ec0b0

9fc0dcaa

25 May, 2023 1 commit

Add LRS3 AV-ASR recipe (#3278) · c6624fa6

Pingchuan Ma authored May 25, 2023

Summary:
This PR adds AV-ASR recipe which contains sample implementations of training and evaluation pipelines for RNNT based automatic, visual, and audio-visual (ASR, VSR, AV-ASR) models on LRS3. This repository includes both streaming/non-streaming modes.

CC stavros99 xiaohui-zhang YumengTao mthrok nateanl hwangjeff

Pull Request resolved: https://github.com/pytorch/audio/pull/3278

Reviewed By: nateanl

Differential Revision: D46121550

Pulled By: mpc001

fbshipit-source-id: bb44b97ae25e87df2a73a707008be46af4ad0fc6

c6624fa6

24 May, 2023 6 commits

Add StreamReader/Writer custom IO to doc (#3367) · f41ba26d

moto authored May 24, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3367

Reviewed By: nateanl

Differential Revision: D46148139

Pulled By: mthrok

fbshipit-source-id: 50f297ac69bb95562976eb452e4e382b8c064c3c

f41ba26d

Fix build doc (#3349) · 8b85ca5d

moto authored May 24, 2023

Summary:
Follow-up https://github.com/pytorch/audio/issues/3045
- Revert the removal of HW acceleration doc
- comment out FFmpeg CLI test run

Pull Request resolved: https://github.com/pytorch/audio/pull/3349

Reviewed By: nateanl

Differential Revision: D46121899

Pulled By: mthrok

fbshipit-source-id: dfc030a69f05addec73637cfb6a720c184e37323

8b85ca5d

Update smoke test (#3346) · 71b2634b

moto authored May 24, 2023

Summary:
* Delay the import of torchaudio until the CLI options are parsed.
* Add option to set log level to DEBUG so that it's easy to see the issue with external libraries.

Pull Request resolved: https://github.com/pytorch/audio/pull/3346

Reviewed By: nateanl

Differential Revision: D46022546

Pulled By: mthrok

fbshipit-source-id: 9f988bbd770c2fd2bb260c3cfe02b238a9da2808

71b2634b

Amend commit to gh-pages branch (#3345) · a79cf3ba

moto authored May 24, 2023

Summary:
This commit changes the way doc is pushed.
It ammends instead of adding a new commit.

Currently each commit in gh-pages contain like 100MB of data. gh-pages branch is fetched by default when `git clone`. So the size of torchaudio repo grows significantly.

Pull Request resolved: https://github.com/pytorch/audio/pull/3345

Reviewed By: nateanl

Differential Revision: D46136612

Pulled By: mthrok

fbshipit-source-id: 39479ee5d1a6888254ef50f0db252453d976d183

a79cf3ba

Remove CUDA 11.7 builds; replace with 11.8 (#3360) · 5a6f4eba

pbialecki authored May 24, 2023

Summary:
CC atalman malfet

Pull Request resolved: https://github.com/pytorch/audio/pull/3360

Reviewed By: mthrok

Differential Revision: D46150898

Pulled By: atalman

fbshipit-source-id: 985a0ef69406f48fb15f239d6b16616c0a5379f5

5a6f4eba

Resolve lint issue on LaTeX (#3366) · 8690e6ec

moto authored May 23, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3366

Reviewed By: nateanl

Differential Revision: D46136238

Pulled By: mthrok

fbshipit-source-id: 3432f5d007293831bab21460a79ae26b1bbc81a8

8690e6ec

23 May, 2023 3 commits

[BugFix] Fix extract_features method for WavLM models (#3350) · 7d0f3369

Zhaoheng Ni authored May 23, 2023

Summary:
resolve https://github.com/pytorch/audio/issues/3347

`position_bias` is ignored in `extract_features` method, this doesn't affect Wav2Vec2 or HuBERT models, but it changes the output of transformer layers (except the first layer) in WavLM model. This PR fixes it by adding `position_bias` to the method.

Pull Request resolved: https://github.com/pytorch/audio/pull/3350

Reviewed By: mthrok

Differential Revision: D46112148

Pulled By: nateanl

fbshipit-source-id: 3d21aa4b32b22da437b440097fd9b00238152596

7d0f3369

[Nova] MacOS Unittests on Nova (#3324) · fce54fd1

Omkar Salpekar authored May 23, 2023

Summary:
As discussed in the [Torchaudio Migration Proposal](https://docs.google.com/document/d/1PF8biwiGzsjzfEBM78mlLiRrkcsGsvuYkeqkI66Ym8A/edit), this PR moves MacOS unittest job to Nova tooling. Note that this does not touch anything within the existing CircleCI job at the moment.

Passing job: https://github.com/pytorch/audio/actions/runs/4932497525/jobs/8815581251?pr=3324

Pull Request resolved: https://github.com/pytorch/audio/pull/3324

Reviewed By: atalman, mthrok

Differential Revision: D46113524

Pulled By: osalpekar

fbshipit-source-id: d048d300489f992fa187628cb6744d95ab4fb68a

fce54fd1

Fix cuda test failure (#3363) · fa59855f

Zhaoheng Ni authored May 23, 2023

Summary:
Fix https://github.com/pytorch/audio/issues/3361

When adding FunctionalCUDAOnlyTest, the class should inherit from `TestBaseMixin` instead of `Functional`

Pull Request resolved: https://github.com/pytorch/audio/pull/3363

Reviewed By: atalman, osalpekar

Differential Revision: D46112084

Pulled By: nateanl

fbshipit-source-id: 67c6472fda98cb718e0fc53ab248beda745feab5

fa59855f