Commits · 6fb68544e758eeceacebf6e215253187dcaf9983 · OpenDAS / Torchaudio

29 Aug, 2023 2 commits

Separate Test Token for Conda Uploads (#3582) · 6fb68544

Omkar Salpekar authored Aug 29, 2023

Summary:
We will use a separate token for uploading test binaries (instead of reusing the nightly token). This PR adds that token to the caller workflow.

Pull Request resolved: https://github.com/pytorch/audio/pull/3582

Reviewed By: atalman

Differential Revision: D48803009

Pulled By: osalpekar

fbshipit-source-id: c2af57f6946da51a7b56c975614e60f243e3f6fb

6fb68544

Remove random print statement (#3577) · 5ee254e3

moto authored Aug 28, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3577

Reviewed By: atalman

Differential Revision: D48763580

Pulled By: mthrok

fbshipit-source-id: 6ab155a5dd4cf11b2a58f26ced369107f0a2f08f

5ee254e3

23 Aug, 2023 1 commit

update CUDA to 12.1 U1 (#3563) · 47eaab4d

pbialecki authored Aug 23, 2023

Summary:
Follow-up of: https://github.com/pytorch/builder/pull/1485

CC atalman

Pull Request resolved: https://github.com/pytorch/audio/pull/3563

Reviewed By: kit1980

Differential Revision: D48610200

Pulled By: atalman

fbshipit-source-id: 61c9981da5a343a3cbce97b0a77ab91f37560087

47eaab4d

21 Aug, 2023 2 commits

Use FFmpeg6 in unittest (#3570) · 9d11563d

moto authored Aug 21, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3570

Reviewed By: huangruizhe

Differential Revision: D48518568

Pulled By: mthrok

fbshipit-source-id: 0fdfb8b3988789c7ded0fb336824034bedf6a394

9d11563d

Fix style (#3569) · 3318bcec

moto authored Aug 21, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3569

Reviewed By: huangruizhe

Differential Revision: D48508244

Pulled By: mthrok

fbshipit-source-id: 6e14267e2dbdf08ea3c25a1dab480cb0e908e0c3

3318bcec

20 Aug, 2023 3 commits

Fix I/O test (#3568) · 0688863c

moto authored Aug 20, 2023

Summary:
Turned out FFmpeg 5 installed via conda reports video frame rate -1. FFmpeg 4 and 6 are fine. This is either a regression in FFmpeg or in the underlying decoding library.

Make the reference value adoptive.

Pull Request resolved: https://github.com/pytorch/audio/pull/3568

Reviewed By: huangruizhe

Differential Revision: D48499621

Pulled By: mthrok

fbshipit-source-id: fb64187bcf0dc57b753cb6c05f04d436238f5c51

0688863c

Fix style check CI job (#3564) · a5da0a28

moto authored Aug 20, 2023

Summary:
It seems that the default Python version was updated to 3.11.
libcst does not have binary release for 3.11, so the CI attempts to
build from source but it fails because building libcst requires Rust
compiler.

This commit fix the Python version of style check job to 3.10 so that
the issue with Rust compiler is avoided.

Pull Request resolved: https://github.com/pytorch/audio/pull/3564

Reviewed By: huangruizhe

Differential Revision: D48499560

Pulled By: mthrok

fbshipit-source-id: 53ab77268d8143f4946d92e8cd1f96aea55e7b72

a5da0a28

Add detail about CTC peaky behavior (#3566) · a25bcb6b

moto authored Aug 20, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3566

Reviewed By: huangruizhe

Differential Revision: D48499338

Pulled By: mthrok

fbshipit-source-id: 7f837e1a1f8116d7d82411607c91628b729077d8

a25bcb6b

19 Aug, 2023 1 commit

Enable ROCm RNN-T Loss (#2485) · c5939616

Juan Villamizar authored Aug 18, 2023

Summary:
Added HIPIFY code and small changes for ROCm. Targeting RNN-T loss.

Pull Request resolved: https://github.com/pytorch/audio/pull/2485

Reviewed By: huangruizhe

Differential Revision: D43537864

Pulled By: mthrok

fbshipit-source-id: 4bdb1f291dc51a12232ccd072b97ae94ae20cc0c

c5939616

18 Aug, 2023 1 commit

Update README.md (#3567) · 1638efee

moto authored Aug 18, 2023

Summary:
Remove mention of backend and quick usage. Those are explained in the documentation in detail.

Pull Request resolved: https://github.com/pytorch/audio/pull/3567

Reviewed By: huangruizhe

Differential Revision: D48471832

Pulled By: mthrok

fbshipit-source-id: 467efc1f11f66534c33cf4751de27b08176c31bf

1638efee

15 Aug, 2023 2 commits

Use pytorch/manylinuxaarch64-builder:cpu-aarch64 docker image (#3560) · 126f9f6c

Andrey Talman authored Aug 15, 2023

Summary:
Use pytorch/manylinuxaarch64-builder:cpu-aarch64

Introduced in https://github.com/pytorch/builder/pull/1472

Pull Request resolved: https://github.com/pytorch/audio/pull/3560

Reviewed By: mthrok

Differential Revision: D48366572

Pulled By: atalman

fbshipit-source-id: 6de15f81abb09c737e6a1271226259483141e8f4

126f9f6c

[BC-breaking] Update pre-built ffmpeg4 to 4.4.4 (#3561) · bf07ea6b

moto authored Aug 15, 2023

Summary:
In https://github.com/pytorch/audio/pull/3460, we switched the build process for FFmpeg extension.
Since it is complicated to install FFmpeg in some environments, at build time, pre-built binaries and its headers
are downloaded and used as a scaffolding for torchaudio build.

Now even though we did not change any code or FFmpeg version, it turned out that this causes segmentation
fault on Ubuntu when using system Python and FFmpeg 4.4 installed via aptitude.
While investigating the issue, I swapped the said pre-built FFmpeg scaffolding with FFmpeg 4.4 from aptitude,
and the segmentation fault did not happen. This indicates that it is binary compatibility issue.

Before https://github.com/pytorch/audio/issues/3460, each binary build job was building FFmpeg 4.1.8 using the same compiler used to build torchaudio,
but after https://github.com/pytorch/audio/issues/3460 the environments to build FFmpeg 4.1.8 and torchaudio are different. My hypothesis is that
this difference is causing some ABI incompatibility when linking against FFmpeg 4.4. (Also, I don't remember well,
but I read somewhere that 4.4 has a different ABI)

Through experiments, it turned out upgrading the pre-built FFmpeg scaffolding to 4.4 resolves this.
So this commit upgrade the pre-built FFmpeg 4 to 4.4.
The potential (yet unconfirmed) downside is that torchaudio will no longer work with 4.1, 4.2, and 4.3.
Since FFmpeg 4.4 is what Ubuntu 20.04 and 22.04 support by default, and Google Colab is also on 20.04,
I think it is more important to support 4.4.

Therefore we drop the support for 4.1-4.3 from normal build (and official distributions). Those who wish to
use 4.1-4.3 can build torchaudio from source by linking to specific FFmpeg.

Pull Request resolved: https://github.com/pytorch/audio/pull/3561

Reviewed By: hwangjeff

Differential Revision: D48340201

Pulled By: mthrok

fbshipit-source-id: 7ece82910f290c7cf83f58311c4cf6a384e8795e

bf07ea6b

14 Aug, 2023 5 commits

Move essential backend implementations to _backend (#3549) · 2e0dfafa

moto authored Aug 14, 2023

Summary:
Move the actual I/O implementation to `_backend` submodule so that the existing `backend` submodule contains only what's related to legacy backend utilities.

Pull Request resolved: https://github.com/pytorch/audio/pull/3549

Reviewed By: huangruizhe

Differential Revision: D48253550

Pulled By: mthrok

fbshipit-source-id: c23f1664458c723f63e134c7974b3f7cf17a1e98

2e0dfafa

Update I/O and backend docs (#3555) · c0f25f21

moto authored Aug 14, 2023

Summary:
* Merge backend doc into torchaudio toplevel doc
* Update backend, dispatcher, installation doc

Pull Request resolved: https://github.com/pytorch/audio/pull/3555

Reviewed By: huangruizhe

Differential Revision: D48326812

Pulled By: mthrok

fbshipit-source-id: cc0d7326eacfebd341323b5d613ca1777255748b

c0f25f21

Update integration test CI config (#3502) · 9d8f76d9

moto authored Aug 14, 2023

Summary:
Update the ubuntu image so that CI is triggered.
There is some issue with FFmpeg 4, so that CI does not succeed.
This will be is handled separately.

Pull Request resolved: https://github.com/pytorch/audio/pull/3502

Reviewed By: huangruizhe

Differential Revision: D48327431

Pulled By: mthrok

fbshipit-source-id: 5ea639f3e20c3aaf460e6030f6cb1ad2daa00172

9d8f76d9

Update ffmpeg pre-built binary to 4.4.4 (#3557) · a9e38e74

moto authored Aug 14, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3557

Reviewed By: huangruizhe

Differential Revision: D48326462

Pulled By: mthrok

fbshipit-source-id: c37ae38e28e4514ea284613636604a725829346d

a9e38e74

Add default use_tmp_hub_dir value for integration tests (#3558) · d1d41fd3

Jeff Hwang authored Aug 14, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3558

In the event that `use_tmp_hub_dir` isn't specified as an option, pytest shouldn't fail. To resolve such failures, this PR modifies function `temp_hub_dir` to fall back on a default value of `False` for `use_tmp_hub_dir`.

Reviewed By: mthrok

Differential Revision: D48318947

fbshipit-source-id: 5dd692f9202ef37ec3e2c9ea39896156f928d693

d1d41fd3

11 Aug, 2023 3 commits

Expose AudioMetadata (#3556) · 9467fc44

moto authored Aug 11, 2023

Summary:
`torchaudio.info` returns `AudioMetaData`. It should be exposed as public API, without referring `backend` submodule.

Pull Request resolved: https://github.com/pytorch/audio/pull/3556

Reviewed By: huangruizhe

Differential Revision: D48267349

Pulled By: mthrok

fbshipit-source-id: 6ccc0c32bf62fbdcb71495fc7d8d4cc29891538a

9467fc44

Revise VGGish pipeline test again (#3551) · f2b2f05a

Jeff Hwang authored Aug 10, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3551

Restores VGGish pipeline test to be a function rather than class.

Reviewed By: mthrok

Differential Revision: D48236197

fbshipit-source-id: 25ac19d87a7a0964a9c3f7552037cd6c21dc38a9

f2b2f05a

Support writing opus and mp3 with soundfile (#3554) · 9bd7ca51

moto authored Aug 10, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3554

Reviewed By: huangruizhe

Differential Revision: D48240906

Pulled By: mthrok

fbshipit-source-id: 1936757646f8ebba74e8b65e2ffe2a8b74fdfeeb

9bd7ca51

10 Aug, 2023 6 commits

Refactor _backend module (#3547) · 1e6a8f93

moto authored Aug 10, 2023

Summary:
* Move Backend implementations to separate files

Pull Request resolved: https://github.com/pytorch/audio/pull/3547

Reviewed By: hwangjeff

Differential Revision: D48233538

Pulled By: mthrok

fbshipit-source-id: bcc63fc07a5dfcd48929f0a2fb64bfcb3282eb92

1e6a8f93

Add Frechet distance function (#3545) · 06301c0a

Jeff Hwang authored Aug 10, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3545

Adds function for computing the Fréchet distance between two multivariate normal distributions.

Reviewed By: mthrok

Differential Revision: D48126102

fbshipit-source-id: e4e122b831e1e752037c03f5baa9451e81ef1697

06301c0a

[aarch64] Add aarch64 workflow (#3553) · 8d858c38

Mike Schneider authored Aug 10, 2023

Summary:
# Changes
* Adding workflow for building aarch64 wheels.

Pull Request resolved: https://github.com/pytorch/audio/pull/3553

Reviewed By: hwangjeff, osalpekar

Differential Revision: D48239384

Pulled By: atalman

fbshipit-source-id: dfa00edb3fee0acaf2b83fb420eaf12bddc6980e

8d858c38

Move backend initialization to toplevel (#3548) · 6fb21ab1

moto authored Aug 10, 2023

Summary:
The backend dispatcher is implemented in `torchaudio._backend`, while the legacy backend is implemented in `torchaudio.backend`.

The initialization happen in `torchaudio._backend`.
This commit moves it to `torchaudio.__init__`, so that `backend` and `_backend` is more independent.

Pull Request resolved: https://github.com/pytorch/audio/pull/3548

Reviewed By: huangruizhe

Differential Revision: D48219244

Pulled By: mthrok

fbshipit-source-id: e694cb232794f90902a60ee51c7bf11b7f0548a0

6fb21ab1

Fix SoundfileBackend method decorators (#3550) · 2d1138c5

moto authored Aug 10, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3550

Reviewed By: hwangjeff

Differential Revision: D48219176

Pulled By: mthrok

fbshipit-source-id: 4b11111dd3853cbef4ffe1859ec428ca05394824

2d1138c5

Misc tutorial updates (#3546) · bc264256

moto authored Aug 10, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3546

Reviewed By: huangruizhe

Differential Revision: D48219274

Pulled By: mthrok

fbshipit-source-id: 6881f039bf70cf7240fbcfeb48443471ef457bd4

bc264256

09 Aug, 2023 1 commit

Revise VGGish inference pipeline test (#3544) · 9f5fa84b

Jeff Hwang authored Aug 08, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3544

Revises VGGish inference pipeline test to support internal testing.

Reviewed By: mthrok

Differential Revision: D48058409

fbshipit-source-id: 045140a0e9d50128d32ef6510bdb2f642a365c83

9f5fa84b

08 Aug, 2023 6 commits

Updating CTC FA tutorial (#3542) · eab8aa74

moto authored Aug 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3542

Reviewed By: huangruizhe

Differential Revision: D48166025

Pulled By: mthrok

fbshipit-source-id: 29fee7dbf08394993972ec2967f94ce9fcb1c853

eab8aa74

Add tutorial link to AVSR recipe (#3532) · f7ab406a

Pingchuan Ma authored Aug 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3532

Reviewed By: mthrok

Differential Revision: D48165499

Pulled By: mpc001

fbshipit-source-id: c87b3361f0e6282684f218b32888df883d56682b

f7ab406a

Adopt MMS_FA bundle in multilingual FA tutorials (#3534) · 19e9046a

moto authored Aug 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3534

Reviewed By: huangruizhe

Differential Revision: D48155817

Pulled By: mthrok

fbshipit-source-id: a3d45fdfd360f9668063a3ecb3b00364290134c9

19e9046a

Fix FA bundle (#3538) · 7e85f625

moto authored Aug 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3538

Reviewed By: huangruizhe

Differential Revision: D48154056

Pulled By: mthrok

fbshipit-source-id: 72f58c501c5302d40f1d059f95bd6fe40d4a52aa

7e85f625

Librispeech RNNT recipe updates for pytorch lightening 2.0 (#3336) · e6c89731

Ruizhe (Ray) Huang authored Aug 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3336

Reviewed By: mthrok

Differential Revision: D47846814

Pulled By: huangruizhe

fbshipit-source-id: dc12362bf243c52222dccadec3176e25e43dd652

e6c89731

Add abstraction for download util (#1959) · 3f98fb96

Moto Hira authored Aug 07, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/1959

Reviewed By: hwangjeff

Differential Revision: D32078361

fbshipit-source-id: 50b56bac9593c36197998e89db19cd6d65b793cc

3f98fb96

07 Aug, 2023 4 commits

Move alignment code to separate submodule (#3536) · 90143e96

moto authored Aug 07, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3536

Reviewed By: huangruizhe

Differential Revision: D48120170

Pulled By: mthrok

fbshipit-source-id: dec7575db07734490099b35a8bfc854252952c6e

90143e96

Add MMS FA Bundle (#3521) · 5e211d66

moto authored Aug 07, 2023

Summary:
Port the MMS FA model from tutorial to the library with post-processing module.

Pull Request resolved: https://github.com/pytorch/audio/pull/3521

Reviewed By: huangruizhe

Differential Revision: D48038285

Pulled By: mthrok

fbshipit-source-id: 571cf0fceaaab4790983be2719f1a85805b814f5

5e211d66

Add merge_tokens / TokenSpan (#3535) · 30668afb

moto authored Aug 07, 2023

Summary:
This commit adds `merge_tokens` function which removes repeated tokens from CTC token sequences returned from `forced_align`.

Resolving repeated tokens is a necessary step and almost universal, thus it makes sense to have such helper function in torchaudio.

Pull Request resolved: https://github.com/pytorch/audio/pull/3535

Reviewed By: huangruizhe

Differential Revision: D48111202

Pulled By: mthrok

fbshipit-source-id: 25354bfa210aa5c03f8c1d3e201f253ca3761b24

30668afb

Make target_lengths/input_lengths in forced_align optional (#3533) · cd80976e

moto authored Aug 07, 2023

Summary:
Currently `torchaudio.functional.forced_align` function requires full information on input/target lengths.
When performing non-batched alignment, these can be inferred from the size of Tensor.

Pull Request resolved: https://github.com/pytorch/audio/pull/3533

Reviewed By: nateanl

Differential Revision: D48111041

Pulled By: mthrok

fbshipit-source-id: fbf07124d3959c5cc5533dcd86296851587082fb

cd80976e

04 Aug, 2023 2 commits

Revise VGGish pipeline to accept arbitrary state dict function (#3531) · b976c8f1

Jeff Hwang authored Aug 04, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3531

Revises VGGish pipeline to accept arbitrary state dict function to accommodate loading weights from any source.

Reviewed By: mthrok

Differential Revision: D48056390

fbshipit-source-id: 2767699b58442ad132b518b4a6435f2772a637c3

b976c8f1

Update ctc forced alignment tutorial (#3529) · b645c07b

moto authored Aug 04, 2023

Summary:
- Simplify the step to generate token-level alignment

Pull Request resolved: https://github.com/pytorch/audio/pull/3529

Reviewed By: huangruizhe

Differential Revision: D48066787

Pulled By: mthrok

fbshipit-source-id: 452c243d278e508926a59894928e280fea76dcc6

b645c07b

03 Aug, 2023 1 commit

Refactor wav2vec2 pipeline misc helper functions (#3527) · 09aabcc1

moto authored Aug 02, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3527

Reviewed By: huangruizhe

Differential Revision: D48008822

Pulled By: mthrok

fbshipit-source-id: 4beae2956dfd1f00534832b70a1bf0897cba7812

09aabcc1