Commits · c0f25f2161b518c641bbcf512b902995dd98f976 · OpenDAS / Torchaudio

14 Aug, 2023 4 commits

Update I/O and backend docs (#3555) · c0f25f21

moto authored Aug 14, 2023

Summary:
* Merge backend doc into torchaudio toplevel doc
* Update backend, dispatcher, installation doc

Pull Request resolved: https://github.com/pytorch/audio/pull/3555

Reviewed By: huangruizhe

Differential Revision: D48326812

Pulled By: mthrok

fbshipit-source-id: cc0d7326eacfebd341323b5d613ca1777255748b

c0f25f21

Update integration test CI config (#3502) · 9d8f76d9

moto authored Aug 14, 2023

Summary:
Update the ubuntu image so that CI is triggered.
There is some issue with FFmpeg 4, so that CI does not succeed.
This will be is handled separately.

Pull Request resolved: https://github.com/pytorch/audio/pull/3502

Reviewed By: huangruizhe

Differential Revision: D48327431

Pulled By: mthrok

fbshipit-source-id: 5ea639f3e20c3aaf460e6030f6cb1ad2daa00172

9d8f76d9

Update ffmpeg pre-built binary to 4.4.4 (#3557) · a9e38e74

moto authored Aug 14, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3557

Reviewed By: huangruizhe

Differential Revision: D48326462

Pulled By: mthrok

fbshipit-source-id: c37ae38e28e4514ea284613636604a725829346d

a9e38e74

Add default use_tmp_hub_dir value for integration tests (#3558) · d1d41fd3

Jeff Hwang authored Aug 14, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3558

In the event that `use_tmp_hub_dir` isn't specified as an option, pytest shouldn't fail. To resolve such failures, this PR modifies function `temp_hub_dir` to fall back on a default value of `False` for `use_tmp_hub_dir`.

Reviewed By: mthrok

Differential Revision: D48318947

fbshipit-source-id: 5dd692f9202ef37ec3e2c9ea39896156f928d693

d1d41fd3

11 Aug, 2023 3 commits

Expose AudioMetadata (#3556) · 9467fc44

moto authored Aug 11, 2023

Summary:
`torchaudio.info` returns `AudioMetaData`. It should be exposed as public API, without referring `backend` submodule.

Pull Request resolved: https://github.com/pytorch/audio/pull/3556

Reviewed By: huangruizhe

Differential Revision: D48267349

Pulled By: mthrok

fbshipit-source-id: 6ccc0c32bf62fbdcb71495fc7d8d4cc29891538a

9467fc44

Revise VGGish pipeline test again (#3551) · f2b2f05a

Jeff Hwang authored Aug 10, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3551

Restores VGGish pipeline test to be a function rather than class.

Reviewed By: mthrok

Differential Revision: D48236197

fbshipit-source-id: 25ac19d87a7a0964a9c3f7552037cd6c21dc38a9

f2b2f05a

Support writing opus and mp3 with soundfile (#3554) · 9bd7ca51

moto authored Aug 10, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3554

Reviewed By: huangruizhe

Differential Revision: D48240906

Pulled By: mthrok

fbshipit-source-id: 1936757646f8ebba74e8b65e2ffe2a8b74fdfeeb

9bd7ca51

10 Aug, 2023 6 commits

Refactor _backend module (#3547) · 1e6a8f93

moto authored Aug 10, 2023

Summary:
* Move Backend implementations to separate files

Pull Request resolved: https://github.com/pytorch/audio/pull/3547

Reviewed By: hwangjeff

Differential Revision: D48233538

Pulled By: mthrok

fbshipit-source-id: bcc63fc07a5dfcd48929f0a2fb64bfcb3282eb92

1e6a8f93

Add Frechet distance function (#3545) · 06301c0a

Jeff Hwang authored Aug 10, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3545

Adds function for computing the Fréchet distance between two multivariate normal distributions.

Reviewed By: mthrok

Differential Revision: D48126102

fbshipit-source-id: e4e122b831e1e752037c03f5baa9451e81ef1697

06301c0a

[aarch64] Add aarch64 workflow (#3553) · 8d858c38

Mike Schneider authored Aug 10, 2023

Summary:
# Changes
* Adding workflow for building aarch64 wheels.

Pull Request resolved: https://github.com/pytorch/audio/pull/3553

Reviewed By: hwangjeff, osalpekar

Differential Revision: D48239384

Pulled By: atalman

fbshipit-source-id: dfa00edb3fee0acaf2b83fb420eaf12bddc6980e

8d858c38

Move backend initialization to toplevel (#3548) · 6fb21ab1

moto authored Aug 10, 2023

Summary:
The backend dispatcher is implemented in `torchaudio._backend`, while the legacy backend is implemented in `torchaudio.backend`.

The initialization happen in `torchaudio._backend`.
This commit moves it to `torchaudio.__init__`, so that `backend` and `_backend` is more independent.

Pull Request resolved: https://github.com/pytorch/audio/pull/3548

Reviewed By: huangruizhe

Differential Revision: D48219244

Pulled By: mthrok

fbshipit-source-id: e694cb232794f90902a60ee51c7bf11b7f0548a0

6fb21ab1

Fix SoundfileBackend method decorators (#3550) · 2d1138c5

moto authored Aug 10, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3550

Reviewed By: hwangjeff

Differential Revision: D48219176

Pulled By: mthrok

fbshipit-source-id: 4b11111dd3853cbef4ffe1859ec428ca05394824

2d1138c5

Misc tutorial updates (#3546) · bc264256

moto authored Aug 10, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3546

Reviewed By: huangruizhe

Differential Revision: D48219274

Pulled By: mthrok

fbshipit-source-id: 6881f039bf70cf7240fbcfeb48443471ef457bd4

bc264256

09 Aug, 2023 1 commit

Revise VGGish inference pipeline test (#3544) · 9f5fa84b

Jeff Hwang authored Aug 08, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3544

Revises VGGish inference pipeline test to support internal testing.

Reviewed By: mthrok

Differential Revision: D48058409

fbshipit-source-id: 045140a0e9d50128d32ef6510bdb2f642a365c83

9f5fa84b

08 Aug, 2023 6 commits

Updating CTC FA tutorial (#3542) · eab8aa74

moto authored Aug 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3542

Reviewed By: huangruizhe

Differential Revision: D48166025

Pulled By: mthrok

fbshipit-source-id: 29fee7dbf08394993972ec2967f94ce9fcb1c853

eab8aa74

Add tutorial link to AVSR recipe (#3532) · f7ab406a

Pingchuan Ma authored Aug 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3532

Reviewed By: mthrok

Differential Revision: D48165499

Pulled By: mpc001

fbshipit-source-id: c87b3361f0e6282684f218b32888df883d56682b

f7ab406a

Adopt MMS_FA bundle in multilingual FA tutorials (#3534) · 19e9046a

moto authored Aug 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3534

Reviewed By: huangruizhe

Differential Revision: D48155817

Pulled By: mthrok

fbshipit-source-id: a3d45fdfd360f9668063a3ecb3b00364290134c9

19e9046a

Fix FA bundle (#3538) · 7e85f625

moto authored Aug 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3538

Reviewed By: huangruizhe

Differential Revision: D48154056

Pulled By: mthrok

fbshipit-source-id: 72f58c501c5302d40f1d059f95bd6fe40d4a52aa

7e85f625

Librispeech RNNT recipe updates for pytorch lightening 2.0 (#3336) · e6c89731

Ruizhe (Ray) Huang authored Aug 08, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3336

Reviewed By: mthrok

Differential Revision: D47846814

Pulled By: huangruizhe

fbshipit-source-id: dc12362bf243c52222dccadec3176e25e43dd652

e6c89731

Add abstraction for download util (#1959) · 3f98fb96

Moto Hira authored Aug 07, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/1959

Reviewed By: hwangjeff

Differential Revision: D32078361

fbshipit-source-id: 50b56bac9593c36197998e89db19cd6d65b793cc

3f98fb96

07 Aug, 2023 4 commits

Move alignment code to separate submodule (#3536) · 90143e96

moto authored Aug 07, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3536

Reviewed By: huangruizhe

Differential Revision: D48120170

Pulled By: mthrok

fbshipit-source-id: dec7575db07734490099b35a8bfc854252952c6e

90143e96

Add MMS FA Bundle (#3521) · 5e211d66

moto authored Aug 07, 2023

Summary:
Port the MMS FA model from tutorial to the library with post-processing module.

Pull Request resolved: https://github.com/pytorch/audio/pull/3521

Reviewed By: huangruizhe

Differential Revision: D48038285

Pulled By: mthrok

fbshipit-source-id: 571cf0fceaaab4790983be2719f1a85805b814f5

5e211d66

Add merge_tokens / TokenSpan (#3535) · 30668afb

moto authored Aug 07, 2023

Summary:
This commit adds `merge_tokens` function which removes repeated tokens from CTC token sequences returned from `forced_align`.

Resolving repeated tokens is a necessary step and almost universal, thus it makes sense to have such helper function in torchaudio.

Pull Request resolved: https://github.com/pytorch/audio/pull/3535

Reviewed By: huangruizhe

Differential Revision: D48111202

Pulled By: mthrok

fbshipit-source-id: 25354bfa210aa5c03f8c1d3e201f253ca3761b24

30668afb

Make target_lengths/input_lengths in forced_align optional (#3533) · cd80976e

moto authored Aug 07, 2023

Summary:
Currently `torchaudio.functional.forced_align` function requires full information on input/target lengths.
When performing non-batched alignment, these can be inferred from the size of Tensor.

Pull Request resolved: https://github.com/pytorch/audio/pull/3533

Reviewed By: nateanl

Differential Revision: D48111041

Pulled By: mthrok

fbshipit-source-id: fbf07124d3959c5cc5533dcd86296851587082fb

cd80976e

04 Aug, 2023 2 commits

Revise VGGish pipeline to accept arbitrary state dict function (#3531) · b976c8f1

Jeff Hwang authored Aug 04, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3531

Revises VGGish pipeline to accept arbitrary state dict function to accommodate loading weights from any source.

Reviewed By: mthrok

Differential Revision: D48056390

fbshipit-source-id: 2767699b58442ad132b518b4a6435f2772a637c3

b976c8f1

Update ctc forced alignment tutorial (#3529) · b645c07b

moto authored Aug 04, 2023

Summary:
- Simplify the step to generate token-level alignment

Pull Request resolved: https://github.com/pytorch/audio/pull/3529

Reviewed By: huangruizhe

Differential Revision: D48066787

Pulled By: mthrok

fbshipit-source-id: 452c243d278e508926a59894928e280fea76dcc6

b645c07b

03 Aug, 2023 2 commits

Refactor wav2vec2 pipeline misc helper functions (#3527) · 09aabcc1

moto authored Aug 02, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3527

Reviewed By: huangruizhe

Differential Revision: D48008822

Pulled By: mthrok

fbshipit-source-id: 4beae2956dfd1f00534832b70a1bf0897cba7812

09aabcc1

Relax Conformer RNN-T numerical parity tests (#3525) · 72b0917d

hwangjeff authored Aug 02, 2023

Summary:
Increases numerical tolerance on Conformer RNN-T TorchScript consistency tests to resolve CI test failures.

Pull Request resolved: https://github.com/pytorch/audio/pull/3525

Reviewed By: mthrok

Differential Revision: D48000613

Pulled By: hwangjeff

fbshipit-source-id: 1d35ba58055a8346dc40e2b67f37ccfd2e015894

72b0917d

02 Aug, 2023 1 commit

Fix save INT16 sox backend (#3524) · 3f9b5171

moto authored Aug 02, 2023

Summary:
When passing int16 type tensor to `save(backend="sox")`, the resulting file should be 16-bit signed PCM, but instead is 32-bit signed PCM.

Resolves https://github.com/pytorch/audio/issues/3304

Pull Request resolved: https://github.com/pytorch/audio/pull/3524

Reviewed By: huangruizhe

Differential Revision: D47941090

Pulled By: mthrok

fbshipit-source-id: 2622b31eb1cbf03969f67ab2b2adec6e2ba677c4

3f9b5171

01 Aug, 2023 3 commits

Add cuctc tutorial, change blank skip threshold into prob (#3297) · 732c94a3

Yuekai Zhang authored Aug 01, 2023

Summary:
Add a separate tutorial for cuctc.
Reslove https://github.com/pytorch/audio/issues/3096

Pull Request resolved: https://github.com/pytorch/audio/pull/3297

Reviewed By: huangruizhe

Differential Revision: D47928400

Pulled By: mthrok

fbshipit-source-id: 8c16492fb4d007b6ea7969ba77c866a51749c0ec

732c94a3

Migrate weight_norm (#3523) · 144cfcfc

moto authored Aug 01, 2023

Summary:
torch.nn.utils.weight_norm is deprecated.
Replacing this with new API

Pull Request resolved: https://github.com/pytorch/audio/pull/3523

Reviewed By: huangruizhe

Differential Revision: D47932384

Pulled By: mthrok

fbshipit-source-id: 344abfa12bd11da779f7fd13b74a1e009a582b52

144cfcfc

Add pretrained VGGish inference pipeline (#3491) · cbfde17b

hwangjeff authored Jul 31, 2023

Summary:
Adds pre-trained VGGish inference pipeline ported from https://github.com/harritaylor/torchvggish and https://github.com/tensorflow/models/tree/master/research/audioset.

Pull Request resolved: https://github.com/pytorch/audio/pull/3491

Reviewed By: mthrok

Differential Revision: D47738130

Pulled By: hwangjeff

fbshipit-source-id: 859c1ff1ec1b09dae4e26586169544571657cc67

cbfde17b

31 Jul, 2023 2 commits

Migrate torch.norm to torch.linalg.vector_norm (#3522) · 8a2e12d3

moto authored Jul 31, 2023

Summary:
torch.norm is now deprecated.
The usages in torchaudio seems to be vector norm, so replacing them with torch.linalg.vector_norm

Resolves https://github.com/pytorch/audio/issues/3484

Pull Request resolved: https://github.com/pytorch/audio/pull/3522

Reviewed By: huangruizhe

Differential Revision: D47926659

Pulled By: mthrok

fbshipit-source-id: f7428cf0168109a3d340b8784adc99bb5f781084

8a2e12d3

Set and tweak global matplotlib configuration in tutorials (#3515) · 84b12306

moto authored Jul 31, 2023

Summary:
- Set global matplotlib rc params
- Fix style check
- Fix and updates FA tutorial plots
- Add av-asr index cars

Pull Request resolved: https://github.com/pytorch/audio/pull/3515

Reviewed By: huangruizhe

Differential Revision: D47894156

Pulled By: mthrok

fbshipit-source-id: b40d8d31f12ffc2b337e35e632afc216e9d59a6e

84b12306

29 Jul, 2023 1 commit

Refactor compat (#3518) · 8497ee91

moto authored Jul 29, 2023

Summary:
The I/O functions in _compat module was introduced there so that
everything related to FFmpeg is in torchaudio.io and FFmpeg library
initialization can be carried out in `torchaudio.io.__init__`.

Now that this constraint is removed, (all the initialization happens
at `torchaudio._extension.__init__`) and `_compat` is only used by
FFmpeg dispatcher backend, we move the module to `torchaudio._backend`
for better locality.

Pull Request resolved: https://github.com/pytorch/audio/pull/3518

Reviewed By: huangruizhe

Differential Revision: D47877412

Pulled By: mthrok

fbshipit-source-id: aa18c8cb6e5d5360950df5158c33c653e37c565f

8497ee91

28 Jul, 2023 5 commits

Amend amp_to_db docstring (#3519) · 61cbf791

moto authored Jul 28, 2023

Summary:
Context: https://github.com/pytorch/audio/issues/3448

The documentation of amplitude_to_DB is ambigious on how cut-off values are computed when the input tensor is 3D.

This commit clarifies that.

Closes: https://github.com/pytorch/audio/issues/3448

Pull Request resolved: https://github.com/pytorch/audio/pull/3519

Reviewed By: huangruizhe

Differential Revision: D47875505

Pulled By: mthrok

fbshipit-source-id: e06bb997e7a27e2abe35c8e2ac91ddfbded4e641

61cbf791

Remove ffmpeg fallback from sox_io backend (#3516) · 2c8665de

moto authored Jul 28, 2023

Summary:
In https://github.com/pytorch/audio/issues/2419, we added ffmpeg as fallback for sox_io backend. The was a warkaround for solving the issue with libmad removal.

Now that we introduced `backend` argument to I/O functions, and libsox integration is moved to dynamic binding where users can use libsox with libmad integration, we do not need the workaround.

This commit is based on reverting https://github.com/pytorch/audio/issues/2416 (fd7ace17).

Pull Request resolved: https://github.com/pytorch/audio/pull/3516

Reviewed By: huangruizhe

Differential Revision: D47855272

Pulled By: mthrok

fbshipit-source-id: 5af73af7865f6e545ccb052d478e86588ff2a014

2c8665de

Update documentation about dependencies (#3517) · a051985f

moto authored Jul 28, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3517

Reviewed By: huangruizhe

Differential Revision: D47858452

Pulled By: mthrok

fbshipit-source-id: 62ee6c8bb2669dd70f8ca25703a04dc8a9d19aec

a051985f

Move TorchAudio-Squim models to Beta (#3512) · b7d2d928

Zhaoheng Ni authored Jul 28, 2023

Summary:
The PR move `SquimObjective` and `SquimSubjective` models and corresponding factory functions and pre-trained pipelines out of prototype and to the core directory. They will be included in the next official release.

Pull Request resolved: https://github.com/pytorch/audio/pull/3512

Reviewed By: mthrok

Differential Revision: D47837434

Pulled By: nateanl

fbshipit-source-id: d0639f29079f7e1afc30f236849e530c8cadffd8

b7d2d928

Add real-time av-asr tutorial (#3511) · d6aeaa74

Pingchuan Ma authored Jul 28, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3511

Reviewed By: mthrok

Differential Revision: D47852108

Pulled By: mpc001

fbshipit-source-id: c0ecb4b5bcc8670013dcbe1164e3929f5793c8aa

d6aeaa74