Commits · d4cf8d549734bf8fde04dedf8cbe2cee71a6ff6e · OpenDAS / Torchaudio

09 Nov, 2023 1 commit
- Update version matrix (#3690) · d4cf8d54
  moto authored Nov 09, 2023
  
  d4cf8d54
31 Oct, 2023 1 commit
- Update CITATION (#3687) · c5b69336
  moto authored Oct 31, 2023
  
  c5b69336
26 Oct, 2023 5 commits
- Fix doc config (#3683) · 6e265157
  moto authored Oct 26, 2023
  
  6e265157
- Update StreamReader/Writer name · fcf38946
  moto-meta authored Oct 26, 2023
```
Differential Revision: D50696105

Pull Request resolved: https://github.com/pytorch/audio/pull/3682
```
  fcf38946
- Swap decoder/encoder implementation · 36f5010b
  moto-meta authored Oct 26, 2023
```
Differential Revision: D50677606

Pull Request resolved: https://github.com/pytorch/audio/pull/3681
```
  36f5010b
- Fix doc on FA (#3679) · 2a0f4c06
  moto authored Oct 25, 2023
  
  2a0f4c06
- Fix doc (#3678) · 3ff5e8c8
  moto authored Oct 25, 2023
  
  3ff5e8c8
24 Oct, 2023 2 commits
- Update C++ API doc (#3671) · 1caa3fcc
  moto authored Oct 24, 2023
  
  1caa3fcc
- Change namespace to torio · a78ba389
  moto-meta authored Oct 24, 2023
```
Differential Revision: D50506299

Pull Request resolved: https://github.com/pytorch/audio/pull/3669
```
  a78ba389
13 Oct, 2023 1 commit

Add Ray Tracing (#3604) (#2850) (#3655) · fa78fb64

moto authored Oct 13, 2023

Summary:
Revamped version of https://github.com/pytorch/audio/pull/3234
(which was also revamp of https://github.com/pytorch/audio/pull/2850)

fa78fb64

11 Oct, 2023 1 commit

Move libtorchaudio_ffmpeg to dedicated directory · 2836a23d

moto-meta authored Oct 11, 2023

Differential Revision: D50082877

Pull Request resolved: https://github.com/pytorch/audio/pull/3646

2836a23d

09 Oct, 2023 1 commit

Fix breadcrumbs for v2.1 · a8bb3973

Carl Parker authored Oct 09, 2023

Differential Revision: D50036850

Pull Request resolved: https://github.com/pytorch/audio/pull/3637

a8bb3973

19 Sep, 2023 1 commit

Fix doc nightly doc CI (#3611) · ac63c454

moto authored Sep 19, 2023

Some changes at matplotlib 3.8.0 rejects torch.Tensor passed to `plot` function.

ac63c454

04 Sep, 2023 2 commits

[BC-Breaking] Remove legacy global backend switch (#3559) · 454418d2

moto authored Sep 04, 2023

Summary:
This PR removes the legacy backend switch mechanism.
The implementation itself is still available.

Merge after v2.1 release

Pull Request resolved: https://github.com/pytorch/audio/pull/3559

Reviewed By: nateanl

Differential Revision: D48353764

Pulled By: mthrok

fbshipit-source-id: 4d3924dbe6f334ecebe2b12fcd4591c61c4aa656

454418d2

Fix doc link (#3593) · 3e7e696c

moto authored Sep 04, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3593

Reviewed By: nateanl

Differential Revision: D48933041

Pulled By: mthrok

fbshipit-source-id: cd05d3cf5006206ba441fdc05548bcd922ce0598

3e7e696c

20 Aug, 2023 1 commit

Add detail about CTC peaky behavior (#3566) · a25bcb6b

moto authored Aug 20, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3566

Reviewed By: huangruizhe

Differential Revision: D48499338

Pulled By: mthrok

fbshipit-source-id: 7f837e1a1f8116d7d82411607c91628b729077d8

a25bcb6b

15 Aug, 2023 1 commit

[BC-breaking] Update pre-built ffmpeg4 to 4.4.4 (#3561) · bf07ea6b

moto authored Aug 15, 2023

Summary:
In https://github.com/pytorch/audio/pull/3460, we switched the build process for FFmpeg extension.
Since it is complicated to install FFmpeg in some environments, at build time, pre-built binaries and its headers
are downloaded and used as a scaffolding for torchaudio build.

Now even though we did not change any code or FFmpeg version, it turned out that this causes segmentation
fault on Ubuntu when using system Python and FFmpeg 4.4 installed via aptitude.
While investigating the issue, I swapped the said pre-built FFmpeg scaffolding with FFmpeg 4.4 from aptitude,
and the segmentation fault did not happen. This indicates that it is binary compatibility issue.

Before https://github.com/pytorch/audio/issues/3460, each binary build job was building FFmpeg 4.1.8 using the same compiler used to build torchaudio,
but after https://github.com/pytorch/audio/issues/3460 the environments to build FFmpeg 4.1.8 and torchaudio are different. My hypothesis is that
this difference is causing some ABI incompatibility when linking against FFmpeg 4.4. (Also, I don't remember well,
but I read somewhere that 4.4 has a different ABI)

Through experiments, it turned out upgrading the pre-built FFmpeg scaffolding to 4.4 resolves this.
So this commit upgrade the pre-built FFmpeg 4 to 4.4.
The potential (yet unconfirmed) downside is that torchaudio will no longer work with 4.1, 4.2, and 4.3.
Since FFmpeg 4.4 is what Ubuntu 20.04 and 22.04 support by default, and Google Colab is also on 20.04,
I think it is more important to support 4.4.

Therefore we drop the support for 4.1-4.3 from normal build (and official distributions). Those who wish to
use 4.1-4.3 can build torchaudio from source by linking to specific FFmpeg.

Pull Request resolved: https://github.com/pytorch/audio/pull/3561

Reviewed By: hwangjeff

Differential Revision: D48340201

Pulled By: mthrok

fbshipit-source-id: 7ece82910f290c7cf83f58311c4cf6a384e8795e

bf07ea6b

14 Aug, 2023 1 commit

Update I/O and backend docs (#3555) · c0f25f21

moto authored Aug 14, 2023

Summary:
* Merge backend doc into torchaudio toplevel doc
* Update backend, dispatcher, installation doc

Pull Request resolved: https://github.com/pytorch/audio/pull/3555

Reviewed By: huangruizhe

Differential Revision: D48326812

Pulled By: mthrok

fbshipit-source-id: cc0d7326eacfebd341323b5d613ca1777255748b

c0f25f21

11 Aug, 2023 1 commit

Expose AudioMetadata (#3556) · 9467fc44

moto authored Aug 11, 2023

Summary:
`torchaudio.info` returns `AudioMetaData`. It should be exposed as public API, without referring `backend` submodule.

Pull Request resolved: https://github.com/pytorch/audio/pull/3556

Reviewed By: huangruizhe

Differential Revision: D48267349

Pulled By: mthrok

fbshipit-source-id: 6ccc0c32bf62fbdcb71495fc7d8d4cc29891538a

9467fc44

10 Aug, 2023 1 commit

Add Frechet distance function (#3545) · 06301c0a

Jeff Hwang authored Aug 10, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3545

Adds function for computing the Fréchet distance between two multivariate normal distributions.

Reviewed By: mthrok

Differential Revision: D48126102

fbshipit-source-id: e4e122b831e1e752037c03f5baa9451e81ef1697

06301c0a

07 Aug, 2023 2 commits

Add MMS FA Bundle (#3521) · 5e211d66

moto authored Aug 07, 2023

Summary:
Port the MMS FA model from tutorial to the library with post-processing module.

Pull Request resolved: https://github.com/pytorch/audio/pull/3521

Reviewed By: huangruizhe

Differential Revision: D48038285

Pulled By: mthrok

fbshipit-source-id: 571cf0fceaaab4790983be2719f1a85805b814f5

5e211d66

Add merge_tokens / TokenSpan (#3535) · 30668afb

moto authored Aug 07, 2023

Summary:
This commit adds `merge_tokens` function which removes repeated tokens from CTC token sequences returned from `forced_align`.

Resolving repeated tokens is a necessary step and almost universal, thus it makes sense to have such helper function in torchaudio.

Pull Request resolved: https://github.com/pytorch/audio/pull/3535

Reviewed By: huangruizhe

Differential Revision: D48111202

Pulled By: mthrok

fbshipit-source-id: 25354bfa210aa5c03f8c1d3e201f253ca3761b24

30668afb

03 Aug, 2023 1 commit

Refactor wav2vec2 pipeline misc helper functions (#3527) · 09aabcc1

moto authored Aug 02, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3527

Reviewed By: huangruizhe

Differential Revision: D48008822

Pulled By: mthrok

fbshipit-source-id: 4beae2956dfd1f00534832b70a1bf0897cba7812

09aabcc1

01 Aug, 2023 2 commits

Add cuctc tutorial, change blank skip threshold into prob (#3297) · 732c94a3

Yuekai Zhang authored Aug 01, 2023

Summary:
Add a separate tutorial for cuctc.
Reslove https://github.com/pytorch/audio/issues/3096

Pull Request resolved: https://github.com/pytorch/audio/pull/3297

Reviewed By: huangruizhe

Differential Revision: D47928400

Pulled By: mthrok

fbshipit-source-id: 8c16492fb4d007b6ea7969ba77c866a51749c0ec

732c94a3

Add pretrained VGGish inference pipeline (#3491) · cbfde17b

hwangjeff authored Jul 31, 2023

Summary:
Adds pre-trained VGGish inference pipeline ported from https://github.com/harritaylor/torchvggish and https://github.com/tensorflow/models/tree/master/research/audioset.

Pull Request resolved: https://github.com/pytorch/audio/pull/3491

Reviewed By: mthrok

Differential Revision: D47738130

Pulled By: hwangjeff

fbshipit-source-id: 859c1ff1ec1b09dae4e26586169544571657cc67

cbfde17b

31 Jul, 2023 1 commit

Set and tweak global matplotlib configuration in tutorials (#3515) · 84b12306

moto authored Jul 31, 2023

Summary:
- Set global matplotlib rc params
- Fix style check
- Fix and updates FA tutorial plots
- Add av-asr index cars

Pull Request resolved: https://github.com/pytorch/audio/pull/3515

Reviewed By: huangruizhe

Differential Revision: D47894156

Pulled By: mthrok

fbshipit-source-id: b40d8d31f12ffc2b337e35e632afc216e9d59a6e

84b12306

28 Jul, 2023 3 commits

Update documentation about dependencies (#3517) · a051985f

moto authored Jul 28, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3517

Reviewed By: huangruizhe

Differential Revision: D47858452

Pulled By: mthrok

fbshipit-source-id: 62ee6c8bb2669dd70f8ca25703a04dc8a9d19aec

a051985f

Move TorchAudio-Squim models to Beta (#3512) · b7d2d928

Zhaoheng Ni authored Jul 28, 2023

Summary:
The PR move `SquimObjective` and `SquimSubjective` models and corresponding factory functions and pre-trained pipelines out of prototype and to the core directory. They will be included in the next official release.

Pull Request resolved: https://github.com/pytorch/audio/pull/3512

Reviewed By: mthrok

Differential Revision: D47837434

Pulled By: nateanl

fbshipit-source-id: d0639f29079f7e1afc30f236849e530c8cadffd8

b7d2d928

Add real-time av-asr tutorial (#3511) · d6aeaa74

Pingchuan Ma authored Jul 28, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3511

Reviewed By: mthrok

Differential Revision: D47852108

Pulled By: mpc001

fbshipit-source-id: c0ecb4b5bcc8670013dcbe1164e3929f5793c8aa

d6aeaa74

27 Jul, 2023 1 commit

Replace libsox with stub library (#3497) · 8588fba1

moto authored Jul 27, 2023

Summary:
This commit updates the way libsox is integrated to torchaudio

1. We stop statically linking libsox, so torchaudio will not ship libsox.
2. We link libsox dynamically. Users are expected to install libsox by themselves.
3. We use stab library to build torchaudio.

Pull Request resolved: https://github.com/pytorch/audio/pull/3497

Differential Revision: D47803706

Pulled By: mthrok

fbshipit-source-id: 31b05495d81069186fa52d67beea360cc7e817a8

8588fba1

25 Jul, 2023 2 commits

Update nvdec/nvenc tutorials (#3483) · 56e22664

moto authored Jul 25, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3483

Differential Revision: D47725664

Pulled By: mthrok

fbshipit-source-id: e4249e1488fa7af8670be4a5077957912ff3420b

56e22664

Update AV-ASR recipe link to index.rst. (#3492) · ae8c131e

Pingchuan Ma authored Jul 25, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3492

Reviewed By: mthrok

Differential Revision: D47755638

Pulled By: mpc001

fbshipit-source-id: 729efdb2a69b5656dbc0b70dd623c1509123d3aa

ae8c131e

18 Jul, 2023 1 commit

Extract NVDEC tutorial from the current notebook (#3478) · 63244623

moto authored Jul 17, 2023

Summary:
Now that GPU video decoders are available in doc CI, we run the tutorials with GPU decoders.

Pull Request resolved: https://github.com/pytorch/audio/pull/3478

Differential Revision: D47519672

Pulled By: mthrok

fbshipit-source-id: 2f95243100e9c75e17c2b4d306da164f0e31f8f2

63244623

15 Jul, 2023 1 commit

Update notes on FFmpeg version (#3480) · 5a809aa0

moto authored Jul 15, 2023

Summary:
The nightly builds support FFmpeg version 4, 5 and 6.

Pull Request resolved: https://github.com/pytorch/audio/pull/3480

Differential Revision: D47482841

Pulled By: mthrok

fbshipit-source-id: 88267f5e83ddc7b1e866b35e57a87b985e2c78c9

5a809aa0

12 Jul, 2023 1 commit

Support multiple FFmpeg versions (#3464) · 786066b4

moto authored Jul 11, 2023

Summary:
This commit introduces support for multiple FFmpeg versions for OSS binary distributions.

Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking.
This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4.

The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them.
At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension.
The order of preference is 6, 5, then 4.

To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build.
They are LGPL and downloaded from S3 at build time, instead of building every time.

The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces
single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built
so that it will only support one specific version of FFmpeg.

Pull Request resolved: https://github.com/pytorch/audio/pull/3464

Differential Revision: D47300223

Pulled By: mthrok

fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04

786066b4

11 Jul, 2023 1 commit

Update doc analytics (#3469) · 216146ab

moto authored Jul 11, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3469

Differential Revision: D47368140

Pulled By: mthrok

fbshipit-source-id: d82ddb91ae1f6612298486fb8401f95c48db5620

216146ab

28 Jun, 2023 1 commit

include a link to index.rst (#3441) · a8ce4a87

Pingchuan Ma authored Jun 28, 2023

Summary:
Include Conformer/Emformer RNN-T ASR/VSR/AV-ASR link to index.rst

Pull Request resolved: https://github.com/pytorch/audio/pull/3441

Differential Revision: D47094158

Pulled By: mthrok

fbshipit-source-id: 9ab42ac2bf52a5ce488003897ffba2f10a6ca941

a8ce4a87

21 Jun, 2023 2 commits

Introduce chroma spectrogram transform (#3427) · 70968293

Jeff Hwang authored Jun 21, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3427

Adds transform `ChromaSpectrogram` for generating chromagrams from waveforms as well as transform `ChromaScale` for generating chromagrams from linear-frequency spectrograms.

Reviewed By: mthrok

Differential Revision: D46547418

fbshipit-source-id: 250f298b8e11d8cf82f05536c29d51cf8d77a960

70968293

Split the CTC forced aligment API tutorial into two tutorials (#3443) · 627c37a9

Xiaohui Zhang authored Jun 20, 2023

Summary:
Splitting the multilingual example part into another tutorial.

Pull Request resolved: https://github.com/pytorch/audio/pull/3443

Reviewed By: mthrok

Differential Revision: D46802844

Pulled By: xiaohui-zhang

fbshipit-source-id: a7093053cac8b79d650d4f665db7fde2d8254998

627c37a9

08 Jun, 2023 1 commit

Introduce chroma filter bank function (#3395) · dfd0c5fd

Jeff Hwang authored Jun 08, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3395

Adds chroma filter bank function `chroma_filterbank` to `torchaudio.prototype.functional`.

Reviewed By: mthrok

Differential Revision: D46307672

fbshipit-source-id: c5d8104a8bb03da70d0629b5cc224e0d897148d5

dfd0c5fd