Commits · f082e6c1a4b6eb00f47417f035166fe636722bf8 · OpenDAS / Torchaudio

26 Jul, 2023 1 commit

Add nightly doc update (#3496) · f082e6c1

moto authored Jul 26, 2023

Summary:
Add scheduled doc update job so that docs are updated at least once a day.

Pull Request resolved: https://github.com/pytorch/audio/pull/3496

Differential Revision: D47795577

Pulled By: mthrok

fbshipit-source-id: aba5376ec51f07560014d250a16fef8b8a11b43e

f082e6c1

25 Jul, 2023 7 commits

Disable some tests that need libsox (#3494) · 49e9ed94

moto authored Jul 25, 2023

Summary:
In preparation for https://github.com/pytorch/audio/pull/3082

Disable those FFmpeg tests that depend on sox CLI. These tests need to be updated or removed so as not to use sox CLI.

Auto-skip some sox tests if decoder/encoder are not available

Pull Request resolved: https://github.com/pytorch/audio/pull/3494

Differential Revision: D47761948

Pulled By: mthrok

fbshipit-source-id: 3a48d7f280f8376a48d223947dd41a7cdc8cbc30

49e9ed94

Fix and update doc deployment (#3495) · e483a67a

moto authored Jul 25, 2023

Summary:
- Fix condition to add new commit to gh-pages
- Allow to deploy docs from workflow dispatch

Pull Request resolved: https://github.com/pytorch/audio/pull/3495

Differential Revision: D47767443

Pulled By: mthrok

fbshipit-source-id: 9ca858868f3e822e532c21cde9d7499af9891a51

e483a67a

Update avsr recipe (#3493) · d4644793

Pingchuan Ma authored Jul 25, 2023

Summary:
This PR is to include few changes in the AV-ASR recipe. The changes include better results, a faster face detector (Mediapipe), renamed variable names, a streamlined dataloader, and a few illustrated examples. These changes were made to improve the usability of the recipe.

Pull Request resolved: https://github.com/pytorch/audio/pull/3493

Reviewed By: mthrok

Differential Revision: D47758072

Pulled By: mpc001

fbshipit-source-id: 4533587776f3a7a74f3f11b0ece773a0934bacdc

d4644793

Update nvdec/nvenc tutorials (#3483) · 56e22664

moto authored Jul 25, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3483

Differential Revision: D47725664

Pulled By: mthrok

fbshipit-source-id: e4249e1488fa7af8670be4a5077957912ff3420b

56e22664

Run GPU video decoder/encoder tests in CI (#3490) · df655604

moto authored Jul 25, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3490

Differential Revision: D47757316

Pulled By: mthrok

fbshipit-source-id: cfb376be29980f9e452f291c4fa25780e9f85a97

df655604

Fix typo in melscale_fbank (#3487) · 135cb7ba

moto authored Jul 25, 2023

Summary:
Resolves https://github.com/pytorch/audio/issues/3486

Pull Request resolved: https://github.com/pytorch/audio/pull/3487

Differential Revision: D47724733

Pulled By: mthrok

fbshipit-source-id: 26f5641a8271a7e50c4a33861d09b0c8274b29e4

135cb7ba

Update AV-ASR recipe link to index.rst. (#3492) · ae8c131e

Pingchuan Ma authored Jul 25, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3492

Reviewed By: mthrok

Differential Revision: D47755638

Pulled By: mpc001

fbshipit-source-id: 729efdb2a69b5656dbc0b70dd623c1509123d3aa

ae8c131e

24 Jul, 2023 1 commit

Move examples/asr/avsr_rnnt to examples/avsr folder (#3489) · 66f661df

Pingchuan Ma authored Jul 24, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3489

Reviewed By: mthrok

Differential Revision: D47726448

Pulled By: mpc001

fbshipit-source-id: 3d5aa7646c6bb816dcbbf70c61e98404bb148841

66f661df

18 Jul, 2023 1 commit

Extract NVDEC tutorial from the current notebook (#3478) · 63244623

moto authored Jul 17, 2023

Summary:
Now that GPU video decoders are available in doc CI, we run the tutorials with GPU decoders.

Pull Request resolved: https://github.com/pytorch/audio/pull/3478

Differential Revision: D47519672

Pulled By: mthrok

fbshipit-source-id: 2f95243100e9c75e17c2b4d306da164f0e31f8f2

63244623

17 Jul, 2023 1 commit

Ensure StreamReader returns tensors with requires_grad is False (#3467) · 44b92062

moto authored Jul 17, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3467

Differential Revision: D47482388

Pulled By: mthrok

fbshipit-source-id: abff36491dc28b83270673860d6457a084b1327d

44b92062

15 Jul, 2023 2 commits

Use more recent FFmpeg in unit tests (#3476) · ea7a96dd

moto authored Jul 15, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3476

Differential Revision: D47494211

Pulled By: mthrok

fbshipit-source-id: 230bbf0a271b070d1dea34146d0d466e666cccdc

ea7a96dd

Update notes on FFmpeg version (#3480) · 5a809aa0

moto authored Jul 15, 2023

Summary:
The nightly builds support FFmpeg version 4, 5 and 6.

Pull Request resolved: https://github.com/pytorch/audio/pull/3480

Differential Revision: D47482841

Pulled By: mthrok

fbshipit-source-id: 88267f5e83ddc7b1e866b35e57a87b985e2c78c9

5a809aa0

14 Jul, 2023 1 commit

Update the logic to fetch pixel format from filter graph (#3479) · cf53a486

moto authored Jul 14, 2023

Summary:
When using GPU decoder in some environments, attempting to read the output formats from filter graph caused an issue in which the software pixel format cannot be determined.

We do not know the exact cause but when it happens, the input link of buffer sink does not have HW frames context.

Since currently no filter can convert the pixel format of CUDA frame, we resort to the HW frames context of the output link of buffer source.

Environments this was observed.

Env1
- OS: Fedora 36 (x86_64)
- GCC 12.2.1
- Python 3.10.12
- GPU: GeForce RTX 3070 Ti Laptop GPU
- FFmpeg: 5.1.3
- nv-codec-header: n11.1.5.2
- CUDA: 12.1

Env2
- Ubuntu 20.04.4 LTS (x86_64)
- GCC 9.4.0
- Python 3.11.3
- GPU: Quadro GV100
- FFmpeg: 5.1.3
- nv-codec-header: n11.1.5.2
- CUDA: 11.4

Pull Request resolved: https://github.com/pytorch/audio/pull/3479

Differential Revision: D47482407

Pulled By: mthrok

fbshipit-source-id: 1c53096b27824453b260138ab64e1948afeeefc7

cf53a486

13 Jul, 2023 2 commits

Linux CPU job should respect set Python version (#3477) · 86cb1e09

Omkar Salpekar authored Jul 13, 2023

Summary:
Reintroduce a conda environment within which we will do all deps installation, audio builds, and tests runs. This conda environment will use the python version set by the GHA job - previously this just defaulted to using the system 3.10 python which was default inside the container.

Pull Request resolved: https://github.com/pytorch/audio/pull/3477

Reviewed By: mthrok

Differential Revision: D47414572

Pulled By: osalpekar

fbshipit-source-id: 80760f82c7726205b29812d576e498db2a7a80a0

86cb1e09

Revert D47402174: [audio][PR] Resolve some compilation warnings · 155d1bae

Moto Hira authored Jul 13, 2023

Differential Revision:
D47402174

Original commit changeset: 00c0719ab184

Original Phabricator Diff: D47402174

fbshipit-source-id: b1f6ea4cc3ecef3f72a87bf2f67bf9644c847546

155d1bae

12 Jul, 2023 5 commits

Resolve some compilation warnings (#3471) · a6d1fec0

moto authored Jul 12, 2023

Summary:
- FFmpeg 6 deprecated attributes
- Guard CUDA specific functions not used in CPU builds

Pull Request resolved: https://github.com/pytorch/audio/pull/3471

Differential Revision: D47402174

Pulled By: mthrok

fbshipit-source-id: 00c0719ab1849b50c0b56b03d8fb38bc7aa74538

a6d1fec0

Fix resampling to support dynamic input lengths for onnx exports. (#3473) · a3b6bfb6

Bogdan Teleaga authored Jul 12, 2023

Summary:
This is a port of https://github.com/adefossez/julius/pull/17 for torchaudio.

Not sure if it's possible/desirable to add tests to test the functionality of ONNX exports, but I did a quick test on my machine to ensure this works. The logic is a bit simpler compared to the other PR because the torchaudio version does not support the additional flags available in julius.

Pull Request resolved: https://github.com/pytorch/audio/pull/3473

Differential Revision: D47401988

Pulled By: mthrok

fbshipit-source-id: 62fa1e4388923f6a62cef2c0f902a79ea179cec4

a3b6bfb6

Use FFmpeg6 in build doc (#3475) · 989702b3

moto authored Jul 12, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3475

Differential Revision: D47403772

Pulled By: mthrok

fbshipit-source-id: 5cdde521dbbbbf33856470a9dc79419b4a3a1683

989702b3

Fix FFmpeg initialization logic (#3474) · 49e269ab

Moto Hira authored Jul 12, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3474

Differential Revision: D47398447

fbshipit-source-id: f77b685d54ddfc222b806475707d4a10239872f5

49e269ab

Support multiple FFmpeg versions (#3464) · 786066b4

moto authored Jul 11, 2023

Summary:
This commit introduces support for multiple FFmpeg versions for OSS binary distributions.

Currently torchaudio only works with FFmpeg 4. This is inconvenient from installing to runtime linking.
This commit allows to pick FFmpeg 4, 5 or 6 at runtime, instead of just looking for v4.

The way it works is that we compile the FFmpeg extension three times with different FFmpeg and ship them.
At runtime, we look for libavutil of specific version and when one is found, load the corresponding FFmpeg extension.
The order of preference is 6, 5, then 4.

To make the build process simple and reproducible, we use pre-built binaries of FFmpeg during the build.
They are LGPL and downloaded from S3 at build time, instead of building every time.

The use of pre-built binaries as scaffolding limits the system that can build torchaudio, so it also introduces
single FFmpeg version support mode. setting FFMPEG_ROOT during the build will change the way binaries are built
so that it will only support one specific version of FFmpeg.

Pull Request resolved: https://github.com/pytorch/audio/pull/3464

Differential Revision: D47300223

Pulled By: mthrok

fbshipit-source-id: 560c7968315e4c8922afa11a4693f648c0356d04

786066b4

11 Jul, 2023 4 commits

Clean up FFmpeg build scripts (#3470) · cc41178b

moto authored Jul 11, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3470

Differential Revision: D47374347

Pulled By: mthrok

fbshipit-source-id: 003b83e50a70f6e1d06eb196f0be5dbba1640226

cc41178b

Fix doc style (#3468) · 18b20f77

moto authored Jul 11, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3468

Differential Revision: D47368070

Pulled By: mthrok

fbshipit-source-id: 9b5d57b0cb861a2556a1903121f526f8011a0e2d

18b20f77

Update doc analytics (#3469) · 216146ab

moto authored Jul 11, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3469

Differential Revision: D47368140

Pulled By: mthrok

fbshipit-source-id: d82ddb91ae1f6612298486fb8401f95c48db5620

216146ab

Clean up FFMPEG env var and remove pre/post build script (#3466) · c825c019

moto authored Jul 11, 2023

Summary:
Now that we do not build FFmpeg as part of CI build process, we can remove the pre/post build scripts.

Needs to land after https://github.com/pytorch/test-infra/pull/4358

Pull Request resolved: https://github.com/pytorch/audio/pull/3466

Reviewed By: atalman

Differential Revision: D47367022

Pulled By: mthrok

fbshipit-source-id: 17aafff74ee7d269236cffb8a88c803a8d4c44b7

c825c019

10 Jul, 2023 1 commit

Update package smoke test (#3465) · 589de109

moto authored Jul 10, 2023

Summary:
1. Update smoke test script to change directory so that there is no `torchaudio` directory in CWD when smoke test is being executed.
2. Disable the part of smoke test which requires FFmpeg for wheel. The preparation for https://github.com/pytorch/test-infra/pull/4358

Pull Request resolved: https://github.com/pytorch/audio/pull/3465

Reviewed By: nateanl

Differential Revision: D47345117

Pulled By: mthrok

fbshipit-source-id: 95aad0a22922d44ee9a24a05d9ece85166b8c17e

589de109

07 Jul, 2023 3 commits

Set the default #threads to 1 in StreamWriter (#3370) · 9c7bf1bc

moto authored Jul 07, 2023

Summary:
Similrt to https://github.com/pytorch/audio/issues/2949

Pull Request resolved: https://github.com/pytorch/audio/pull/3370

Differential Revision: D47298746

Pulled By: mthrok

fbshipit-source-id: 0cc0f395772b33f8b2f5f55253d659e451f506c4

9c7bf1bc

Fix StreamWriter regression around RGB0/BGR0 (#3428) · 9210cba2

moto authored Jul 07, 2023

Summary:
- Add RGB0/BGR0 support to CPU encoder
- Allow to pass RGB/BGR when expectged format is RGB0/BGR0

Pull Request resolved: https://github.com/pytorch/audio/pull/3428

Differential Revision: D47274370

Pulled By: mthrok

fbshipit-source-id: d34d940e04b07673bb86f518fe895c0735912444

9210cba2

Use pre-built binaries for ffmpeg extension (#3460) · f77c3e5b

moto authored Jul 07, 2023

Summary:
This commit changes the way FFmpeg extension is built.

Originally, the build process expected the FFmpeg binaries to be somehow available in build env.
This makes the build process unpredictable and prevents default enabling FFmpeg extension.

The proposed change uses pre-built FFmpeg binaries as build-time only scaffold, which are built in our CI job https://github.com/pytorch/audio/actions/workflows/ffmpeg.yml.

This makes the build process more predictable and removes the necessity to build FFmpeg in our CI.
Currently, it supports macOS (arm64, x86_64), unix (x86_64, aarch64) and windows (amd64).
The downside is that it no longer works with the architecture not listed above.
We can potentially workaround by searching the FFmpeg binaries available in system (the old way) for
these system, but since they are not supported by PyTorch, the priority is low.

Pull Request resolved: https://github.com/pytorch/audio/pull/3460

Differential Revision: D47261885

Pulled By: mthrok

fbshipit-source-id: 223a15e95c9140c95688af968beb35ff40354476

f77c3e5b

06 Jul, 2023 2 commits

Add ARM linux ffmpeg build (#3462) · d9f51ce5

moto authored Jul 06, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3462

Differential Revision: D47270241

Pulled By: mthrok

fbshipit-source-id: 6a3b02380dfb381ffb47c1f46b46f4833c765246

d9f51ce5

Fix mac ffmpeg build (#3459) · 2fa39dbd

moto authored Jul 06, 2023

Summary:
Follow up of  https://github.com/pytorch/audio/pull/3455

FFMPEG_VERSION env ver is not defined in existing CI jobs.

Pull Request resolved: https://github.com/pytorch/audio/pull/3459

Reviewed By: atalman

Differential Revision: D47249074

Pulled By: mthrok

fbshipit-source-id: 20f82d749adef5f45a984ab8125592ef36279e94

2fa39dbd

05 Jul, 2023 4 commits

Revert "[audio][PR] Add option to dlopen FFmpeg libraries (#3402)" (#3456) · ca66a1d3

moto authored Jul 05, 2023

Summary:
This reverts commit b7d3e89a.

We will use pre-built binaries instead of dlopen.

Pull Request resolved: https://github.com/pytorch/audio/pull/3456

Differential Revision: D47239681

Pulled By: mthrok

fbshipit-source-id: 0446a62410d914081184fc20c386afa00b1e41b6

ca66a1d3

Add stand alone job to build FFmpeg binaries (#3455) · 662f067b

moto authored Jul 05, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3455

Differential Revision: D47242316

Pulled By: mthrok

fbshipit-source-id: 0eb4bdb0a45fccfe9ff97eaed79db63cd7bfc7d8

662f067b

Untangle third party inclusion in CMake (#3457) · c34a1d6d

moto authored Jul 05, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3457

Differential Revision: D47241343

Pulled By: mthrok

fbshipit-source-id: fd1bfd1531397cb59e9cf11de9dede6949f8517e

c34a1d6d

Update forced_align method to only support batch Tensors (#3433) · cc164478

Zhaoheng Ni authored Jul 05, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3433

Current design of forced_align accept 2D Tensor for `log_probs` and 1D Tensor for `targets`. To make the API simple, the PR make changes to only support batch Tensors (3D Tensor for `log_probs` and 2D Tensor for `targets`).

Reviewed By: mthrok

Differential Revision: D46657526

fbshipit-source-id: af17ec3f92f1a2c46dba91c6db2488a11de36f89

cc164478

03 Jul, 2023 1 commit

Update README (#3434) · 163157d3

Zhaoheng Ni authored Jul 03, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3434

Add one bullet point for `torchaudio.functional` and forced alignment as one example.

Reviewed By: mthrok

Differential Revision: D46658058

fbshipit-source-id: 6e037b7bb6ed2fc2e27ad1e55c5728c17ce69ce8

163157d3

28 Jun, 2023 2 commits

include a link to index.rst (#3441) · a8ce4a87

Pingchuan Ma authored Jun 28, 2023

Summary:
Include Conformer/Emformer RNN-T ASR/VSR/AV-ASR link to index.rst

Pull Request resolved: https://github.com/pytorch/audio/pull/3441

Differential Revision: D47094158

Pulled By: mthrok

fbshipit-source-id: 9ab42ac2bf52a5ce488003897ffba2f10a6ca941

a8ce4a87

Follow up on tutorial update (#3449) · 4a121aa5

moto authored Jun 28, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3449

Differential Revision: D47094402

Pulled By: mthrok

fbshipit-source-id: 43e6994604f0e6c06a5f19c5e8599e2ce12ae622

4a121aa5

26 Jun, 2023 1 commit

Add more explanation about `n_fft` (#3442) · 105b77fe

moto authored Jun 26, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3442

Differential Revision: D46797481

Pulled By: mthrok

fbshipit-source-id: 3513037cbb8f2edb70fdab0fec5c7c554a697abe

105b77fe

21 Jun, 2023 1 commit

Introduce chroma spectrogram transform (#3427) · 70968293

Jeff Hwang authored Jun 21, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3427

Adds transform `ChromaSpectrogram` for generating chromagrams from waveforms as well as transform `ChromaScale` for generating chromagrams from linear-frequency spectrograms.

Reviewed By: mthrok

Differential Revision: D46547418

fbshipit-source-id: 250f298b8e11d8cf82f05536c29d51cf8d77a960

70968293