Commits · 05d2580a6886f2ba47a20b2b6f1382c25c9fc397 · OpenDAS / Torchaudio

12 Jul, 2022 1 commit

Zhaoheng Ni authored Jul 11, 2022

Summary:
The docstring of `apply_beamforming` has warning when building the documentation page. Fix it in this PR.

Pull Request resolved: https://github.com/pytorch/audio/pull/2540

Reviewed By: mthrok

Differential Revision: D37763745

Pulled By: nateanl

fbshipit-source-id: 0e9f1e098865af032b00ac56d918cb9d2ffc5024

05d2580a

11 Jul, 2022 1 commit

Revise LibriSpeech Conformer RNN-T recipe (#2535) · a7d1b31c

Jeff Hwang authored Jul 11, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2535

Modifies LibriSpeech Conformer RNN-T example recipe to make the Lightning module and datamodule more generic and reusable.

Reviewed By: mthrok

Differential Revision: D36731576

fbshipit-source-id: 4643e86fac78f3c2bacc15f5d385bc7b10f410a2

a7d1b31c

08 Jul, 2022 1 commit

Put StreamReader source code into dedicated directory (#2531) · 54eb0991

moto authored Jul 07, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2531

Reviewed By: carolineechen

Differential Revision: D37698120

Pulled By: mthrok

fbshipit-source-id: d0fd6445d69758cd803a485cd17836d1936aa1ee

54eb0991

07 Jul, 2022 5 commits

Rename AVContextPtr with AVContextInputPtr (#2530) · 08597236

moto authored Jul 07, 2022

Summary:
Preparation to add save features with ffmpeg.

Pull Request resolved: https://github.com/pytorch/audio/pull/2530

Reviewed By: carolineechen

Differential Revision: D37698147

Pulled By: mthrok

fbshipit-source-id: feb5cbb6349a2b6b7faf44b629c574fdae47ecab

08597236

Update CircleCI Xcode image (#2529) · 8b70c93e

moto authored Jul 07, 2022

Summary:
CircleCI is removing Xcode 12.4.0 image on August, and there was a planned
burnout on July 6th. [[detail](https://discuss.circleci.com/t/xcode-image-deprecation/44294?mkt_tok=NDg1LVpNSC02MjYAAAGFbbxbX7nSPCzN0MCKN078pw0VLJ-TMdICr8_gouRNYBM8C55RL8NDKLXA_9CQGPqnhJE5lsSFdetLRF-nH7iBLzoPGBfYpf2vuJ-XkW_C4__4)]

https://app.circleci.com/pipelines/github/pytorch/audio/11566/workflows/da167296-a84f-4dfe-b1b9-60d67e7a3d1c/jobs/771638

This commit updates Xcode image to 12.5

Pull Request resolved: https://github.com/pytorch/audio/pull/2529

Reviewed By: atalman

Differential Revision: D37688122

Pulled By: mthrok

fbshipit-source-id: 1095edbf0d920c4dc772555915bce93875b74671

8b70c93e

Add YUV444P support to StreamReader (#2516) · b2a90f91

moto authored Jul 06, 2022

Summary:
This commit add support for `"yuv444p"` type as output format of StreamReader.

Pull Request resolved: https://github.com/pytorch/audio/pull/2516

Reviewed By: hwangjeff

Differential Revision: D37659715

Pulled By: mthrok

fbshipit-source-id: eae9b5590d8f138a6ebf3808c08adfe068f11a2b

b2a90f91

Move helper functions out of common utility for better locality (#2512) · 10ac6d2b

moto authored Jul 06, 2022

Summary:
This commits move helper functions/definitions around so that better locality of logics are achieved.

## Detail

`ffmpeg.[h|cpp]` implements classes that convert FFmpeg structures into RAII semantics.
Initially it these classes included the construction logic in their constructors, but such logics were
extracted to factory functions in https://github.com/pytorch/audio/issues/2373.

Now the reason why the factory functions stayed in `ffmpeg.[h|cpp]` was because the logic for
the initialization and  clean-up of AVDictionary class was only available in `ffmpeg.cpp`.

Now AVDictionary class handling is properly defined in https://github.com/pytorch/audio/issues/2507, the factory functions, which are not
that reusable better stay with the implementation that use them.

This makes `ffmpeg.h` lean and clean, makes it easier to see what can be reused.

Pull Request resolved: https://github.com/pytorch/audio/pull/2512

Reviewed By: hwangjeff

Differential Revision: D37477592

Pulled By: mthrok

fbshipit-source-id: 8c1b5059ea5f44649cc0eb1f82d1a92877ef186e

10ac6d2b

Update lint config (#2389) · 515fd01c

moto authored Jul 06, 2022

Summary:
Following the formatter changes heppened in fbcode, this commit update the linter config.

Pull Request resolved: https://github.com/pytorch/audio/pull/2389

Reviewed By: hwangjeff

Differential Revision: D37659649

Pulled By: mthrok

fbshipit-source-id: 1c52ff93f0b10cb2e7303d2ad13b2d65ffccfcb0

515fd01c

06 Jul, 2022 1 commit

Fix fluent test for windows (#2510) · 09daa438

Caroline Chen authored Jul 05, 2022

Summary:
fluent dataset test currently fails on windows, due to new line generation in csv writer in testing and incorrect path parsing in dataset impl.

Pull Request resolved: https://github.com/pytorch/audio/pull/2510

Reviewed By: carolineechen

Differential Revision: D37573203

Pulled By: mthrok

fbshipit-source-id: 4868bc649690c7e596b002686c6128ce735d3564

09daa438

29 Jun, 2022 1 commit

Fix build doc job (#2520) · ef8bd7b6

moto authored Jun 29, 2022

Summary:
The build doc job is failing these days due to the fact that CUDA 11.6 requires different handling.

Pull Request resolved: https://github.com/pytorch/audio/pull/2520

Reviewed By: xiaohui-zhang

Differential Revision: D37527088

Pulled By: mthrok

fbshipit-source-id: 34c23bdbf70ba9fb8e315c7036cff01b3ddf4c91

ef8bd7b6

28 Jun, 2022 3 commits

Add 0.12.0 to version compatibility matrix (#2513) · d3b4ce68

hwangjeff authored Jun 28, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2513

Reviewed By: mthrok

Differential Revision: D37491994

Pulled By: hwangjeff

fbshipit-source-id: 2c164bcec39342fd94abf4cc148d96dc9844699e

d3b4ce68

Refactor FilterGraph interface (#2508) · 0dd57236

moto authored Jun 27, 2022

Summary:
FilterGraph is necessary for StreamWriter when saving video as
Tensor array format cannot express commonot video formats like yub420.

The current implementation of FilterGraph is specific to StreamReader,
as it takes AVCodecParameters object. Not individual parameters.

This PR refactor FilterGraph interface so that it can be constructed
from more primitive information.

Pull Request resolved: https://github.com/pytorch/audio/pull/2508

Reviewed By: hwangjeff

Differential Revision: D37466033

Pulled By: mthrok

fbshipit-source-id: 8414e985da7579c2dfe260b4dccd2afe113bb573

0dd57236

Refactor AVDictionary clean up (#2507) · 0ad03adf

moto authored Jun 27, 2022

Summary:
Small clean up in ffmpeg binding code.

1. Make `get_option_dict` and `clean_up_dict` public utility
2. Merge the exception into `clean_up_dict`
3. Get rid of custom string join function and use `c10::Join`.

Pull Request resolved: https://github.com/pytorch/audio/pull/2507

Reviewed By: hwangjeff

Differential Revision: D37466022

Pulled By: mthrok

fbshipit-source-id: 44b769ac6ff1ab20e6d6ae086cd1447deacb5969

0ad03adf

27 Jun, 2022 5 commits

Add missing __init__ in io test directory (#2511) · d50ed521

moto authored Jun 27, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2511

Reviewed By: nateanl

Differential Revision: D37461021

Pulled By: mthrok

fbshipit-source-id: 6f894c02bbefc5afda0f9584d26ad785f7c71ee4

d50ed521

Fix download links of RNNT pipelines in prototype (#2444) · 9b4ee17c

Zhaoheng Ni authored Jun 27, 2022

Summary:
In https://github.com/pytorch/audio/issues/2283, torchaudio's downloading function is updated to reduce code duplication. The links in `EMFORMER_RNNT_BASE_LIBRISPEECH` are updated, but the ones in prototype pipelines are not. This PR addresses it by updating the download links of `EMFORMER_RNNT_BASE_MUSTC` and `EMFORMER_RNNT_BASE_TEDLIUM3` in prototype. Corresponding integration tests are added as well.

Pull Request resolved: https://github.com/pytorch/audio/pull/2444

Reviewed By: mthrok

Differential Revision: D37389178

Pulled By: nateanl

fbshipit-source-id: 46598dd71c95be47d1e1b54cef89ea51d280e17a

9b4ee17c

Add utility function to fetch FFmpeg library versions (#2467) · 4ba7dc38

moto authored Jun 27, 2022

Summary:
Follow-up of https://github.com/pytorch/audio/issues/2464. Add utility function to fetch the versions of FFmpeg.

Pull Request resolved: https://github.com/pytorch/audio/pull/2467

Reviewed By: carolineechen

Differential Revision: D37028006

Pulled By: mthrok

fbshipit-source-id: 72adce1e6b43985760ce55b715b0e59af5244fdb

4ba7dc38

Fix for the cuda 11.6 and usage of cudatoolkit (#2501) · 8ede3e1e

Andrey Talman authored Jun 27, 2022

Summary:
Fix for the cuda 11.6 and usage of cudatoolkit

Pull Request resolved: https://github.com/pytorch/audio/pull/2501

Reviewed By: mthrok

Differential Revision: D37388598

Pulled By: atalman

fbshipit-source-id: 41add7ad6fbb3d156cc1270625dc085c62f7a531

8ede3e1e

Add VoxCeleb1 dataset (#2349) · 21b2d139

Zhaoheng Ni authored Jun 27, 2022

Summary:
This PR adds two dataset classes of VoxCeleb1 corpus.
- `VoxCeleb1Identification`
Each data sample contains the waveform, sample rate, speaker id, and the file id.
- `VoxCeleb1Verification`
Each data sample contains a pair of waveforms, sample rate, the label indicating if they are from the same speaker, and the file ids.

Pull Request resolved: https://github.com/pytorch/audio/pull/2349

Reviewed By: carolineechen

Differential Revision: D35927921

Pulled By: nateanl

fbshipit-source-id: 3e07ddd329178777698841565053eb59befe6449

21b2d139

24 Jun, 2022 1 commit

Fix version number on main branch (#2509) · 49551eed

moto authored Jun 24, 2022

Summary:
The source build is still saying its 0.12.

Pull Request resolved: https://github.com/pytorch/audio/pull/2509

Reviewed By: carolineechen

Differential Revision: D37427703

Pulled By: mthrok

fbshipit-source-id: a6e455ba7c583af7b1a2a355ca45a9e5ab5fe30d

49551eed

23 Jun, 2022 1 commit

[AutoAccept][Codemod][FBSourceBlackLinter] Daily `arc lint --take BLACK` · fee994ce

CodemodService FBSourceBlackLinterBot authored Jun 23, 2022

Summary:
Meta:
**If you take no action, this diff will be automatically accepted on 2022-06-23.**
(To remove yourself from auto-accept diffs and just let them all land, add yourself to [this Butterfly rule](https://www.internalfb.com/butterfly/rule/904302247110220))

Produced by `tools/arcanist/lint/codemods/black-fbsource`.

#nocancel

Rules run:
- CodemodTransformerSimpleShell

Config Oncall: [lint](https://our.intern.facebook.com/intern/oncall3/?shortname=lint)
CodemodConfig: [CodemodConfigFBSourceBlackLinter](https://www.internalfb.com/code/www/flib/intern/codemod_service/config/fbsource_arc_f/CodemodConfigFBSourceBlackLinter.php)
ConfigType: php
Sandcastle URL: https://www.internalfb.com/intern/sandcastle/job/13510799586951394/
This diff was automatically created with CodemodService.
To learn more about CodemodService, check out the [CodemodService wiki](https://fburl.com/CodemodService).

_____

## Questions / Comments / Feedback?

**[Click here to give feedback about this diff](https://www.internalfb.com/codemod_service/feedback?sandcastle_job_id=13510799586951394).**

* Returning back to author or abandoning this diff will only cause the diff to be regenerated in the future.
* Do **NOT** post in the CodemodService Feedback group about this specific diff.

drop-conflicts

Reviewed By: adamjernst

Differential Revision: D37375235

fbshipit-source-id: 3d7eb39e5c0539a78d1412f37562dec90b0fc759

fee994ce

21 Jun, 2022 1 commit

Create musdb handler and tests (#2484) · b92a8a09

Sean Kim authored Jun 21, 2022

Summary:
Create dataset handler and tests for new dataset. Manually tested and unit tested to test validity. Pre-commit ran for style checks.

Pull Request resolved: https://github.com/pytorch/audio/pull/2484

Reviewed By: carolineechen, nateanl

Differential Revision: D37250556

Pulled By: skim0514

fbshipit-source-id: d2c8d73d22fd9d7282026265676f3eab1e178d51

b92a8a09

20 Jun, 2022 1 commit

Add fluent speech commands (#2480) · 66a67d2e

Caroline Chen authored Jun 20, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2480

Reviewed By: nateanl

Differential Revision: D37249571

Pulled By: carolineechen

fbshipit-source-id: caefeec4253c91f2579655a0c1735edaeed51be9

66a67d2e

17 Jun, 2022 1 commit

Make lazy import for joblib (#2498) · 10195316

Zhaoheng Ni authored Jun 17, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2498

Reviewed By: mthrok

Differential Revision: D37224024

Pulled By: nateanl

fbshipit-source-id: 5d5d561c43d1ee323ae0cc599ffa1479208ea09a

10195316

16 Jun, 2022 1 commit

Add special handling to filelike object mp3 (#2478) · 74dcfba3

moto authored Jun 15, 2022

Summary:
Loading and querying file-like object is not possible to use the fallback
mechanism introduced in https://github.com/pytorch/audio/issues/2419 because file-like objects are not seekable.

This commit add special case handling to mp3.

For filelike object mp3 input, it was required to pass `format="mp3"`
because libsox did not auto detect the format.

With the transition of mp3 handling from libsox to ffmpeg, the logic
is to let the ffmpeg handle it without waiting for libsox to fail,
if the `format="mp3"`

Note: This is back port of https://github.com/pytorch/audio/issues/2477.

Pull Request resolved: https://github.com/pytorch/audio/pull/2478

Reviewed By: carolineechen

Differential Revision: D37177123

Pulled By: mthrok

fbshipit-source-id: 997eead01c0ad1f04ffa0daa1039302a75f62b63

74dcfba3

15 Jun, 2022 7 commits

Making sure channel flag is set correctly (#2496) · 5e966711

Andrey Talman authored Jun 15, 2022

Summary:
Making sure channel flag is set correctly for the test channel

Pull Request resolved: https://github.com/pytorch/audio/pull/2496

Reviewed By: hwangjeff, mthrok

Differential Revision: D37183083

Pulled By: atalman

fbshipit-source-id: 5df8aad1bceb22ad65b0942bf370480bb1cbd44a

5e966711

Fix typo in release build step (#2495) · c46a00c2

Andrey Talman authored Jun 15, 2022

Summary:
Fix typo in release build step

Pull Request resolved: https://github.com/pytorch/audio/pull/2495

Reviewed By: hwangjeff

Differential Revision: D37176695

Pulled By: atalman

fbshipit-source-id: 37b4e30c1084e506f3a45cf7427784c955868909

c46a00c2

Fix push on release reference name (#2492) · 6b152cb1

Andrey Talman authored Jun 15, 2022

Summary:
Fix push on release reference name
We want to compare it against refs/heads/release rather then release
Tests: https://github.com/atalman/vision/commit/af17cd95d2d43ca13354fb700e2da42108dd5a87
Sets correctly release chanell (wheels): https://github.com/atalman/vision/runs/6901327010?check_suite_focus=true

Pull Request resolved: https://github.com/pytorch/audio/pull/2492

Reviewed By: hwangjeff

Differential Revision: D37174090

Pulled By: atalman

fbshipit-source-id: e114972935572a701eb7daff429a0df0ed5a75e4

6b152cb1

Making sure we are picking correct release branch (#2489) · 722982ea

Andrey Talman authored Jun 14, 2022

Summary:
Making sure we are picking correct release branch
Ref: https://github.com/pytorch/vision/pull/6168

Pull Request resolved: https://github.com/pytorch/audio/pull/2489

Reviewed By: mthrok

Differential Revision: D37160145

Pulled By: atalman

fbshipit-source-id: 3e4a2208cbe47f85147573159f9adb8d6a824956

722982ea

Update config.guess to the latest (#2479) · 575478ec

moto authored Jun 14, 2022

Summary:
closes https://github.com/pytorch/audio/issues/2420

Pull Request resolved: https://github.com/pytorch/audio/pull/2479

Reviewed By: carolineechen

Differential Revision: D37142717

Pulled By: mthrok

fbshipit-source-id: c3d4cc1435a74dfa6992112590c988c2903511a8

575478ec

Disable lint CI signal (#2487) · 1e3cc6b2

moto authored Jun 14, 2022

Summary:
Lint style has diverged since fb-internal lint engine has been changed.

Backport of https://github.com/pytorch/audio/issues/2466.

Pull Request resolved: https://github.com/pytorch/audio/pull/2487

Reviewed By: carolineechen

Differential Revision: D37160193

Pulled By: mthrok

fbshipit-source-id: cf4e2091a78a0da53269ae1251a55d4d1e52ead2

1e3cc6b2

Pin MKL to 2020.04 (#2486) · be213bfb

moto authored Jun 14, 2022

Summary:
The version of MKL that is installed alongside PyTorch has been bumped
to 2022.1 on Windows and it is causing installation issue in unit tests.

This commit pins the previous version

Backport of https://github.com/pytorch/audio/issues/2463

Pull Request resolved: https://github.com/pytorch/audio/pull/2486

Reviewed By: nateanl

Differential Revision: D37160156

Pulled By: mthrok

fbshipit-source-id: 7e3a30c25782b349a3cad2ee6d1141affc921881

be213bfb

14 Jun, 2022 2 commits

Adding conda builds for M1 (#2473) · 489999e2

Andrey Talman authored Jun 14, 2022

Summary:
Adding conda builds for M1

Pull Request resolved: https://github.com/pytorch/audio/pull/2473

Reviewed By: mthrok

Differential Revision: D37151454

Pulled By: atalman

fbshipit-source-id: 0108b937a4c7048bd4bb03b2b5a367704d7b78cc

489999e2

Add note about `normalize` argument (#2449) · 6fa5732c

moto authored Jun 13, 2022

Summary:
`load` function has `normalize` argument, which converts the native
sample type to `torch.float32`.

This argument is confusing for audio practitioners as it seems
to perform [volume normalization](https://en.wikipedia.org/wiki/Audio_normalization).

See https://github.com/pytorch/audio/issues/2253

Due to the BC-breaking concern, we cannot easily change the argument name.
This commit adds warnings to documentations.

Fix https://github.com/pytorch/audio/issues/2253

Pull Request resolved: https://github.com/pytorch/audio/pull/2449

Reviewed By: nateanl

Differential Revision: D36995756

Pulled By: carolineechen

fbshipit-source-id: 0b7db2758a355f6aafe06a2273bc72a1027690bd

6fa5732c

13 Jun, 2022 2 commits

Fix typo in nightly m1 ref (#2474) · a9c1e3a3

Andrey Talman authored Jun 13, 2022

Summary:
Fix typo in nightly m1 ref
See: https://github.com/pytorch/vision/pull/6158

Pull Request resolved: https://github.com/pytorch/audio/pull/2474

Reviewed By: malfet, mthrok

Differential Revision: D37117637

Pulled By: atalman

fbshipit-source-id: 2a8f7b5bf3506f2a53884424799919137870a0ad

a9c1e3a3

[AutoAccept][Codemod][FBSourceBlackLinter] Daily `arc lint --take BLACK` · 71ed457e
CodemodService FBSourceBlackLinterBot authored Jun 13, 2022
```
Reviewed By: ivanmurashko

Differential Revision: D37103342

fbshipit-source-id: adc908c790a413384bd88a75d3c2b4b0974c6674
```
71ed457e

10 Jun, 2022 2 commits

Adding tagged builds to torchaudio (#2471) · 19d93282

Andrey Talman authored Jun 10, 2022

Summary:
Adding tagged builds for torchaudio
see: https://github.com/pytorch/vision/pull/6140

Pull Request resolved: https://github.com/pytorch/audio/pull/2471

Reviewed By: hwangjeff

Differential Revision: D37080828

Pulled By: atalman

fbshipit-source-id: 13d754f522510514f0148ba465ce12a320058722

19d93282

Modifying Pitchshift for faster resampling (#2441) · df2262b5

Sean Kim authored Jun 10, 2022

Summary:
Split existing Pitchshift into multiple helper functions in order to cache kernel and speed up overall process addressing https://github.com/pytorch/audio/issues/2359.
Existing unit tests pass.

edit: functional and transforms unit test pass. Adopted lazy initialization to avoid BC-breaking.

Pull Request resolved: https://github.com/pytorch/audio/pull/2441

Reviewed By: carolineechen

Differential Revision: D36905582

Pulled By: skim0514

fbshipit-source-id: 6780db3ac8a29d59017a6abe7e82ce1fd17aaac2

df2262b5

08 Jun, 2022 3 commits

Fix metadata fetch (#2464) · 4d2fa190

moto authored Jun 08, 2022

Summary:
In https://github.com/pytorch/audio/issues/2461, `metadata` field was added to StreamInfo.
However, the value attached to this new field was source-level metadata,
while each stream can have different metadata.

* source level metadata
[AVFormatContext->metadata](https://ffmpeg.org/doxygen/4.1/structAVFormatContext.html#a3019a56080ed2e3297ff25bc2ff88adf)
* stream level metadata
[AVFormatContext->streams[]->metadata](https://ffmpeg.org/doxygen/4.1/structAVStream.html#a50d250a128a3da9ce3d135e84213fb82)

This commit moves source level metadata to dedicated method, `get_metadata`, and
fix the stream-level metadata to report stream metadata.

Pull Request resolved: https://github.com/pytorch/audio/pull/2464

Reviewed By: hwangjeff, xiaohui-zhang

Differential Revision: D36995452

Pulled By: mthrok

fbshipit-source-id: 534be1f7feb07790a0ce8624c336cdb7b65a8697

4d2fa190

Update HW decoding tutorial and add notes about unseekable object (#2408) · 711d6016

moto authored Jun 08, 2022

Summary:
https://output.circle-artifacts.com/output/job/75187a52-b0d8-4cac-89f3-24e10889a36a/artifacts/0/docs/hw_acceleration_tutorial.html

1. Update HW decoding tutorial to include file-like object
1. Add note about unseekable object int streaming API tutorial

Pull Request resolved: https://github.com/pytorch/audio/pull/2408

Reviewed By: hwangjeff

Differential Revision: D36632702

Pulled By: mthrok

fbshipit-source-id: 17be2fb8528cb1d2d1ee11901b6a95c512466feb

711d6016

Split Streaming API tutorials into two (#2446) · 2d846263

moto authored Jun 07, 2022

Summary:
The Streaming API tutorial has gotten long, so this commit split it into two.

Pull Request resolved: https://github.com/pytorch/audio/pull/2446

Reviewed By: hwangjeff

Differential Revision: D36987513

Pulled By: mthrok

fbshipit-source-id: 13e3aad74c0d0e654c39c0eeceffca1a00b0dac4

2d846263