Commits · 21b2d1392c4ff998fed71d14d2cb5892afc445b8 · OpenDAS / Torchaudio

27 Jun, 2022 1 commit

Add VoxCeleb1 dataset (#2349) · 21b2d139

Zhaoheng Ni authored Jun 27, 2022

Summary:
This PR adds two dataset classes of VoxCeleb1 corpus.
- `VoxCeleb1Identification`
Each data sample contains the waveform, sample rate, speaker id, and the file id.
- `VoxCeleb1Verification`
Each data sample contains a pair of waveforms, sample rate, the label indicating if they are from the same speaker, and the file ids.

Pull Request resolved: https://github.com/pytorch/audio/pull/2349

Reviewed By: carolineechen

Differential Revision: D35927921

Pulled By: nateanl

fbshipit-source-id: 3e07ddd329178777698841565053eb59befe6449

21b2d139

24 Jun, 2022 1 commit

Fix version number on main branch (#2509) · 49551eed

moto authored Jun 24, 2022

Summary:
The source build is still saying its 0.12.

Pull Request resolved: https://github.com/pytorch/audio/pull/2509

Reviewed By: carolineechen

Differential Revision: D37427703

Pulled By: mthrok

fbshipit-source-id: a6e455ba7c583af7b1a2a355ca45a9e5ab5fe30d

49551eed

23 Jun, 2022 1 commit

[AutoAccept][Codemod][FBSourceBlackLinter] Daily `arc lint --take BLACK` · fee994ce

CodemodService FBSourceBlackLinterBot authored Jun 23, 2022

Summary:
Meta:
**If you take no action, this diff will be automatically accepted on 2022-06-23.**
(To remove yourself from auto-accept diffs and just let them all land, add yourself to [this Butterfly rule](https://www.internalfb.com/butterfly/rule/904302247110220))

Produced by `tools/arcanist/lint/codemods/black-fbsource`.

#nocancel

Rules run:
- CodemodTransformerSimpleShell

Config Oncall: [lint](https://our.intern.facebook.com/intern/oncall3/?shortname=lint)
CodemodConfig: [CodemodConfigFBSourceBlackLinter](https://www.internalfb.com/code/www/flib/intern/codemod_service/config/fbsource_arc_f/CodemodConfigFBSourceBlackLinter.php)
ConfigType: php
Sandcastle URL: https://www.internalfb.com/intern/sandcastle/job/13510799586951394/
This diff was automatically created with CodemodService.
To learn more about CodemodService, check out the [CodemodService wiki](https://fburl.com/CodemodService).

_____

## Questions / Comments / Feedback?

**[Click here to give feedback about this diff](https://www.internalfb.com/codemod_service/feedback?sandcastle_job_id=13510799586951394).**

* Returning back to author or abandoning this diff will only cause the diff to be regenerated in the future.
* Do **NOT** post in the CodemodService Feedback group about this specific diff.

drop-conflicts

Reviewed By: adamjernst

Differential Revision: D37375235

fbshipit-source-id: 3d7eb39e5c0539a78d1412f37562dec90b0fc759

fee994ce

21 Jun, 2022 1 commit

Create musdb handler and tests (#2484) · b92a8a09

Sean Kim authored Jun 21, 2022

Summary:
Create dataset handler and tests for new dataset. Manually tested and unit tested to test validity. Pre-commit ran for style checks.

Pull Request resolved: https://github.com/pytorch/audio/pull/2484

Reviewed By: carolineechen, nateanl

Differential Revision: D37250556

Pulled By: skim0514

fbshipit-source-id: d2c8d73d22fd9d7282026265676f3eab1e178d51

b92a8a09

20 Jun, 2022 1 commit

Add fluent speech commands (#2480) · 66a67d2e

Caroline Chen authored Jun 20, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2480

Reviewed By: nateanl

Differential Revision: D37249571

Pulled By: carolineechen

fbshipit-source-id: caefeec4253c91f2579655a0c1735edaeed51be9

66a67d2e

17 Jun, 2022 1 commit

Make lazy import for joblib (#2498) · 10195316

Zhaoheng Ni authored Jun 17, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2498

Reviewed By: mthrok

Differential Revision: D37224024

Pulled By: nateanl

fbshipit-source-id: 5d5d561c43d1ee323ae0cc599ffa1479208ea09a

10195316

16 Jun, 2022 1 commit

Add special handling to filelike object mp3 (#2478) · 74dcfba3

moto authored Jun 15, 2022

Summary:
Loading and querying file-like object is not possible to use the fallback
mechanism introduced in https://github.com/pytorch/audio/issues/2419 because file-like objects are not seekable.

This commit add special case handling to mp3.

For filelike object mp3 input, it was required to pass `format="mp3"`
because libsox did not auto detect the format.

With the transition of mp3 handling from libsox to ffmpeg, the logic
is to let the ffmpeg handle it without waiting for libsox to fail,
if the `format="mp3"`

Note: This is back port of https://github.com/pytorch/audio/issues/2477.

Pull Request resolved: https://github.com/pytorch/audio/pull/2478

Reviewed By: carolineechen

Differential Revision: D37177123

Pulled By: mthrok

fbshipit-source-id: 997eead01c0ad1f04ffa0daa1039302a75f62b63

74dcfba3

15 Jun, 2022 7 commits

Making sure channel flag is set correctly (#2496) · 5e966711

Andrey Talman authored Jun 15, 2022

Summary:
Making sure channel flag is set correctly for the test channel

Pull Request resolved: https://github.com/pytorch/audio/pull/2496

Reviewed By: hwangjeff, mthrok

Differential Revision: D37183083

Pulled By: atalman

fbshipit-source-id: 5df8aad1bceb22ad65b0942bf370480bb1cbd44a

5e966711

Fix typo in release build step (#2495) · c46a00c2

Andrey Talman authored Jun 15, 2022

Summary:
Fix typo in release build step

Pull Request resolved: https://github.com/pytorch/audio/pull/2495

Reviewed By: hwangjeff

Differential Revision: D37176695

Pulled By: atalman

fbshipit-source-id: 37b4e30c1084e506f3a45cf7427784c955868909

c46a00c2

Fix push on release reference name (#2492) · 6b152cb1

Andrey Talman authored Jun 15, 2022

Summary:
Fix push on release reference name
We want to compare it against refs/heads/release rather then release
Tests: https://github.com/atalman/vision/commit/af17cd95d2d43ca13354fb700e2da42108dd5a87
Sets correctly release chanell (wheels): https://github.com/atalman/vision/runs/6901327010?check_suite_focus=true

Pull Request resolved: https://github.com/pytorch/audio/pull/2492

Reviewed By: hwangjeff

Differential Revision: D37174090

Pulled By: atalman

fbshipit-source-id: e114972935572a701eb7daff429a0df0ed5a75e4

6b152cb1

Making sure we are picking correct release branch (#2489) · 722982ea

Andrey Talman authored Jun 14, 2022

Summary:
Making sure we are picking correct release branch
Ref: https://github.com/pytorch/vision/pull/6168

Pull Request resolved: https://github.com/pytorch/audio/pull/2489

Reviewed By: mthrok

Differential Revision: D37160145

Pulled By: atalman

fbshipit-source-id: 3e4a2208cbe47f85147573159f9adb8d6a824956

722982ea

Update config.guess to the latest (#2479) · 575478ec

moto authored Jun 14, 2022

Summary:
closes https://github.com/pytorch/audio/issues/2420

Pull Request resolved: https://github.com/pytorch/audio/pull/2479

Reviewed By: carolineechen

Differential Revision: D37142717

Pulled By: mthrok

fbshipit-source-id: c3d4cc1435a74dfa6992112590c988c2903511a8

575478ec

Disable lint CI signal (#2487) · 1e3cc6b2

moto authored Jun 14, 2022

Summary:
Lint style has diverged since fb-internal lint engine has been changed.

Backport of https://github.com/pytorch/audio/issues/2466.

Pull Request resolved: https://github.com/pytorch/audio/pull/2487

Reviewed By: carolineechen

Differential Revision: D37160193

Pulled By: mthrok

fbshipit-source-id: cf4e2091a78a0da53269ae1251a55d4d1e52ead2

1e3cc6b2

Pin MKL to 2020.04 (#2486) · be213bfb

moto authored Jun 14, 2022

Summary:
The version of MKL that is installed alongside PyTorch has been bumped
to 2022.1 on Windows and it is causing installation issue in unit tests.

This commit pins the previous version

Backport of https://github.com/pytorch/audio/issues/2463

Pull Request resolved: https://github.com/pytorch/audio/pull/2486

Reviewed By: nateanl

Differential Revision: D37160156

Pulled By: mthrok

fbshipit-source-id: 7e3a30c25782b349a3cad2ee6d1141affc921881

be213bfb

14 Jun, 2022 2 commits

Adding conda builds for M1 (#2473) · 489999e2

Andrey Talman authored Jun 14, 2022

Summary:
Adding conda builds for M1

Pull Request resolved: https://github.com/pytorch/audio/pull/2473

Reviewed By: mthrok

Differential Revision: D37151454

Pulled By: atalman

fbshipit-source-id: 0108b937a4c7048bd4bb03b2b5a367704d7b78cc

489999e2

Add note about `normalize` argument (#2449) · 6fa5732c

moto authored Jun 13, 2022

Summary:
`load` function has `normalize` argument, which converts the native
sample type to `torch.float32`.

This argument is confusing for audio practitioners as it seems
to perform [volume normalization](https://en.wikipedia.org/wiki/Audio_normalization).

See https://github.com/pytorch/audio/issues/2253

Due to the BC-breaking concern, we cannot easily change the argument name.
This commit adds warnings to documentations.

Fix https://github.com/pytorch/audio/issues/2253

Pull Request resolved: https://github.com/pytorch/audio/pull/2449

Reviewed By: nateanl

Differential Revision: D36995756

Pulled By: carolineechen

fbshipit-source-id: 0b7db2758a355f6aafe06a2273bc72a1027690bd

6fa5732c

13 Jun, 2022 2 commits

Fix typo in nightly m1 ref (#2474) · a9c1e3a3

Andrey Talman authored Jun 13, 2022

Summary:
Fix typo in nightly m1 ref
See: https://github.com/pytorch/vision/pull/6158

Pull Request resolved: https://github.com/pytorch/audio/pull/2474

Reviewed By: malfet, mthrok

Differential Revision: D37117637

Pulled By: atalman

fbshipit-source-id: 2a8f7b5bf3506f2a53884424799919137870a0ad

a9c1e3a3

[AutoAccept][Codemod][FBSourceBlackLinter] Daily `arc lint --take BLACK` · 71ed457e
CodemodService FBSourceBlackLinterBot authored Jun 13, 2022
```
Reviewed By: ivanmurashko

Differential Revision: D37103342

fbshipit-source-id: adc908c790a413384bd88a75d3c2b4b0974c6674
```
71ed457e

10 Jun, 2022 2 commits

Adding tagged builds to torchaudio (#2471) · 19d93282

Andrey Talman authored Jun 10, 2022

Summary:
Adding tagged builds for torchaudio
see: https://github.com/pytorch/vision/pull/6140

Pull Request resolved: https://github.com/pytorch/audio/pull/2471

Reviewed By: hwangjeff

Differential Revision: D37080828

Pulled By: atalman

fbshipit-source-id: 13d754f522510514f0148ba465ce12a320058722

19d93282

Modifying Pitchshift for faster resampling (#2441) · df2262b5

Sean Kim authored Jun 10, 2022

Summary:
Split existing Pitchshift into multiple helper functions in order to cache kernel and speed up overall process addressing https://github.com/pytorch/audio/issues/2359.
Existing unit tests pass.

edit: functional and transforms unit test pass. Adopted lazy initialization to avoid BC-breaking.

Pull Request resolved: https://github.com/pytorch/audio/pull/2441

Reviewed By: carolineechen

Differential Revision: D36905582

Pulled By: skim0514

fbshipit-source-id: 6780db3ac8a29d59017a6abe7e82ce1fd17aaac2

df2262b5

08 Jun, 2022 5 commits

Fix metadata fetch (#2464) · 4d2fa190

moto authored Jun 08, 2022

Summary:
In https://github.com/pytorch/audio/issues/2461, `metadata` field was added to StreamInfo.
However, the value attached to this new field was source-level metadata,
while each stream can have different metadata.

* source level metadata
[AVFormatContext->metadata](https://ffmpeg.org/doxygen/4.1/structAVFormatContext.html#a3019a56080ed2e3297ff25bc2ff88adf)
* stream level metadata
[AVFormatContext->streams[]->metadata](https://ffmpeg.org/doxygen/4.1/structAVStream.html#a50d250a128a3da9ce3d135e84213fb82)

This commit moves source level metadata to dedicated method, `get_metadata`, and
fix the stream-level metadata to report stream metadata.

Pull Request resolved: https://github.com/pytorch/audio/pull/2464

Reviewed By: hwangjeff, xiaohui-zhang

Differential Revision: D36995452

Pulled By: mthrok

fbshipit-source-id: 534be1f7feb07790a0ce8624c336cdb7b65a8697

4d2fa190

Update HW decoding tutorial and add notes about unseekable object (#2408) · 711d6016

moto authored Jun 08, 2022

Summary:
https://output.circle-artifacts.com/output/job/75187a52-b0d8-4cac-89f3-24e10889a36a/artifacts/0/docs/hw_acceleration_tutorial.html

1. Update HW decoding tutorial to include file-like object
1. Add note about unseekable object int streaming API tutorial

Pull Request resolved: https://github.com/pytorch/audio/pull/2408

Reviewed By: hwangjeff

Differential Revision: D36632702

Pulled By: mthrok

fbshipit-source-id: 17be2fb8528cb1d2d1ee11901b6a95c512466feb

711d6016

Split Streaming API tutorials into two (#2446) · 2d846263

moto authored Jun 07, 2022

Summary:
The Streaming API tutorial has gotten long, so this commit split it into two.

Pull Request resolved: https://github.com/pytorch/audio/pull/2446

Reviewed By: hwangjeff

Differential Revision: D36987513

Pulled By: mthrok

fbshipit-source-id: 13e3aad74c0d0e654c39c0eeceffca1a00b0dac4

2d846263

Add metadata to source stream info (#2461) · 10d1bd89

moto authored Jun 07, 2022

Summary:
Add metadata, such as ID3 (https://github.com/pytorch/audio/commit/7d98db0567cb60fabcc173949b8c08e3a3487ac2)tag to `StreamReaderSourceAudioStream`.

Pull Request resolved: https://github.com/pytorch/audio/pull/2461

Reviewed By: hwangjeff

Differential Revision: D36985656

Pulled By: mthrok

fbshipit-source-id: e66f9e6e980eb57c378cc643a8979b6b7813dae7

10d1bd89

Bump version to 0.13 (#2460) · 7d98db05

hwangjeff authored Jun 07, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2460

Reviewed By: nateanl, mthrok

Differential Revision: D36992043

Pulled By: hwangjeff

fbshipit-source-id: 3a2a7f8991beaeaa2af0f620985230a68df201c2

7d98db05

07 Jun, 2022 7 commits

Quesst14 return type change (#2458) · a5a7849a

Sean Kim authored Jun 07, 2022

Summary:
Fixing return types for quesst14

Pull Request resolved: https://github.com/pytorch/audio/pull/2458

Reviewed By: carolineechen

Differential Revision: D36977139

Pulled By: skim0514

fbshipit-source-id: f8f5a2de7cab2de1bec49c529c3bb9316145403d

a5a7849a

Remove CTC decoder prototype message (#2459) · da3ffe9b

Caroline Chen authored Jun 07, 2022

Summary:
ctc decoder has been moved to beta, remove prototype message from tutorial

(this is done on the release branch in https://github.com/pytorch/audio/issues/2457)

Pull Request resolved: https://github.com/pytorch/audio/pull/2459

Reviewed By: hwangjeff

Differential Revision: D36978417

Pulled By: carolineechen

fbshipit-source-id: e580c1e8475a1a0aa924d44deea3852adc332a86

da3ffe9b

Add HuBERT fine-tuning recipe (#2352) · ab5edfcd

Zhaoheng Ni authored Jun 07, 2022

Summary:
The PR contains the CTC fine-tuning recipe of HuBERT Base model.
The files include:
- lightning module
- training script
- README and the result table
- evaluation scripts

Pull Request resolved: https://github.com/pytorch/audio/pull/2352

Reviewed By: hwangjeff

Differential Revision: D36915712

Pulled By: nateanl

fbshipit-source-id: 0249635ad5e81a8aa2d228c1d5fe84d78b62a15b

ab5edfcd

Update audio I/O tutorials (#2385) · 4c19e2cb

moto authored Jun 07, 2022

Summary:
- Adopt `torchaudio.utils.download_asset` to simplify asset management.
- Break down the first section about helper functions.
- Use tempfile so that executing tutorial won't leave any artifacts on local file system.

Example: https://output.circle-artifacts.com/output/job/b11a0087-8bf9-4999-a74f-b53798eaa77f/artifacts/0/docs/tutorials/audio_io_tutorial.html

Pull Request resolved: https://github.com/pytorch/audio/pull/2385

Reviewed By: hwangjeff

Differential Revision: D36404399

Pulled By: mthrok

fbshipit-source-id: 106af34e8ddd22a061aa12767b444b32aef07bad

4c19e2cb

[DOC/CI] Store doc as tar archive (#2448) · 550e6dcb

moto authored Jun 07, 2022

Summary:
At the time of release, we need to download doc built by CI.
CircleCI does not have feature to download multiple files.

This commit add the archive of built documentations as
CI artifact so that the whole documentation can be downloaded
at once.

Resolves https://github.com/pytorch/audio/issues/2340

Pull Request resolved: https://github.com/pytorch/audio/pull/2448

Reviewed By: hwangjeff

Differential Revision: D36942077

Pulled By: mthrok

fbshipit-source-id: 61dde0d71841434a3d0624404d99911aa6956f88

550e6dcb

Update smoke test (#2455) · d2d8b670

moto authored Jun 06, 2022

Summary:
Import StreamReader from the new location

Pull Request resolved: https://github.com/pytorch/audio/pull/2455

Reviewed By: nateanl

Differential Revision: D36959668

Pulled By: mthrok

fbshipit-source-id: c2b8c9f9dff1ec306ea39c495294faa9208b3c4e

d2d8b670

Fix decoder compilation (#2450) · f11fc7cf

moto authored Jun 06, 2022

Summary:
Address https://github.com/pytorch/audio/issues/2445

Pull Request resolved: https://github.com/pytorch/audio/pull/2450

Reviewed By: carolineechen

Differential Revision: D36945877

Pulled By: mthrok

fbshipit-source-id: c7f9ba8093c8dc03b27582b9c608b023c7700332

f11fc7cf

06 Jun, 2022 1 commit

Set the default ffmpeg log level to FATAL (#2447) · 4e761081

moto authored Jun 06, 2022

Summary:
With the default log-level, completely sane operation like converting
YUV to RGB issues bunch of warnings like

`[swscaler @ 0x128aa8000] No accelerated colorspace conversion found from yuv420p to rgb24.`

This commit sets the log level to FATAL.

Pull Request resolved: https://github.com/pytorch/audio/pull/2447

Reviewed By: hwangjeff

Differential Revision: D36938728

Pulled By: mthrok

fbshipit-source-id: 39c2e6a4307f1eac577fd606e17ab0f298079b54

4e761081

04 Jun, 2022 3 commits

Refactor LibriSpeech Lightning datamodule to accommodate different dataset implementations (#2437) · a63629b6

Jeff Hwang authored Jun 04, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2437

Refactors LibriSpeech Lightning datamodule to accommodate different dataset implementations.

Reviewed By: carolineechen, nateanl

Differential Revision: D36731577

fbshipit-source-id: 4ba91044311fa3f99a928aef6ef411316955f6b5

a63629b6

Make FFmpeg log level configurable (#2439) · 877a88c5

moto authored Jun 03, 2022

Summary:
Undesired logs are one of the loudest UX complains we get.
Yet, loading media files involves uncertainty which is
difficult to debug without debug log.

This commit introduces utility functions to configure logging level
so that we can ask users to enable it when they encounter an issue,
while defaulting to non-verbose option.

Pull Request resolved: https://github.com/pytorch/audio/pull/2439

Reviewed By: hwangjeff, xiaohui-zhang

Differential Revision: D36903763

Pulled By: mthrok

fbshipit-source-id: f4ddd9915b13197c2a2eb97e965005b8b5b8d987

877a88c5

Update CTC decoder docs (#2443) · 3229fc55

Caroline Chen authored Jun 03, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2443

Reviewed By: nateanl

Differential Revision: D36909822

Pulled By: carolineechen

fbshipit-source-id: ef3ab2345e7a4666cf29dd02c83d03504e8aa62c

3229fc55

03 Jun, 2022 4 commits

Update audio data augmentation tutorial (#2388) · 41082eb0

moto authored Jun 03, 2022

Summary:
- Adopt `torchaudio.utils.download_asset` to simplify asset management.
- Break down the first section about helper functions.
- Reduce the number of helper functions

https://output.circle-artifacts.com/output/job/d7dd1b93-6dfe-46da-a080-109bfdc63881/artifacts/0/docs/tutorials/audio_data_augmentation_tutorial.html

Pull Request resolved: https://github.com/pytorch/audio/pull/2388

Reviewed By: carolineechen

Differential Revision: D36404405

Pulled By: mthrok

fbshipit-source-id: f460ed810519797fce6e2fa7baaee110bddd1d06

41082eb0

Update audio resampling tutorial (#2386) · fd2be89a

moto authored Jun 03, 2022

Summary:
- Replace mis-use of plot_specgram with plot_sweep, and remove plot_specgram
- Move `benchmark_resample` to later section

https://output.circle-artifacts.com/output/job/9f7af187-777d-4d75-840f-2630a36295b7/artifacts/0/docs/tutorials/audio_resampling_tutorial.html

Pull Request resolved: https://github.com/pytorch/audio/pull/2386

Reviewed By: carolineechen

Differential Revision: D36404403

Pulled By: mthrok

fbshipit-source-id: f9df8453e3f531bdc4549b0134e5dbba90653bf7

fd2be89a

Update audio feature extraction tutorial (#2391) · 8e20d546

moto authored Jun 03, 2022

Summary:
- Adopt torchaudio.utils.download_asset to simplify asset management.
- Break down the first section about helper functions.
- Reduce the number of helper functions

Pull Request resolved: https://github.com/pytorch/audio/pull/2391

Reviewed By: carolineechen, nateanl

Differential Revision: D36885626

Pulled By: mthrok

fbshipit-source-id: 1306f22ab70ab1e7f74ed7e43bf43150015448b6

8e20d546

Remove possible manual seeds from test files. (#2436) · f0bc00c9

Sean Kim authored Jun 03, 2022

Summary:
For test files where applicable, removed manual seeds where applicable. Refactoring https://github.com/pytorch/audio/issues/2267

Pull Request resolved: https://github.com/pytorch/audio/pull/2436

Reviewed By: carolineechen

Differential Revision: D36896854

Pulled By: skim0514

fbshipit-source-id: 7b4dd8a8dbfbef271f5cc56564dc83a760407e6c

f0bc00c9