Commits · 8b85ca5d326465a4344decc63653cb539a1b2f66 · OpenDAS / Torchaudio

24 May, 2023 5 commits

moto authored May 24, 2023

Summary:
Follow-up https://github.com/pytorch/audio/issues/3045
- Revert the removal of HW acceleration doc
- comment out FFmpeg CLI test run

Pull Request resolved: https://github.com/pytorch/audio/pull/3349

Reviewed By: nateanl

Differential Revision: D46121899

Pulled By: mthrok

fbshipit-source-id: dfc030a69f05addec73637cfb6a720c184e37323

8b85ca5d

Update smoke test (#3346) · 71b2634b

moto authored May 24, 2023

Summary:
* Delay the import of torchaudio until the CLI options are parsed.
* Add option to set log level to DEBUG so that it's easy to see the issue with external libraries.

Pull Request resolved: https://github.com/pytorch/audio/pull/3346

Reviewed By: nateanl

Differential Revision: D46022546

Pulled By: mthrok

fbshipit-source-id: 9f988bbd770c2fd2bb260c3cfe02b238a9da2808

71b2634b

Amend commit to gh-pages branch (#3345) · a79cf3ba

moto authored May 24, 2023

Summary:
This commit changes the way doc is pushed.
It ammends instead of adding a new commit.

Currently each commit in gh-pages contain like 100MB of data. gh-pages branch is fetched by default when `git clone`. So the size of torchaudio repo grows significantly.

Pull Request resolved: https://github.com/pytorch/audio/pull/3345

Reviewed By: nateanl

Differential Revision: D46136612

Pulled By: mthrok

fbshipit-source-id: 39479ee5d1a6888254ef50f0db252453d976d183

a79cf3ba

Remove CUDA 11.7 builds; replace with 11.8 (#3360) · 5a6f4eba

pbialecki authored May 24, 2023

Summary:
CC atalman malfet

Pull Request resolved: https://github.com/pytorch/audio/pull/3360

Reviewed By: mthrok

Differential Revision: D46150898

Pulled By: atalman

fbshipit-source-id: 985a0ef69406f48fb15f239d6b16616c0a5379f5

5a6f4eba

Resolve lint issue on LaTeX (#3366) · 8690e6ec

moto authored May 23, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3366

Reviewed By: nateanl

Differential Revision: D46136238

Pulled By: mthrok

fbshipit-source-id: 3432f5d007293831bab21460a79ae26b1bbc81a8

8690e6ec

23 May, 2023 6 commits

[BugFix] Fix extract_features method for WavLM models (#3350) · 7d0f3369

Zhaoheng Ni authored May 23, 2023

Summary:
resolve https://github.com/pytorch/audio/issues/3347

`position_bias` is ignored in `extract_features` method, this doesn't affect Wav2Vec2 or HuBERT models, but it changes the output of transformer layers (except the first layer) in WavLM model. This PR fixes it by adding `position_bias` to the method.

Pull Request resolved: https://github.com/pytorch/audio/pull/3350

Reviewed By: mthrok

Differential Revision: D46112148

Pulled By: nateanl

fbshipit-source-id: 3d21aa4b32b22da437b440097fd9b00238152596

7d0f3369

[Nova] MacOS Unittests on Nova (#3324) · fce54fd1

Omkar Salpekar authored May 23, 2023

Summary:
As discussed in the [Torchaudio Migration Proposal](https://docs.google.com/document/d/1PF8biwiGzsjzfEBM78mlLiRrkcsGsvuYkeqkI66Ym8A/edit), this PR moves MacOS unittest job to Nova tooling. Note that this does not touch anything within the existing CircleCI job at the moment.

Passing job: https://github.com/pytorch/audio/actions/runs/4932497525/jobs/8815581251?pr=3324

Pull Request resolved: https://github.com/pytorch/audio/pull/3324

Reviewed By: atalman, mthrok

Differential Revision: D46113524

Pulled By: osalpekar

fbshipit-source-id: d048d300489f992fa187628cb6744d95ab4fb68a

fce54fd1

Fix cuda test failure (#3363) · fa59855f

Zhaoheng Ni authored May 23, 2023

Summary:
Fix https://github.com/pytorch/audio/issues/3361

When adding FunctionalCUDAOnlyTest, the class should inherit from `TestBaseMixin` instead of `Functional`

Pull Request resolved: https://github.com/pytorch/audio/pull/3363

Reviewed By: atalman, osalpekar

Differential Revision: D46112084

Pulled By: nateanl

fbshipit-source-id: 67c6472fda98cb718e0fc53ab248beda745feab5

fa59855f

Unset BPS when using sox vorbis (#3359) · d850ff61

moto authored May 23, 2023

Summary:
When saving audio with vorbis, BPS should not be specified, otherwise warnings that cannot be turned off are shown.

Address: https://github.com/pytorch/audio/issues/3358

Pull Request resolved: https://github.com/pytorch/audio/pull/3359

Reviewed By: nateanl

Differential Revision: D46095037

Pulled By: mthrok

fbshipit-source-id: 6885a12dc3ec84bf39f0159ee58d1a2a87cff7e4

d850ff61

[Nova] Linux CPU Unittests to Nova (#3323) · 2255a0fc

Omkar Salpekar authored May 23, 2023

Summary:
As discussed in the [Torchaudio Migration Proposal](https://docs.google.com/document/d/1PF8biwiGzsjzfEBM78mlLiRrkcsGsvuYkeqkI66Ym8A/edit), this PR moves the Linux CPU unittest job to Nova tooling. Note that this does not disable the existing CircleCI job at the moment.

Passing Job: https://github.com/pytorch/audio/actions/runs/4986115298/jobs/8926499354?pr=3323

Pull Request resolved: https://github.com/pytorch/audio/pull/3323

Reviewed By: atalman, mthrok

Differential Revision: D46113506

Pulled By: osalpekar

fbshipit-source-id: 1778c360e17b9d02c63bcc60100834c75798d380

2255a0fc

[audio] add CTC forced alignment API tutorial to torchaudio (#3356) · f046f7e3

Xiaohui Zhang authored May 22, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3356

move the forced aligner tutorial to torchaudio, with some formatting changes

Reviewed By: mthrok

Differential Revision: D46060238

fbshipit-source-id: d90e7db5669a58d1e9ef5c2ec3c6d175b4e394ec

f046f7e3

22 May, 2023 4 commits

Cleaning up Deprecated Jobs from CCI Config (#3340) · 150234bd

Omkar Salpekar authored May 22, 2023

Summary:
Cleaning up CCI configs that are no longer used.

Pull Request resolved: https://github.com/pytorch/audio/pull/3340

Reviewed By: mthrok

Differential Revision: D46077882

Pulled By: osalpekar

fbshipit-source-id: 0dce08fc14b5efc4517ab1f559e7ef7eb245af64

150234bd

Update forced_align document (#3357) · c0702338

Zhaoheng Ni authored May 22, 2023

Summary:
- Fix latex formula rendering issue
- Add `devices` and `properties` tags
- Fix grammar

Pull Request resolved: https://github.com/pytorch/audio/pull/3357

Reviewed By: mthrok

Differential Revision: D46068633

Pulled By: nateanl

fbshipit-source-id: 80cb84508396fbcaf81c068228d46a24bb63b975

c0702338

Fix CPU kernel of forced_align function (#3354) · 8a893fb3

Zhaoheng Ni authored May 21, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3354

when start ==0, the first item instead of Sth item of t row in backPtr_a should be 0.

Reviewed By: xiaohui-zhang

Differential Revision: D46059971

fbshipit-source-id: 89933134878513034eae033764b19f8562f24cb8

8a893fb3

Add doc for forced_align (#3355) · 011f7f3d

Zhaoheng Ni authored May 21, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3355

Reviewed By: xiaohui-zhang

Differential Revision: D46060254

Pulled By: nateanl

fbshipit-source-id: c2e44f994739755daf049fe350dd24a987a9cc29

011f7f3d

21 May, 2023 2 commits

Revert D45960556: add CTC forced alignment API tutorial to torchaudio · f9b4f74f

Moto Hira authored May 20, 2023

Differential Revision:
D45960556

Original commit changeset: 93f2271f7130

Original Phabricator Diff: D45960556

fbshipit-source-id: d22883fbcf9c5f2bb5d49274bcc194bdffaca72a

f9b4f74f

add CTC forced alignment API tutorial to torchaudio (#3351) · 93adc3e4

Xiaohui Zhang authored May 20, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3351

move the forced aligner tutorial to torchaudio, with some formatting changes

Reviewed By: vineelpratap, nateanl

Differential Revision: D45960556

fbshipit-source-id: 93f2271f71307404e6a7732385cf7d646dc8ceaa

93adc3e4

20 May, 2023 1 commit

[audio][PR] Add forced_align function to torchaudio (#3348) · e7935cff

Zhaoheng Ni authored May 19, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3348

The pull request adds a CTC-based forced alignment function that supports both CPU and CUDA deviced. The function takes the CTC emissions and target labels as inputs and generates the corresponding labels for each frame.

Reviewed By: vineelpratap, xiaohui-zhang

Differential Revision: D45867265

fbshipit-source-id: 3e25b06bf9bc8bb1bdcdc08de7f4434d912154cb

e7935cff

19 May, 2023 1 commit

Build and use GPU-enabled FFmpeg in doc CI (#3045) · 0db5ab25

moto authored May 19, 2023

Summary:
This commit add the step to build FFmpeg with GPU decoder in build_doc job so that we can use GPU decoder/encoder in documentations.

Pull Request resolved: https://github.com/pytorch/audio/pull/3045

Reviewed By: nateanl

Differential Revision: D45965739

Pulled By: mthrok

fbshipit-source-id: c167eb3ef347860a51efa906068fa2daa556f017

0db5ab25

17 May, 2023 4 commits

Improve the performance of YUV420P frame conversion (#3342) · 72d3fe09

moto authored May 17, 2023

Summary:
This commit improve the performance of conversions of YUV420P format from AVFrame to torch Tensor.

It changes two things;
1. Change the implementation of nearest-neighbor upsampling from `torch::nn::functional::interpolate` to manual data copy.
2.  Get rid of intermediate UV plane copy

The following compares the time it takes to process 30 seconds of YUV420P frame at 25 FPS of resolution 320x240. The measurement times are sorted by values.

Some observations
* `torch::nn::functional::interpolate` with `torch::kNearest` option is not as fast as copying data manually.
* switching from `interpolate` to manual data copy reduces the variance.

run | main | 1 | 1+2 | improvement (from main to 1+2)
-- | -- | -- | -- | --
1 | 0.452250583 | 0.417490125 | 0.40155375 | 11.21%
2 | 0.462039958 | 0.42006675 | 0.401764125 | 13.05%
3 | 0.463067666 | 0.42416 | 0.402651334 | 13.05%
4 | 0.464228166 | 0.424545458 | 0.402985667 | 13.19%
5 | 0.465777375 | 0.425629208 | 0.405604625 | 12.92%
6 | 0.469628666 | 0.427044333 | 0.40628525 | 13.49%
7 | 0.475935125 | 0.42805875 | 0.406412167 | 14.61%
8 | 0.482277667 | 0.429921209 | 0.407279 | 15.55%
9 | 0.496695208 | 0.431182792 | 0.442013791 | 11.01%
10 | 0.546653625 | 0.541639584 | 0.4711585 | 13.81%

[second]

Increasing the resolution, the improvement is smaller but is consistent.

run | main | 1+2 | improvement
-- | -- | -- | --
1 | 4.032393 | 3.991784667 | 1.01%
2 | 4.052248084 | 3.992672208 | 1.47%
3 | 4.07705575 | 4.000541666 | 1.88%
4 | 4.143954792 | 4.020671584 | 2.98%
5 | 4.170711959 | 4.025753125 | 3.48%
6 | 4.240229292 | 4.045504875 | 4.59%
7 | 4.267384042 | 4.045588125 | 5.20%
8 | 4.277025958 | 4.061980083 | 5.03%
9 | 4.312192042 | 4.163251959 | 3.45%
10 | 4.406109875 | 4.312560334 | 2.12%

<details><summary>code</summary>

```python
import time

from torchaudio.io import StreamReader

def test():
    r = StreamReader(src="testsrc=duration=30", format="lavfi")
    # r = StreamReader(src="testsrc=duration=30:size=1080x720", format="lavfi")
    r.add_video_stream(-1, filter_desc="format=yuv420p")
    t0 = time.monotonic()
    r.process_all_packets()
    elapsed = time.monotonic() - t0
    print(elapsed)

for _ in range(10):
    test()
```
</details>

<details><summary>env</summary>

```
PyTorch version: 2.1.0.dev20230325
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 13.3.1 (arm64)
GCC version: Could not collect
Clang version: 14.0.6
CMake version: version 3.22.1
Libc version: N/A

Python version: 3.9.16 (main, Mar  8 2023, 04:29:24)  [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-13.3.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M1

Versions of relevant libraries:
[pip3] torch==2.1.0.dev20230325
[pip3] torchaudio==2.1.0a0+541b525
[conda] pytorch                   2.1.0.dev20230325         py3.9_0    pytorch-nightly
[conda] torchaudio                2.1.0a0+541b525           dev_0    <develop>
```

</details>

Pull Request resolved: https://github.com/pytorch/audio/pull/3342

Reviewed By: xiaohui-zhang

Differential Revision: D45947716

Pulled By: mthrok

fbshipit-source-id: 17e5930f57544b4f2e48a9b2185464694a88ab68

72d3fe09

Improve the performance of NV12 frame conversion (#3344) · c11661e0

moto authored May 17, 2023

Summary:
Similar to https://github.com/pytorch/audio/pull/3342, this commit improves the performance of NV12 frame conversion.

It changes two things;

- Change the implementation of nearest-neighbor upsampling from `torch::nn::functional::interpolate` to manual data copy.
- Get rid of intermediate UV plane copy

with 320x240

run | main | pr | improvement
-- | -- | -- | --
1 | 0.600671417 | 0.464993125 | 22.59%
2 | 0.638846084 | 0.456763542 | 28.50%
3 | 0.64158175 | 0.458295333 | 28.57%
4 | 0.649868584 | 0.455450583 | 29.92%
5 | 0.612171333 | 0.462435625 | 24.46%
6 | 0.6128095 | 0.456716166 | 25.47%
7 | 0.632084583 | 0.463357083 | 26.69%
8 | 0.610733083 | 0.46148625 | 24.44%
9 | 0.613825834 | 0.4559555 | 25.72%
10 | 0.653857458 | 0.455375375 | 30.36%

[second]

with 1080x720 video

run | main | pr | improvement
-- | -- | -- | --
1 | 4.984154333 | 4.21090375 | 15.51%
2 | 4.988090625 | 4.239649375 | 15.00%
3 | 4.988896375 | 4.227277458 | 15.27%
4 | 4.998186584 | 4.161077042 | 16.75%
5 | 5.06180425 | 4.191672584 | 17.19%
6 | 5.108769667 | 4.198468458 | 17.82%
7 | 5.151363625 | 4.181942167 | 18.82%
8 | 5.199527875 | 4.239319084 | 18.47%
9 | 5.224903708 | 4.194901959 | 19.71%
10 | 5.333422583 | 4.320925792 | 18.98%

[second]

<details><summary>code</summary>

```python
import time

from torchaudio.io import StreamReader

def test():
    r = StreamReader(src="testsrc=duration=30", format="lavfi")
    # r = StreamReader(src="testsrc=duration=30:size=1080x720", format="lavfi")
    r.add_video_stream(-1, filter_desc="format=nv12")
    t0 = time.monotonic()
    r.process_all_packets()
    elapsed = time.monotonic() - t0
    print(elapsed)

for _ in range(10):
    test()
```
</details>

<details><summary>env</summary>

```
PyTorch version: 2.1.0.dev20230325
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 13.3.1 (arm64)
GCC version: Could not collect
Clang version: 14.0.6
CMake version: version 3.22.1
Libc version: N/A

Python version: 3.9.16 (main, Mar  8 2023, 04:29:24)  [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-13.3.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M1

Versions of relevant libraries:
[pip3] torch==2.1.0.dev20230325
[pip3] torchaudio==2.1.0a0+541b525
[conda] pytorch                   2.1.0.dev20230325         py3.9_0    pytorch-nightly
[conda] torchaudio                2.1.0a0+541b525           dev_0    <develop>
```

</details>

Pull Request resolved: https://github.com/pytorch/audio/pull/3344

Reviewed By: xiaohui-zhang

Differential Revision: D45948511

Pulled By: mthrok

fbshipit-source-id: ae9b300cbcb4295f3f7470736f258280005a21e5

c11661e0

Fix for breadcrumbs displaying "Old version (stable)" on Nightly build (#3333) · 3ffd76c8

Carl Parker authored May 16, 2023

Summary:
Previously, `breadcrumbs.html` identified a nightly build version by the prefix "Nightly" which would normally be prepended to the version in `conf.py`. However, the version string is coming through without the "Nightly" prefix, so this change causes `breadcrumbs.html` to key on the substring "dev" instead.

The reason we aren't getting "Nightly" is apparently because the environment variable BUILD_VERSION is available, so `conf.py` is using the value of that env var instead of the version string imported from the `torchaudio` module itself, which actually appears to be incorrect; see below.

If I install torchaudio using

conda install torchaudio -c pytorch-nightly

then `torchaudio.__version__` returns the incorrect version string:

2.0.0.dev20230309

Pull Request resolved: https://github.com/pytorch/audio/pull/3333

Reviewed By: mthrok

Differential Revision: D45926466

Pulled By: carljparker

fbshipit-source-id: d5516f2d9f1716c2400d3e9b285bd5d32b4b3a77

3ffd76c8

Add 420p10le CPU support to StreamReader (#3332) · c12f4734

moto authored May 16, 2023

Summary:
This commit add support to decode YUV420P010LE format.

The image tensor returned by this format
- NCHW format (C == 3)
- int16 type
- value range [0, 2^10).

Note that the value range is different from what "hevc_cuvid" decoder
returns. "hevc_cuvid" decoder uses full range of int16 (internally,
it's uint16) to express the color (with some intervals), but the values
returned by CPU "hevc" decoder are with in [0, 2^10).

Address https://github.com/pytorch/audio/issues/3331

Pull Request resolved: https://github.com/pytorch/audio/pull/3332

Reviewed By: hwangjeff

Differential Revision: D45925097

Pulled By: mthrok

fbshipit-source-id: 4e669b65c030f388bba2fdbb8f00faf7e2981508

c12f4734

16 May, 2023 3 commits

Upgrade to FFmpeg5 (#3298) · d38a7854

moto authored May 16, 2023

Summary:
This commit upgrade the version of FFmpeg compiled against TorchAudio binary distribution to 5.0.4.

FFmpeg 5.0 was released in Jan 2022, and many package managers provide a version of FFmpeg v5.
Conda-forge lists 5.1 for all the platforms TorchAudio supports.https://anaconda.org/conda-forge/ffmpeg

Pull Request resolved: https://github.com/pytorch/audio/pull/3298

Reviewed By: hwangjeff

Differential Revision: D45865599

Pulled By: mthrok

fbshipit-source-id: d95638eb80daaf477a710a992f4ead9b9009bb9b

d38a7854

Remove obsolete third party dependencies of CTC decoder (#3339) · e4c1d70b

moto authored May 16, 2023

Summary:
TorchAudio has migrated CTC decoder to flashlight-text, and code related CTC decoder was removed in https://github.com/pytorch/audio/issues/3236.

This commit cleans up the residual, removes the third party libraries used for CTC decoder, and mention to environment variable for CTC decoder.

Pull Request resolved: https://github.com/pytorch/audio/pull/3339

Reviewed By: nateanl

Differential Revision: D45920878

Pulled By: mthrok

fbshipit-source-id: 8d93e64138697781570e5b0b1c9f86e1a7923a89

e4c1d70b

[Doc] Fix a word in documents (#3334) · 04f67546

Amir Masoud Nourollah authored May 15, 2023

Summary:
A redundant "and" just removed.

Pull Request resolved: https://github.com/pytorch/audio/pull/3334

Reviewed By: xiaohui-zhang

Differential Revision: D45864314

Pulled By: mthrok

fbshipit-source-id: ad67bde8fa73eac995fbd0d3809709cc38486884

04f67546

15 May, 2023 1 commit

Switch windows nightly builds to GHA (#3330) · 00247576

atalman authored May 15, 2023

Summary:
Switch windows nightly builds to GHA

Similar to: https://github.com/pytorch/vision/pull/7578

Pull Request resolved: https://github.com/pytorch/audio/pull/3330

Reviewed By: mthrok

Differential Revision: D45871892

Pulled By: atalman

fbshipit-source-id: 817490a2abcaffceec5174c624f9e7d0377bbc4a

00247576

11 May, 2023 3 commits

Clean-up StreamReader/StreamWriter interface (#3328) · d9643f50

Moto Hira authored May 11, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3328

Make the `AVIOContext`-based constructor protected for better encapsulation.
AVFormatContext and optional AVIOContext are managed by StreamReader/Writer, so it's better that they are abstracted away from client code.

Reviewed By: hwangjeff

Differential Revision: D45779629

fbshipit-source-id: 44c31e8af785447cb47aad0c44bf4ecf1aeebeaa

d9643f50

Add doc preview (#3326) · 1c7309d2

moto authored May 11, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3326

Reviewed By: hwangjeff

Differential Revision: D45760678

Pulled By: mthrok

fbshipit-source-id: 79b5d846c93516ca90c9700279124a9a04470242

1c7309d2

Add 2.0.1 to the version compatibility matrix (#3325) · 608775bf

moto authored May 11, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3325

Reviewed By: hwangjeff

Differential Revision: D45759434

Pulled By: mthrok

fbshipit-source-id: f3b1127fcf3b23beeab61fb7ff18f1b89b11ddc6

608775bf

10 May, 2023 4 commits

[BC-Breaking] Switch to the backend dispatcher (#3241) · 4463fbdf

moto authored May 10, 2023

Summary:
This commit makes the code defaults to the backend dispatcher by default. Enabling backend dispatcher puts the FFmpeg-based I/O implementation on higher priority (if the corresponding FFmpeg is available), and allows individual function call to specify the backend.

See also https://github.com/pytorch/audio/issues/2950

Pull Request resolved: https://github.com/pytorch/audio/pull/3241

Reviewed By: hwangjeff

Differential Revision: D44709068

Pulled By: mthrok

fbshipit-source-id: 43aac3433f78a681df6669e9ac46e8ecf3beb1be

4463fbdf

Add AudioEffector tutorial (#3226) · 2ab49e5b

moto authored May 09, 2023

Summary:
https://output.circle-artifacts.com/output/job/fbfa6d9a-5014-42ac-8e77-c1e9565747e8/artifacts/0/docs/tutorials/effector_tutorial.html

Pull Request resolved: https://github.com/pytorch/audio/pull/3226

Reviewed By: nateanl

Differential Revision: D45402724

Pulled By: mthrok

fbshipit-source-id: bc9d1bc071f6f5062b9cc35d743b4a3016306262

2ab49e5b

Update `torchaudio` doc and tutorial (#3285) · 667c6a9e

moto authored May 09, 2023

Summary:
This commit is preparation for landing dispatcher switch in https://github.com/pytorch/audio/issues/3241

Making FFmpeg backend default causes some issues on tutorials, so this commit disable it.
The IO tutorial will be updated after https://github.com/pytorch/audio/issues/3241 is landed to accommodate the change.

Since it is necessary to mention the changes related to migration in the IO tutorial,
I also update the IO documentation to include migration work so that it's easy to redirect.

Pull Request resolved: https://github.com/pytorch/audio/pull/3285

Reviewed By: nateanl

Differential Revision: D45671237

Pulled By: mthrok

fbshipit-source-id: cb541f6bd93cd9920019b8ec83210ea69d34f133

667c6a9e

[BC-Breaking] Update InverseMelScale solution (#3280) · 5a85a461

Zhaoheng Ni authored May 09, 2023

Summary:
Address https://github.com/pytorch/audio/issues/2643

- replace `SGD` optimization with `torch.linalg.lstsq` which is much faster.
- Add autograd test for `InverseMelScale`
- update other tests

Pull Request resolved: https://github.com/pytorch/audio/pull/3280

Reviewed By: hwangjeff

Differential Revision: D45679988

Pulled By: nateanl

fbshipit-source-id: a42e8bff9dc0f38e47e0482fd8a2aad902eedd59

5a85a461

09 May, 2023 6 commits

Remove NumPy from conda build env (#3315) · 282ed27a

moto authored May 09, 2023

Summary:
NumPy is an optional runtime dependency of TorchAudio, and it is not required at build time.

Pull Request resolved: https://github.com/pytorch/audio/pull/3315

Reviewed By: nateanl

Differential Revision: D45702243

Pulled By: mthrok

fbshipit-source-id: 6ca6598931764c46be6323868e8cce7c8adc5024

282ed27a

Refactor StreamReader/Writer PyBinding (#3296) · 8d7268f1

Moto Hira authored May 09, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3296

Reviewed By: hwangjeff

Differential Revision: D45503774

fbshipit-source-id: 806c22bd0f54fd0cea43d61ef3dbedd67ffeb012

8d7268f1

Add StreamReaderCustomIO (#3320) · 007cca23

Moto Hira authored May 09, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3320

Add StreamReaderCustomIO, which is analogous to StreamWriterCustomIO and which takes custom read/seek functions to fetch media data.

Reviewed By: hwangjeff

Differential Revision: D45482843

fbshipit-source-id: 3ccf771c0fdce153aaa7551053e9a77facedc983

007cca23

Refactor StreamWriterCustomIO (#3319) · 51767917

Moto Hira authored May 09, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3319

* Merge the source with StreamWriter
* Add docstrings
* Move CustomIO to detail::CustomOutput to prepare for adding CustomInput

Reviewed By: hwangjeff

Differential Revision: D45481807

fbshipit-source-id: 4a9ac8a57acda47b126f8ae18e607b72919f9988

51767917

Fix batch consistency test for InverseBarkScale (#3322) · 51cc1cbf

Zhaoheng Ni authored May 09, 2023

Summary:
The batch consistency test function should call `InverseBarkScale` instead of `InverseMelScale`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3322

Reviewed By: mthrok

Differential Revision: D45691769

Pulled By: nateanl

fbshipit-source-id: 4a1ed80c4a56c3a847a49a8d02f8b5cbe4f09045

51cc1cbf

[BE] Add description to wheel package (#3321) · 3a49a2d2

Nikita Shulga authored May 09, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3321

Reviewed By: atalman, mthrok

Differential Revision: D45673225

Pulled By: malfet

fbshipit-source-id: f2b915f3307ba95445702e3018254ad254fe2bb3

3a49a2d2