Commits · f6d1bc96b61c22987ca9b2969339fe61976f77e8 · OpenDAS / Torchaudio

24 Feb, 2023 1 commit

Use autosummary for torchaudio.prototyoe.models documentation (#3084) · f6d1bc96

Zhaoheng Ni authored Feb 24, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3084

Reviewed By: mthrok

Differential Revision: D43550150

Pulled By: nateanl

fbshipit-source-id: 5c5e3d9461e375be202493e3399ff38ce5cd7690

f6d1bc96

23 Feb, 2023 5 commits

Replace c10::Dict with std::map in StreamReader/Writer (#3092) · c3310018

moto authored Feb 23, 2023

Summary:
This commit is kind of clean up and preparation for future development.

We plan to pass around more complicated objects among StreamReader and StreamWriter, and TorchBind is not expressive enough for defining intermediate object, so we want to use PyBind11 for binding StreamReader/Writer.

PyBind11 converts Python dict into std::map, while TorchBind converts it into c10::Dict. Because of this descrepancy, conversion from c10::Dict to std::map have to happen in multiple places, and this makes the binding code thicker as it requires to wrapper methods.

Using std::map reduces the number of wrapper methods / conversions, because the same method can be bound for file-like object and the others.

Pull Request resolved: https://github.com/pytorch/audio/pull/3092

Reviewed By: nateanl

Differential Revision: D43524808

Pulled By: mthrok

fbshipit-source-id: f7467c66ccd37dbf4abc337bbb18ffaac21a0058

c3310018

Add TCPGen context-biasing Conformer RNN-T (#2890) · 1ed330b5

G. Sun authored Feb 23, 2023

Summary:
This commit adds the implementation of the tree-constrained pointer generator (TCPGen) for contextual biasing.

An example for Librispeech can be found in audio/examples/asr/librispeech_biasing.

Maintainer's note (mthrok):
It seems that TrieNode should be better typed as tuple, but changing the implementation from list to tuple
could cause some issue without running the code, so the code is not changed, though the annotation uses tuple.

Pull Request resolved: https://github.com/pytorch/audio/pull/2890

Reviewed By: nateanl

Differential Revision: D43171447

Pulled By: mthrok

fbshipit-source-id: 372bb077d997d720401dbf2dbfa131e6a958e37e

1ed330b5

Remove Tensor binding from StreamReader (#3093) · d3c9295c

mthrok authored Feb 23, 2023

Summary:
Remove the Tensor input support from StreamReader

Follow up of https://github.com/pytorch/audio/pull/3086

Pull Request resolved: https://github.com/pytorch/audio/pull/3093

Reviewed By: xiaohui-zhang

Differential Revision: D43526066

Pulled By: mthrok

fbshipit-source-id: 57ba4866c413649173e1c2c3b23ba7de3231b7bc

d3c9295c

Deprecate the use of Tensor as a mean of passing byte string (#3086) · a26c2f27

moto authored Feb 22, 2023

Summary:
The same functionality can be achieved with passing io.BytesIO to the constructor.

Pull Request resolved: https://github.com/pytorch/audio/pull/3086

Reviewed By: nateanl

Differential Revision: D43500360

Pulled By: mthrok

fbshipit-source-id: 2c6f37d100f50553b283c75c04fe57c8f9c07dc9

a26c2f27

Update CTCDecoder static build deprecation message (#3089) · 3b75b74f

moto authored Feb 22, 2023

Summary:
1. Fix spacing.
2. Move it to after successful import
3. Add link to the announcement issue

Pull Request resolved: https://github.com/pytorch/audio/pull/3089

Reviewed By: nateanl, xiaohui-zhang

Differential Revision: D43514075

Pulled By: mthrok

fbshipit-source-id: 3b2a24c65c63dab8c12c9c6aa1942a8354b2c0f1

3b75b74f

22 Feb, 2023 3 commits

Rename SQUIM_OBJECTIVE model to SquimObjective (#3087) · b0155938

Zhaoheng Ni authored Feb 22, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3087

Reviewed By: xiaohui-zhang, mthrok

Differential Revision: D43509865

Pulled By: nateanl

fbshipit-source-id: 569cc2ee8edd9de0b7d255a1e1075ac812b26cc8

b0155938

Fix ConformerWav2Vec2PretrainModel (#3085) · b35a5fcf

Zhaoheng Ni authored Feb 22, 2023

Summary:
The negative sampling should be applied to unmasked features in masked indices, the PR fixes the logic in ConformerWav2Vec2PretrainModel.

Pull Request resolved: https://github.com/pytorch/audio/pull/3085

Reviewed By: mthrok

Differential Revision: D43488570

Pulled By: nateanl

fbshipit-source-id: 3820400d50b74216bb98ca6a40dc6a7acca01564

b35a5fcf

Add objective metric estimation model for speech enhancement (#3042) · 3267c7ed

Zhaoheng Ni authored Feb 21, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3042

Reviewed By: mthrok

Differential Revision: D43405932

Pulled By: nateanl

fbshipit-source-id: 88f6dabae35565b699230e9909b8f68f4a57f5c7

3267c7ed

21 Feb, 2023 1 commit

Fix contiguous error when backpropagating through lfilter (#3080) · 6ab1325a

Chin-Yun Yu authored Feb 21, 2023

Summary:
I encountered the following errors when using the filter with gradients being enabled.

```sh
Traceback (most recent call last):
  File "/home/ycy/working/audio/test_backward.py", line 20, in <module>
    loss.backward()
  File "/home/ycy/miniconda3/envs/nightly/lib/python3.10/site-packages/torch/_tensor.py", line 488, in backward
    torch.autograd.backward(
  File "/home/ycy/miniconda3/envs/nightly/lib/python3.10/site-packages/torch/autograd/__init__.py", line 197, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: Expected input_signal_windows.is_contiguous() && a_coeff_flipped.is_contiguous() && padded_output_waveform.is_contiguous() to be true, but got false.  (Could this error message be improved?  If so, please report an enhancement request to PyTorch.)
```
This can happen if the outputs from lfilter was used by other operations.

### How to reproduce
The following script can reproduce the error on the stable and nightly versions.

```python
import torch
import torch.nn.functional as F
from torchaudio.functional import lfilter

a = torch.rand(250, 26, requires_grad=True)
b = torch.ones(250, 26, requires_grad=True)
x = torch.rand(250, 1024, requires_grad=True)
w = torch.eye(1024).unsqueeze(1)

y = lfilter(x, a, b, False)
y = F.conv_transpose1d(
    y.t().unsqueeze(0),
    w,
    stride=256,
).squeeze()
print(y.shape)
target = torch.ones_like(y)
loss = torch.nn.functional.mse_loss(y, target)
loss.backward()
```

### Cause

The inner call of differentiable IIR in the backward pass needs to ensure the input is contiguous. Adding a `contiguous()` call solve the problem.

Pull Request resolved: https://github.com/pytorch/audio/pull/3080

Reviewed By: xiaohui-zhang

Differential Revision: D43466612

Pulled By: mthrok

fbshipit-source-id: 375e0a147988656da47ac8397f7de6eae512a655

6ab1325a

17 Feb, 2023 3 commits

Make lengths optional for speed functions and modules (#3072) · 5af309d3

hwangjeff authored Feb 16, 2023

Summary:
Makes lengths input optional for `torchaudio.functional.speed`, `torchaudio.transforms.Speed`, and `torchaudio.transforms.SpeedPerturbation`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3072

Reviewed By: nateanl, mthrok

Differential Revision: D43371406

Pulled By: hwangjeff

fbshipit-source-id: ecb38bcc2bfff5c5a396a37eff238b22238e795a

5af309d3

Add py3.11 to windows nightly conda (#3071) · e663095c

atalman authored Feb 16, 2023

Summary:
Same as: https://github.com/pytorch/vision/pull/7263

Pull Request resolved: https://github.com/pytorch/audio/pull/3071

Reviewed By: weiwangmeta

Differential Revision: D43377741

Pulled By: atalman

fbshipit-source-id: 0dbe0aaa10b9a4bad713563e98642b1a65e9ac07

e663095c

Add precodition check for contiguous emissions tensor (#3074) · 06b1cc9d

Daniel Walker authored Feb 16, 2023

Summary:
This PR adds a precondition check to the `CTCDecoder` that raises a helpful exception when called on a noncontiguous emissions tensor.

Currently, noncontiguous tensors can be passed into the CTCDecoder, which in turn passes the tensors to the backing Flashlight C++ library and results in undefined behavior, since Flashlight requires the tensors to be laid out in contiguous memory. The following code demonstrates the problem:

```
import torch
from torchaudio.models.decoder import ctc_decoder

tokens = ['a', '-', '|']
decoder = ctc_decoder(lexicon=None, tokens=tokens)

emissions = torch.rand(len(tokens), 2)  # N x T contiguous
emissions = emissions.t()  # T x N noncontiguous

batch = emissions.unsqueeze(0)
result = decoder(batch)  # undefined behavior!!!
```

I stumbled on the issue accidentally when I noticed the decoder wasn't giving the expected results on my input only to realize, finally, that the tensor I had passed in was noncontiguous. In my case, Flashlight was iterating over unrelated segments of memory where it had expected to find a contiguous tensor. A precondition check will hopefully save others from making the same mistake.

Pull Request resolved: https://github.com/pytorch/audio/pull/3074

Reviewed By: nateanl, xiaohui-zhang

Differential Revision: D43376011

Pulled By: mthrok

fbshipit-source-id: 7c95aa8016d8f9f2d65b5b816a859b28ea4629f5

06b1cc9d

16 Feb, 2023 5 commits

Add guards to prevent ffmpeg failures during dispatcher import (#3073) · 85f8fc54

hwangjeff authored Feb 16, 2023

Summary:
With the introduction of the backend dispatcher, importing torchaudio fails when ffmpeg is not available. This PR adds guards to resolve these failures.

Pull Request resolved: https://github.com/pytorch/audio/pull/3073

Reviewed By: NivekT, mthrok

Differential Revision: D43372870

Pulled By: hwangjeff

fbshipit-source-id: 7f6c2795430d7aeb742c2feb97984d5273f20aac

85f8fc54

Fix DDP training in HuBERT recipes (#3068) · 2c9b3e59

Zhaoheng Ni authored Feb 16, 2023

Summary:
The `BucketizeBatchSampler` may return different iter_list in different node if `shuffle` is `True`, which will cause DPP training hang forever.
`shuffle` in `DistributedSampler` only happens in initialization, which means it will assign the same subset to replicas in all training epochs. The PR fixes the two above issues.

cc arlofaria

Pull Request resolved: https://github.com/pytorch/audio/pull/3068

Reviewed By: mthrok

Differential Revision: D43372110

Pulled By: nateanl

fbshipit-source-id: a162728406ae995e05d2a07cfc2444fb76cf345e

2c9b3e59

Update WER results for CTC n-gram decoding (#3070) · 11bdafc3

Zhaoheng Ni authored Feb 16, 2023

Summary:
In https://github.com/pytorch/audio/issues/2873, layer normalization is applied to waveforms for SSL models trained on large scale datasets. The word error rate is significantly reduced after the change. The PR updates the results for the affected models.

Without the change in https://github.com/pytorch/audio/issues/2873, here is the WER result table:
|                                                                                            Model | dev-clean | dev-other | test-clean | test-other |
|:------------------------------------------------------------------------------------------------|-----------:|-----------:|-----------:|-----------:|
| [WAV2VEC2_ASR_LARGE_LV60K_10M](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_10M.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_10M) |        10.59|        15.62|        9.58|        16.33|
| [WAV2VEC2_ASR_LARGE_LV60K_100H](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_100H.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_100H) |        2.80|        6.01|        2.82|        6.34|
| [WAV2VEC2_ASR_LARGE_LV60K_960H](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_960H.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_960H) |        2.36|        4.43|        2.41|        4.96|
| [HUBERT_ASR_LARGE](https://pytorch.org/audio/main/generated/torchaudio.pipelines.HUBERT_ASR_LARGE.html#torchaudio.pipelines.HUBERT_ASR_LARGE) |        1.85|        3.46|        2.09|        3.89|
| [HUBERT_ASR_XLARGE](https://pytorch.org/audio/main/generated/torchaudio.pipelines.HUBERT_ASR_XLARGE.html#torchaudio.pipelines.HUBERT_ASR_XLARGE) |         2.21|        3.40|        2.26|        4.05|

After applying layer normalization, here is the updated result:
|                                                                                            Model | dev-clean | dev-other | test-clean | test-other |
|:------------------------------------------------------------------------------------------------|-----------:|-----------:|-----------:|-----------:|
| [WAV2VEC2_ASR_LARGE_LV60K_10M](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_10M.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_10M) |        6.77|        10.03|        6.87|        10.51|
| [WAV2VEC2_ASR_LARGE_LV60K_100H](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_100H.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_100H) |        2.19|        4.55|        2.32|        4.64|
| [WAV2VEC2_ASR_LARGE_LV60K_960H](https://pytorch.org/audio/main/generated/torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_960H.html#torchaudio.pipelines.WAV2VEC2_ASR_LARGE_LV60K_960H) |        1.78|        3.51|        2.03|        3.68|
| [HUBERT_ASR_LARGE](https://pytorch.org/audio/main/generated/torchaudio.pipelines.HUBERT_ASR_LARGE.html#torchaudio.pipelines.HUBERT_ASR_LARGE) |        1.77|        3.32|        2.03|        3.68|
| [HUBERT_ASR_XLARGE](https://pytorch.org/audio/main/generated/torchaudio.pipelines.HUBERT_ASR_XLARGE.html#torchaudio.pipelines.HUBERT_ASR_XLARGE) |         1.73|        2.72|        1.90|        3.16|

Pull Request resolved: https://github.com/pytorch/audio/pull/3070

Reviewed By: mthrok

Differential Revision: D43365313

Pulled By: nateanl

fbshipit-source-id: 34a60ad2e5eb1299da64ef88ff0208ec8ec76e91

11bdafc3

Add deprecation warning to decoder (#3055) · 6b2086cf

moto authored Feb 16, 2023

Summary:
Flashlight Text decoder is now available on PyPI and KenLM support is being added at
https://github.com/flashlight/text/pull/43

Once this work is merged, we can rely on the official distribution of Flashlight Text package, so we are adding deprecation warning.

Once the decoder is fully available, one can install it with

```
pip install flashlight-text
pip install git+https://github.com/kpu/kenlm.git
```

Pull Request resolved: https://github.com/pytorch/audio/pull/3055

Reviewed By: hwangjeff, nateanl

Differential Revision: D43239150

Pulled By: mthrok

fbshipit-source-id: 728cb208b8403100cd4ccd80c6295d454756b414

6b2086cf

Introduce I/O backend dispatcher (#3015) · b799fcd6

hwangjeff authored Feb 16, 2023

Summary:
Adds I/O backend dispatcher that routes I/O requests to FFmpeg, SoX, or Soundfile backend, per library availability. It allows users to specify a backend mapped to a media library, i.e. one of `["ffmpeg", "sox", "soundfile"]`, to use via keyword argument, with FFmpeg being the default. Environment variable `TORCHAUDIO_USE_BACKEND_DISPATCHER` gates enablement of the dispatcher; specifically, if `TORCHAUDIO_USE_BACKEND_DISPATCHER` is explicitly set to `1`, importing TorchAudio makes it accessible via `torchaudio.info`, `torchaudio.load`, and `torchaudio.save`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3015

Reviewed By: mthrok

Differential Revision: D43258649

Pulled By: hwangjeff

fbshipit-source-id: 8f12e4e56b9fa3f0814dd3fed3e1783ab23a53a1

b799fcd6

15 Feb, 2023 5 commits

Implement exp sigmoid (#3056) · 9db4bdf1

Cole Li authored Feb 15, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3056

Task #2 from https://github.com/pytorch/audio/issues/2835

Reviewed By: mthrok

Differential Revision: D42854156

fbshipit-source-id: e1b3bd992c91fedc55f30a814e16efd7c51e0c80

9db4bdf1

Enable broadcasting for inputs to convolve (#3061) · a49edea5

hwangjeff authored Feb 15, 2023

Summary:
Relaxes input dimension matching constraint on `convolve` to enable broadcasting for inputs.

Pull Request resolved: https://github.com/pytorch/audio/pull/3061

Reviewed By: mthrok

Differential Revision: D43298078

Pulled By: hwangjeff

fbshipit-source-id: a6cc36674754523b88390fac0a05f06562921319

a49edea5

Add FFmpeg compat save function (#3058) · fb932674

Jeff Hwang authored Feb 15, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3058

Adds FFmpeg-based save function.

Reviewed By: mthrok

Differential Revision: D43264858

fbshipit-source-id: ae3f89012bc2520f3de11af65348ba8f77f0acff

fb932674

Update data augmentation tutorial to use new operators (#3062) · b9ef69d1

hwangjeff authored Feb 15, 2023

Summary:
Updates tutorial "Audio Data Augmentation" to use two of the newly introduced data augmentation operators in beta: `torchaudio.functional.fftconvolve` and `torchaudio.functional.add_noise`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3062

Reviewed By: mthrok

Differential Revision: D43298120

Pulled By: hwangjeff

fbshipit-source-id: 09ca736a5c67242568515d600b7d31eab32c2df1

b9ef69d1

Tweak docs around IO (#3064) · 12e8cb97

moto authored Feb 15, 2023

Summary:
* Mention context manager in StreamWriter
* Add FFmpeg as optional dependency

Pull Request resolved: https://github.com/pytorch/audio/pull/3064

Reviewed By: hwangjeff

Differential Revision: D43307818

Pulled By: mthrok

fbshipit-source-id: 86339d973aba85e090f520e08af65b5d736e3d18

12e8cb97

14 Feb, 2023 4 commits

Adding RC triggers for all build jobs (#3057) · b0af1406

Omkar Salpekar authored Feb 14, 2023

Summary:
Add triggers for RC branches and tags to all build workflows. This will ensure that the release-candidate builds will run with `CHANNEL=test`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3057

Reviewed By: atalman

Differential Revision: D43279657

Pulled By: osalpekar

fbshipit-source-id: 5abf3994b9b4a4897f53c540bd1db6c3d624b3e0

b0af1406

Update ssl example (#3060) · ff01be0f

Zhaoheng Ni authored Feb 14, 2023

Summary:
- Rename the current `ssl` example to `self_supervised_learning`
- Add README to demonstrate how to run the recipe with hubert task

Pull Request resolved: https://github.com/pytorch/audio/pull/3060

Reviewed By: mthrok

Differential Revision: D43287868

Pulled By: nateanl

fbshipit-source-id: 10352682485ef147ca32f4c4c9f9cde995444aa0

ff01be0f

Redirect build instruction to official doc (#3053) · 73b29fc9

moto authored Feb 14, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3053

Reviewed By: nateanl

Differential Revision: D43238766

Pulled By: mthrok

fbshipit-source-id: 4f82878b1c97b0e6a35af75855849b86200e6061

73b29fc9

Add simulate_rir_ism method for room impulse response simulation (#2880) · 8c5c9a9b

Zhaoheng Ni authored Feb 14, 2023

Summary:
replicate of https://github.com/pytorch/audio/issues/2644

Pull Request resolved: https://github.com/pytorch/audio/pull/2880

Reviewed By: mthrok

Differential Revision: D41633911

Pulled By: nateanl

fbshipit-source-id: 73cf145d75c389e996aafe96571ab86dc21f86e5

8c5c9a9b

11 Feb, 2023 1 commit

Update hardware accelerated video processing tutorial (#3050) · 3f02b898

moto authored Feb 10, 2023

Summary:
Par https://github.com/pytorch/audio/issues/3040 and https://github.com/pytorch/audio/issues/3041, it turned out Google Colab now has FFmpeg with GPU decoder/encoder preinstalled, and installing FFmpeg manually corrups the environment.

This commit updates the tutorial by extracting and moving the how-to-install part to installation/build section.

closes https://github.com/pytorch/audio/issues/3041
closes https://github.com/pytorch/audio/issues/3040

Pull Request resolved: https://github.com/pytorch/audio/pull/3050

Reviewed By: nateanl

Differential Revision: D43166054

Pulled By: mthrok

fbshipit-source-id: 32667f292a796344d5fcde86e8231e15ad904e58

3f02b898

10 Feb, 2023 1 commit

Add python 3.11 support for torchaudio and add workflow concurrency rule (#3039) · fadb5ae5

Wei Wang authored Feb 09, 2023

Summary:
So far Linux and MacOS were tested to work fine out of the box. This PR is created to verify this -- disabled windows jobs and configs for now.

Pull Request resolved: https://github.com/pytorch/audio/pull/3039

Reviewed By: osalpekar

Differential Revision: D43174745

Pulled By: weiwangmeta

fbshipit-source-id: 81766905256e03c5a01cb5448a350f5d409ca4b8

fadb5ae5

09 Feb, 2023 3 commits

Follow-up on audio playback function (#3051) · 91b05e2e

moto authored Feb 09, 2023

Summary:
- Add documentation
- Tweak docsrting
- Fix import

Pull Request resolved: https://github.com/pytorch/audio/pull/3051

Reviewed By: weiwangmeta, atalman, nateanl

Differential Revision: D43166081

Pulled By: mthrok

fbshipit-source-id: 7d77aa34a6318a64824626cff8372f8b9aebf6f9

91b05e2e

Follow-up fix policy set (#3046) · 70acff7a

moto authored Feb 09, 2023

Summary:
Commit b4c66d1f broke all the CIs.
The new policy changes the timestamp of configuration files of third party libraries,
which triggers re-configuration which requires extra tools.

This commit fixes it by reverting the old behavior.
Also this adds guard for older cmake versions.

Pull Request resolved: https://github.com/pytorch/audio/pull/3046

Reviewed By: atalman

Differential Revision: D43133536

Pulled By: mthrok

fbshipit-source-id: 357055c8c1b53e593b8b7880f2045e13512c7a8f

70acff7a

Updated USE_ROCM detection (#3008) · 05d597fa

DanilBaibak authored Feb 09, 2023

Summary:
We don't need the presence of physical HW to compile with CUDA.

This is a follow up PR regarding `USE_ROCM` for issue https://github.com/pytorch/audio/issues/2979.

Pull Request resolved: https://github.com/pytorch/audio/pull/3008

Reviewed By: malfet

Differential Revision: D42708862

Pulled By: DanilBaibak

fbshipit-source-id: 90cedc80a2d180ca1e0912ad5b644398182417b8

05d597fa

08 Feb, 2023 4 commits

Update the guard mechanism for FFmpeg-related features (#3028) · 98b3ac17

moto authored Feb 08, 2023

Summary:
Instead of raising an error when lazy import happens, this method allows to import features, and raises an error when the feature is being used.

This makes it easy to adopt the same error mechanism across different modules. It is how it's done for sox-related features.

Pull Request resolved: https://github.com/pytorch/audio/pull/3028

Reviewed By: xiaohui-zhang

Differential Revision: D42966976

Pulled By: mthrok

fbshipit-source-id: 423dfe0b8a3970cd07f20e841c794c7f2809f993

98b3ac17

Build doc on GHA (#3043) · a0f8af4b

moto authored Feb 08, 2023

Summary:
The first step to migrate doc build to GHA.

Pull Request resolved: https://github.com/pytorch/audio/pull/3043

Reviewed By: xiaohui-zhang

Differential Revision: D43110816

Pulled By: mthrok

fbshipit-source-id: 91de5f3ac567188e7030f14c2827a202a1901f1a

a0f8af4b

Suppres warning about archive timestamp (#3044) · b4c66d1f

moto authored Feb 08, 2023

Summary:
Currently, for each third party library checked out with ExternalProject_Add, the following warning is shown.

This commit set the policy so that the warning is not shown.

```
CMake Warning (dev) at ci_env/lib/python3.10/site-packages/cmake/data/share/cmake-3.25/Modules/ExternalProject.cmake:3075 (message):
  The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
  not set.  The policy's OLD behavior will be used.  When using a URL
  download, the timestamps of extracted files should preferably be that of
  the time of extraction, otherwise code that depends on the extracted
  contents might not be rebuilt if the URL changes.  The OLD behavior
  preserves the timestamps from the archive instead, but this is usually not
  what you want.  Update your project to the NEW behavior or specify the
  DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this
  robustness issue.
```

Pull Request resolved: https://github.com/pytorch/audio/pull/3044

Reviewed By: xiaohui-zhang

Differential Revision: D43110818

Pulled By: mthrok

fbshipit-source-id: d2e20c9fdbbeeedb5ad546fe32dbda28c5bdd431

b4c66d1f

Switch to Nova MacOS Conda (#2908) · de54d864

DanilBaibak authored Feb 08, 2023

Summary:
Switch to Nova M1 Conda

Pull Request resolved: https://github.com/pytorch/audio/pull/2908

Reviewed By: seemethere, osalpekar

Differential Revision: D43093605

Pulled By: DanilBaibak

fbshipit-source-id: 9e44f26cfb87e277c3808ee59f50218b4629e86e

de54d864

07 Feb, 2023 2 commits

Add installation / build instruction to doc (#3038) · 3c121a59

moto authored Feb 07, 2023

Summary:
Add a section about installation/build

https://output.circle-artifacts.com/output/job/f121cd38-68f3-47a3-ac29-c7b0cfe94c77/artifacts/0/docs/installation.html
<img width="1102" alt="Screenshot 2023-02-06 at 6 13 50 PM" src="https://user-images.githubusercontent.com/855818/217108551-622b117b-209e-4776-b5d6-d6934c8126a4.png">

https://output.circle-artifacts.com/output/job/f121cd38-68f3-47a3-ac29-c7b0cfe94c77/artifacts/0/docs/build.html
<img width="1072" alt="Screenshot 2023-02-06 at 6 13 57 PM" src="https://user-images.githubusercontent.com/855818/217108568-c125cdc2-9d6a-4c1d-a155-2cee40c9dac6.png">

Pull Request resolved: https://github.com/pytorch/audio/pull/3038

Reviewed By: hwangjeff, nateanl

Differential Revision: D43083469

Pulled By: mthrok

fbshipit-source-id: e0b5b76dbf706552dd60ae26ea40ebc98627e3b0

3c121a59

Add playback function (#3026) · 2ead941e

juan.azcarreta.ortiz authored Feb 07, 2023

Summary:
Allows user to play audio through the
device speaker.

Pull Request resolved: https://github.com/pytorch/audio/pull/3026

Test Plan:
Created a new test that mocks a call to the write audio chunk method from StreamWriter. To run the test:

`pytest test/torchaudio_unittest/io/_playback_test.py`

Reviewed By: mthrok

Differential Revision: D43082062

Pulled By: jazcarretao

fbshipit-source-id: 01a85b32ce925687a633d1208d15d54556e89dd8

2ead941e

06 Feb, 2023 1 commit

Switch circleci jobs from cu116 to cu117 (#3034) · 9368f33b

atalman authored Feb 06, 2023

Summary:
Switch circleci jobs from cu116 to cu117

Pull Request resolved: https://github.com/pytorch/audio/pull/3034

Reviewed By: DanilBaibak

Differential Revision: D43042385

Pulled By: atalman

fbshipit-source-id: 636e3d86d66a6091d13d731238550d800e77ccc8

9368f33b

04 Feb, 2023 1 commit

Add rgb48le and CUDA p010 support (HDR/10bit) to StreamReader (#3023) · b7e173fa

Tristan Rice authored Feb 04, 2023

Summary:
This adds 2 10 bit pix formats one for CPU and one for CUDA. This allows for training on HDR/10bit video datasets.

Pull Request resolved: https://github.com/pytorch/audio/pull/3023

Test Plan:
```py
r = StreamReader(
    reader, format='hevc',
)
stream = r.add_video_stream(
    frames_per_chunk=-1,
    decoder="hevc_cuvid",
    hw_accel="cuda",
)
frame = next(r.stream())
```

```py
r = StreamReader(
    reader, format='hevc',
)
stream = r.add_video_stream(
    frames_per_chunk=-1,
    filter_desc="format=rgb48le",
)
frame = next(r.stream())
```

![audio-example](https://user-images.githubusercontent.com/909104/215696543-ed3dc5a3-3013-4a57-8b98-05aa4a5a9a7c.png)

Reviewed By: xiaohui-zhang

Differential Revision: D43019191

Pulled By: mthrok

fbshipit-source-id: fe4359e525b24c8b856dfdf3d2f8596871566350

b7e173fa