Commits · 60af60a8aefa4d4830231a3c2f9a57384493b953 · OpenDAS / Torchaudio

01 Feb, 2023 1 commit

Drop python 3.7 support (#3020) · 60af60a8

Wei Wang authored Jan 31, 2023

Summary:
https://github.com/pytorch/pytorch/pull/93155 Core has dropped python3.7

Pull Request resolved: https://github.com/pytorch/audio/pull/3020

Reviewed By: mthrok

Differential Revision: D42902346

Pulled By: weiwangmeta

fbshipit-source-id: 07ab1aff0e128c5960d87e5fa29e341310dea388

60af60a8

31 Jan, 2023 1 commit

Remove unnecessary AVFrame allocation (#3021) · 0709cadc

Moto Hira authored Jan 31, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3021

When input format and encode format is different in StreamWriter, filter for format conversion is inserted.

A temporary AVFilter (`dst_frame`) is used for this case,
but FilterGraph handles the memory allocation,
so there is no need to perform allocation by ourselves.

This `dst_frame` is otherwise not used, so we do not have to allocate memory at all.
This commit removes the unnecessary memory allocation at all.

Reviewed By: xiaohui-zhang

Differential Revision: D42865042

fbshipit-source-id: 2673b06de1e905dc73a11e2ec1cc6ce7b525d451

0709cadc

30 Jan, 2023 2 commits

Fix hybrid demucs tutorial for CUDA (#3017) · da9d1627

Yan Li authored Jan 30, 2023

Summary:
Currently there will be a few errors when this tutorial is run with a CUDA device.

The reasons being:
- The source audio waveform is not properly moved to the GPU. The `to()` method is not in-place for Tensors, so we need to assign the return value of the method call to the variable (otherwise the Tensor would still be on the CPU).
- When performing further analysis and displaying of the output audio, we need to move them back from the GPU to the CPU. This is because some of the functions we call require the Tensor to be on the CPU (e.g. `stft()` and `bss_eval_sources()`).

Pull Request resolved: https://github.com/pytorch/audio/pull/3017

Reviewed By: mthrok

Differential Revision: D42828526

Pulled By: nateanl

fbshipit-source-id: c28bc855e79e3363a011f4a35a69aae1764e7762

da9d1627

Add get_build_config ffmpeg utility function (#3014) · 635d8cff

moto authored Jan 29, 2023

Summary:
We often need to look at which FFmpeg was found and linked when debugging an issue.

Version number is often not enough but there is no easy way to find where the library was found either.

This commit adds utility function that prints the build time configuration.

It helps to distinguish if the linked FFmpeg is the one from binary distribution built in CI or locally built.

Pull Request resolved: https://github.com/pytorch/audio/pull/3014

Reviewed By: hwangjeff

Differential Revision: D42794952

Pulled By: mthrok

fbshipit-source-id: 91ed358fde8cfe9d6d950f34742b1722e729cf4e

635d8cff

27 Jan, 2023 3 commits

Replace torchaudio::ffmpeg with torchaudio::io (#3013) · 51aae466

Moto Hira authored Jan 27, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3013

Namespace clean up before publishing the torchaudio C++ API as prototype.

Reviewed By: hwangjeff

Differential Revision: D42699903

fbshipit-source-id: 8a9eed0390dfa4a152124b42f2b927dbdd3e23d2

51aae466

Switch to Nova Linux Conda build (#2899) · 12f960b2

DanilBaibak authored Jan 27, 2023

Summary:
Switch to Nova Linux Conda build.

Pull Request resolved: https://github.com/pytorch/audio/pull/2899

Reviewed By: seemethere, osalpekar, mthrok

Differential Revision: D42416835

Pulled By: DanilBaibak

fbshipit-source-id: 70886c4ff6f3243b80059be9385269cc0f2d4764

12f960b2

Move data augmentation transforms out of prototype (#3009) · b4cc0f33

hwangjeff authored Jan 26, 2023

Summary:
Moves `AddNoise`, `Convolve`, `FFTConvolve`, `Speed`, `SpeedPerturbation`, `Deemphasis`, and `Preemphasis` out of `torchaudio.prototype.transforms` and into `torchaudio.transforms`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3009

Reviewed By: xiaohui-zhang, mthrok

Differential Revision: D42730322

Pulled By: hwangjeff

fbshipit-source-id: 43739ac31437150d3127e51eddc0f0bba5facb15

b4cc0f33

26 Jan, 2023 3 commits

Abstract away AVFormatContext from StreamReader/Writer constructor (#3007) · 7ea69e61

Moto Hira authored Jan 26, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3007

Simplify the construction of StreamReader/Writer in C++.

Currently these classes require client code to build AVFormatContext
manually. This is tedious and not user freindly.

Some client code actually uses the same helper function that
TorchAudio codebase uses.

This commit moves the helper logic inside of the constructor of
StreamReader/Writer, so that the signatures of these constructors
are easy to use and similar to Python interface.

Reviewed By: xiaohui-zhang

Differential Revision: D42662520

fbshipit-source-id: d95e5236810c48d7d9bd2d89c05d4f60a44b3ba1

7ea69e61

Remove function input parameters from data aug functional tests (#3011) · 2f5fcf4f

hwangjeff authored Jan 25, 2023

Summary:
Passing functions as test parameters causes issues on some platforms. This PR updates the functional tests to pass functions by name instead.

Pull Request resolved: https://github.com/pytorch/audio/pull/3011

Reviewed By: mthrok

Differential Revision: D42748106

Pulled By: hwangjeff

fbshipit-source-id: 4d81dabe4aff2293bc344a457a034a2d9af024e2

2f5fcf4f

Deprecate sox initialization/shutdown public API functions (#3010) · aa760caf

moto authored Jan 25, 2023

Summary:
These functions are called part of sox initialization, thus it is no longer needed.

Pull Request resolved: https://github.com/pytorch/audio/pull/3010

Reviewed By: hwangjeff

Differential Revision: D42744478

Pulled By: mthrok

fbshipit-source-id: 17d715b328392397ec47d81a533a307aac22862d

aa760caf

24 Jan, 2023 1 commit

Move data augmentation functions out of prototype (#3001) · 41b88314

hwangjeff authored Jan 23, 2023

Summary:
Moves `add_noise`, `fftconvolve`, `convolve`, `speed`, `preemphasis`, and `deemphasis` out of `torchaudio.prototype.functional` and into `torchaudio.functional`.

Pull Request resolved: https://github.com/pytorch/audio/pull/3001

Reviewed By: mthrok

Differential Revision: D42688971

Pulled By: hwangjeff

fbshipit-source-id: 43280bd3ffeccddae57f1092ac45afb64dd426cc

41b88314

23 Jan, 2023 3 commits

Tweak `USE_CUDA` detection (#3005) · 09e7d818

Nikita Shulga authored Jan 23, 2023

Summary:
We don't need the presence of physical HW to compile with CUDA.

Likely one of the causes of  https://github.com/pytorch/audio/issues/2979 (i.e. in CircleCI builds USE_CUDA were defined by CI environment, so nobody ever checked the default, but this is not the case in Nova builds)

Pull Request resolved: https://github.com/pytorch/audio/pull/3005

Test Plan:
Check that `compute.cu` is mentioned in builds, for example see https://github.com/pytorch/audio/actions/runs/3990295262/jobs/6843771056#step:9:829
```
[193/202] /usr/local/cuda-11.6/bin/nvcc -forward-unknown-to-host-compiler -DINCLUDE_KALDI -DUSE_C10D_GLOO -DUSE_C10D_NCCL -DUSE_CUDA -DUSE_DISTRIBUTED -DUSE_RPC -DUSE_TENSORPIPE -Dlibtorchaudio_EXPORTS -I/__w/audio/audio/pytorch/audio -I/__w/audio/audio/pytorch/audio/third_party/kaldi/src -I/__w/audio/audio/pytorch/audio/third_party/kaldi/submodule/src -isystem=/__w/_temp/conda_environment_3990295262/lib/python3.7/site-packages/torch/include -isystem=/__w/_temp/conda_environment_3990295262/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem=/usr/local/cuda-11.6/include -DONNX_NAMESPACE=onnx_c2 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_50,code=compute_50 -Xcudafe --diag_suppress=cc_clobber_ignored,--diag_suppress=integer_sign_change,--diag_suppress=useless_using_declaration,--diag_suppress=set_but_not_used,--diag_suppress=field_without_dll_interface,--diag_suppress=base_class_has_different_dll_interface,--diag_suppress=dll_interface_conflict_none_assumed,--diag_suppress=dll_interface_conflict_dllexport_assumed,--diag_suppress=implicit_return_from_non_void_function,--diag_suppress=unsigned_compare_with_zero,--diag_suppress=declared_but_not_referenced,--diag_suppress=bad_friend_decl --expt-relaxed-constexpr --expt-extended-lambda -O3 -DNDEBUG -Xcompiler=-fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++17 -MD -MT torchaudio/csrc/CMakeFiles/libtorchaudio.dir/rnnt/gpu/compute.cu.o -MF torchaudio/csrc/CMakeFiles/libtorchaudio.dir/rnnt/gpu/compute.cu.o.d -x cu -c /__w/audio/audio/pytorch/audio/torchaudio/csrc/rnnt/gpu/compute.cu -o torchaudio/csrc/CMakeFiles/libtorchaudio.dir/rnnt/gpu/compute.cu.o
```

Reviewed By: mthrok

Differential Revision: D42687455

Pulled By: malfet

fbshipit-source-id: c37ad58cc62439d1268865e9bf0bcb97079a529f

09e7d818

Merge pop_chunks methods (#3002) · 54196fd3

Moto Hira authored Jan 23, 2023

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/3002

This commit merges `pop_chunks` and `pop_chunks_with_metadata`.

In #2975 (D42526945 (https://github.com/pytorch/audio/commit/0dd59e0dda22eabf54fc95ad8050094df239bd39)), we updated StreamReader so that it returns PTS.
In that PR, we introduced `pop_chunks_with_metadata` method, so that
the original `pop_chunks` method returns the same type and we could
focus on the PTS logic in the code review.

The commit is landed, now we merge the two methods, so that the original
`pop_chunks` returns Tensor frames and metadata (PTS).

Reviewed By: xiaohui-zhang

Differential Revision: D42662321

fbshipit-source-id: 37ae088bc63fc516ea068698088925e8b31bc0a1

54196fd3

Update highlighting in doc (#3000) · 1f9b9104

moto authored Jan 23, 2023

Summary:
This change fixes the issue where syntax highlighting is broken up par word.

## Plain
Before
<img width="243" alt="Screenshot 2023-01-20 at 1 28 48 PM" src="https://user-images.githubusercontent.com/855818/213778202-27ec8030-3f2f-4ef9-8210-bce7cfc3cb38.png">
After
<img width="244" alt="Screenshot 2023-01-20 at 1 29 01 PM" src="https://user-images.githubusercontent.com/855818/213778231-61c52825-d63a-4913-b10d-a65f3b2cfbbb.png">

## In articles
Before
<img width="786" alt="Screenshot 2023-01-20 at 1 34 12 PM" src="https://user-images.githubusercontent.com/855818/213779050-c21ba5e2-84b3-4935-bbab-6edcb7bc89ce.png">
After
<img width="783" alt="Screenshot 2023-01-20 at 1 34 17 PM" src="https://user-images.githubusercontent.com/855818/213779069-f1406422-27a4-41cf-8ccd-5058f80860bd.png">

## In tables
Before
<img width="813" alt="Screenshot 2023-01-20 at 1 27 35 PM" src="https://user-images.githubusercontent.com/855818/213778039-fede6f18-5a35-47f2-9e0b-a9be5716dc73.png">
After
<img width="813" alt="Screenshot 2023-01-20 at 1 27 51 PM" src="https://user-images.githubusercontent.com/855818/213778073-e26275a9-d380-4601-aa92-84af7aeab00f.png">

Pull Request resolved: https://github.com/pytorch/audio/pull/3000

Reviewed By: xiaohui-zhang

Differential Revision: D42642522

Pulled By: mthrok

fbshipit-source-id: 6831bb90da005aff8d7f178ef768e967bc6d2640

1f9b9104

22 Jan, 2023 1 commit

Make StreamReader return PTS (#2975) · 0dd59e0d

moto authored Jan 22, 2023

Summary:
This commit makes `StreamReader` report PTS (presentation time stamp) of the returned chunk as well.

Example

```python
from torchaudio.io import StreamReader

s = StreamReader(...)
s.add_video_stream(...)
for (video_chunk, ) in s.stream():
    # video_chunk is Torch tensor type but has extra attribute of PTS
    print(video_chunk.pts)  # reports the PTS of the first frame of the video chunk.
```

For the backward compatibility, we introduce a `_ChunkTensor`, that is a composition
of Tensor and metadata, but works like a normal tensor in PyTorch operations.

The implementation of `_ChunkTensor` is based on [TrivialTensorViaComposition](https://github.com/albanD/subclass_zoo/blob/0eeb1d68fb59879029c610bc407f2997ae43ba0a/trivial_tensors.py#L83).

It was also suggested to attach metadata directly to Tensor object,
but the possibility to have the collision on torchaudio's metadata and new attributes introduced in
PyTorch cannot be ignored, so we use Tensor subclass implementation.

If any unexpected issue arise from metadata attribute name collision, client code can
fetch the bare Tensor and continue.

Pull Request resolved: https://github.com/pytorch/audio/pull/2975

Reviewed By: hwangjeff

Differential Revision: D42526945

Pulled By: mthrok

fbshipit-source-id: b4e9422e914ff328421b975120460f3001268f35

0dd59e0d

20 Jan, 2023 4 commits

Document StreamReader/Writer C++ code (#2997) · de628226

moto authored Jan 20, 2023

Summary:
Extraction from https://github.com/pytorch/audio/issues/2994

Add docstrings to C++ StreamReader/Writer.

Pull Request resolved: https://github.com/pytorch/audio/pull/2997

Reviewed By: nateanl

Differential Revision: D42628016

Pulled By: mthrok

fbshipit-source-id: b22c43b80997af4a9087142340c67bed28e54917

de628226

Fix error message (#2999) · bcfa9eed

moto authored Jan 20, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2999

Reviewed By: hwangjeff

Differential Revision: D42637618

Pulled By: mthrok

fbshipit-source-id: 35a7976c316e3b3899ae9c2202f132f1a960b736

bcfa9eed

Move drain method to private (#2996) · de9473a4

moto authored Jan 19, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2996

Reviewed By: nateanl

Differential Revision: D42624655

Pulled By: mthrok

fbshipit-source-id: 8273cbfa529fbc2bd28adc9c63ceb9453838baa4

de9473a4

Remove unused/redundant things (#2995) · 6a5efe6c

moto authored Jan 19, 2023

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2995

Reviewed By: nateanl

Differential Revision: D42624676

Pulled By: mthrok

fbshipit-source-id: 10fbdaada06ae78e5fa2253eb3331c93c032eeb3

6a5efe6c

19 Jan, 2023 3 commits

Add modularized SSL training recipe (#2876) · 2eaefe27

Zhaoheng Ni authored Jan 19, 2023

Summary:
TorchAudio currently has one training recipe for HuBET + LibriSpeech pre-training. It may not suit well when users want to use customized dataset, or use a new training objective (such as contrastive loss in Wav2Vec2). The PR addresses the issue by providing a modularized training recipe for audio self-supervised learning. Users can inject customized model module, loss function, optimizer, lr scheduler, and datamodule for training a SSL model.

Pull Request resolved: https://github.com/pytorch/audio/pull/2876

Reviewed By: hwangjeff

Differential Revision: D42617414

Pulled By: nateanl

fbshipit-source-id: 6413df45a9d106ed1d5ff830bf628c54368c5792

2eaefe27

Simplify train step in Conformer RNN-T LibriSpeech recipe (#2981) · c6a52355

hwangjeff authored Jan 19, 2023

Summary:
In the Conformer RNN-T LibriSpeech recipe, there's no need to perform manual optimization. This PR modifies the recipe to use automatic optimization instead.

Pull Request resolved: https://github.com/pytorch/audio/pull/2981

Reviewed By: mthrok

Differential Revision: D42507228

Pulled By: hwangjeff

fbshipit-source-id: 9712add951eba356e39f7e8c8dc3bf584ba48309

c6a52355

Make lengths optional for additive noise operators (#2977) · bb077284

hwangjeff authored Jan 19, 2023

Summary:
For greater flexibility, this PR makes argument `lengths` optional for `add_noise` and `AddNoise`.

Pull Request resolved: https://github.com/pytorch/audio/pull/2977

Reviewed By: nateanl

Differential Revision: D42484211

Pulled By: hwangjeff

fbshipit-source-id: 54757dcc73df194bb98c1d9d42a2f43f3027b190

bb077284

17 Jan, 2023 2 commits

Fix buffer flushing mechanism · 51731bf9

Moto Hira authored Jan 16, 2023

Summary:
When buffered data are cleared from ChunkedBuffer,
the `num_buffered_frames` variable was not updated.

This commit fixes that.

Reviewed By: xiaohui-zhang

Differential Revision: D42538519

fbshipit-source-id: a24a9afcebebd8956d977f05e9c2f0b603d060d1

51731bf9

Fix mel spectrogram visualization in TTS tutorial (#2989) · b983c665

Zhaoheng Ni authored Jan 16, 2023

Summary:
The mel spectrograms in the TTS tutorial are upside down. The PR fixes it by using `origin="lower"` in imshow.

Pull Request resolved: https://github.com/pytorch/audio/pull/2989

Reviewed By: mthrok

Differential Revision: D42538349

Pulled By: nateanl

fbshipit-source-id: 4388103a49bdfabf1705c1f979d44ecedd5c910a

b983c665

16 Jan, 2023 4 commits

Refactor buffer common utils (#2988) · e259f156

moto authored Jan 16, 2023

Summary:
Split `convert_video` into memory allocation function and write function.

Also put all the buffer implementations into detail namespace.

Pull Request resolved: https://github.com/pytorch/audio/pull/2988

Reviewed By: xiaohui-zhang

Differential Revision: D42536769

Pulled By: mthrok

fbshipit-source-id: 36fbf437d4bfd521322846161ae08a48c782c540

e259f156

Fixes examples/source_separation for WSJ0_2mix dataset (#2987) · f9d38796

Robin Scheibler authored Jan 16, 2023

Summary:
The `examples/source_separation` scripts use inconsistent keyword to indicate the WSJ0_2mix dataset. This PR does the following.

1. Use `wsj0mix` consistently as keyword indicating the WSJ0_2mix dataset
2. Corrects `args.data_dir` to `args.root_dir` in eval.py
3. Modify the parameters of `pytorch_lightning.Trainer` according to latest version (use `accelerator="gpu"` and `devices=args.num_devices`, instead of just `gpus=args.num_devices`)

Pull Request resolved: https://github.com/pytorch/audio/pull/2987

Reviewed By: xiaohui-zhang

Differential Revision: D42536992

Pulled By: nateanl

fbshipit-source-id: 10a80263ad7054b1629d8fa023676b607e633d76

f9d38796

Refactor chunked buffer implementation (#2984) · 52b6bc3b

moto authored Jan 16, 2023

Summary:
So that the number of Tensor frames stored in buffers is always a multiple of frames_per_chunk.

This makes it easy to store PTS values in aligned manner.

Pull Request resolved: https://github.com/pytorch/audio/pull/2984

Reviewed By: nateanl

Differential Revision: D42526670

Pulled By: mthrok

fbshipit-source-id: d83ee914b7e50de3b51758069b0e0b6b3ebe2e54

52b6bc3b

Set filter graph #threads to 1 (#2985) · 3ecf78d6

moto authored Jan 16, 2023

Summary:
FilterGraph supports multi threading, and by default, the number of threads is determined automatically.

Rather than an automatic behavior, which is unpredictable, it is better to fix the number of threads to 1.

Follow-up: Add an interface to adjust it.

Similar to https://github.com/pytorch/audio/pull/2949.

Pull Request resolved: https://github.com/pytorch/audio/pull/2985

Reviewed By: nateanl

Differential Revision: D42526958

Pulled By: mthrok

fbshipit-source-id: c4f7f95317e93a39378107636a3ca30f6ddfe466

3ecf78d6

15 Jan, 2023 1 commit

Add pre-trained pipelines for XLS-R models (#2978) · 9b7b64e4

Zhaoheng Ni authored Jan 15, 2023

Summary:
The PR adds three `Wav2Vec2Bundle ` pipeline objects for XLS-R models:
- WAV2VEC2_XLSR_300M
- WAV2VEC2_XLSR_1B
- WAV2VEC2_XLSR_2B

All three models use layer normalization in the feature extraction layers, hence `_normalize_waveform` is set to `True`.

Pull Request resolved: https://github.com/pytorch/audio/pull/2978

Reviewed By: hwangjeff

Differential Revision: D42501491

Pulled By: nateanl

fbshipit-source-id: 2429ec880cc14798034843381e458e1b4664dac3

9b7b64e4

14 Jan, 2023 1 commit

Fix CI tests on gpu machines (#2982) · 82ded7e7

Zhaoheng Ni authored Jan 14, 2023

Summary:
XLS-R tests are supposed to be skipped on gpu machines, but they are forced to run in [_skipIf](https://github.com/pytorch/audio/blob/main/test/torchaudio_unittest/common_utils/case_utils.py#L143-L145) decorator. This PR skips the XLS-R tests if the machine is CI and CUDA is available.

Pull Request resolved: https://github.com/pytorch/audio/pull/2982

Reviewed By: xiaohui-zhang

Differential Revision: D42520292

Pulled By: nateanl

fbshipit-source-id: c6ee4d4a801245226c26d9cd13e039e8d910add2

82ded7e7

13 Jan, 2023 2 commits

Add mel spectrogram visualization to Streaming ASR tutorial (#2974) · 55575a53

moto authored Jan 12, 2023

Summary:
Per the suggestion by nateanl, adding the visualization of feature fed to ASR.

<img width="688" alt="Screen Shot 2023-01-12 at 8 19 59 PM" src="https://user-images.githubusercontent.com/855818/212215190-23be7553-4c04-40d9-944e-3ee2ff69c49b.png">

Pull Request resolved: https://github.com/pytorch/audio/pull/2974

Reviewed By: nateanl

Differential Revision: D42484088

Pulled By: mthrok

fbshipit-source-id: 2c839492869416554eac04aa06cd12078db21bd7

55575a53

Add XLS-R models (#2959) · a5664ca9

Zhaoheng Ni authored Jan 12, 2023

Summary:
XLSR (cross-lingual speech representation) are a set of cross-lingual self-supervised learning models for generating cross-lingual speech representation. It was first proposed in https://arxiv.org/pdf/2006.13979.pdf which is trained on 53 languages (so-called XLSR-53). This PR supports more XLS-R models from https://arxiv.org/pdf/2111.09296.pdf that have more parameters (300M, 1B, 2B) and are trained on 128 languages.

Pull Request resolved: https://github.com/pytorch/audio/pull/2959

Reviewed By: mthrok

Differential Revision: D42397643

Pulled By: nateanl

fbshipit-source-id: 23e8e51a7cde0a226db4f4028db7df8f02b986ce

a5664ca9

12 Jan, 2023 4 commits

Refactor extension modules initialization (#2968) · 5dfe0b22

mthrok authored Jan 12, 2023

Summary:
* Refactor _extension module so that
  * the implementation of initialization logic and its execution are separated.
    * logic goes to `_extension.utils`
    * the execution is at `_extension.__init__`
    * global variables are defined and modified in `__init__`.
* Replace `is_sox_available()` with `_extension._SOX_INITIALIZED`
* Replace `is_kaldi_available()` with `_extension._IS_KALDI_AVAILABLE`
* Move `requies_sox()` and `requires_kaldi()` to break the circular dependency among `_extension` and `_internal.module_utils`.
* Merge the sox-related initialization logic in `_extension.utils` module.

Pull Request resolved: https://github.com/pytorch/audio/pull/2968

Reviewed By: hwangjeff

Differential Revision: D42387251

Pulled By: mthrok

fbshipit-source-id: 0c3245dfab53f9bc1b8a83ec2622eb88ec96673f

5dfe0b22

Add query methods to FilterGraph (#2976) · 32d46f94

moto authored Jan 11, 2023

Summary:
This commit add methods to query output configuration from FilterGraph object.
* time_base -> required to compute PTS of output frame
* sample_rate, num_channels -> required to compute PTS and pre allocate buffers for audio.

Pull Request resolved: https://github.com/pytorch/audio/pull/2976

Reviewed By: xiaohui-zhang

Differential Revision: D42466744

Pulled By: mthrok

fbshipit-source-id: dd27109819bfb1fbe37b8233dd6a5e4224fe3f6c

32d46f94

Add `buffer_chunk_size=-1` option (#2969) · 22788a8f

moto authored Jan 11, 2023

Summary:
This commit adds `buffer_chunk_size=-1`, which does not drop buffered frames.

Pull Request resolved: https://github.com/pytorch/audio/pull/2969

Reviewed By: xiaohui-zhang

Differential Revision: D42403467

Pulled By: mthrok

fbshipit-source-id: a0847e6878874ce7e4b0ec3f56e5fbb8ebdb5992

22788a8f

Update C++ standard to 17 (#2973) · d1cc1da6

moto authored Jan 11, 2023

Summary:
Following the change in PyTorch core.

https://github.com/pytorch/pytorch/commit/87e4a087784c805312a2b48bb063d2400df26c5e

Pull Request resolved: https://github.com/pytorch/audio/pull/2973

Reviewed By: xiaohui-zhang

Differential Revision: D42462709

Pulled By: mthrok

fbshipit-source-id: 60c2aa3d63fe25d8e0b7aa476404e7a55d6eb87f

d1cc1da6

11 Jan, 2023 1 commit

add CUDA 11.8 builds (#2951) · 7e7b60c1

pbialecki authored Jan 11, 2023

Summary:
CC atalman

Pull Request resolved: https://github.com/pytorch/audio/pull/2951

Reviewed By: mthrok

Differential Revision: D42459205

Pulled By: atalman

fbshipit-source-id: b2d7c5604ba1f3bb4d9a45a052ac41054acd52dd

7e7b60c1

10 Jan, 2023 2 commits

Update the handling of videos without PTS values (#2970) · 1717edaa

moto authored Jan 10, 2023

Summary:
filter graph does not fallback to `best_effort_timestamp`, thus applying filters (like changing fps) on videos without PTS values failed.

This commit changes the behavior by overwriting the PTS values with best_effort_timestamp.

Pull Request resolved: https://github.com/pytorch/audio/pull/2970

Reviewed By: YosuaMichael

Differential Revision: D42425771

Pulled By: mthrok

fbshipit-source-id: 7b7a033ea2ad89bb49d6e1663d35d377dab2aae9

1717edaa

Fix fill_buffer method (#2971) · e1cddb46

moto authored Jan 09, 2023

Summary:
* Add missing docsrtings
* Add default values

Pull Request resolved: https://github.com/pytorch/audio/pull/2971

Reviewed By: xiaohui-zhang

Differential Revision: D42425796

Pulled By: mthrok

fbshipit-source-id: a6a946875142a54424c059bbfbab1908a1564bd3

e1cddb46

06 Jan, 2023 1 commit

Fix document for MelScale and InverseMelScale (#2967) · 4a037b03

Zhaoheng Ni authored Jan 06, 2023

Summary:
`InverseMelScale` is missing from the nightly documentation webpage. `MelScale` is better in Feature Extractions section. This PR moves both documents into Feature Extractions section.

Pull Request resolved: https://github.com/pytorch/audio/pull/2967

Reviewed By: mthrok

Differential Revision: D42387886

Pulled By: nateanl

fbshipit-source-id: cdac020887817ea2530bfb26e8ed414ae4761420

4a037b03