Commits · 69e8dbb21d66120a72c9e0b076c017dde1a0da74 · OpenDAS / Torchaudio

22 Dec, 2022 1 commit

moto authored Dec 21, 2022

Summary:
This commit adds "filter_waveform" prototype function.

This function can apply non-stationary filters across the time.
It also performs cropping at the end to compensate the delay introduced by filtering.
The figure bellow illustrates this.

See [subtractive_synthesis_tutorial](https://output.circle-artifacts.com/output/job/5233fda9-dadb-4710-9389-7e8ac20a062f/artifacts/0/docs/tutorials/subtractive_synthesis_tutorial.html) for example usages.

![figure](https://download.pytorch.org/torchaudio/doc-assets/filter_waveform.png)

Pull Request resolved: https://github.com/pytorch/audio/pull/2928

Reviewed By: carolineechen

Differential Revision: D42199955

Pulled By: mthrok

fbshipit-source-id: e822510ab8df98393919bea33768f288f4d661b2

69e8dbb2

21 Dec, 2022 1 commit

Extract libsox integration from libtorchaudio (#2929) · 1706a72f

moto authored Dec 21, 2022

Summary:
This commit makes the following changes to the C++ library organization
- Move sox-related feature implementations from `libtorchaudio` to `libtorchaudio_sox`.
- Remove C++ implementation of `is_sox_available` and `is_ffmpeg_available` as it is now sufficient to check the existence of `libtorchaudio_sox` and `libtorchaudio_ffmpeg` to check the availability. This makes `libtorchaudio_sox` and `libtorchaudio_ffmpeg` independent from `libtorchaudio`.
- Move PyBind11-based bindings (`_torchaudio_sox`, `_torchaudio_ffmpeg`) into `torchaudio.lib` so that the built library structure is less cluttered.

Background:
Originally, when the `libsox` was the only C++ extension and `libtorchaudio` was supposed to contain all the C++ code.
The things are different now. We have a bunch of C++ extensions and we need to make the code/build structure more modular.

The new `libtorchaudio_sox` contains the implementations and `_torchaudio_sox` contains the PyBin11-based bindings.

Pull Request resolved: https://github.com/pytorch/audio/pull/2929

Reviewed By: hwangjeff

Differential Revision: D42159594

Pulled By: mthrok

fbshipit-source-id: 1a0fbca9e4143137f6363fc001b2378ce6029aa7

1706a72f

20 Dec, 2022 1 commit

Fallback to best_effort_timestamp in case of invalid PTS (#2916) · c6bc65fd

moto authored Dec 20, 2022

Summary:
If the input video has invalid PTS, the current precise seek fails except when seeking into t=0.

This commit updates the discard mechanism to fallback to `best_effort_timestamp` in such cases.

`best_effort_timestamp` is just the number of frames went through decoder starting from the beginning of the file.

This means if the input file is very long, but seeking towards the end of the file, the StreamReader still decodes all the frames.

For videos with valid PTS, `best_effort_timestamp` should be same as `pts`. [[src](https://ffmpeg.org/doxygen/4.1/decode_8c.html#a8d86329cf58a4adbd24ac840d47730cf)]

Pull Request resolved: https://github.com/pytorch/audio/pull/2916

Reviewed By: YosuaMichael

Differential Revision: D42170204

Pulled By: mthrok

fbshipit-source-id: 80c04dc376e0f427d41eb9feb44c251a1648a998

c6bc65fd

19 Dec, 2022 2 commits

Split extract_archive into dedicated functions. (#2927) · 5807078c

moto authored Dec 19, 2022

Summary:
`extra_archive` in `datasets.utils` does not distinguish the input type, and blindly treats it as tar, then zip in case of failure.

This is an anti-pattern. All the dataset implementations know which archive type the downloaded files are.

This commit splits extract_archive function into dedicated functions, and make each dataset use the correct one.

Pull Request resolved: https://github.com/pytorch/audio/pull/2927

Reviewed By: carolineechen

Differential Revision: D42154069

Pulled By: mthrok

fbshipit-source-id: bc46cc2af26aa086ef49aa1f9a94b6dedb55f85e

5807078c

Remove deprecated/unused functions from datasets.utils (#2926) · d744f33f

moto authored Dec 19, 2022

Summary:
`stream_url`, `download_url` and `validate_file` are not used and not listed in documentation (`download_url` is marked as deprecated) so remove them.

This will also fix the failing bandit workflow.

Pull Request resolved: https://github.com/pytorch/audio/pull/2926

Reviewed By: carolineechen

Differential Revision: D42153484

Pulled By: mthrok

fbshipit-source-id: 0fccdc7b7e0e40db8046e12f46eb68de57d838ca

d744f33f

17 Dec, 2022 1 commit

Add filter design tutorial (#2894) · 9c4f71a6

moto authored Dec 16, 2022

Summary:
Adds filter design tutorial, which demonstrates `sinc_impulse_response` and `frequency_impulse_response`.

Example:
 - [filter_design_tutorial](https://output.circle-artifacts.com/output/job/bd22c615-9215-4b17-a52c-b171a47f646c/artifacts/0/docs/tutorials/filter_design_tutorial.html)

Pull Request resolved: https://github.com/pytorch/audio/pull/2894

Reviewed By: xiaohui-zhang

Differential Revision: D42117658

Pulled By: mthrok

fbshipit-source-id: f7dd04980e8557bb6f0e0ec26ac2c7f53314ea16

9c4f71a6

16 Dec, 2022 3 commits

Rename resampling_method options (#2922) · e6bebe6a

Caroline Chen authored Dec 16, 2022

Summary:
resolves https://github.com/pytorch/audio/issues/2891

Rename `resampling_method` options to more accurately describe what is happening. Previously the methods were set to `sinc_interpolation` and `kaiser_window`, which can be confusing as both options actually use sinc interpolation methodology, but differ in the window function used. As a result, rename `sinc_interpolation` to `sinc_interp_hann` and `kaiser_window` to `sinc_interp_kaiser`. Using an old option will throw a warning, and those options will be deprecated in 2 released. The numerical behavior is unchanged.

Pull Request resolved: https://github.com/pytorch/audio/pull/2922

Reviewed By: mthrok

Differential Revision: D42083619

Pulled By: carolineechen

fbshipit-source-id: 9a9a7ea2d2daeadc02d53dddfd26afe249459e70

e6bebe6a

Bump version to match PyTorch (#2924) · 3cf266a3

moto authored Dec 16, 2022

Summary:
Updating the version to 2.0.0 to match PyTorch core

Pull Request resolved: https://github.com/pytorch/audio/pull/2924

Reviewed By: carolineechen

Differential Revision: D42098238

Pulled By: mthrok

fbshipit-source-id: 19cb290e493293d1db4211f9bfcdcdffa3e96e45

3cf266a3

Update version matrix for 0.13.1 release (#2923) · f0986b60

moto authored Dec 15, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2923

Reviewed By: carolineechen

Differential Revision: D42090164

Pulled By: mthrok

fbshipit-source-id: 4816d9922cc54f42ad2137a3bba1a61f3be68f57

f0986b60

15 Dec, 2022 1 commit

Switch to Nova MacOS Wheel (#2907) · 0be8423d

DanilBaibak authored Dec 14, 2022

Summary:
Switch to Nova MacOS and M1 Wheels. This PR is a step in migrating from CircleCI to the Nova workflow.

- [x] Disable the CircleCI builds for MacOS Wheel.
- [x] Disable the CircleCI builds for M1 Wheel.
- [x] Enable the Nova workflow for MacOS Wheel.
- [x] Enable the Nova workflow for M1 Wheel.

Pull Request resolved: https://github.com/pytorch/audio/pull/2907

Reviewed By: osalpekar, mthrok

Differential Revision: D42040965

Pulled By: DanilBaibak

fbshipit-source-id: b87f028cf5686bf97265109591fb0a8c1190324c

0be8423d

13 Dec, 2022 2 commits

Return version from cuda version check (#2919) · 392481f0

atalman authored Dec 13, 2022

Summary:
Return version from cuda version check so that we can display it in the test details

Pull Request resolved: https://github.com/pytorch/audio/pull/2919

Reviewed By: mthrok

Differential Revision: D42001252

Pulled By: atalman

fbshipit-source-id: 0d8fc3812a13fe098cdf8e0f3df7b66161ba3f95

392481f0

Switch to nova Linux Wheel build (#2896) · ad596b37

DanilBaibak authored Dec 13, 2022

Summary:
Switch to Nova Linux Wheel build.

- [x] Disable the CircleCI builds for Linux Wheel.
- [x] Enable the Nova workflow for Linux Wheel.

~The Linux Wheel Python3.8 build has been kept because it is a dependency for the `docstring_parameters_sync` job.~ As Omkar pointed out, Docstring Parameters Sync also runs on GHA (https://github.com/pytorch/audio/actions/runs/3638187635/jobs/6140090209). So, we completely switched to the Nova Linux Wheel build.

Pull Request resolved: https://github.com/pytorch/audio/pull/2896

Reviewed By: mthrok

Differential Revision: D41994993

Pulled By: DanilBaibak

fbshipit-source-id: 2bafdbe1b62ef8fa194ce96f4d4667a3ed3dc44f

ad596b37

12 Dec, 2022 1 commit

Update precise seek behavior for t=0 (#2915) · cbd35438

moto authored Dec 12, 2022

Summary:
It was reported that when videos with invalid PTS values are fed to StreamReader, StreamReader returns only the last frame.

https://github.com/pytorch/vision/blob/677fc939b21a8893f07db4c1f90482b648b6573f/test/assets/videos/RATRACE_wave_f_nm_np1_fr_goo_37.avi

```
import torchaudio

src = "RATRACE_wave_f_nm_np1_fr_goo_37.avi"

streamer = torchaudio.io.StreamReader(src=src)
streamer.add_basic_video_stream(frames_per_chunk=-1)
streamer.process_all_packets()
video, = streamer.pop_chunks()

print(video.size(0)) # prints 1, but there are more than 70 frames
```

The reason why all the frames are not returned is due to invalid PTS values. All the frames's PTS values are `-9223372036854775808` so the internal mechanism discards them.

The reason why the last frame is output is because when entering drain mode, the discard value of -1 is used, which is interpreted as no discard.

For the second issue, the discard behavior should be consistent across regular decoding and drain mode.

For the first issue, although the normal behavior is not guaranteed for such invalid input, we can support the case where one reads video from start (or when one seeks into t=0)

---

This commits make the following changes to address the above two.

1. Define the discard_before_pts attribtue on StreamProcessor, so that StreamProcessor is aware of the discard behavior without being told by StreamReader, and its behavior is consistent between regular decoding and drain.

This gets rid of the discard_before_pts computation that is currently happening at the every time a frame is processed, so this should improve the peformance a bit.

2. Change the meaning of discard_before_pts, so that when it's 0, no discard happens. With this change, the negative value is not necessary so we put it a UB status.

Note:
Even with those changes seeking videos with invalid PTS is not plausible, client codes can implement a fallback which decodes frames first and discard undesired ones.

Pull Request resolved: https://github.com/pytorch/audio/pull/2915

Reviewed By: nateanl

Differential Revision: D41957784

Pulled By: mthrok

fbshipit-source-id: 2dafdbada5aa33bfc81c986306f80642ba6277df

cbd35438

11 Dec, 2022 1 commit

Update PR labels (#2912) · 123c7319

Caroline Chen authored Dec 10, 2022

Summary:
replace "example" primary label with "tutorial" and "recipe"

cc pytorch/team-audio-core

Pull Request resolved: https://github.com/pytorch/audio/pull/2912

Reviewed By: nateanl, mthrok

Differential Revision: D41897772

Pulled By: carolineechen

fbshipit-source-id: e20540c6f302b5dcc56663a2c074b14a48e4d3e1

123c7319

10 Dec, 2022 2 commits

Fix type of arguments in torchaudio.io classes (#2913) · 54a664b9

Zhaoheng Ni authored Dec 10, 2022

Summary:
The `src` or `dst` argument can be `str` or `file-like object`. Setting it to `str` in type annotation will confuse users that it only accepts `str` type.

Pull Request resolved: https://github.com/pytorch/audio/pull/2913

Reviewed By: mthrok

Differential Revision: D41896668

Pulled By: nateanl

fbshipit-source-id: 1446a9f84186a0376cdbe4c61817fae4d5eaaab4

54a664b9

Update model documentation structure (#2902) · 9912e54d

moto authored Dec 09, 2022

Summary:
Currently, the documentation page for `torchaudio.models` have separate sections for model definitions and factory functions.

The relationships between models and factory functions are not immediately clear.

This commit moves the list of factory functions to the list of models.

After:
 - https://output.circle-artifacts.com/output/job/242a9521-7460-4043-895b-9995bf5093b5/artifacts/0/docs/generated/torchaudio.models.Wav2Vec2Model.html

<img width="1171" alt="Screen Shot 2022-12-08 at 8 41 03 PM" src="https://user-images.githubusercontent.com/855818/206603743-74a6e368-c3cf-4b87-b854-518a95893f06.png">

Pull Request resolved: https://github.com/pytorch/audio/pull/2902

Reviewed By: carolineechen

Differential Revision: D41897800

Pulled By: mthrok

fbshipit-source-id: a3c01d28d80e755596a9bc37c951960eb84870b9

9912e54d

09 Dec, 2022 5 commits

Fix integration test for WAV2VEC2_ASR_LARGE_LV60K_10M (#2910) · 90162812

Zhaoheng Ni authored Dec 09, 2022

Summary:
After https://github.com/pytorch/audio/issues/2873, the pre-trained Wav2Vec2 models with larger datasets can get better performances. The PR fixes the integration test of bundle `WAV2VEC2_ASR_LARGE_LV60K_10M` which predicts the word `CURIOUSITY` to `CURIOUSSITY` before but now to `CURIOUSITY` correctly.

Pull Request resolved: https://github.com/pytorch/audio/pull/2910

Reviewed By: mthrok

Differential Revision: D41881919

Pulled By: nateanl

fbshipit-source-id: 236fd00b983a5205c731f3efa31033a6b8257cab

90162812

Update author and maintainer info (#2911) · eb8b1bda

moto authored Dec 09, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2911

Reviewed By: carolineechen

Differential Revision: D41887854

Pulled By: mthrok

fbshipit-source-id: eb91773ec67b4cda2d70733df450956d83742509

eb8b1bda

Fix duplicated memory allocation in StreamWriter (#2906) · 90c456de

Moto Hira authored Dec 09, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2906

The correct way to create AVFormatContext* for output is to pass an address of an uninitialized *AVFormatContext struct to `avformat_alloc_output_context2` function.

The current code pre-allocates AVFormatContext* with `avformat_alloc_context`, then this allocated object is lost inside of `avformat_alloc_output_context2`.

Reviewed By: xiaohui-zhang

Differential Revision: D41865685

fbshipit-source-id: 9a9dc83b5acfe9b450f191fe716c85ebb5a5d842

90c456de

Fix wrong frame allocation in StreamWriter (#2905) · 3518df48

Moto Hira authored Dec 09, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2905

In StreamWriter, if the tensor format is different from the encoding format, then a FilterGraph object is automatically inserted to convert the format.

The FilterGraph object operates on AVFrames. The input AVFrame must be allocated by us, but the output AVFrames is filled by FilterGraph, thus no need to allocate it.

Now the output AVFrame is used as input to encoder regardless of whether FilterGraph was inserted. Thus the output AVFrame has to be manually allocated by us when FilterGraph is not used.

The current code flips this condition and incorrectly allocates AVFrame when FilterGraph is present and does not allocate otherwise.

This commit fix that.

Reviewed By: xiaohui-zhang

Differential Revision: D41866198

fbshipit-source-id: 40799c147dc8166a979ecfb58ed8e502539a6aed

3518df48

Toggle on/off ffmpeg test if needed (#2901) · ccda545c

atalman authored Dec 09, 2022

Summary:
Toggle on/off ffmpeg test if needed
By default it ON, hence should not affect any current tests.
To toggle ON no change required.
To toggle OFF use:
```
smoke_test.py --no-ffmpeg
```

To be used when calling from builder currently. Since we do not install ffmpeg currently.

Pull Request resolved: https://github.com/pytorch/audio/pull/2901

Reviewed By: carolineechen, mthrok

Differential Revision: D41874976

Pulled By: atalman

fbshipit-source-id: c57b19f37c63a1f476f93a5211550e980e67d9c7

ccda545c

08 Dec, 2022 4 commits

Follow up on WavLM bundles (#2895) · 41d007b4

Grigory Sizov authored Dec 08, 2022

Summary:
Addressed mthrok's comments in https://github.com/pytorch/audio/pull/2833:
- Moved model type from `_params` directly into the bundle definition. For now I defined model type as "WavLM" for WavLM bundles and "Wav2Vec2" for everything else. We can also distinguish between different Wav2Vec2 falvours - Hubert, VoxPopuli etc, but at the moment this won't imply any functional differences, so I didn't do it
- Expanded the title underline to match the title length

Pull Request resolved: https://github.com/pytorch/audio/pull/2895

Reviewed By: nateanl, mthrok

Differential Revision: D41799875

Pulled By: sgrigory

fbshipit-source-id: 0730d4f91ed60e900643bb74d6cccdd7aa5d7b39

41d007b4

Fix docs warnings for conformer w2v2 (#2900) · 88927e84

Caroline Chen authored Dec 08, 2022

Summary:
cc mthrok

Pull Request resolved: https://github.com/pytorch/audio/pull/2900

Reviewed By: mthrok

Differential Revision: D41839924

Pulled By: carolineechen

fbshipit-source-id: ba3ada7d04a86d99e08c9044de05a1c48b05d036

88927e84

Add HiFi GAN Generator to prototypes (#2860) · b5e4663a

Grigory Sizov authored Dec 08, 2022

Summary:
Part 1 of [T138011314](https://www.internalfb.com/intern/tasks/?t=138011314)

This PR ports the generator part of [HiFi GAN](https://arxiv.org/abs/2010.05646v2) from [the original implementation](https://github.com/jik876/hifi-gan/blob/4769534d45265d52a904b850da5a622601885777/models.py#L75)

Adds tests:
- Smoke tests for architectures V1, V2, V3
- Check that output shapes are correct
- Check that the model is torchscriptable and scripting doesn't change the output
- Check that our code's output matches the original implementation. Here I clone the original repo inside `/tmp` and import necessary objects from inside the test function.  On test teardown I restore `PATH`, but don't remove the cloned code, so that it can be reused on subsequent runs - let me know if removing it would be a better practice

There are no quantization tests, because the model consists mainly of `Conv1d` and `ConvTransposed1d`, and they are [not supported by dynamic quantization](https://pytorch.org/docs/stable/quantization.html)

Pull Request resolved: https://github.com/pytorch/audio/pull/2860

Reviewed By: nateanl

Differential Revision: D41433416

Pulled By: sgrigory

fbshipit-source-id: f135c560df20f5138f01e3efdd182621edabb4f5

b5e4663a

Add feature badges to preemphasis and deemphasis functions (#2892) · ba683bd1

hwangjeff authored Dec 08, 2022

Summary:
Adds feature badges to preemphasis and deemphasis functions

Pull Request resolved: https://github.com/pytorch/audio/pull/2892

Reviewed By: carolineechen

Differential Revision: D41830782

Pulled By: hwangjeff

fbshipit-source-id: 487ce9afa8dc8fe321aa9e02cc88bb1453985d39

ba683bd1

07 Dec, 2022 3 commits

Add additive noise transform (#2889) · 29ecf7e8

hwangjeff authored Dec 07, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2889

Reviewed By: xiaohui-zhang

Differential Revision: D41760084

Pulled By: hwangjeff

fbshipit-source-id: d2f5253e1fae7e7aafa9fa6043c6a7045c5b33a0

29ecf7e8

Introduce MUSAN dataset (#2888) · 45c7d05a

hwangjeff authored Dec 06, 2022

Summary:
Introduces the MUSAN dataset (https://www.openslr.org/17/), which contains music, speech, and noise recordings.

Pull Request resolved: https://github.com/pytorch/audio/pull/2888

Reviewed By: xiaohui-zhang

Differential Revision: D41762164

Pulled By: hwangjeff

fbshipit-source-id: 14d5baaa4d40f065dd5d99bf7f2e0a73aa6c31a9

45c7d05a

Upgrade nightly wheels to ROCm5.3 (#2853) · e97f3a32

Jithun Nair authored Dec 06, 2022

Summary:
Dependent on PR https://github.com/pytorch/pytorch/pull/89101

Pull Request resolved: https://github.com/pytorch/audio/pull/2853

Reviewed By: atalman, osalpekar

Differential Revision: D41737634

Pulled By: malfet

fbshipit-source-id: 715a97a2da8ef309cea78d971b47c07463495683

e97f3a32

06 Dec, 2022 1 commit

Add frequency_impulse_response (#2879) · d234498c

moto authored Dec 06, 2022

Summary:
This commit adds `frequency_impulse_response` function, which generates filter from desired frequency response.

[Example](https://output.circle-artifacts.com/output/job/5233fda9-dadb-4710-9389-7e8ac20a062f/artifacts/0/docs/tutorials/filter_design_tutorial.html#frequency-sampling)

Pull Request resolved: https://github.com/pytorch/audio/pull/2879

Reviewed By: hwangjeff

Differential Revision: D41767787

Pulled By: mthrok

fbshipit-source-id: 6d5e44c6390e8cf3028994a1b1de590ff3aaf6c2

d234498c

04 Dec, 2022 1 commit

Fix _init_hubert_pretrain_model (#2886) · d8a5a11d

Zhaoheng Ni authored Dec 03, 2022

Summary:
address https://github.com/pytorch/audio/issues/2885

In `_init_hubert_pretrain_model ` method which initialize the hubert pretrain models, `kaiming_normal_` should be applied on `ConvLayerBlock` instead of `LayerNorm` layer. This PR fixes it and adds more unit tests.

Pull Request resolved: https://github.com/pytorch/audio/pull/2886

Reviewed By: hwangjeff

Differential Revision: D41713801

Pulled By: nateanl

fbshipit-source-id: ed199baf7504d06bbf2d31c522ae708a75426a2d

d8a5a11d

02 Dec, 2022 1 commit

Add pre-emphasis and de-emphasis functions (#2871) · 55e9978a

hwangjeff authored Dec 01, 2022

Summary:
Adds pre-emphasis and de-emphasis functions.

Pull Request resolved: https://github.com/pytorch/audio/pull/2871

Reviewed By: carolineechen

Differential Revision: D41651097

Pulled By: hwangjeff

fbshipit-source-id: 7a3cf6ce68b6ce1b9ae315ddd8bd8ed71acccdf1

55e9978a

30 Nov, 2022 2 commits

Add speed and speed perturbation functions and transforms (#2829) · c28073cc

hwangjeff authored Nov 30, 2022

Summary:
Adds functions and transforms for speed and speed perturbation (https://www.isca-speech.org/archive/interspeech_2015/ko15_interspeech.html).

Pull Request resolved: https://github.com/pytorch/audio/pull/2829

Reviewed By: xiaohui-zhang

Differential Revision: D41285114

Pulled By: hwangjeff

fbshipit-source-id: 114740507698e01f35d4beb2c568a2479e847506

c28073cc

Add layer normalization to wav2vec2 large+ pretrained models (#2873) · aca61bc0

Andreas Floros authored Nov 29, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2873

The original fairseq implementation had an extra layer normalization
preprocessings for large/xlarge models.

https://github.com/facebookresearch/fairseq/blob/fcca32258c8e8bcc9f9890bf4714fa2f96b6b3e1/fairseq/data/audio/hubert_dataset.py#L355-L357

This commit modifies the pre-trained model bundle to include this
preprocessing to the impacted pre-trained models listed bellow.
For the sake of keeping the interface identical to the other models,
since the additional preprocessing is rather simple, the returned
pre-trained model instance is modified ot include the preprocess,
instead of adding a method for preprocessing.

- WAV2VEC2_LARGE_LV60K
- WAV2VEC2_ASR_LARGE_LV60K_10M
- WAV2VEC2_ASR_LARGE_LV60K_100H
- WAV2VEC2_ASR_LARGE_LV60K_960H
- WAV2VEC2_XLSR53
- HUBERT_LARGE
- HUBERT_XLARGE
- HUBERT_ASR_LARGE
- HUBERT_ASR_XLARGE
- WAVLM_LARGE

Reviewed By: nateanl

Differential Revision: D41520183

fbshipit-source-id: 83d72fe692e8b9fc25df144deb4ca946fcd09615

aca61bc0

29 Nov, 2022 5 commits

Add sinc_impulse_response op (#2875) · fc0720b4

moto authored Nov 29, 2022

Summary:
This commit adds `sinc_impulse_response`, which generates windowed-sinc low-pass filters for given cutoff frequencies.

Example usage:
 - [Filter Design Tutorial](https://output.circle-artifacts.com/output/job/c0085baa-5345-4aeb-bd44-448034caa9e1/artifacts/0/docs/tutorials/filter_design_tutorial.html)

Pull Request resolved: https://github.com/pytorch/audio/pull/2875

Reviewed By: carolineechen

Differential Revision: D41586631

Pulled By: mthrok

fbshipit-source-id: a9991dbe5b137b0b4679228ec37072a1da7e50bb

fc0720b4

Add logging to StreamReader/Writer (#2878) · 19e1a84d

moto authored Nov 29, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2878

Reviewed By: carolineechen

Differential Revision: D41587081

Pulled By: mthrok

fbshipit-source-id: da7f3647083a3566ce94070ce2bd30bf99e1db76

19e1a84d

Add additive synthesis tutorial (#2877) · 1a003c3f

moto authored Nov 29, 2022

Summary:
This commit adds the tutorial for additive synthesis, using torchaudio's prototype DSP ops.

[Review here](https://output.circle-artifacts.com/output/job/3dc83322-832a-4272-9c13-df752c97b660/artifacts/0/docs/tutorials/additive_synthesis_tutorial.html)

Pull Request resolved: https://github.com/pytorch/audio/pull/2877

Reviewed By: carolineechen

Differential Revision: D41585425

Pulled By: mthrok

fbshipit-source-id: b81283b90e4779c8054fd030a1d8c3d39d676bbd

1a003c3f

Extend fftconvolve to support broadcast-able shapes (#2874) · 7a05622e

moto authored Nov 29, 2022

Summary:
Currently, fftconvolve only accepts the tensors for the exact same leading dimensions.
This commit loosens the restriction to allow shapes that are broadcast-able.

This makes the fftconvolve operation more efficient for cases like signal filtering where one operand (waveform) is larger than the other (filter kernel) and the same filter kernels are applied across channels and batches.

Pull Request resolved: https://github.com/pytorch/audio/pull/2874

Reviewed By: carolineechen

Differential Revision: D41581588

Pulled By: mthrok

fbshipit-source-id: c0117e11b979fb53236cc307a970a461b0e50134

7a05622e

Add conformer wav2vec2 pretrain model (#2827) · 8bde6a54

Caroline Chen authored Nov 29, 2022

Summary:
modeled after [paper](https://arxiv.org/pdf/2110.07313.pdf) and internal flow f288347302

internal comparison tests: D40080919

Pull Request resolved: https://github.com/pytorch/audio/pull/2827

Reviewed By: nateanl

Differential Revision: D41569046

Pulled By: carolineechen

fbshipit-source-id: 43c5313074af05972d93da55b2029c746b75c380

8bde6a54

28 Nov, 2022 2 commits

Add aux_num_out to emformer_hubert_model (#2868) · b0795ebe

Zhaoheng Ni authored Nov 28, 2022

Summary:
- layer_norm in `EmformerEncoder` is set as default in emformer_hubert_model, change the type to be non-optional.
- add `aux_num_out` to emformer_hubert_model to support fine-tuning model.
- update unit tests.

Pull Request resolved: https://github.com/pytorch/audio/pull/2868

Reviewed By: carolineechen

Differential Revision: D41451311

Pulled By: nateanl

fbshipit-source-id: 5fa0f19255e4f01e001d62f8689e36f134030083

b0795ebe

Add oscillator tutorial (#2862) · 52e89756

moto authored Nov 28, 2022

Summary:
This commits add tutorial for oscillator_bank and adsr_envelope, which will be a basis for DDSP.

 - [Review here](https://output.circle-artifacts.com/output/job/cf1d3001-88e5-418b-8cf8-ae22b4445dba/artifacts/0/docs/tutorials/oscillator_tutorial.html)

Pull Request resolved: https://github.com/pytorch/audio/pull/2862

Reviewed By: carolineechen

Differential Revision: D41559503

Pulled By: mthrok

fbshipit-source-id: 3f1689186db7d246de14f228fc2f91bf37db98cd

52e89756