Commits · 3518df48fbfe0fc0057777dca39534a3373a5010 · OpenDAS / Torchaudio

09 Dec, 2022 2 commits

Fix wrong frame allocation in StreamWriter (#2905) · 3518df48

Moto Hira authored Dec 09, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2905

In StreamWriter, if the tensor format is different from the encoding format, then a FilterGraph object is automatically inserted to convert the format.

The FilterGraph object operates on AVFrames. The input AVFrame must be allocated by us, but the output AVFrames is filled by FilterGraph, thus no need to allocate it.

Now the output AVFrame is used as input to encoder regardless of whether FilterGraph was inserted. Thus the output AVFrame has to be manually allocated by us when FilterGraph is not used.

The current code flips this condition and incorrectly allocates AVFrame when FilterGraph is present and does not allocate otherwise.

This commit fix that.

Reviewed By: xiaohui-zhang

Differential Revision: D41866198

fbshipit-source-id: 40799c147dc8166a979ecfb58ed8e502539a6aed

3518df48

Toggle on/off ffmpeg test if needed (#2901) · ccda545c

atalman authored Dec 09, 2022

Summary:
Toggle on/off ffmpeg test if needed
By default it ON, hence should not affect any current tests.
To toggle ON no change required.
To toggle OFF use:
```
smoke_test.py --no-ffmpeg
```

To be used when calling from builder currently. Since we do not install ffmpeg currently.

Pull Request resolved: https://github.com/pytorch/audio/pull/2901

Reviewed By: carolineechen, mthrok

Differential Revision: D41874976

Pulled By: atalman

fbshipit-source-id: c57b19f37c63a1f476f93a5211550e980e67d9c7

ccda545c

08 Dec, 2022 4 commits

Follow up on WavLM bundles (#2895) · 41d007b4

Grigory Sizov authored Dec 08, 2022

Summary:
Addressed mthrok's comments in https://github.com/pytorch/audio/pull/2833:
- Moved model type from `_params` directly into the bundle definition. For now I defined model type as "WavLM" for WavLM bundles and "Wav2Vec2" for everything else. We can also distinguish between different Wav2Vec2 falvours - Hubert, VoxPopuli etc, but at the moment this won't imply any functional differences, so I didn't do it
- Expanded the title underline to match the title length

Pull Request resolved: https://github.com/pytorch/audio/pull/2895

Reviewed By: nateanl, mthrok

Differential Revision: D41799875

Pulled By: sgrigory

fbshipit-source-id: 0730d4f91ed60e900643bb74d6cccdd7aa5d7b39

41d007b4

Fix docs warnings for conformer w2v2 (#2900) · 88927e84

Caroline Chen authored Dec 08, 2022

Summary:
cc mthrok

Pull Request resolved: https://github.com/pytorch/audio/pull/2900

Reviewed By: mthrok

Differential Revision: D41839924

Pulled By: carolineechen

fbshipit-source-id: ba3ada7d04a86d99e08c9044de05a1c48b05d036

88927e84

Add HiFi GAN Generator to prototypes (#2860) · b5e4663a

Grigory Sizov authored Dec 08, 2022

Summary:
Part 1 of [T138011314](https://www.internalfb.com/intern/tasks/?t=138011314)

This PR ports the generator part of [HiFi GAN](https://arxiv.org/abs/2010.05646v2) from [the original implementation](https://github.com/jik876/hifi-gan/blob/4769534d45265d52a904b850da5a622601885777/models.py#L75)

Adds tests:
- Smoke tests for architectures V1, V2, V3
- Check that output shapes are correct
- Check that the model is torchscriptable and scripting doesn't change the output
- Check that our code's output matches the original implementation. Here I clone the original repo inside `/tmp` and import necessary objects from inside the test function.  On test teardown I restore `PATH`, but don't remove the cloned code, so that it can be reused on subsequent runs - let me know if removing it would be a better practice

There are no quantization tests, because the model consists mainly of `Conv1d` and `ConvTransposed1d`, and they are [not supported by dynamic quantization](https://pytorch.org/docs/stable/quantization.html)

Pull Request resolved: https://github.com/pytorch/audio/pull/2860

Reviewed By: nateanl

Differential Revision: D41433416

Pulled By: sgrigory

fbshipit-source-id: f135c560df20f5138f01e3efdd182621edabb4f5

b5e4663a

Add feature badges to preemphasis and deemphasis functions (#2892) · ba683bd1

hwangjeff authored Dec 08, 2022

Summary:
Adds feature badges to preemphasis and deemphasis functions

Pull Request resolved: https://github.com/pytorch/audio/pull/2892

Reviewed By: carolineechen

Differential Revision: D41830782

Pulled By: hwangjeff

fbshipit-source-id: 487ce9afa8dc8fe321aa9e02cc88bb1453985d39

ba683bd1

07 Dec, 2022 3 commits

Add additive noise transform (#2889) · 29ecf7e8

hwangjeff authored Dec 07, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2889

Reviewed By: xiaohui-zhang

Differential Revision: D41760084

Pulled By: hwangjeff

fbshipit-source-id: d2f5253e1fae7e7aafa9fa6043c6a7045c5b33a0

29ecf7e8

Introduce MUSAN dataset (#2888) · 45c7d05a

hwangjeff authored Dec 06, 2022

Summary:
Introduces the MUSAN dataset (https://www.openslr.org/17/), which contains music, speech, and noise recordings.

Pull Request resolved: https://github.com/pytorch/audio/pull/2888

Reviewed By: xiaohui-zhang

Differential Revision: D41762164

Pulled By: hwangjeff

fbshipit-source-id: 14d5baaa4d40f065dd5d99bf7f2e0a73aa6c31a9

45c7d05a

Upgrade nightly wheels to ROCm5.3 (#2853) · e97f3a32

Jithun Nair authored Dec 06, 2022

Summary:
Dependent on PR https://github.com/pytorch/pytorch/pull/89101

Pull Request resolved: https://github.com/pytorch/audio/pull/2853

Reviewed By: atalman, osalpekar

Differential Revision: D41737634

Pulled By: malfet

fbshipit-source-id: 715a97a2da8ef309cea78d971b47c07463495683

e97f3a32

06 Dec, 2022 1 commit

Add frequency_impulse_response (#2879) · d234498c

moto authored Dec 06, 2022

Summary:
This commit adds `frequency_impulse_response` function, which generates filter from desired frequency response.

[Example](https://output.circle-artifacts.com/output/job/5233fda9-dadb-4710-9389-7e8ac20a062f/artifacts/0/docs/tutorials/filter_design_tutorial.html#frequency-sampling)

Pull Request resolved: https://github.com/pytorch/audio/pull/2879

Reviewed By: hwangjeff

Differential Revision: D41767787

Pulled By: mthrok

fbshipit-source-id: 6d5e44c6390e8cf3028994a1b1de590ff3aaf6c2

d234498c

04 Dec, 2022 1 commit

Fix _init_hubert_pretrain_model (#2886) · d8a5a11d

Zhaoheng Ni authored Dec 03, 2022

Summary:
address https://github.com/pytorch/audio/issues/2885

In `_init_hubert_pretrain_model ` method which initialize the hubert pretrain models, `kaiming_normal_` should be applied on `ConvLayerBlock` instead of `LayerNorm` layer. This PR fixes it and adds more unit tests.

Pull Request resolved: https://github.com/pytorch/audio/pull/2886

Reviewed By: hwangjeff

Differential Revision: D41713801

Pulled By: nateanl

fbshipit-source-id: ed199baf7504d06bbf2d31c522ae708a75426a2d

d8a5a11d

02 Dec, 2022 1 commit

Add pre-emphasis and de-emphasis functions (#2871) · 55e9978a

hwangjeff authored Dec 01, 2022

Summary:
Adds pre-emphasis and de-emphasis functions.

Pull Request resolved: https://github.com/pytorch/audio/pull/2871

Reviewed By: carolineechen

Differential Revision: D41651097

Pulled By: hwangjeff

fbshipit-source-id: 7a3cf6ce68b6ce1b9ae315ddd8bd8ed71acccdf1

55e9978a

30 Nov, 2022 2 commits

Add speed and speed perturbation functions and transforms (#2829) · c28073cc

hwangjeff authored Nov 30, 2022

Summary:
Adds functions and transforms for speed and speed perturbation (https://www.isca-speech.org/archive/interspeech_2015/ko15_interspeech.html).

Pull Request resolved: https://github.com/pytorch/audio/pull/2829

Reviewed By: xiaohui-zhang

Differential Revision: D41285114

Pulled By: hwangjeff

fbshipit-source-id: 114740507698e01f35d4beb2c568a2479e847506

c28073cc

Add layer normalization to wav2vec2 large+ pretrained models (#2873) · aca61bc0

Andreas Floros authored Nov 29, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2873

The original fairseq implementation had an extra layer normalization
preprocessings for large/xlarge models.

https://github.com/facebookresearch/fairseq/blob/fcca32258c8e8bcc9f9890bf4714fa2f96b6b3e1/fairseq/data/audio/hubert_dataset.py#L355-L357

This commit modifies the pre-trained model bundle to include this
preprocessing to the impacted pre-trained models listed bellow.
For the sake of keeping the interface identical to the other models,
since the additional preprocessing is rather simple, the returned
pre-trained model instance is modified ot include the preprocess,
instead of adding a method for preprocessing.

- WAV2VEC2_LARGE_LV60K
- WAV2VEC2_ASR_LARGE_LV60K_10M
- WAV2VEC2_ASR_LARGE_LV60K_100H
- WAV2VEC2_ASR_LARGE_LV60K_960H
- WAV2VEC2_XLSR53
- HUBERT_LARGE
- HUBERT_XLARGE
- HUBERT_ASR_LARGE
- HUBERT_ASR_XLARGE
- WAVLM_LARGE

Reviewed By: nateanl

Differential Revision: D41520183

fbshipit-source-id: 83d72fe692e8b9fc25df144deb4ca946fcd09615

aca61bc0

29 Nov, 2022 5 commits

Add sinc_impulse_response op (#2875) · fc0720b4

moto authored Nov 29, 2022

Summary:
This commit adds `sinc_impulse_response`, which generates windowed-sinc low-pass filters for given cutoff frequencies.

Example usage:
 - [Filter Design Tutorial](https://output.circle-artifacts.com/output/job/c0085baa-5345-4aeb-bd44-448034caa9e1/artifacts/0/docs/tutorials/filter_design_tutorial.html)

Pull Request resolved: https://github.com/pytorch/audio/pull/2875

Reviewed By: carolineechen

Differential Revision: D41586631

Pulled By: mthrok

fbshipit-source-id: a9991dbe5b137b0b4679228ec37072a1da7e50bb

fc0720b4

Add logging to StreamReader/Writer (#2878) · 19e1a84d

moto authored Nov 29, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2878

Reviewed By: carolineechen

Differential Revision: D41587081

Pulled By: mthrok

fbshipit-source-id: da7f3647083a3566ce94070ce2bd30bf99e1db76

19e1a84d

Add additive synthesis tutorial (#2877) · 1a003c3f

moto authored Nov 29, 2022

Summary:
This commit adds the tutorial for additive synthesis, using torchaudio's prototype DSP ops.

[Review here](https://output.circle-artifacts.com/output/job/3dc83322-832a-4272-9c13-df752c97b660/artifacts/0/docs/tutorials/additive_synthesis_tutorial.html)

Pull Request resolved: https://github.com/pytorch/audio/pull/2877

Reviewed By: carolineechen

Differential Revision: D41585425

Pulled By: mthrok

fbshipit-source-id: b81283b90e4779c8054fd030a1d8c3d39d676bbd

1a003c3f

Extend fftconvolve to support broadcast-able shapes (#2874) · 7a05622e

moto authored Nov 29, 2022

Summary:
Currently, fftconvolve only accepts the tensors for the exact same leading dimensions.
This commit loosens the restriction to allow shapes that are broadcast-able.

This makes the fftconvolve operation more efficient for cases like signal filtering where one operand (waveform) is larger than the other (filter kernel) and the same filter kernels are applied across channels and batches.

Pull Request resolved: https://github.com/pytorch/audio/pull/2874

Reviewed By: carolineechen

Differential Revision: D41581588

Pulled By: mthrok

fbshipit-source-id: c0117e11b979fb53236cc307a970a461b0e50134

7a05622e

Add conformer wav2vec2 pretrain model (#2827) · 8bde6a54

Caroline Chen authored Nov 29, 2022

Summary:
modeled after [paper](https://arxiv.org/pdf/2110.07313.pdf) and internal flow f288347302

internal comparison tests: D40080919

Pull Request resolved: https://github.com/pytorch/audio/pull/2827

Reviewed By: nateanl

Differential Revision: D41569046

Pulled By: carolineechen

fbshipit-source-id: 43c5313074af05972d93da55b2029c746b75c380

8bde6a54

28 Nov, 2022 3 commits

Add aux_num_out to emformer_hubert_model (#2868) · b0795ebe

Zhaoheng Ni authored Nov 28, 2022

Summary:
- layer_norm in `EmformerEncoder` is set as default in emformer_hubert_model, change the type to be non-optional.
- add `aux_num_out` to emformer_hubert_model to support fine-tuning model.
- update unit tests.

Pull Request resolved: https://github.com/pytorch/audio/pull/2868

Reviewed By: carolineechen

Differential Revision: D41451311

Pulled By: nateanl

fbshipit-source-id: 5fa0f19255e4f01e001d62f8689e36f134030083

b0795ebe

Add oscillator tutorial (#2862) · 52e89756

moto authored Nov 28, 2022

Summary:
This commits add tutorial for oscillator_bank and adsr_envelope, which will be a basis for DDSP.

 - [Review here](https://output.circle-artifacts.com/output/job/cf1d3001-88e5-418b-8cf8-ae22b4445dba/artifacts/0/docs/tutorials/oscillator_tutorial.html)

Pull Request resolved: https://github.com/pytorch/audio/pull/2862

Reviewed By: carolineechen

Differential Revision: D41559503

Pulled By: mthrok

fbshipit-source-id: 3f1689186db7d246de14f228fc2f91bf37db98cd

52e89756

Add extend_pitch (#2863) · 3882c395

moto authored Nov 27, 2022

Summary:
Add `extend_pitch` function that can be used for augmenting fundamental frequencies with its harmonic overtones or inharmonic partials. it can be use for amplitude as well.

For example usages, see https://output.circle-artifacts.com/output/job/4ad0c29a-d75a-4244-baad-f5499f11d94b/artifacts/0/docs/tutorials/synthesis_tutorial.html

Part of https://github.com/pytorch/audio/issues/2835
Extracted from https://github.com/pytorch/audio/issues/2808

Pull Request resolved: https://github.com/pytorch/audio/pull/2863

Reviewed By: carolineechen

Differential Revision: D41543880

Pulled By: mthrok

fbshipit-source-id: 4f20e55770b0b3bee825ec07c73f9ec7cb181109

3882c395

19 Nov, 2022 1 commit

Add torchscript test to oscillator_bank (#2864) · 8ba323bb

moto authored Nov 18, 2022

Summary:
Missing from https://github.com/pytorch/audio/issues/2848

Pull Request resolved: https://github.com/pytorch/audio/pull/2864

Reviewed By: carolineechen

Differential Revision: D41413381

Pulled By: mthrok

fbshipit-source-id: 4377ed4a59504c6ade9ee6f42938a2bc3f04fb73

8ba323bb

18 Nov, 2022 2 commits

Add emformer hubert model architecture (#2836) · 92b6847e

Zhaoheng Ni authored Nov 18, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2836

Reviewed By: carolineechen

Differential Revision: D41208630

Pulled By: nateanl

fbshipit-source-id: 625e1651f0b8a6e20876409739cf7084cb7c748b

92b6847e

Update decoder doc (#2865) · 13063f9b

moto authored Nov 18, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2865

Reviewed By: carolineechen

Differential Revision: D41403756

Pulled By: mthrok

fbshipit-source-id: d193caa90e786f08f28e4cc2df4b4fb77aa8f592

13063f9b

17 Nov, 2022 4 commits

Add logging to MelSpectrogram and Spectrogram (#2861) · 2bfb8648

hwangjeff authored Nov 17, 2022

Summary:
Adds API usage logging to MelSpectrogram and Spectrogram.

Pull Request resolved: https://github.com/pytorch/audio/pull/2861

Reviewed By: carolineechen

Differential Revision: D41384080

Pulled By: hwangjeff

fbshipit-source-id: caf4b0fa6e4cc3954384bfdd08a183b90d07d974

2bfb8648

Add adsr_envelope (#2859) · 793ff00b

moto authored Nov 17, 2022

Summary:
Add adsr_envelope op, which generates ADSR envelope

* Supports generation of the envelope on GPU
* Supports optional Hold
* Supports polynomial decay

<image src='https://download.pytorch.org/torchaudio/doc-assets/adsr_examples.png'>

Pull Request resolved: https://github.com/pytorch/audio/pull/2859

Reviewed By: nateanl

Differential Revision: D41379601

Pulled By: mthrok

fbshipit-source-id: 3717a6e0360d2a24913c2a836c57c5edec1d7b31

793ff00b

fix import bug in global_stats.py (#2858) · d912dcd7

vasiliy authored Nov 17, 2022

Summary:
This code was added by
https://github.com/pytorch/audio/commit/4d0095a528412cfec2a549204fc01d9ebb15df7a

Seems that the original code had a typo?

Pull Request resolved: https://github.com/pytorch/audio/pull/2858

Test Plan:
```
// the import of `mustc` now succeeds, previously crashed
python examples/asr/emformer_rnnt/global_stats.py --model-type librispeech --dataset-path /home/vasiliy/local/librispeech/
```

Reviewed By: carolineechen

Differential Revision: D41355663

Pulled By: nateanl

fbshipit-source-id: 92507e529d41b984b9dd400ad24a55d130372b7d

d912dcd7

Add oscillator_bank (#2848) · e502b10c

moto authored Nov 16, 2022

Summary:
This commit adds `oscillator_bank` op, which is the core of (differential) digital signal processing ops.
The implementation itself is pretty simple, sum instantaneous frequencies, take sin and multiply with amplitudes.

Following the magenta implementation, amplitudes for frequency range outside of [-Nyquist, Nyquist] \
are suppressed.

The differentiability is tested within frequency range of [- Nyquist, Nyquist], and amplitude range of [-5, 5], which should be enough.

For example usages:
 - https://output.circle-artifacts.com/output/job/129f3e21-41ce-406b-bc6b-833efb3c3141/artifacts/0/docs/tutorials/oscillator_tutorial.html
 - https://output.circle-artifacts.com/output/job/129f3e21-41ce-406b-bc6b-833efb3c3141/artifacts/0/docs/tutorials/synthesis_tutorial.html

Part of https://github.com/pytorch/audio/issues/2835
Extracted from https://github.com/pytorch/audio/issues/2808

Pull Request resolved: https://github.com/pytorch/audio/pull/2848

Reviewed By: carolineechen

Differential Revision: D41353075

Pulled By: mthrok

fbshipit-source-id: 80e60772fb555760f2396f7df40458803c280225

e502b10c

16 Nov, 2022 2 commits

Enable mixed precision training for hubert_pretrain_model (#2854) · e062110b

Zhaoheng Ni authored Nov 16, 2022

Summary:
address https://github.com/pytorch/audio/issues/2847

In mixed precision training, the dtype of `mask_embedding` is **not** converted to fp16 automatically. This PR addresses the issue by changing the dtype of `mask_embedding` to `x` to enable mixed precision training.

Pull Request resolved: https://github.com/pytorch/audio/pull/2854

Reviewed By: carolineechen

Differential Revision: D41343486

Pulled By: nateanl

fbshipit-source-id: 4a5cbb429ff8ba5d3c439a3d5acb5094f66bf705

e062110b

Fix hubert fine-tuning recipe (#2851) · 40ff642e

Zhaoheng Ni authored Nov 16, 2022

Summary:
- `_get_fileids_paths` in `LibriLightLimited` dataset was changed dataset in https://github.com/pytorch/audio/issues/2653, the absolute path becomes relative paths. This PR fixes the usage in hubert fine-tuning recipe to get correct audio paths.
- model options should be `hubert_pretrain_large` and `hubert_pretrain_xlarge` instead of `hubert_large` and `hubert_xlarge`.
- The input dimension of CTC linear layer varies depending on the model architecture, update it in lightning module.

cc simpleoier

Pull Request resolved: https://github.com/pytorch/audio/pull/2851

Reviewed By: carolineechen

Differential Revision: D41327998

Pulled By: nateanl

fbshipit-source-id: f92248ee84ec860b4e4dbef880c5794b338e1e2d

40ff642e

15 Nov, 2022 3 commits

Add WavLM bundles (#2833) · 26f62dc5

Grigory Sizov authored Nov 15, 2022

Summary:
Closes T136364380, follow-up to https://github.com/pytorch/audio/issues/2822

- Added "base", "base+", and "large" bundles for WavLM
- Expanded `wav2vec2_pipeline_test.py` to include the new bundles
- Added the new bundles to docs in `pipelines.rst`

Pull Request resolved: https://github.com/pytorch/audio/pull/2833

Reviewed By: nateanl

Differential Revision: D41194796

Pulled By: sgrigory

fbshipit-source-id: bf8e96c05b6a81ac5c5a014c46adeeac12685328

26f62dc5

Use BetterTransfomer in WavLM Self-Attention (#2842) · 2d1da45c

Grigory Sizov authored Nov 15, 2022

Summary:
Closes T137506059

Replaces functional multi-head attention in `WavLMSelfAttention` with a module `torch.nn.MultiheadAttention`. The reason is that the latter uses native CPU/CUDA implementation ([BetterTransfomer](https://pytorch.org/blog/a-better-transformer-for-fast-transformer-encoder-inference/)) under certain conditions, and can achieve significant speedup. It also simplifies the code in `WavLMSelfAttention`

Note: the definition of `bias` parameter in `WavLMSelfAttention.forward` has changed slightly, because in `torch.nn.MultiheadAttention` there is no parameter controlling presence of bias for projections of `k`, `v`, and `q` independently. In WavLM we only use `bias=True`, so it won't have any effect for users of WavLM or tests

Pull Request resolved: https://github.com/pytorch/audio/pull/2842

Reviewed By: nateanl

Differential Revision: D41186166

Pulled By: sgrigory

fbshipit-source-id: e791c68106ad89f96c1abf046de699cb8ec7b595

2d1da45c

Add logo (#2802) · d73f4688

moto authored Nov 14, 2022

Summary:
* Add the new official torchaudio logo to documentation/README.
* Add a page for download logo.

https://output.circle-artifacts.com/output/job/e9eb1292-7c10-4fef-adc3-ad568802aa59/artifacts/0/docs/index.html

<img width="1068" alt="Screen Shot 2022-11-14 at 10 30 27 AM" src="https://user-images.githubusercontent.com/855818/201738349-9e248f15-dce2-4931-9066-aa898a53d6ad.png">

https://output.circle-artifacts.com/output/job/e9eb1292-7c10-4fef-adc3-ad568802aa59/artifacts/0/docs/logo.html

<img width="617" alt="Screen Shot 2022-11-14 at 10 30 47 AM" src="https://user-images.githubusercontent.com/855818/201738420-ad0fda2f-f310-4802-851c-bbdf6c84c045.png">

Pull Request resolved: https://github.com/pytorch/audio/pull/2802

Reviewed By: carolineechen

Differential Revision: D41295277

Pulled By: mthrok

fbshipit-source-id: 6615d00799c9611f875e8485459d800e350b3486

d73f4688

14 Nov, 2022 2 commits

Remove LTS from README (#2844) · 01eda13c

moto authored Nov 14, 2022

Summary:
Removing LTS mention and packages from README as it is discontinued.

Pull Request resolved: https://github.com/pytorch/audio/pull/2844

Reviewed By: hwangjeff, xiaohui-zhang

Differential Revision: D41200886

Pulled By: mthrok

fbshipit-source-id: 0da0afe68df51826075ce945cf0cf1de901e1c8f

01eda13c

Move bark spectrogram to prototype (#2843) · 7819f3f6

Caroline Chen authored Nov 14, 2022

Summary:
follow up to https://github.com/pytorch/audio/issues/2823
- move bark spectrogram to prototype
- decrease autograd test tolerance (passing on circle ci)
- add diagram for bark fbanks

cc jdariasl

Pull Request resolved: https://github.com/pytorch/audio/pull/2843

Reviewed By: nateanl

Differential Revision: D41199522

Pulled By: carolineechen

fbshipit-source-id: 8e6c2e20fb7b14f39477683b3c6ed8356359a213

7819f3f6

13 Nov, 2022 1 commit

Fix initialization in hubert_pretrain_model (#2846) · 6e334a46

Zhaoheng Ni authored Nov 13, 2022

Summary:
address https://github.com/pytorch/audio/issues/2845

Pull Request resolved: https://github.com/pytorch/audio/pull/2846

Reviewed By: carolineechen

Differential Revision: D41251624

Pulled By: nateanl

fbshipit-source-id: 1a363d2314d6a452f35c109b9730da64ada5a2fd

6e334a46

11 Nov, 2022 1 commit

Add nova workflow for MacOS and Linux (#2800) · eabf1a13

DanilBaibak authored Nov 11, 2022

Summary:
Added missed build workflows for MacOS and Linux:

- [x] Linux conda
- [x] MacOS conda

This does not change the existing builds/uploads in CircleCI, and should not break any existing jobs/workflows. This is just to add back workflows for the MacOS and Linux conda builds with Nova.

We will create a workflow (most likely in test-infra) that does this comparison between the binaries to ensure there is parity between the binaries before we start uploading with Nova.

Pull Request resolved: https://github.com/pytorch/audio/pull/2800

Reviewed By: osalpekar

Differential Revision: D41181467

Pulled By: DanilBaibak

fbshipit-source-id: a5c5d4dcfdd778b4045203f6016c20fb42daa01b

eabf1a13

10 Nov, 2022 2 commits

Fix the handling of discard_before_pts (#2841) · 4e309734

moto authored Nov 10, 2022

Summary:
Currently `discard_before_pts=-1` is used to indicate no AVFrame should be skipped. It was reported that some corrupted video can have constant negative pts value.

It is technically UB for such corrupted data, but still all the AVFrame should be decoded as long as `seek` is not used.

This commit changes the decoder so that it processes AVFrame if `discard_before_pts==-1` disregard of AVFrame::pts value.

Pull Request resolved: https://github.com/pytorch/audio/pull/2841

Reviewed By: hwangjeff

Differential Revision: D41174442

Pulled By: mthrok

fbshipit-source-id: e9d2fab4b0e2bc47146eda8e1dd377a74c087590

4e309734

[Nova] Add M1 Wheels Build (#2839) · 15f76b0b

Omkar Salpekar authored Nov 10, 2022

Summary:
Adding Nova Reusable Workflow for M1 Wheels Build. Once this has been running well for a while, we can replace the old `build-m1-binaries.yml` workflow.

Pull Request resolved: https://github.com/pytorch/audio/pull/2839

Reviewed By: DanilBaibak

Differential Revision: D41195316

Pulled By: osalpekar

fbshipit-source-id: f3754043f384b1645e5fcfaebf465f6839f72461

15f76b0b