Commits · 4c19e2cb3ef78ab3f6ae4052182c3aeed4817d08 · OpenDAS / Torchaudio

12 May, 2022 1 commit

Fix CollateFn in HuBERT pre-training recipe (#2296) · 09639680

Zhaoheng Ni authored May 12, 2022

Summary:
- When cropping the waveform and corresponding label, we use the formula `torch.div(audio_start - kernel_size * sample_rate, stride * sample_rate, rounding_mode="floor")` to align the audio start and label start indices. However, sometimes the value can be negative, which result in an empty label. The training example will hurt the performance after zero-padding (i.e., the labels are all zero for the input waveform).
This PR fixes the bug by checking if `label_start` is negative, and change it to zero if so.
- If `pad` is True, the `length` should be the length of each waveform instead of the max length. Fix it to make the model ignore the padding component in pre-training.

Pull Request resolved: https://github.com/pytorch/audio/pull/2296

Reviewed By: mthrok

Differential Revision: D36323217

Pulled By: nateanl

fbshipit-source-id: 1ffa71e39bbc0e8dee55c3b829911bc2e785b423

09639680

22 Apr, 2022 1 commit

Introduce DistributedBatchSampler (#2299) · 6411c9ad

Zhaoheng Ni authored Apr 22, 2022

Summary:
When using customized `batch_sampler`, pytorch_lightning can't wrap the distributed sampler onto it. Hence we provide a `DistributedBatchSampler` that supports `BucketizeBatchSampler` in `ddp` mode.

The `DistributedBatchSampler` assumes `BucketizeBatchSampler.iter_list` is a list of lists, where each sub-list contains a batch of indices. Setting `shuffle` to `True` will shuffle the lists based on `seed` and current `epoch`.

The `shuffle` only happens in the initialization, and won't be changed if user don't reset it. The reason is shuffling `BucketizeBatchSampler` may have a different length than before, do shuffling in ``__iter__`` may result in mismatch between ``__len__`` and the real length value.
Hence users need to set `reload_dataloaders_every_n_epochs=1` in pytorch_lightning's Trainer. Then the value of ``__len__`` and the real length is the same.

Pull Request resolved: https://github.com/pytorch/audio/pull/2299

Reviewed By: hwangjeff

Differential Revision: D35781538

Pulled By: nateanl

fbshipit-source-id: 6e8396615497f1aeddab1ee5678830c0445c2b2a

6411c9ad

22 Jan, 2022 1 commit

[Example] Refactor BucketizeBatchSampler and HuBERTDataset (#2150) · 576b02b1

Zhaoheng Ni authored Jan 22, 2022

Summary:
- Rename `BucketizeSampler` to `BucketizeBatchSampler`
- Fix bugs in `BucketizeBatchSampler`
- Adjust HuBERTDataset based on the latest `BucketizeBatchSampler`.

Pull Request resolved: https://github.com/pytorch/audio/pull/2150

Reviewed By: mthrok

Differential Revision: D33689963

Pulled By: nateanl

fbshipit-source-id: 203764e9af5b7577ba08ebaa30ba5da3b67fb7e7

576b02b1

06 Jan, 2022 1 commit

[Example] abstracts BucketizeSampler to be usable outside of HuBERT example. (#2147) · 8c16529b

Elijah Rippeth authored Jan 06, 2022

Summary:
This PR:

- Replaces the `data_source` with `lengths`
- Adds a `shuffle` argument to decide whether to shuffle the samples in the buckets.
- Add `max_len` and `min_len` to filter out samples that are > max_len or < min_len.

cc nateanl

Pull Request resolved: https://github.com/pytorch/audio/pull/2147

Reviewed By: carolineechen

Differential Revision: D33454369

Pulled By: nateanl

fbshipit-source-id: 3835169ec7f808f8dd9650e7f183f79091efe886

8c16529b

23 Dec, 2021 1 commit

Apply arc lint to pytorch audio (#2096) · 5859923a

Joao Gomes authored Dec 23, 2021

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2096

run: `arc lint --apply-patches --paths-cmd 'hg files -I "./**/*.py"'`

Reviewed By: mthrok

Differential Revision: D33297351

fbshipit-source-id: 7bf5956edf0717c5ca90219f72414ff4eeaf5aa8

5859923a

10 Dec, 2021 1 commit

Add bucketize sampler and dataset for HuBERT Base model training pipeline (#2000) · ddb9fb5b

nateanl authored Dec 10, 2021

Summary:
The PR adds PyTorch Lightning based training script for HuBERT Base model. There are two iterations of pre-training and 1 iteration of ASR fine-tuning on LibriSpeech dataset.

Pull Request resolved: https://github.com/pytorch/audio/pull/2000

Reviewed By: carolineechen

Differential Revision: D33021467

Pulled By: nateanl

fbshipit-source-id: 77fe5a751943b56b63d5f1fb4e6ef35946e081db

ddb9fb5b