"vscode:/vscode.git/clone" did not exist on "fed7f9ffc034d5f97a5660bafbdae0d2eaf8617b"
- 12 Oct, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: following pr https://github.com/pytorch/audio/issues/2716 - For preprocessing - The HuBERT feature takes lots of memory which may not fit some machines. Enable to use a subset of feature for training a k-means model. - For pre-training - Normalize the loss based on the total number of masked frames across all GPUs. - Use mixed precision training. fp16 is not well supported in pytorch_lightning. - Log accuracies of masked/unmasked frames during training. - Clip the gradients with norm `10.0`. - For ASR fine-tuning - Normalize the loss based on the total number of batches across all GPUs, same as in the conformer recipe of TorchAudio. - Use mixed precision training. - Add "|" after the end of transcription to capture the silence/word termination, same as in fairseq recipe. - Update the WER results on LibriSpeech dev and test sets. | | WER% (Viterbi)| WER% (KenLM) | |:-----------------:|--------------:|--------------:| | dev-clean | 10.9 | 4.2 | | dev-other | 17.5 | 9.4 | | test-clean | 10.9 | 4.4 | | test-other | 17.8 | 9.5 | Pull Request resolved: https://github.com/pytorch/audio/pull/2744 Reviewed By: carolineechen Differential Revision: D40282322 Pulled By: nateanl fbshipit-source-id: 4723584c912e70e8970149fe09de005385eaab90
-
- 28 Jul, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: - The optimizer in fine-tuning recipe should also be `AdamW`. See https://github.com/pytorch/audio/pull/2412 - Fix the import of `DistributedBatchSampler` in hubert dataset - Fix `dataset_path` in fine-tuning module. Pull Request resolved: https://github.com/pytorch/audio/pull/2588 Reviewed By: carolineechen Differential Revision: D38243423 Pulled By: nateanl fbshipit-source-id: badc88ce9eddfd71270201a65ae89433fae2733f
-
- 07 Jun, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: The PR contains the CTC fine-tuning recipe of HuBERT Base model. The files include: - lightning module - training script - README and the result table - evaluation scripts Pull Request resolved: https://github.com/pytorch/audio/pull/2352 Reviewed By: hwangjeff Differential Revision: D36915712 Pulled By: nateanl fbshipit-source-id: 0249635ad5e81a8aa2d228c1d5fe84d78b62a15b
-
- 15 May, 2022 1 commit
-
-
John Reese authored
Summary: Applies new import merging and sorting from µsort v1.0. When merging imports, µsort will make a best-effort to move associated comments to match merged elements, but there are known limitations due to the diynamic nature of Python and developer tooling. These changes should not produce any dangerous runtime changes, but may require touch-ups to satisfy linters and other tooling. Note that µsort uses case-insensitive, lexicographical sorting, which results in a different ordering compared to isort. This provides a more consistent sorting order, matching the case-insensitive order used when sorting import statements by module name, and ensures that "frog", "FROG", and "Frog" always sort next to each other. For details on µsort's sorting and merging semantics, see the user guide: https://usort.readthedocs.io/en/stable/guide.html#sorting Reviewed By: lisroach Differential Revision: D36402214 fbshipit-source-id: b641bfa9d46242188524d4ae2c44998922a62b4c
-
- 12 May, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: - When cropping the waveform and corresponding label, we use the formula `torch.div(audio_start - kernel_size * sample_rate, stride * sample_rate, rounding_mode="floor")` to align the audio start and label start indices. However, sometimes the value can be negative, which result in an empty label. The training example will hurt the performance after zero-padding (i.e., the labels are all zero for the input waveform). This PR fixes the bug by checking if `label_start` is negative, and change it to zero if so. - If `pad` is True, the `length` should be the length of each waveform instead of the max length. Fix it to make the model ignore the padding component in pre-training. Pull Request resolved: https://github.com/pytorch/audio/pull/2296 Reviewed By: mthrok Differential Revision: D36323217 Pulled By: nateanl fbshipit-source-id: 1ffa71e39bbc0e8dee55c3b829911bc2e785b423
-
- 22 Apr, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: When using customized `batch_sampler`, pytorch_lightning can't wrap the distributed sampler onto it. Hence we provide a `DistributedBatchSampler` that supports `BucketizeBatchSampler` in `ddp` mode. The `DistributedBatchSampler` assumes `BucketizeBatchSampler.iter_list` is a list of lists, where each sub-list contains a batch of indices. Setting `shuffle` to `True` will shuffle the lists based on `seed` and current `epoch`. The `shuffle` only happens in the initialization, and won't be changed if user don't reset it. The reason is shuffling `BucketizeBatchSampler` may have a different length than before, do shuffling in ``__iter__`` may result in mismatch between ``__len__`` and the real length value. Hence users need to set `reload_dataloaders_every_n_epochs=1` in pytorch_lightning's Trainer. Then the value of ``__len__`` and the real length is the same. Pull Request resolved: https://github.com/pytorch/audio/pull/2299 Reviewed By: hwangjeff Differential Revision: D35781538 Pulled By: nateanl fbshipit-source-id: 6e8396615497f1aeddab1ee5678830c0445c2b2a
-
- 22 Jan, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: - Rename `BucketizeSampler` to `BucketizeBatchSampler` - Fix bugs in `BucketizeBatchSampler` - Adjust HuBERTDataset based on the latest `BucketizeBatchSampler`. Pull Request resolved: https://github.com/pytorch/audio/pull/2150 Reviewed By: mthrok Differential Revision: D33689963 Pulled By: nateanl fbshipit-source-id: 203764e9af5b7577ba08ebaa30ba5da3b67fb7e7
-
- 06 Jan, 2022 1 commit
-
-
Elijah Rippeth authored
Summary: This PR: - Replaces the `data_source` with `lengths` - Adds a `shuffle` argument to decide whether to shuffle the samples in the buckets. - Add `max_len` and `min_len` to filter out samples that are > max_len or < min_len. cc nateanl Pull Request resolved: https://github.com/pytorch/audio/pull/2147 Reviewed By: carolineechen Differential Revision: D33454369 Pulled By: nateanl fbshipit-source-id: 3835169ec7f808f8dd9650e7f183f79091efe886
-
- 23 Dec, 2021 1 commit
-
-
Joao Gomes authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2096 run: `arc lint --apply-patches --paths-cmd 'hg files -I "./**/*.py"'` Reviewed By: mthrok Differential Revision: D33297351 fbshipit-source-id: 7bf5956edf0717c5ca90219f72414ff4eeaf5aa8
-
- 10 Dec, 2021 1 commit
-
-
nateanl authored
Summary: The PR adds PyTorch Lightning based training script for HuBERT Base model. There are two iterations of pre-training and 1 iteration of ASR fine-tuning on LibriSpeech dataset. Pull Request resolved: https://github.com/pytorch/audio/pull/2000 Reviewed By: carolineechen Differential Revision: D33021467 Pulled By: nateanl fbshipit-source-id: 77fe5a751943b56b63d5f1fb4e6ef35946e081db
-