Commits · ffeba11aff8bc9f1c3f7b214da74e44f97f456ef · OpenDAS / Torchaudio

02 Sep, 2024 1 commit
- UPDATE · ffeba11a
  mayp777 authored Sep 02, 2024
  
  ffeba11a
16 Nov, 2022 2 commits

Enable mixed precision training for hubert_pretrain_model (#2854) · 030646c0

Zhaoheng Ni authored Nov 16, 2022

Summary:
address https://github.com/pytorch/audio/issues/2847

In mixed precision training, the dtype of `mask_embedding` is **not** converted to fp16 automatically. This PR addresses the issue by changing the dtype of `mask_embedding` to `x` to enable mixed precision training.

Pull Request resolved: https://github.com/pytorch/audio/pull/2854

Reviewed By: carolineechen

Differential Revision: D41343486

Pulled By: nateanl

fbshipit-source-id: 4a5cbb429ff8ba5d3c439a3d5acb5094f66bf705

030646c0

Fix hubert fine-tuning recipe (#2851) · 980528e9

Zhaoheng Ni authored Nov 16, 2022

Summary:
- `_get_fileids_paths` in `LibriLightLimited` dataset was changed dataset in https://github.com/pytorch/audio/issues/2653, the absolute path becomes relative paths. This PR fixes the usage in hubert fine-tuning recipe to get correct audio paths.
- model options should be `hubert_pretrain_large` and `hubert_pretrain_xlarge` instead of `hubert_large` and `hubert_xlarge`.
- The input dimension of CTC linear layer varies depending on the model architecture, update it in lightning module.

cc simpleoier

Pull Request resolved: https://github.com/pytorch/audio/pull/2851

Reviewed By: carolineechen

Differential Revision: D41327998

Pulled By: nateanl

fbshipit-source-id: f92248ee84ec860b4e4dbef880c5794b338e1e2d

980528e9

12 Oct, 2022 1 commit

Improve hubert recipe for pre-training and fine-tuning (#2744) · 928248d7

Zhaoheng Ni authored Oct 12, 2022

Summary:
following pr https://github.com/pytorch/audio/issues/2716
- For preprocessing
  - The HuBERT feature takes lots of memory which may not fit some machines. Enable to use a subset of feature for training a k-means model.

- For pre-training
  - Normalize the loss based on the total number of masked frames across all GPUs.
  - Use mixed precision training. fp16 is not well supported in pytorch_lightning.
  - Log accuracies of masked/unmasked frames during training.
  - Clip the gradients with norm `10.0`.

- For ASR fine-tuning
  - Normalize the loss based on the total number of batches across all GPUs, same as in the conformer recipe of TorchAudio.
  - Use mixed precision training.
  - Add "|" after the end of transcription to capture the silence/word termination, same as in fairseq recipe.

- Update the WER results on LibriSpeech dev and test sets.

|                   | WER% (Viterbi)|  WER% (KenLM) |
|:-----------------:|--------------:|--------------:|
| dev-clean         |       10.9    |       4.2     |
| dev-other         |       17.5    |       9.4     |
| test-clean        |       10.9    |       4.4     |
| test-other        |       17.8    |       9.5     |

Pull Request resolved: https://github.com/pytorch/audio/pull/2744

Reviewed By: carolineechen

Differential Revision: D40282322

Pulled By: nateanl

fbshipit-source-id: 4723584c912e70e8970149fe09de005385eaab90

928248d7

28 Jul, 2022 1 commit

Fix hubert fine-tuning recipe bugs (#2588) · 0092aa3c

Zhaoheng Ni authored Jul 28, 2022

Summary:
- The optimizer in fine-tuning recipe should also be `AdamW`. See https://github.com/pytorch/audio/pull/2412
- Fix the import of `DistributedBatchSampler` in hubert dataset
- Fix `dataset_path` in fine-tuning module.

Pull Request resolved: https://github.com/pytorch/audio/pull/2588

Reviewed By: carolineechen

Differential Revision: D38243423

Pulled By: nateanl

fbshipit-source-id: badc88ce9eddfd71270201a65ae89433fae2733f

0092aa3c

07 Jun, 2022 1 commit

Add HuBERT fine-tuning recipe (#2352) · ab5edfcd

Zhaoheng Ni authored Jun 07, 2022

Summary:
The PR contains the CTC fine-tuning recipe of HuBERT Base model.
The files include:
- lightning module
- training script
- README and the result table
- evaluation scripts

Pull Request resolved: https://github.com/pytorch/audio/pull/2352

Reviewed By: hwangjeff

Differential Revision: D36915712

Pulled By: nateanl

fbshipit-source-id: 0249635ad5e81a8aa2d228c1d5fe84d78b62a15b

ab5edfcd

26 May, 2022 1 commit
- change Adam to AdamW (#2412) · 752de3e4
  nateanl authored May 26, 2022
  
  752de3e4
23 May, 2022 1 commit

Add recipe for HuBERT model pre-training (#2198) · 48a0c17a

Zhaoheng Ni authored May 23, 2022

Summary:
Replace https://github.com/pytorch/audio/issues/2129

Pull Request resolved: https://github.com/pytorch/audio/pull/2198

Reviewed By: carolineechen

Differential Revision: D36544163

Pulled By: nateanl

fbshipit-source-id: 3f19ba5b0f2c2b9e93b0603c3b4491c1dbc40ef8

48a0c17a