Commits · 358e9e931dabde76f36493aaa09060d3824fc84f · OpenDAS / Torchaudio

05 Oct, 2021 1 commit
- Add HUBERT_BASE and HUBERT_ASR_LARGE pretrained models (#1821) · 358e9e93
  moto authored Oct 05, 2021
  
  358e9e93
01 Oct, 2021 1 commit

Fix HuBERT xlarge configuration and test (#1811) · 13b2349a

moto authored Oct 01, 2021

1. Fix the HuBERT xlarge model config
2. In the 48 transformer layers of HuBERT xlarge model, very few elements deviate from the equivalent model of fairseq, and exceeds the default atol 1e-5. This commit relax it to 3e-5 for the specific test.

13b2349a

30 Sep, 2021 1 commit
- Skip hubert_xlarge TS test on Windows (#1807) · 8686af1f
  moto authored Sep 30, 2021
```
Writing scripted HuBERT XLarge models fail on Windows CI.
```
  8686af1f
29 Sep, 2021 2 commits

Rename factory functions `wav2vec2_asr_ARCH` to `wav2vec2_ft_ARCH` (#1804) · 5c01c25f

moto authored Sep 29, 2021

* Rename factory functions `wav2vec2_asr_ARCH` to `wav2vec2_ft_ARCH`

In #1783, we split the factory functions of wav2vec2 into ones for pretraining models
and ones for fine-tuning models (pretraining model + extra Linear module).

I picked the name scheme `wav2vec2_asr_ARCH` for factory functions of fine-tuning models,
but did not feel right, because the architecture code is more generic.
Even though the resulting model architecture was used for  ASR fine-tuning in the paper, 
it does not have to be ASR.
This became more evident as we add pre-trained parameters support, such as #1799.
It matters more for the weight files that for which task and on which dataset it was
trained on. For factory function, ASR task is not relevant.

Therefore renaming the functions by replacing `_asr_` to `_ft_` fine-tuning.

Note: Since the new functions are not release yet, this PR itself is not BC-breaking.

5c01c25f

Skip hubert_asr_xlarge TS test on Windows (#1800) · a7bdedae
moto authored Sep 29, 2021

a7bdedae

28 Sep, 2021 1 commit

Add HuBERT model architectures (#1769) · a7854f33

moto authored Sep 28, 2021

This commit adds the following HuBERT model architectures

 - `base` (pre-training)
 - `large` (pre-training / fine-tuning)
 - `xlarge` (pre-training / fine-tuning)

Since the internal components are same as `Wav2Vec2Model`, it reuses the existing modules..
With these models, it is possible to 
- import the pre-trained model published by `fairseq` and TorchScript it.
- fine-tune the existing model for downstream task.

a7854f33

24 Sep, 2021 1 commit

[BC-Breaking] Split pretraining and finetuning factory functions (#1783) · b2e9f1e4

moto authored Sep 24, 2021

* [BC-Breaking] Split pretraining and finetuning factory functions

Previously, factory functions of wav2vec2 only generated the architecture
for the fine-tuning architecture used in wav2ve2 paper for ASR task.
That is, pre-training architecture + Linear module, and it did not
provide a straightforward way to generate architectures for pre-training.

The goal of the original implementation was to allow the inference of
wav2vec2 in non-Python environment via TorchScript. Now we would like to
expand it to pre-training/fine-tuning and HuBERT model as well.

Therefore, we need to have factory functions for both pre-training and
fine-tuning. This commit introduces new factory functions and separate
functions for pre-training and fine-tuning.

1. New functions for ASR fine-tuning.

We introdcue `wav2vec2_asr_XXX` functions which generates the architecture
used for the fine-tuning task in wav2vec2 paper. *1

2. Re-purpse the old functions

The existing functions, `wav2vec2_XXX`, now generates the architecture with
pre-trainig module only. (no Linear module)

Note
*1 This architecture is just one way to define architecture for fine-tuning
and it is not universal definition. The new `wav2vec2_asr_XXX` functions are
designed to provide these specific fine-tuning configuration and they are not
meant to support generic architecture for downstream task.

b2e9f1e4

22 Sep, 2021 3 commits

[BC-Breaking] Move fine-tune specific module out of wav2vec2 encoder (#1782) · 40f2a085

moto authored Sep 22, 2021

Previously, the Linear module (called `readout`, which is used only for an ASR fine-tuning
task) was placed in encoder module. Conceptually, the encoder has nothing to
do with a module specific to fine-tuning / downstream task.

The problems here are that;
1. encoder can be also used in pre-training phase, in which such a module should
not present
2. The choice of Linear module is arbitral, and it is inconvenient for users
to have hard-coded module structure in encoder.

Therefore, this commit moves the Linear module out the encoder, and places it
as `aux` attribute of `Wav2Vec2Model`. (as a result `Wav2Vec2Model` has
`feature_extractor`, `encoder` and `aux` attributes.)

An alternative approach is to define another module and place `Wav2Vec2Model`
and aux module along each other. But that will introduce a new class we need
to maintain.
The expected use of `aux` is only  for 1. loading the pre-trained parameters 
published by `fairseq` (and it's variations from HF) and 2. creating the same model 
architectures for comparison experiment.
The newly introduced class will not be general enough for downstream adaptations, 
where there will be a bunch of different more complicated models. (i.e. s3prl)

Therefore, based on the minimalistic approach, we put them inside of `Wav2Vec2Model`.

40f2a085

Fix HF model integration (#1781) · e9cab8f8

moto authored Sep 22, 2021

* Fix HF model integration

Previously, when testing wav2vec models from HF transformers, all the model were
instantiated as `Wav2Vec2ForCTC` class, while some of them were supposed to be
`Wav2Vec2Model`.

Fixing this revealed that model importer cannot correctly handle `Wav2Vec2Model` import.

This PR fixes these issues.

e9cab8f8

Update reference from master to main elsewhere (#1784) · 1b4b82e0

moto authored Sep 22, 2021



Summary: Update fairseq reference from master to main elsewhere

Reviewed By: alexeib

Differential Revision: D30938472

fbshipit-source-id: 243b98550207f241c9d3265bf3d4060350aaf0a8
Co-authored-by: Diana Liskovich <dianaml@fb.com>

1b4b82e0

21 Sep, 2021 1 commit

Tweak test name by appending factory function name (#1780) · 5aedcab3

moto authored Sep 21, 2021

Apply tweak around the test names so that it's easier to see which tests are failing.

Before: `test_import_finetuned_model_2`
After: `test_import_finetuned_model_2_wav2vec2_large_lv60k`

5aedcab3

20 Sep, 2021 2 commits

[BC-Breaking] Update `extract_features` of Wav2Vec2Model (#1776) · 78b08c26

moto authored Sep 20, 2021

* [BC-Breaking] Update `extract_features` of Wav2Vec2Model

Originally, `extract_features` method was returning the result from
the convolutional feature extractor module.

The features commonly used in downstream tasks are outputs from intermediate
layers of transformer block in encoder.

This commit update the behavior of `extract_features` to allow selectively
retrieve such features.

78b08c26

Move MVDR and PSD modules to transforms (#1771) · ac97ad82
nateanl authored Sep 20, 2021

ac97ad82

17 Sep, 2021 1 commit
- Refactor batch consistency test in transforms (#1772) · b40aee5a
  nateanl authored Sep 17, 2021
  
  b40aee5a
02 Sep, 2021 1 commit

Put output tensor on proper device in `get_whitenoise()` (#1744) · feede97e

jayleverett authored Sep 02, 2021

* put output tensor on device in `get_whitenoise()`

* Update `get_spectrogram()` so that window uses same device as waveform

* put window on proper device in `test_griffinlim()`

feede97e

27 Aug, 2021 2 commits
- Refactor scripting in test (#1727) · 595b37b6
  moto authored Aug 27, 2021
```
Introduce a helper function `torch_script` that performs scripting in the recommended way.
```
  595b37b6
- Remove unnecessary README (#1728) · ef7255bb
  moto authored Aug 27, 2021
  
  ef7255bb
26 Aug, 2021 2 commits

[Fbsync] Lint fix (#1726) · 560c082e
moto authored Aug 26, 2021

560c082e

Add MVDR module to example (#1708) · 4915524f

nateanl authored Aug 26, 2021

- Support three solutions for MVDR beamforming ("ref_channel", "stv_evd", "stv_power").
- Support single-channel and multi-channel time-frequency masks
- Add unit tests

4915524f

23 Aug, 2021 1 commit
- Refactor WaveRNN infer and move it to the codebase (#1704) · 3bb5feb5
  yangarbiter authored Aug 23, 2021
  
  3bb5feb5
20 Aug, 2021 1 commit

Add basic filtfilt implementation (#1681) · 496b381a

hwangjeff authored Aug 20, 2021



* Add basic filtfilt implementation

* Add filtfilt to functional package; add tests
Co-authored-by: V G <vladislav.goncharenko@phystech.edu>

496b381a

19 Aug, 2021 1 commit
- Move RNNT Loss out of prototype (#1711) · 2c115821
  Caroline Chen authored Aug 19, 2021
  
  2c115821
18 Aug, 2021 1 commit
- Move Tacotron2 out of prototype (#1714) · 352d63c5
  yangarbiter authored Aug 17, 2021
  
  352d63c5
17 Aug, 2021 1 commit
- RNNT loss resolve null gradient (#1707) · 0f603eb9
  Caroline Chen authored Aug 17, 2021
  
  0f603eb9
12 Aug, 2021 1 commit

[Fbsync] Update pitch shift batch consistency test (#1700) · 1a64530d

hwangjeff authored Aug 12, 2021



* Reduce length of waveform in pitch_shift batch_consistency test

Summary: To address the test failure in T96406395

Reviewed By: carolineechen

Differential Revision: D30163741

fbshipit-source-id: f88d86b3da7b1ee52518934567b0b0a62700ee58

* Fix batch consistency test in transforms

Summary: The stress test still fails. Add n_fft to address it.

Reviewed By: mthrok

Differential Revision: D30218279

fbshipit-source-id: 7858efd3e5ac0073193a7883fd314486efc73814
Co-authored-by: Zhaoheng Ni <zni@fb.com>

1a64530d

11 Aug, 2021 1 commit

Add InverseSpectrogram to transforms and functional (#1652) · 6e0af713

nateanl authored Aug 11, 2021



- Provide InverseSpectrogram module that corresponds to Spectrogram module
- Add length parameter to the forward method in transforms
Co-authored-by: dgenzel <dgenzel@fb.com>
Co-authored-by: Zhaoheng Ni <zni@fb.com>

6e0af713

10 Aug, 2021 2 commits
- Add batch support to lfilter (#1638) · 8094751f
  Chin-Yun Yu authored Aug 11, 2021
  
  8094751f
- Add Tacotron2 inference method (#1648) · 15bc554f
  yangarbiter authored Aug 10, 2021
  
  15bc554f
04 Aug, 2021 1 commit
- [Fbsync] Move test initialization logic to dedicated directory (#1680) · ea394347
  moto authored Aug 04, 2021
```
D30080845
```
  ea394347
03 Aug, 2021 2 commits
- Remove fused_log_softmax option from RNNT Loss (#1615) · d74d0604
  Caroline Chen authored Aug 03, 2021
  
  d74d0604
- Remove reuse_logits_for_grads option for RNNTL (#1610) · 16f3b2f9
  Caroline Chen authored Aug 03, 2021
  
  16f3b2f9
02 Aug, 2021 2 commits
- Add CMUDict dataset (#1627) · 077a5f4a
  yangarbiter authored Aug 02, 2021
  
  077a5f4a
- Add melscale_fbanks and deprecate create_fb_matrix (#1653) · 83dc5ec7
  Joel Frank authored Aug 02, 2021
```
- Renamed torchaudio.functional.create_fb_matrix to torchaudio.functional.melscale_fbanks.
- Added interface with a warning for create_fb_matrix
```
  83dc5ec7
29 Jul, 2021 1 commit
- Add LFCC feature to transforms (#1611) · 86370639
  Joel Frank authored Jul 29, 2021
```
Summary:
- Add linear_fbank method
- Add LFCC in transforms
```
  86370639
28 Jul, 2021 1 commit
- Refactor text preprocessing tests in Tacotron2 example (#1641) · e14a2e0c
  yangarbiter authored Jul 28, 2021
  
  e14a2e0c
26 Jul, 2021 2 commits
- Add Tacotron2 loss function (#1625) · 1b52e720
  yangarbiter authored Jul 26, 2021
  
  1b52e720
- [Fbsync] Reduce sample rate to avoid test time out (#1640) · c49db739
  moto authored Jul 26, 2021
  
  c49db739
22 Jul, 2021 1 commit

Remove lazy behavior from MelScale (#1636) · 32b9cf80

hwangjeff authored Jul 22, 2021



Rebases #1571; addresses #1569:

"In 0.9.0 we are deprecating the lazy behavior of MelScale because it can make an invalid 
TorchScript object and it does not align with the design of torchaudio. Now in master 
branch, we can remove the implementation."
Co-authored-by: Pankaj Patil <pankaj.patil2099@hotmail.com>
Co-authored-by: moto <855818+mthrok@users.noreply.github.com>
Co-authored-by: hwangjeff <jeffhwang@fb.com>

32b9cf80

21 Jul, 2021 1 commit
- Add filterbanks support to lfilter (#1587) · aa0dd03b
  Chin-Yun Yu authored Jul 22, 2021
  
  aa0dd03b
20 Jul, 2021 1 commit
- Make buffer size for function info configurable (#1634) · e7b43dde
  hwangjeff authored Jul 20, 2021
  
  e7b43dde