Commits · 4a735b8e98c2814e89b8868b3b64bfc409dae5cf · OpenDAS / Torchaudio

28 Sep, 2021 1 commit

Add HuBERT model architectures (#1769) · a7854f33

moto authored Sep 28, 2021

This commit adds the following HuBERT model architectures

 - `base` (pre-training)
 - `large` (pre-training / fine-tuning)
 - `xlarge` (pre-training / fine-tuning)

Since the internal components are same as `Wav2Vec2Model`, it reuses the existing modules..
With these models, it is possible to 
- import the pre-trained model published by `fairseq` and TorchScript it.
- fine-tune the existing model for downstream task.

a7854f33

24 Sep, 2021 1 commit

[BC-Breaking] Split pretraining and finetuning factory functions (#1783) · b2e9f1e4

moto authored Sep 24, 2021

* [BC-Breaking] Split pretraining and finetuning factory functions

Previously, factory functions of wav2vec2 only generated the architecture
for the fine-tuning architecture used in wav2ve2 paper for ASR task.
That is, pre-training architecture + Linear module, and it did not
provide a straightforward way to generate architectures for pre-training.

The goal of the original implementation was to allow the inference of
wav2vec2 in non-Python environment via TorchScript. Now we would like to
expand it to pre-training/fine-tuning and HuBERT model as well.

Therefore, we need to have factory functions for both pre-training and
fine-tuning. This commit introduces new factory functions and separate
functions for pre-training and fine-tuning.

1. New functions for ASR fine-tuning.

We introdcue `wav2vec2_asr_XXX` functions which generates the architecture
used for the fine-tuning task in wav2vec2 paper. *1

2. Re-purpse the old functions

The existing functions, `wav2vec2_XXX`, now generates the architecture with
pre-trainig module only. (no Linear module)

Note
*1 This architecture is just one way to define architecture for fine-tuning
and it is not universal definition. The new `wav2vec2_asr_XXX` functions are
designed to provide these specific fine-tuning configuration and they are not
meant to support generic architecture for downstream task.

b2e9f1e4

17 Sep, 2021 1 commit
- [DOC] Fix model subsections (#1775) · 88ca1e05
  moto authored Sep 17, 2021
  
  88ca1e05
23 Aug, 2021 1 commit
- Refactor WaveRNN infer and move it to the codebase (#1704) · 3bb5feb5
  yangarbiter authored Aug 23, 2021
  
  3bb5feb5
18 Aug, 2021 1 commit
- Move Tacotron2 out of prototype (#1714) · 352d63c5
  yangarbiter authored Aug 17, 2021
  
  352d63c5
20 Jul, 2021 1 commit
- Add pretrained weights for wavernn (#1612) · 8ec6b873
  yangarbiter authored Jul 20, 2021
  
  8ec6b873
03 Jun, 2021 1 commit

Update docs (#1550) · 0166a851

moto authored Jun 03, 2021

* Use `bibtex` for paper citations.
  * add `override.css` for fixing back reference.
  * wav2vec2
  * wav2letter
  * convtasnet
  * deepspeech
  * rnnt-loss
  * griffinlim
* Fix broken references in `filtering`.
* Fix note in soundfile backends.
* Tweak wav2vec2 example.
* Removes unused `pytorch_theme.css`

0166a851

01 Jun, 2021 1 commit
- Add wav2vec2 fairseq importer (#1531) · f1a0b605
  moto authored Jun 01, 2021
  
  f1a0b605
27 May, 2021 2 commits

Add wav2vec2 HuggingFace importer (#1530) · c8239c64
moto authored May 27, 2021

c8239c64

Add wav2vec2.0 model (#1529) · e6886a4d

moto authored May 27, 2021

- TorchScript-able `Wav2Vec2Model` class
- Factory functions for three configurations presented in the paper 
  - `wav2vec2_base`
  - `wav2vec2_large`
  - `wav2vec2_large_lv60k`

e6886a4d

11 May, 2021 1 commit
- Add vanilla DeepSpeech model (#1399) · 1f136671
  discort authored May 12, 2021
```
Co-authored-by: Vincent Quenneville-Belair <vincentqb@gmail.com>
```
  1f136671
01 Oct, 2020 1 commit
- Update model documentation (#933) · 1df9e201
  moto authored Oct 01, 2020
  
  1df9e201
29 Jul, 2020 1 commit
- Add model name in docs (#836) · de1cb83d
  jimchen90 authored Jul 29, 2020
```
Co-authored-by: Ji Chen <jimchen90@devfair0160.h2.fair>
```
  de1cb83d
28 Apr, 2020 1 commit

Add model Wav2Letter (#462) · d678357f

Tomás Osório authored Apr 28, 2020

* add wav2letter model

* add unit_test to model

* add docstrings

* add documentation

* fix minor error, change logic on forward

* update padding same with ceil

* add inline typing and minor fixes to docstrings

* remove python2

* add formula do docstrings, change param name

* add test with mfcc, add pytest

* fix bug, update docstrings

* change parameter name

d678357f