Commit 4cb895b6 authored by alexeib's avatar alexeib Committed by Facebook Github Bot
Browse files

add pre-trained wav2vec model

Summary: Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/884

Differential Revision: D17774515

Pulled By: alexeib

fbshipit-source-id: d1ffe8ab723fa284c69b067bbd43d699eaa2f02f
parent 315c463d
...@@ -99,6 +99,7 @@ as well as example training and evaluation commands. ...@@ -99,6 +99,7 @@ as well as example training and evaluation commands.
- [Translation](examples/translation/README.md): convolutional and transformer models are available - [Translation](examples/translation/README.md): convolutional and transformer models are available
- [Language Modeling](examples/language_model/README.md): convolutional and transformer models are available - [Language Modeling](examples/language_model/README.md): convolutional and transformer models are available
- [wav2vec](examples/wav2vec/README.md): wav2vec large model is available
We also have more detailed READMEs to reproduce results from specific papers: We also have more detailed READMEs to reproduce results from specific papers:
- [Jointly Learning to Align and Translate with Transformer Models (Garg et al., 2019)](examples/joint_alignment_translation/README.md ) - [Jointly Learning to Align and Translate with Transformer Models (Garg et al., 2019)](examples/joint_alignment_translation/README.md )
......
...@@ -2,6 +2,27 @@ ...@@ -2,6 +2,27 @@
Example to train a wav2vec model as described in [wav2vec: Unsupervised Pre-training for Speech Recognition (Schneider et al., 2019)](https://arxiv.org/abs/1904.05862). Example to train a wav2vec model as described in [wav2vec: Unsupervised Pre-training for Speech Recognition (Schneider et al., 2019)](https://arxiv.org/abs/1904.05862).
## Pre-trained models
Description | Parameters | Dataset | Model
---|---:|---|---
Wav2Vec large <br> ([(Schneider et al., 2019)](https://arxiv.org/abs/1904.05862)) | 32.5M | [Librispeech](http://www.openslr.org/12) | [download](https://dl.fbaipublicfiles.com/fairseq/wav2vec/wav2vec_large.pt)
#### Example usage:
```python
import torch
from fairseq.models.wav2vec import Wav2VecModel
cp = torch.load('/path/to/wav2vec.pt')
model = Wav2VecModel.build_model(cp['args'], task=None)
model.load_state_dict(cp['model'])
model.eval()
wav_input_16khz = torch.randn(1,10000)
z = model.feature_extractor(wav_input_16khz)
c = model.feature_aggregator(z)
```
## Training a new model with the CLI tools ## Training a new model with the CLI tools
Given a directory containing wav files to be used for pretraining (we recommend splitting each file into separate file 10 to 30 seconds in length) Given a directory containing wav files to be used for pretraining (we recommend splitting each file into separate file 10 to 30 seconds in length)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment