Unverified Commit fad855cd authored by moto's avatar moto Committed by GitHub
Browse files

Move wav2vec2 pretrained models to pipelines module (#1876)

- Move wav2vec2 pretrained weights to `torchaudio.pipelines` namespace to align with #1872.
- Split `Wav2Vec2PretrainedModelBundle` into `Wav2Vec2Bundle` (for pre-training model) and  `Wav2Vec2ASRBundle` (for models fine-tuned for ASR).
- Update base URL
parent c22962d1
...@@ -35,6 +35,7 @@ The :mod:`torchaudio` package consists of I/O, popular datasets and common audio ...@@ -35,6 +35,7 @@ The :mod:`torchaudio` package consists of I/O, popular datasets and common audio
transforms transforms
datasets datasets
models models
pipelines
sox_effects sox_effects
compliance.kaldi compliance.kaldi
kaldi_io kaldi_io
......
...@@ -111,160 +111,6 @@ hubert_xlarge ...@@ -111,160 +111,6 @@ hubert_xlarge
.. autofunction:: hubert_xlarge .. autofunction:: hubert_xlarge
Pre-trained Models
------------------
.. autoclass:: Wav2Vec2PretrainedModelBundle
.. automethod:: get_model
.. automethod:: get_labels
WAV2VEC2_BASE
^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_BASE
:no-value:
WAV2VEC2_ASR_BASE_10M
^^^^^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: torchaudio.models.WAV2VEC2_ASR_BASE_10M
:no-value:
WAV2VEC2_ASR_BASE_100H
^^^^^^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_BASE_100H
:no-value:
WAV2VEC2_ASR_BASE_960H
^^^^^^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_BASE_960H
:no-value:
WAV2VEC2_LARGE
^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_LARGE
:no-value:
WAV2VEC2_ASR_LARGE_10M
^^^^^^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_10M
:no-value:
WAV2VEC2_ASR_LARGE_100H
^^^^^^^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_100H
:no-value:
WAV2VEC2_ASR_LARGE_960H
^^^^^^^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_960H
:no-value:
WAV2VEC2_LARGE_LV60K
^^^^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_LARGE_LV60K
:no-value:
WAV2VEC2_ASR_LARGE_LV60K_10M
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_LV60K_10M
:no-value:
WAV2VEC2_ASR_LARGE_LV60K_100H
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_LV60K_100H
:no-value:
WAV2VEC2_ASR_LARGE_LV60K_960H
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_LV60K_960H
:no-value:
WAV2VEC2_XLSR53
^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: WAV2VEC2_XLSR53
:no-value:
HUBERT_BASE
^^^^^^^^^^^
.. container:: py attribute
.. autodata:: HUBERT_BASE
:no-value:
HUBERT_LARGE
^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: HUBERT_LARGE
:no-value:
HUBERT_XLARGE
^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: HUBERT_XLARGE
:no-value:
HUBERT_ASR_LARGE
^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: HUBERT_ASR_LARGE
:no-value:
HUBERT_ASR_XLARGE
^^^^^^^^^^^^^^^^^
.. container:: py attribute
.. autodata:: HUBERT_ASR_XLARGE
:no-value:
Utility Functions Utility Functions
----------------- -----------------
......
torchaudio.pipelines
====================
.. currentmodule:: torchaudio.pipelines
The pipelines subpackage contains API to access the models with pretrained weights, and information/helper functions associated the pretrained weights.
wav2vec 2.0 / HuBERT - Representation Learning
----------------------------------------------
.. autoclass:: Wav2Vec2Bundle
.. automethod:: get_model
WAV2VEC2_BASE
-------------
.. container:: py attribute
.. autodata:: WAV2VEC2_BASE
:no-value:
WAV2VEC2_LARGE
--------------
.. container:: py attribute
.. autodata:: WAV2VEC2_LARGE
:no-value:
WAV2VEC2_LARGE_LV60K
--------------------
.. container:: py attribute
.. autodata:: WAV2VEC2_LARGE_LV60K
:no-value:
WAV2VEC2_XLSR53
---------------
.. container:: py attribute
.. autodata:: WAV2VEC2_XLSR53
:no-value:
HUBERT_BASE
-----------
.. container:: py attribute
.. autodata:: HUBERT_BASE
:no-value:
HUBERT_LARGE
------------
.. container:: py attribute
.. autodata:: HUBERT_LARGE
:no-value:
HUBERT_XLARGE
-------------
.. container:: py attribute
.. autodata:: HUBERT_XLARGE
:no-value:
wav2vec 2.0 / HuBERT - ASR fine-tuning
--------------------------------------
.. autoclass:: Wav2Vec2ASRBundle
.. automethod:: get_model
.. automethod:: get_labels
WAV2VEC2_ASR_BASE_10M
---------------------
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_BASE_10M
:no-value:
WAV2VEC2_ASR_BASE_100H
----------------------
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_BASE_100H
:no-value:
WAV2VEC2_ASR_BASE_960H
----------------------
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_BASE_960H
:no-value:
WAV2VEC2_ASR_LARGE_10M
----------------------
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_10M
:no-value:
WAV2VEC2_ASR_LARGE_100H
-----------------------
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_100H
:no-value:
WAV2VEC2_ASR_LARGE_960H
-----------------------
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_960H
:no-value:
WAV2VEC2_ASR_LARGE_LV60K_10M
----------------------------
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_LV60K_10M
:no-value:
WAV2VEC2_ASR_LARGE_LV60K_100H
-----------------------------
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_LV60K_100H
:no-value:
WAV2VEC2_ASR_LARGE_LV60K_960H
-----------------------------
.. container:: py attribute
.. autodata:: WAV2VEC2_ASR_LARGE_LV60K_960H
:no-value:
HUBERT_ASR_LARGE
----------------
.. container:: py attribute
.. autodata:: HUBERT_ASR_LARGE
:no-value:
HUBERT_ASR_XLARGE
-----------------
.. container:: py attribute
.. autodata:: HUBERT_ASR_XLARGE
:no-value:
References
----------
.. footbibliography::
import torchaudio import torchaudio
from torchaudio.models import ( from torchaudio.pipelines import (
WAV2VEC2_BASE, WAV2VEC2_BASE,
WAV2VEC2_LARGE, WAV2VEC2_LARGE,
WAV2VEC2_LARGE_LV60K, WAV2VEC2_LARGE_LV60K,
......
...@@ -4,6 +4,7 @@ from torchaudio import ( ...@@ -4,6 +4,7 @@ from torchaudio import (
datasets, datasets,
functional, functional,
models, models,
pipelines,
kaldi_io, kaldi_io,
utils, utils,
sox_effects, sox_effects,
...@@ -26,6 +27,7 @@ __all__ = [ ...@@ -26,6 +27,7 @@ __all__ = [
'datasets', 'datasets',
'functional', 'functional',
'models', 'models',
'pipelines',
'kaldi_io', 'kaldi_io',
'utils', 'utils',
'sox_effects', 'sox_effects',
......
...@@ -13,27 +13,6 @@ from .wav2vec2 import ( ...@@ -13,27 +13,6 @@ from .wav2vec2 import (
hubert_large, hubert_large,
hubert_xlarge, hubert_xlarge,
) )
from .wav2vec2.pretrained import (
Wav2Vec2PretrainedModelBundle,
WAV2VEC2_BASE,
WAV2VEC2_LARGE,
WAV2VEC2_LARGE_LV60K,
WAV2VEC2_ASR_BASE_10M,
WAV2VEC2_ASR_BASE_100H,
WAV2VEC2_ASR_BASE_960H,
WAV2VEC2_ASR_LARGE_10M,
WAV2VEC2_ASR_LARGE_100H,
WAV2VEC2_ASR_LARGE_960H,
WAV2VEC2_ASR_LARGE_LV60K_10M,
WAV2VEC2_ASR_LARGE_LV60K_100H,
WAV2VEC2_ASR_LARGE_LV60K_960H,
WAV2VEC2_XLSR53,
HUBERT_BASE,
HUBERT_LARGE,
HUBERT_XLARGE,
HUBERT_ASR_LARGE,
HUBERT_ASR_XLARGE,
)
__all__ = [ __all__ = [
'Wav2Letter', 'Wav2Letter',
...@@ -49,25 +28,6 @@ __all__ = [ ...@@ -49,25 +28,6 @@ __all__ = [
'hubert_base', 'hubert_base',
'hubert_large', 'hubert_large',
'hubert_xlarge', 'hubert_xlarge',
'Wav2Vec2PretrainedModelBundle',
'WAV2VEC2_BASE',
'WAV2VEC2_LARGE',
'WAV2VEC2_LARGE_LV60K',
'WAV2VEC2_ASR_BASE_10M',
'WAV2VEC2_ASR_BASE_100H',
'WAV2VEC2_ASR_BASE_960H',
'WAV2VEC2_ASR_LARGE_10M',
'WAV2VEC2_ASR_LARGE_100H',
'WAV2VEC2_ASR_LARGE_960H',
'WAV2VEC2_ASR_LARGE_LV60K_10M',
'WAV2VEC2_ASR_LARGE_LV60K_100H',
'WAV2VEC2_ASR_LARGE_LV60K_960H',
'WAV2VEC2_XLSR53',
'HUBERT_BASE',
'HUBERT_LARGE',
'HUBERT_XLARGE',
'HUBERT_ASR_LARGE',
'HUBERT_ASR_XLARGE',
'Tacotron2', 'Tacotron2',
'tacotron2', 'tacotron2',
] ]
from ._wav2vec2 import (
Wav2Vec2Bundle,
Wav2Vec2ASRBundle,
WAV2VEC2_BASE,
WAV2VEC2_LARGE,
WAV2VEC2_LARGE_LV60K,
WAV2VEC2_ASR_BASE_10M,
WAV2VEC2_ASR_BASE_100H,
WAV2VEC2_ASR_BASE_960H,
WAV2VEC2_ASR_LARGE_10M,
WAV2VEC2_ASR_LARGE_100H,
WAV2VEC2_ASR_LARGE_960H,
WAV2VEC2_ASR_LARGE_LV60K_10M,
WAV2VEC2_ASR_LARGE_LV60K_100H,
WAV2VEC2_ASR_LARGE_LV60K_960H,
WAV2VEC2_XLSR53,
HUBERT_BASE,
HUBERT_LARGE,
HUBERT_XLARGE,
HUBERT_ASR_LARGE,
HUBERT_ASR_XLARGE,
)
__all__ = [
'Wav2Vec2Bundle',
'Wav2Vec2ASRBundle',
'WAV2VEC2_BASE',
'WAV2VEC2_LARGE',
'WAV2VEC2_LARGE_LV60K',
'WAV2VEC2_ASR_BASE_10M',
'WAV2VEC2_ASR_BASE_100H',
'WAV2VEC2_ASR_BASE_960H',
'WAV2VEC2_ASR_LARGE_10M',
'WAV2VEC2_ASR_LARGE_100H',
'WAV2VEC2_ASR_LARGE_960H',
'WAV2VEC2_ASR_LARGE_LV60K_10M',
'WAV2VEC2_ASR_LARGE_LV60K_100H',
'WAV2VEC2_ASR_LARGE_LV60K_960H',
'WAV2VEC2_XLSR53',
'HUBERT_BASE',
'HUBERT_LARGE',
'HUBERT_XLARGE',
'HUBERT_ASR_LARGE',
'HUBERT_ASR_XLARGE',
]
from dataclasses import dataclass from dataclasses import dataclass
from typing import Dict, Tuple, Any, Optional from typing import Dict, Tuple, Any
from torch.hub import load_state_dict_from_url from torch.hub import load_state_dict_from_url
from .model import wav2vec2_model, Wav2Vec2Model from torchaudio.models import wav2vec2_model, Wav2Vec2Model
__all__ = [] __all__ = []
@dataclass @dataclass
class Wav2Vec2PretrainedModelBundle: class Wav2Vec2Bundle:
"""torchaudio.models.Wav2Vec2PretrainedModelBundle() """torchaudio.pipelines.Wav2Vec2Bundle()
Data class that bundles associated information to use pretrained Wav2Vec2Model. Data class that bundles associated information to use pretrained Wav2Vec2Model.
...@@ -24,7 +24,7 @@ class Wav2Vec2PretrainedModelBundle: ...@@ -24,7 +24,7 @@ class Wav2Vec2PretrainedModelBundle:
Please see below for the usage and the available values. Please see below for the usage and the available values.
Example - Pretraining model Example - Feature Extraction
>>> import torchaudio >>> import torchaudio
>>> >>>
>>> # Build the model and load pretrained weight. >>> # Build the model and load pretrained weight.
...@@ -34,32 +34,14 @@ class Wav2Vec2PretrainedModelBundle: ...@@ -34,32 +34,14 @@ class Wav2Vec2PretrainedModelBundle:
>>> # Extract acoustic features >>> # Extract acoustic features
>>> waveform, sample_rate = torchaudio.load('my_speech.mp3') >>> waveform, sample_rate = torchaudio.load('my_speech.mp3')
>>> features, _ = model.extract_features(waveform) >>> features, _ = model.extract_features(waveform)
Example - Model fine-tuned for ASR
>>> import torchaudio
>>>
>>> # Build the model and load pretrained weight.
>>> model = torchaudio.models.HUBERT_ASR_LARGE.get_model()
Downloading:
100%|███████████████████████████████| 1.18G/1.18G [00:17<00:00, 73.8MB/s]
>>> # Check the corresponding labels of the output.
>>> labels = torchaudio.models.HUBERT_ASR_LARGE.get_labels()
>>> print(labels)
('<s>', '<pad>', '</s>', '<unk>', '|', 'E', 'T', 'A', 'O', 'N', 'I', 'H', 'S', 'R', 'D', 'L', 'U', 'M', 'W', 'C', 'F', 'G', 'Y', 'P', 'B', 'V', 'K', "'", 'X', 'J', 'Q', 'Z')
>>> # Infer the label probability distribution
>>> waveform, sample_rate = torchaudio.load('my_speech.mp3')
>>> emissions, _ = model(waveform)
>>> # Pass emission to decoder
>>> # `ctc_decode` is for illustration purpose only
>>> transcripts = ctc_decode(emissions, labels)
""" # noqa: E501 """ # noqa: E501
_path: str _path: str
_params: Dict[str, Any] _params: Dict[str, Any]
_labels: Optional[Tuple[str]]
def get_model(self, *, dl_kwargs=None) -> Wav2Vec2Model: def get_model(self, *, dl_kwargs=None) -> Wav2Vec2Model:
"""Construct the model and load the pretrained weight. """get_model(self, *, dl_kwargs=None) -> torchaudio.models.Wav2Vec2Model
Construct the model and load the pretrained weight.
The weight file is downloaded from the internet and cached with The weight file is downloaded from the internet and cached with
:func:`torch.hub.load_state_dict_from_url` :func:`torch.hub.load_state_dict_from_url`
...@@ -68,13 +50,50 @@ class Wav2Vec2PretrainedModelBundle: ...@@ -68,13 +50,50 @@ class Wav2Vec2PretrainedModelBundle:
dl_kwargs (dictionary of keyword arguments): Passed to :func:`torch.hub.load_state_dict_from_url`. dl_kwargs (dictionary of keyword arguments): Passed to :func:`torch.hub.load_state_dict_from_url`.
""" """
model = wav2vec2_model(**self._params) model = wav2vec2_model(**self._params)
url = f'https://download.pytorch.org/models/audio/{self._path}' url = f'https://download.pytorch.org/torchaudio/models/{self._path}'
dl_kwargs = {} if dl_kwargs is None else dl_kwargs dl_kwargs = {} if dl_kwargs is None else dl_kwargs
state_dict = load_state_dict_from_url(url, **dl_kwargs) state_dict = load_state_dict_from_url(url, **dl_kwargs)
model.load_state_dict(state_dict) model.load_state_dict(state_dict)
model.eval() model.eval()
return model return model
@dataclass
class Wav2Vec2ASRBundle(Wav2Vec2Bundle):
"""torchaudio.pipelines.Wav2Vec2ASRBundle()
Data class that bundles associated information to use pretrained Wav2Vec2Model.
This class provides interfaces for instantiating the pretrained model along with
the information necessary to retrieve pretrained weights and additional data
to be used with the model.
Torchaudio library instantiates objects of this class, each of which represents
a different pretrained model. Client code should access pretrained models via these
instances.
Please see below for the usage and the available values.
Example - ASR
>>> import torchaudio
>>>
>>> # Build the model and load pretrained weight.
>>> model = torchaudio.models.HUBERT_ASR_LARGE.get_model()
Downloading:
100%|███████████████████████████████| 1.18G/1.18G [00:17<00:00, 73.8MB/s]
>>> # Check the corresponding labels of the output.
>>> labels = torchaudio.models.HUBERT_ASR_LARGE.get_labels()
>>> print(labels)
('<s>', '<pad>', '</s>', '<unk>', '|', 'E', 'T', 'A', 'O', 'N', 'I', 'H', 'S', 'R', 'D', 'L', 'U', 'M', 'W', 'C', 'F', 'G', 'Y', 'P', 'B', 'V', 'K', "'", 'X', 'J', 'Q', 'Z')
>>> # Infer the label probability distribution
>>> waveform, sample_rate = torchaudio.load('my_speech.mp3')
>>> emissions, _ = model(waveform)
>>> # Pass emission to decoder
>>> # `ctc_decode` is for illustration purpose only
>>> transcripts = ctc_decode(emissions, labels)
""" # noqa: E501
_labels: Tuple[str]
def get_labels( def get_labels(
self, self,
*, *,
...@@ -143,7 +162,7 @@ def _get_labels(): ...@@ -143,7 +162,7 @@ def _get_labels():
) )
WAV2VEC2_BASE = Wav2Vec2PretrainedModelBundle( WAV2VEC2_BASE = Wav2Vec2Bundle(
_path='wav2vec2_fairseq_base_ls960.pth', _path='wav2vec2_fairseq_base_ls960.pth',
_params={ _params={
'extractor_mode': 'group_norm', 'extractor_mode': 'group_norm',
...@@ -171,7 +190,6 @@ WAV2VEC2_BASE = Wav2Vec2PretrainedModelBundle( ...@@ -171,7 +190,6 @@ WAV2VEC2_BASE = Wav2Vec2PretrainedModelBundle(
'encoder_layer_drop': 0.05, 'encoder_layer_drop': 0.05,
"aux_num_out": None, "aux_num_out": None,
}, },
_labels=None,
) )
WAV2VEC2_BASE.__doc__ = """wav2vec 2.0 model with "Base" configuration. WAV2VEC2_BASE.__doc__ = """wav2vec 2.0 model with "Base" configuration.
...@@ -183,9 +201,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -183,9 +201,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2Bundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_ASR_BASE_10M = Wav2Vec2PretrainedModelBundle( WAV2VEC2_ASR_BASE_10M = Wav2Vec2ASRBundle(
_path='wav2vec2_fairseq_base_ls960_asr_ll10m.pth', _path='wav2vec2_fairseq_base_ls960_asr_ll10m.pth',
_params={ _params={
'extractor_mode': 'group_norm', 'extractor_mode': 'group_norm',
...@@ -226,9 +246,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -226,9 +246,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_ASR_BASE_100H = Wav2Vec2PretrainedModelBundle( WAV2VEC2_ASR_BASE_100H = Wav2Vec2ASRBundle(
'wav2vec2_fairseq_base_ls960_asr_ls100.pth', 'wav2vec2_fairseq_base_ls960_asr_ls100.pth',
{ {
'extractor_mode': 'group_norm', 'extractor_mode': 'group_norm',
...@@ -269,9 +291,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -269,9 +291,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_ASR_BASE_960H = Wav2Vec2PretrainedModelBundle( WAV2VEC2_ASR_BASE_960H = Wav2Vec2ASRBundle(
'wav2vec2_fairseq_base_ls960_asr_ls960.pth', 'wav2vec2_fairseq_base_ls960_asr_ls960.pth',
{ {
"extractor_mode": "group_norm", "extractor_mode": "group_norm",
...@@ -311,9 +335,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -311,9 +335,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_LARGE = Wav2Vec2PretrainedModelBundle( WAV2VEC2_LARGE = Wav2Vec2Bundle(
'wav2vec2_fairseq_large_ls960.pth', 'wav2vec2_fairseq_large_ls960.pth',
{ {
"extractor_mode": "group_norm", "extractor_mode": "group_norm",
...@@ -341,7 +367,6 @@ WAV2VEC2_LARGE = Wav2Vec2PretrainedModelBundle( ...@@ -341,7 +367,6 @@ WAV2VEC2_LARGE = Wav2Vec2PretrainedModelBundle(
"encoder_layer_drop": 0.2, "encoder_layer_drop": 0.2,
"aux_num_out": None, "aux_num_out": None,
}, },
_labels=None,
) )
WAV2VEC2_LARGE.__doc__ = """Build "large" wav2vec2 model. WAV2VEC2_LARGE.__doc__ = """Build "large" wav2vec2 model.
...@@ -353,9 +378,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -353,9 +378,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2Bundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_ASR_LARGE_10M = Wav2Vec2PretrainedModelBundle( WAV2VEC2_ASR_LARGE_10M = Wav2Vec2ASRBundle(
'wav2vec2_fairseq_large_ls960_asr_ll10m.pth', 'wav2vec2_fairseq_large_ls960_asr_ll10m.pth',
{ {
"extractor_mode": "group_norm", "extractor_mode": "group_norm",
...@@ -396,9 +423,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -396,9 +423,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_ASR_LARGE_100H = Wav2Vec2PretrainedModelBundle( WAV2VEC2_ASR_LARGE_100H = Wav2Vec2ASRBundle(
'wav2vec2_fairseq_large_ls960_asr_ls100.pth', 'wav2vec2_fairseq_large_ls960_asr_ls100.pth',
{ {
"extractor_mode": "group_norm", "extractor_mode": "group_norm",
...@@ -439,9 +468,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -439,9 +468,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_ASR_LARGE_960H = Wav2Vec2PretrainedModelBundle( WAV2VEC2_ASR_LARGE_960H = Wav2Vec2ASRBundle(
'wav2vec2_fairseq_large_ls960_asr_ls960.pth', 'wav2vec2_fairseq_large_ls960_asr_ls960.pth',
{ {
"extractor_mode": "group_norm", "extractor_mode": "group_norm",
...@@ -481,9 +512,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -481,9 +512,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_LARGE_LV60K = Wav2Vec2PretrainedModelBundle( WAV2VEC2_LARGE_LV60K = Wav2Vec2Bundle(
'wav2vec2_fairseq_large_lv60k.pth', 'wav2vec2_fairseq_large_lv60k.pth',
{ {
"extractor_mode": "layer_norm", "extractor_mode": "layer_norm",
...@@ -511,7 +544,6 @@ WAV2VEC2_LARGE_LV60K = Wav2Vec2PretrainedModelBundle( ...@@ -511,7 +544,6 @@ WAV2VEC2_LARGE_LV60K = Wav2Vec2PretrainedModelBundle(
"encoder_layer_drop": 0.0, "encoder_layer_drop": 0.0,
"aux_num_out": None, "aux_num_out": None,
}, },
_labels=None,
) )
WAV2VEC2_LARGE_LV60K.__doc__ = """Build "large-lv60k" wav2vec2 model. WAV2VEC2_LARGE_LV60K.__doc__ = """Build "large-lv60k" wav2vec2 model.
...@@ -523,9 +555,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -523,9 +555,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2Bundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_ASR_LARGE_LV60K_10M = Wav2Vec2PretrainedModelBundle( WAV2VEC2_ASR_LARGE_LV60K_10M = Wav2Vec2ASRBundle(
'wav2vec2_fairseq_large_lv60k_asr_ll10m.pth', 'wav2vec2_fairseq_large_lv60k_asr_ll10m.pth',
{ {
"extractor_mode": "layer_norm", "extractor_mode": "layer_norm",
...@@ -566,9 +600,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -566,9 +600,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_ASR_LARGE_LV60K_100H = Wav2Vec2PretrainedModelBundle( WAV2VEC2_ASR_LARGE_LV60K_100H = Wav2Vec2ASRBundle(
'wav2vec2_fairseq_large_lv60k_asr_ls100.pth', 'wav2vec2_fairseq_large_lv60k_asr_ls100.pth',
{ {
"extractor_mode": "layer_norm", "extractor_mode": "layer_norm",
...@@ -609,9 +645,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -609,9 +645,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_ASR_LARGE_LV60K_960H = Wav2Vec2PretrainedModelBundle( WAV2VEC2_ASR_LARGE_LV60K_960H = Wav2Vec2ASRBundle(
'wav2vec2_fairseq_large_lv60k_asr_ls960.pth', 'wav2vec2_fairseq_large_lv60k_asr_ls960.pth',
{ {
"extractor_mode": "layer_norm", "extractor_mode": "layer_norm",
...@@ -653,9 +691,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2 ...@@ -653,9 +691,11 @@ Originally published by the authors of *wav2vec 2.0* [:footcite:`baevski2020wav2
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
WAV2VEC2_XLSR53 = Wav2Vec2PretrainedModelBundle( WAV2VEC2_XLSR53 = Wav2Vec2Bundle(
'wav2vec2_fairseq_large_xlsr53.pth', 'wav2vec2_fairseq_large_xlsr53.pth',
{ {
"extractor_mode": "layer_norm", "extractor_mode": "layer_norm",
...@@ -683,7 +723,6 @@ WAV2VEC2_XLSR53 = Wav2Vec2PretrainedModelBundle( ...@@ -683,7 +723,6 @@ WAV2VEC2_XLSR53 = Wav2Vec2PretrainedModelBundle(
"encoder_layer_drop": 0.0, "encoder_layer_drop": 0.0,
"aux_num_out": None, "aux_num_out": None,
}, },
_labels=None,
) )
WAV2VEC2_XLSR53.__doc__ = """wav2vec 2.0 model with "Base" configuration. WAV2VEC2_XLSR53.__doc__ = """wav2vec 2.0 model with "Base" configuration.
...@@ -698,9 +737,11 @@ Originally published by the authors of ...@@ -698,9 +737,11 @@ Originally published by the authors of
[:footcite:`conneau2020unsupervised`] under MIT License and redistributed with the same license. [:footcite:`conneau2020unsupervised`] under MIT License and redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/wav2vec#pre-trained-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2Bundle` for the usage.
""" # noqa: E501 """ # noqa: E501
HUBERT_BASE = Wav2Vec2PretrainedModelBundle( HUBERT_BASE = Wav2Vec2Bundle(
'hubert_fairseq_base_ls960.pth', 'hubert_fairseq_base_ls960.pth',
{ {
'extractor_mode': 'group_norm', 'extractor_mode': 'group_norm',
...@@ -728,7 +769,6 @@ HUBERT_BASE = Wav2Vec2PretrainedModelBundle( ...@@ -728,7 +769,6 @@ HUBERT_BASE = Wav2Vec2PretrainedModelBundle(
'encoder_layer_drop': 0.05, 'encoder_layer_drop': 0.05,
'aux_num_out': None, 'aux_num_out': None,
}, },
_labels=None,
) )
HUBERT_BASE.__doc__ = """HuBERT model with "Base" configuration. HUBERT_BASE.__doc__ = """HuBERT model with "Base" configuration.
...@@ -740,9 +780,11 @@ Originally published by the authors of *HuBERT* [:footcite:`hsu2021hubert`] unde ...@@ -740,9 +780,11 @@ Originally published by the authors of *HuBERT* [:footcite:`hsu2021hubert`] unde
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/hubert#pre-trained-and-fine-tuned-asr-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/hubert#pre-trained-and-fine-tuned-asr-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2Bundle` for the usage.
""" # noqa: E501 """ # noqa: E501
HUBERT_LARGE = Wav2Vec2PretrainedModelBundle( HUBERT_LARGE = Wav2Vec2Bundle(
'hubert_fairseq_large_ll60k.pth', 'hubert_fairseq_large_ll60k.pth',
{ {
'extractor_mode': 'layer_norm', 'extractor_mode': 'layer_norm',
...@@ -770,7 +812,6 @@ HUBERT_LARGE = Wav2Vec2PretrainedModelBundle( ...@@ -770,7 +812,6 @@ HUBERT_LARGE = Wav2Vec2PretrainedModelBundle(
'encoder_layer_drop': 0.0, 'encoder_layer_drop': 0.0,
'aux_num_out': None, 'aux_num_out': None,
}, },
_labels=None,
) )
HUBERT_LARGE.__doc__ = """HuBERT model with "Large" configuration. HUBERT_LARGE.__doc__ = """HuBERT model with "Large" configuration.
...@@ -782,9 +823,11 @@ Originally published by the authors of *HuBERT* [:footcite:`hsu2021hubert`] unde ...@@ -782,9 +823,11 @@ Originally published by the authors of *HuBERT* [:footcite:`hsu2021hubert`] unde
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/hubert#pre-trained-and-fine-tuned-asr-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/hubert#pre-trained-and-fine-tuned-asr-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2Bundle` for the usage.
""" # noqa: E501 """ # noqa: E501
HUBERT_XLARGE = Wav2Vec2PretrainedModelBundle( HUBERT_XLARGE = Wav2Vec2Bundle(
'hubert_fairseq_xlarge_ll60k.pth', 'hubert_fairseq_xlarge_ll60k.pth',
{ {
'extractor_mode': 'layer_norm', 'extractor_mode': 'layer_norm',
...@@ -812,7 +855,6 @@ HUBERT_XLARGE = Wav2Vec2PretrainedModelBundle( ...@@ -812,7 +855,6 @@ HUBERT_XLARGE = Wav2Vec2PretrainedModelBundle(
'encoder_layer_drop': 0.0, 'encoder_layer_drop': 0.0,
'aux_num_out': None, 'aux_num_out': None,
}, },
_labels=None,
) )
HUBERT_XLARGE.__doc__ = """HuBERT model with "Extra Large" configuration. HUBERT_XLARGE.__doc__ = """HuBERT model with "Extra Large" configuration.
...@@ -824,9 +866,11 @@ Originally published by the authors of *HuBERT* [:footcite:`hsu2021hubert`] unde ...@@ -824,9 +866,11 @@ Originally published by the authors of *HuBERT* [:footcite:`hsu2021hubert`] unde
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/hubert#pre-trained-and-fine-tuned-asr-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/hubert#pre-trained-and-fine-tuned-asr-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2Bundle` for the usage.
""" # noqa: E501 """ # noqa: E501
HUBERT_ASR_LARGE = Wav2Vec2PretrainedModelBundle( HUBERT_ASR_LARGE = Wav2Vec2ASRBundle(
'hubert_fairseq_large_ll60k_asr_ls960.pth', 'hubert_fairseq_large_ll60k_asr_ls960.pth',
{ {
'extractor_mode': 'layer_norm', 'extractor_mode': 'layer_norm',
...@@ -868,9 +912,11 @@ Originally published by the authors of *HuBERT* [:footcite:`hsu2021hubert`] unde ...@@ -868,9 +912,11 @@ Originally published by the authors of *HuBERT* [:footcite:`hsu2021hubert`] unde
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/hubert#pre-trained-and-fine-tuned-asr-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/hubert#pre-trained-and-fine-tuned-asr-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
HUBERT_ASR_XLARGE = Wav2Vec2PretrainedModelBundle( HUBERT_ASR_XLARGE = Wav2Vec2ASRBundle(
'hubert_fairseq_xlarge_ll60k_asr_ls960.pth', 'hubert_fairseq_xlarge_ll60k_asr_ls960.pth',
{ {
'extractor_mode': 'layer_norm', 'extractor_mode': 'layer_norm',
...@@ -912,4 +958,6 @@ Originally published by the authors of *HuBERT* [:footcite:`hsu2021hubert`] unde ...@@ -912,4 +958,6 @@ Originally published by the authors of *HuBERT* [:footcite:`hsu2021hubert`] unde
redistributed with the same license. redistributed with the same license.
[`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__, [`License <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/LICENSE>`__,
`Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/hubert#pre-trained-and-fine-tuned-asr-models>`__] `Source <https://github.com/pytorch/fairseq/blob/ce6c9eeae163ac04b79539c78e74f292f29eaa18/examples/hubert#pre-trained-and-fine-tuned-asr-models>`__]
Please refer to :func:`torchaudio.pipelines.Wav2Vec2ASRBundle` for the usage.
""" # noqa: E501 """ # noqa: E501
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment