- 15 Oct, 2021 5 commits
-
-
moto authored
Future work items: - length computation of GriffinLim - better way to make InverseMelScale work in inference_mode
-
moto authored
-
moto authored
-
moto authored
-
moto authored
- Move wav2vec2 pretrained weights to `torchaudio.pipelines` namespace to align with #1872. - Split `Wav2Vec2PretrainedModelBundle` into `Wav2Vec2Bundle` (for pre-training model) and `Wav2Vec2ASRBundle` (for models fine-tuned for ASR). - Update base URL
-
- 14 Oct, 2021 1 commit
-
-
Yi Zhang authored
* check cuda installation * check in build.sh * use USE_CUDA * Update pkg_helpers.bash * Fix typo
-
- 13 Oct, 2021 4 commits
-
-
nateanl authored
-
Caroline Chen authored
-
moto authored
-
nateanl authored
-
- 12 Oct, 2021 3 commits
-
-
Caroline Chen authored
-
Yi Zhang authored
-
nateanl authored
-
- 11 Oct, 2021 6 commits
-
-
moto authored
To handle batched input properly.
-
Yi Zhang authored
* set cu113 for unittest_windows_gpu * fix old logic * Update .circleci/regenerate.py Co-authored-by:Nikita Shulga <nikita.shulga@gmail.com>
-
Nikita Shulga authored
* Limit Windows GPU testing to CUDA-11.3 only Which is the only CUDA version that planned to be supported on Windows for the upcoming release * Move unittests to 11.3 as well
-
moto authored
-
moto authored
-
moto authored
-
- 10 Oct, 2021 1 commit
-
-
moto authored
Move the computation of `#classes -> #bits` to the constructor of WaveRNN and attach it to the instance, so that it can be reused elsewhere.
-
- 09 Oct, 2021 1 commit
-
-
moto authored
-
- 08 Oct, 2021 6 commits
- 07 Oct, 2021 7 commits
-
-
Caroline Chen authored
-
nateanl authored
-
moto authored
-
Caroline Chen authored
-
moto authored
This commit merges wav2vec2/hubert factory functions for pre-training and fine-tuning. In #1829, we added parameters to customize the models that are not part of architecture, and `aux_num_out` falls into this category, so it is no longer necessary to have separate functions. This concludes the wav2vec2/HuBERT API update in release 0.10. The summary of BC-breaking changes on wav2vec2 APIs between 0.9 and 0.10 (when this commit is incorporated) 1. `Wav2Vec2Model.extract_features` In 0.9, it was returning the output from `FeatureExtractor` module. In 0.10, it returns the list of outputs from the intermediate layers of `TransformerEncoder` block. 2. `wav2vec2_base(num_out: int)` -> `wav2vec2_base(<dropout_params:float>, aux_num_out: Optional[int]=None)` - `num_out` was renamed to `aux_num_out` and optional. If it is omitted, the resulting model does not have the linear layer for fine-tuning. - Added dropout parameters. -
moto authored
-
moto authored
This commit makes the following changes 1. Make the factory function with full customizability public. i.e. `_get_model(...) -> wav2vec2_model(...)`. 2. Change the other architecture-specific factory functions so that they accept parameters not related to the model architecture (such as dropout). i.e. `wav2vec2_base() -> wav2vec2_base(encoder_projection_dropout, encoder_attention_dropout, encoder_ff_interm_dropout, ...)` ### Why? While adding the pre-trained weight support, I realized that separating API for model construction and pre-trained support achieves simple code organization because of the good separation of concern. As mentioned in #1821, in this framework, 1. Model implementation is responsible for computation logic, 2. factory functions are responsible for customizability and model construction, 3. and pre-trained weight API is responsible for constructing a model and loading pre-trained weights along with the complementary information (such as pre-processing and class labels). (note: for simple models, combining 1 and 2 is also okay.) This means that factory functions has to support all the customizability required by pre-trained weight API. The current implementation uses the internal function like `from .model import Wav2Vec2Model, _get_model`, which is a bit strange. This PR rectifies it by making the mother factory function public. This also clarifies the purpose of having the other factory functions as public API, which is just a syntax sugar for constructing un-trained model with specific architecture. So this commit also adds supplemental parameters to them.
-
- 06 Oct, 2021 6 commits
-
-
kingyiusuen authored
-
moto authored
Add pretrained weights from https://github.com/pytorch/fairseq/tree/main/examples/wav2vec#pre-trained-models - Wav2Vec 2.0 Base / Large / Large (LV-60) - XLSR-53
-
hwangjeff authored
Adds an implementation of Emformer, a memory-efficient transformer architecture introduced in https://ieeexplore.ieee.org/document/9414560 that targets low-latency streaming speech recognition applications.
-
moto authored
-
moto authored
This commit adds - HUBERT_LARGE - HUBERT_XLARGE - HUBERT_ASR_XLARGE
-
moto authored
-