- 13 Oct, 2021 1 commit
-
-
nateanl authored
-
- 12 Oct, 2021 3 commits
-
-
Caroline Chen authored
-
Yi Zhang authored
-
nateanl authored
-
- 11 Oct, 2021 6 commits
-
-
moto authored
To handle batched input properly.
-
Yi Zhang authored
* set cu113 for unittest_windows_gpu * fix old logic * Update .circleci/regenerate.py Co-authored-by:Nikita Shulga <nikita.shulga@gmail.com>
-
Nikita Shulga authored
* Limit Windows GPU testing to CUDA-11.3 only Which is the only CUDA version that planned to be supported on Windows for the upcoming release * Move unittests to 11.3 as well
-
moto authored
-
moto authored
-
moto authored
-
- 10 Oct, 2021 1 commit
-
-
moto authored
Move the computation of `#classes -> #bits` to the constructor of WaveRNN and attach it to the instance, so that it can be reused elsewhere.
-
- 09 Oct, 2021 1 commit
-
-
moto authored
-
- 08 Oct, 2021 6 commits
- 07 Oct, 2021 7 commits
-
-
Caroline Chen authored
-
nateanl authored
-
moto authored
-
Caroline Chen authored
-
moto authored
This commit merges wav2vec2/hubert factory functions for pre-training and fine-tuning. In #1829, we added parameters to customize the models that are not part of architecture, and `aux_num_out` falls into this category, so it is no longer necessary to have separate functions. This concludes the wav2vec2/HuBERT API update in release 0.10. The summary of BC-breaking changes on wav2vec2 APIs between 0.9 and 0.10 (when this commit is incorporated) 1. `Wav2Vec2Model.extract_features` In 0.9, it was returning the output from `FeatureExtractor` module. In 0.10, it returns the list of outputs from the intermediate layers of `TransformerEncoder` block. 2. `wav2vec2_base(num_out: int)` -> `wav2vec2_base(<dropout_params:float>, aux_num_out: Optional[int]=None)` - `num_out` was renamed to `aux_num_out` and optional. If it is omitted, the resulting model does not have the linear layer for fine-tuning. - Added dropout parameters. -
moto authored
-
moto authored
This commit makes the following changes 1. Make the factory function with full customizability public. i.e. `_get_model(...) -> wav2vec2_model(...)`. 2. Change the other architecture-specific factory functions so that they accept parameters not related to the model architecture (such as dropout). i.e. `wav2vec2_base() -> wav2vec2_base(encoder_projection_dropout, encoder_attention_dropout, encoder_ff_interm_dropout, ...)` ### Why? While adding the pre-trained weight support, I realized that separating API for model construction and pre-trained support achieves simple code organization because of the good separation of concern. As mentioned in #1821, in this framework, 1. Model implementation is responsible for computation logic, 2. factory functions are responsible for customizability and model construction, 3. and pre-trained weight API is responsible for constructing a model and loading pre-trained weights along with the complementary information (such as pre-processing and class labels). (note: for simple models, combining 1 and 2 is also okay.) This means that factory functions has to support all the customizability required by pre-trained weight API. The current implementation uses the internal function like `from .model import Wav2Vec2Model, _get_model`, which is a bit strange. This PR rectifies it by making the mother factory function public. This also clarifies the purpose of having the other factory functions as public API, which is just a syntax sugar for constructing un-trained model with specific architecture. So this commit also adds supplemental parameters to them.
-
- 06 Oct, 2021 7 commits
-
-
kingyiusuen authored
-
moto authored
Add pretrained weights from https://github.com/pytorch/fairseq/tree/main/examples/wav2vec#pre-trained-models - Wav2Vec 2.0 Base / Large / Large (LV-60) - XLSR-53
-
hwangjeff authored
Adds an implementation of Emformer, a memory-efficient transformer architecture introduced in https://ieeexplore.ieee.org/document/9414560 that targets low-latency streaming speech recognition applications.
-
moto authored
-
moto authored
This commit adds - HUBERT_LARGE - HUBERT_XLARGE - HUBERT_ASR_XLARGE
-
moto authored
-
moto authored
-
- 05 Oct, 2021 5 commits
-
-
moto authored
-
moto authored
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/1817 This changes the imports in the `torchaudio` to include the new import locations. ``` codemod -d pytorch/audio --extensions py 'torch.quantization' 'torch.ao.quantization' ``` Reviewed By: mthrok Differential Revision: D31302450 fbshipit-source-id: f31a0d4f453f840ea690edb688555a9d585787b5 Co-authored-by:
Zafar Takhirov <zaf@fb.com>
-
moto authored
-
nateanl authored
-
- 01 Oct, 2021 1 commit
-
-
moto authored
1. Fix the HuBERT xlarge model config 2. In the 48 transformer layers of HuBERT xlarge model, very few elements deviate from the equivalent model of fairseq, and exceeds the default atol 1e-5. This commit relax it to 3e-5 for the specific test.
-
- 30 Sep, 2021 2 commits