1. 28 Sep, 2021 1 commit
    • moto's avatar
      Add HuBERT model architectures (#1769) · a7854f33
      moto authored
      This commit adds the following HuBERT model architectures
      
       - `base` (pre-training)
       - `large` (pre-training / fine-tuning)
       - `xlarge` (pre-training / fine-tuning)
      
      Since the internal components are same as `Wav2Vec2Model`, it reuses the existing modules..
      With these models, it is possible to 
      - import the pre-trained model published by `fairseq` and TorchScript it.
      - fine-tune the existing model for downstream task.
      a7854f33
  2. 25 Sep, 2021 1 commit
  3. 24 Sep, 2021 1 commit
    • moto's avatar
      [BC-Breaking] Split pretraining and finetuning factory functions (#1783) · b2e9f1e4
      moto authored
      * [BC-Breaking] Split pretraining and finetuning factory functions
      
      Previously, factory functions of wav2vec2 only generated the architecture
      for the fine-tuning architecture used in wav2ve2 paper for ASR task.
      That is, pre-training architecture + Linear module, and it did not
      provide a straightforward way to generate architectures for pre-training.
      
      The goal of the original implementation was to allow the inference of
      wav2vec2 in non-Python environment via TorchScript. Now we would like to
      expand it to pre-training/fine-tuning and HuBERT model as well.
      
      Therefore, we need to have factory functions for both pre-training and
      fine-tuning. This commit introduces new factory functions and separate
      functions for pre-training and fine-tuning.
      
      1. New functions for ASR fine-tuning.
      
      We introdcue `wav2vec2_asr_XXX` functions which generates the architecture
      used for the fine-tuning task in wav2vec2 paper. *1
      
      2. Re-purpse the old functions
      
      The existing functions, `wav2vec2_XXX`, now generates the architecture with
      pre-trainig module only. (no Linear module)
      
      Note
      *1 This architecture is just one way to define architecture for fine-tuning
      and it is not universal definition. The new `wav2vec2_asr_XXX` functions are
      designed to provide these specific fine-tuning configuration and they are not
      meant to support generic architecture for downstream task.
      b2e9f1e4
  4. 22 Sep, 2021 1 commit
    • moto's avatar
      [BC-Breaking] Move fine-tune specific module out of wav2vec2 encoder (#1782) · 40f2a085
      moto authored
      Previously, the Linear module (called `readout`, which is used only for an ASR fine-tuning
      task) was placed in encoder module. Conceptually, the encoder has nothing to
      do with a module specific to fine-tuning / downstream task.
      
      The problems here are that;
      1. encoder can be also used in pre-training phase, in which such a module should
      not present
      2. The choice of Linear module is arbitral, and it is inconvenient for users
      to have hard-coded module structure in encoder.
      
      Therefore, this commit moves the Linear module out the encoder, and places it
      as `aux` attribute of `Wav2Vec2Model`. (as a result `Wav2Vec2Model` has
      `feature_extractor`, `encoder` and `aux` attributes.)
      
      An alternative approach is to define another module and place `Wav2Vec2Model`
      and aux module along each other. But that will introduce a new class we need
      to maintain.
      The expected use of `aux` is only  for 1. loading the pre-trained parameters 
      published by `fairseq` (and it's variations from HF) and 2. creating the same model 
      architectures for comparison experiment.
      The newly introduced class will not be general enough for downstream adaptations, 
      where there will be a bunch of different more complicated models. (i.e. s3prl)
      
      Therefore, based on the minimalistic approach, we put them inside of `Wav2Vec2Model`.
      40f2a085
  5. 20 Sep, 2021 1 commit
    • moto's avatar
      [BC-Breaking] Update `extract_features` of Wav2Vec2Model (#1776) · 78b08c26
      moto authored
      * [BC-Breaking] Update `extract_features` of Wav2Vec2Model
      
      Originally, `extract_features` method was returning the result from
      the convolutional feature extractor module.
      
      The features commonly used in downstream tasks are outputs from intermediate
      layers of transformer block in encoder.
      
      This commit update the behavior of `extract_features` to allow selectively
      retrieve such features.
      78b08c26
  6. 02 Sep, 2021 1 commit
  7. 14 Jun, 2021 1 commit
  8. 03 Jun, 2021 1 commit
    • moto's avatar
      Update docs (#1550) · 0166a851
      moto authored
      * Use `bibtex` for paper citations.
        * add `override.css` for fixing back reference.
        * wav2vec2
        * wav2letter
        * convtasnet
        * deepspeech
        * rnnt-loss
        * griffinlim
      * Fix broken references in `filtering`.
      * Fix note in soundfile backends.
      * Tweak wav2vec2 example.
      * Removes unused `pytorch_theme.css`
      0166a851
  9. 27 May, 2021 1 commit
    • moto's avatar
      Add wav2vec2.0 model (#1529) · e6886a4d
      moto authored
      - TorchScript-able `Wav2Vec2Model` class
      - Factory functions for three configurations presented in the paper 
        - `wav2vec2_base`
        - `wav2vec2_large`
        - `wav2vec2_large_lv60k`
      e6886a4d