1. 27 Sep, 2021 1 commit
    • Yi Zhang's avatar
      Enable audio windows cuda tests (#1777) · d98c8847
      Yi Zhang authored
      * enable windows cudatests
      
      * add this dir
      
      * minor change
      
      * vs integration
      
      * Update cuda_install.bat
      
      * add logs
      
      * minor change
      
      * minor change
      
      * cp vision conda activate
      
      * mv vc_env_helper.bat
      
      * minor change
      
      * exit if cuda not avaiable
      
      * install numpy
      
      * improt CMakeLists
      
      * check cuda
      
      * minor change
      
      * change windows GPU image from previous to stable
      
      * set libtorch audio suffix as pyd on Windows
      
      * reduce changes
      
      * check env settings
      d98c8847
  2. 26 Sep, 2021 1 commit
  3. 25 Sep, 2021 1 commit
  4. 24 Sep, 2021 4 commits
    • moto's avatar
      [BC-Breaking] Split pretraining and finetuning factory functions (#1783) · b2e9f1e4
      moto authored
      * [BC-Breaking] Split pretraining and finetuning factory functions
      
      Previously, factory functions of wav2vec2 only generated the architecture
      for the fine-tuning architecture used in wav2ve2 paper for ASR task.
      That is, pre-training architecture + Linear module, and it did not
      provide a straightforward way to generate architectures for pre-training.
      
      The goal of the original implementation was to allow the inference of
      wav2vec2 in non-Python environment via TorchScript. Now we would like to
      expand it to pre-training/fine-tuning and HuBERT model as well.
      
      Therefore, we need to have factory functions for both pre-training and
      fine-tuning. This commit introduces new factory functions and separate
      functions for pre-training and fine-tuning.
      
      1. New functions for ASR fine-tuning.
      
      We introdcue `wav2vec2_asr_XXX` functions which generates the architecture
      used for the fine-tuning task in wav2vec2 paper. *1
      
      2. Re-purpse the old functions
      
      The existing functions, `wav2vec2_XXX`, now generates the architecture with
      pre-trainig module only. (no Linear module)
      
      Note
      *1 This architecture is just one way to define architecture for fine-tuning
      and it is not universal definition. The new `wav2vec2_asr_XXX` functions are
      designed to provide these specific fine-tuning configuration and they are not
      meant to support generic architecture for downstream task.
      b2e9f1e4
    • Yi Zhang's avatar
      Fix build on Windows with CUDA (#1787) · cf0adb28
      Yi Zhang authored
      This commit fixes the local build on Windows with CUDA.
      cf0adb28
    • nateanl's avatar
      8d83a2f4
    • Yi Zhang's avatar
      56a010b0
  5. 23 Sep, 2021 1 commit
  6. 22 Sep, 2021 3 commits
    • moto's avatar
      [BC-Breaking] Move fine-tune specific module out of wav2vec2 encoder (#1782) · 40f2a085
      moto authored
      Previously, the Linear module (called `readout`, which is used only for an ASR fine-tuning
      task) was placed in encoder module. Conceptually, the encoder has nothing to
      do with a module specific to fine-tuning / downstream task.
      
      The problems here are that;
      1. encoder can be also used in pre-training phase, in which such a module should
      not present
      2. The choice of Linear module is arbitral, and it is inconvenient for users
      to have hard-coded module structure in encoder.
      
      Therefore, this commit moves the Linear module out the encoder, and places it
      as `aux` attribute of `Wav2Vec2Model`. (as a result `Wav2Vec2Model` has
      `feature_extractor`, `encoder` and `aux` attributes.)
      
      An alternative approach is to define another module and place `Wav2Vec2Model`
      and aux module along each other. But that will introduce a new class we need
      to maintain.
      The expected use of `aux` is only  for 1. loading the pre-trained parameters 
      published by `fairseq` (and it's variations from HF) and 2. creating the same model 
      architectures for comparison experiment.
      The newly introduced class will not be general enough for downstream adaptations, 
      where there will be a bunch of different more complicated models. (i.e. s3prl)
      
      Therefore, based on the minimalistic approach, we put them inside of `Wav2Vec2Model`.
      40f2a085
    • moto's avatar
      Fix HF model integration (#1781) · e9cab8f8
      moto authored
      * Fix HF model integration
      
      Previously, when testing wav2vec models from HF transformers, all the model were
      instantiated as `Wav2Vec2ForCTC` class, while some of them were supposed to be
      `Wav2Vec2Model`.
      
      Fixing this revealed that model importer cannot correctly handle `Wav2Vec2Model` import.
      
      This PR fixes these issues.
      e9cab8f8
    • moto's avatar
      Update reference from master to main elsewhere (#1784) · 1b4b82e0
      moto authored
      
      
      Summary: Update fairseq reference from master to main elsewhere
      
      Reviewed By: alexeib
      
      Differential Revision: D30938472
      
      fbshipit-source-id: 243b98550207f241c9d3265bf3d4060350aaf0a8
      Co-authored-by: default avatarDiana Liskovich <dianaml@fb.com>
      1b4b82e0
  7. 21 Sep, 2021 1 commit
  8. 20 Sep, 2021 3 commits
    • moto's avatar
      [BC-Breaking] Update `extract_features` of Wav2Vec2Model (#1776) · 78b08c26
      moto authored
      * [BC-Breaking] Update `extract_features` of Wav2Vec2Model
      
      Originally, `extract_features` method was returning the result from
      the convolutional feature extractor module.
      
      The features commonly used in downstream tasks are outputs from intermediate
      layers of transformer block in encoder.
      
      This commit update the behavior of `extract_features` to allow selectively
      retrieve such features.
      78b08c26
    • moto's avatar
      Put libtorchaudio in lib directory (#1773) · 599a82b7
      moto authored
      Make the structure of library files somewhat similar to PyTorch core, which has the following pattern
      
      ```
      torch/_C.so
      torch/lib/libc10.so
      torch/lib/libtorch.so
      ...
      ```
      
      ```
      torchaudio/_torchaudio.so
      torchaudio/lib/libtorchaudio.so
      ```
      599a82b7
    • nateanl's avatar
      Move MVDR and PSD modules to transforms (#1771) · ac97ad82
      nateanl authored
      ac97ad82
  9. 17 Sep, 2021 3 commits
  10. 16 Sep, 2021 1 commit
    • moto's avatar
      Split extension into custom impl and Python wrapper libraries (#1752) · 0f822179
      moto authored
      * Split `libtorchaudio` and `_torchaudio`
      
      This change extract the core implementation from `_torchaudio` to `libtorchaudio`,
      so that `libtorchaudio` is reusable in TorchScript-based app.
      
      `_torchaudio` is a wrapper around `libtorchaudio` and only provides PyBind11-based
      features. (currently file-like object support in I/O)
      
      * Removed `BUILD_LIBTORCHAUDIO` option
      
      When invoking `cmake`, `libtorchaudio` is always built, so this option is removed.
      
      The new assumptions around the library discoverability
      
      - In regular OSS workflow (`pip`/`conda`-based binary installation), both `libtorchaudio` and `_torchaudio` are present.
          In this case,`libtorchaudio` has to be loaded manually with `torch.ops.load_library` and/or `torch.classes.load_library` otherwise importing `_torchaudio` would not be able to resolve the symbols defined in `libtorchaudio`.
      - When `torchaudio` is deployed with PEX format (single zip file)
        - We expect that`libtorchaudio.so` exists as a file in some search path configured by client code.
        - `_torchaudio` is still importable and because we do not know where `libtorchaudio` will exist, we will let the dynamic loader resolve the dependency from `_torchaudio` to `libtorchaudio`, which should work as long as `libtorchaudio` is in a library search path (search path is not modifiable from already-running Python process).
      0f822179
  11. 15 Sep, 2021 4 commits
  12. 13 Sep, 2021 1 commit
  13. 07 Sep, 2021 2 commits
    • moto's avatar
      Update the way to access libsox global config (#1755) · e11d27ce
      moto authored
      * Update the way to access libsox global config
      
      Preparation for splitting `libtorchaudio` and `_torchaudio`.
      
      When two libraries are compiled separately, and each code does
      `#include <sox.h>` independently, two copies of libsox's static global
      variables (`sox_globals_t`) are created.
      
      Our code should be referring to the same instance. To achieve this,
      `_torchaudio` should be accessing the global variable defined in
      `libtorchaudio` via the custom utility functions, and it should not
      directly use `sox_get_globals`.
      e11d27ce
    • moto's avatar
      Extract PyBind11 feature implementations (#1739) · 2a67fcc1
      moto authored
      This PR moves the code related to PyBind11 to the dedicated directory `torchaudio/csrc/pybind`.
      
      Before, features related to PyBind11 (I/O for file-like object) was implemented in `torchaudio/csrc/sox` and the binding was defined in `torchaudio/csrc/pybind.cpp`. We used macro definition `TORCH_API_INCLUDE_EXTENSION_H` to turn on/off the feature, in addition to including/excluding `torchaudio/csrc/pybind.cpp` in the list of compiled source.
      
      In the previous manner, in C++ example, one had to rebuild libtorchaudio separately, but by splitting them completely at compile time, it should conceptually possible to distribute libtorchaudio within torchaudio Python package and reuse it for C++ example.
      2a67fcc1
  14. 02 Sep, 2021 6 commits
  15. 01 Sep, 2021 1 commit
  16. 31 Aug, 2021 4 commits
  17. 30 Aug, 2021 3 commits