1. 04 Jun, 2022 1 commit
  2. 03 Jun, 2022 5 commits
  3. 02 Jun, 2022 5 commits
  4. 01 Jun, 2022 8 commits
  5. 31 May, 2022 2 commits
  6. 30 May, 2022 1 commit
  7. 29 May, 2022 2 commits
  8. 28 May, 2022 1 commit
    • moto's avatar
      Update I/O initialization (#2417) · 65ab62e6
      moto authored
      Summary:
      Attempt to load ffmpeg extension at the top level import
      
      Preparation to use ffmpeg-based I/O as a fallback for sox_io backend.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2417
      
      Reviewed By: carolineechen
      
      Differential Revision: D36736989
      
      Pulled By: mthrok
      
      fbshipit-source-id: 0beb6f459313b5ea91597393ccb12571444c54d9
      65ab62e6
  9. 27 May, 2022 1 commit
    • moto's avatar
      Refactor Streamer to StreamReader in C++ codebase (#2403) · 9ef6c23d
      moto authored
      Summary:
      * `Streamer` has been renamed to `StreamReader` when it was moved from prototype to beta.
      This commit applies the same name change to the C++ source code.
      
      * Fix miscellaneous lint issues
      
      * Make the code compilable on FFmpeg 5
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2403
      
      Reviewed By: carolineechen
      
      Differential Revision: D36613053
      
      Pulled By: mthrok
      
      fbshipit-source-id: 69fedd6720d488dadf4dfe7d375ee76d216b215d
      9ef6c23d
  10. 26 May, 2022 1 commit
  11. 24 May, 2022 2 commits
  12. 23 May, 2022 3 commits
  13. 21 May, 2022 1 commit
    • moto's avatar
      Add file-like object support to Streaming API (#2400) · a984872d
      moto authored
      Summary:
      This commit adds file-like object support to Streaming API.
      
      ## Features
      - File-like objects are expected to implement `read(self, n)`.
      - Additionally `seek(self, offset, whence)` is used if available.
      - Without `seek` method, some formats cannot be decoded properly.
        - To work around this, one can use the existing `decoder` option to tell what decoder it should use.
        - The set of `decoder` and `decoder_option` arguments were added to `add_basic_[audio|video]_stream` method, similar to `add_[audio|video]_stream`.
        - So as to have the arguments common to both audio and video in front of the rest of the arguments, the order of the arguments are changed.
        - Also `dtype` and `format` arguments were changed to make them consistent across audio/video methods.
      
      ## Code structure
      
      The approach is very similar to how file-like object is supported in sox-based I/O.
      In Streaming API if the input src is string, it is passed to the implementation bound with TorchBind,
      if the src has `read` attribute, it is passed to the same implementation bound via PyBind 11.
      
      ![Untitled drawing](https://user-images.githubusercontent.com/855818/169098391-6116afee-7b29-460d-b50d-1037bb8a359d.png)
      
      ## Refactoring involved
      - Extracted to https://github.com/pytorch/audio/issues/2402
        - Some implementation in the original TorchBind surface layer is converted to Wrapper class so that they can be re-used from PyBind11 bindings. The wrapper class serves to simplify the binding.
        - `add_basic_[audio|video]_stream` methods were removed from C++ layer as it was just constructing string and passing it to `add_[audio|video]_stream` method, which is simpler to do in Python.
        - The original core Streamer implementation kept the use of types in `c10` namespace minimum. All the `c10::optional` and `c10::Dict` were converted to the equivalents of `std` at binding layer. But since they work fine with PyBind11, Streamer core methods deal them directly.
      
      ## TODO:
      - [x] Check if it is possible to stream MP4 (yuv420p) from S3 and directly decode (with/without HW decoding).
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2400
      
      Reviewed By: carolineechen
      
      Differential Revision: D36520073
      
      Pulled By: mthrok
      
      fbshipit-source-id: a11d981bbe99b1ff0cc356e46264ac8e76614bc6
      a984872d
  14. 20 May, 2022 3 commits
  15. 19 May, 2022 2 commits
    • Eli Uriegas's avatar
      ci: Install libomp on macos (#2404) · 38cf5b7a
      Eli Uriegas authored
      Summary:
      To resolve nightly / general build issues relating to OpenMP not being found, see https://hud.pytorch.org/pytorch/audio/commit/c6a376cc5679c1940e49fc3e0ba22eaead6c2467
      
      
      
      ```
      -- Found Torch: /Users/distiller/miniconda3/envs/env3.10/lib/python3.10/site-packages/torch/lib/libtorch.dylib
      CMake Error at /Users/distiller/miniconda3/envs/env3.10/lib/python3.10/site-packages/cmake/data/CMake.app/Contents/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
        Could NOT find OpenMP_C (missing: OpenMP_C_FLAGS OpenMP_C_LIB_NAMES)
      Call Stack (most recent call first):
        /Users/distiller/miniconda3/envs/env3.10/lib/python3.10/site-packages/cmake/data/CMake.app/Contents/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
        /Users/distiller/miniconda3/envs/env3.10/lib/python3.10/site-packages/cmake/data/CMake.app/Contents/share/cmake-3.22/Modules/FindOpenMP.cmake:544 (find_package_handle_standard_args)
        CMakeLists.txt:131 (find_package)
      
      -- Configuring incomplete, errors occurred!
      ```
      Signed-off-by: default avatarEli Uriegas <eliuriegas@fb.com>
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2404
      
      Reviewed By: atalman
      
      Differential Revision: D36495791
      
      Pulled By: seemethere
      
      fbshipit-source-id: 7b6fa2a62fda6fc468cfcbdf8d2163e6b9c327b0
      38cf5b7a
    • moto's avatar
      Refactor Streamer implementation (#2402) · eed57534
      moto authored
      Summary:
      * Move the helper wrapping code in TorchBind layer to proper wrapper class for so that it will be re-used in PyBind11.
      * Move `add_basic_[audio|video]_stream` methods from C++ to Python, as they are just string manipulation. This will make PyBind11-based binding simpler as it needs not to deal with dtype.
      * Move `add_[audio|video]_stream` wrapper signature to Streamer core, so that Streamer directly deals with `c10::optional`.†
      
      † Related to this, there is a slight change in how the empty filter expression is stored. Originally, if an empty filter expression was given to `add_[audio|video]_stream` method, the `StreamReaderOutputStream` was showing it as empty string `""`, even though internally it was using `"anull"` or `"null"`. Now `StreamReaderOutputStream` shows the corresponding filter expression that is actually being used.
      
      Ref https://github.com/pytorch/audio/issues/2400
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2402
      
      Reviewed By: nateanl
      
      Differential Revision: D36488808
      
      Pulled By: mthrok
      
      fbshipit-source-id: 877ca731364d10fc0cb9d97e75d55df9180f2047
      eed57534
  16. 18 May, 2022 1 commit
    • Zhaoheng Ni's avatar
      Add feature_grad_mult argument to HuBERTPretrainModel (#2335) · 647f28e4
      Zhaoheng Ni authored
      Summary:
      In Wav2Vec2 and HuBERT model training, the convolutional feature extraction layers use `group_norm` for normalization in `Base` model, while they use `layer_norm` in `Large` and `XLarge` models. For `Base` model, the gradients of feature extraction layers will be unstable in pre-training, thus we need to scale down the gradient by multiplying 0.1.
      
      In this PR, we add such argument to `HuBERTPretrainModel` to control the gradient of feature extractor layers. We also put the argument in the factory functions (`hubert_pretrain_base`, `hubert_pretrain_large`, and `hubert_pretrain_xlarge`. The reason is in finetuning, the feature extractor's parameters are fixed, we can multiply the gradient with 0.0 to avoid back propagating gradients.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2335
      
      Reviewed By: xiaohui-zhang, mthrok
      
      Differential Revision: D35646928
      
      Pulled By: nateanl
      
      fbshipit-source-id: 6a9563e227aac6e3127b634357946d860f26c994
      647f28e4
  17. 17 May, 2022 1 commit