1. 25 Aug, 2022 1 commit
  2. 24 Aug, 2022 1 commit
    • moto's avatar
      Add StreamWriter (#2628) · 72404de9
      moto authored
      Summary:
      This commit adds FFmpeg-based encoder StreamWriter class.
      StreamWriter is pretty much the opposite of StreamReader class, and
      it supports;
      
      * Encoding audio / still image / video
      * Exporting to local file / streaming protocol / devices etc...
      * File-like object support (in later commit)
      * HW video encoding (in later commit)
      
      See also: https://fburl.com/gslide/z85kn5a9 (Meta internal)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2628
      
      Reviewed By: nateanl
      
      Differential Revision: D38816650
      
      Pulled By: mthrok
      
      fbshipit-source-id: a9343b0d55755e186971dc96fb86eb52daa003c8
      72404de9
  3. 23 Aug, 2022 2 commits
  4. 22 Aug, 2022 2 commits
  5. 20 Aug, 2022 1 commit
  6. 19 Aug, 2022 2 commits
    • Moto Hira's avatar
      Refactor sox pybind source code (#2636) · 789adf07
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/2636
      
      At the early stage of torchaudio extension module,
      `torchaudio/csrc/pybind` directory was created so that
      all the code defining Python interface would be placed
      there and there will be only one extension module called
      `torchaudio._torchaudio`.
      
      However, the codebase has been evolved in a way separate
      extensions are defined for each feature (third party
      dependency) for the sake of more moduler file organization.
      
      What is left in `csrc/pybind` is libsox Python bindings.
      This commit moves it under `csrc/sox`.
      
      Follow-up rename `torchaudio._torchaudio` to `torchaudio._torchaudio_sox`.
      
      Reviewed By: carolineechen
      
      Differential Revision: D38829253
      
      fbshipit-source-id: 3554af45a2beb0f902810c5548751264e093f28d
      789adf07
    • moto's avatar
      Update README.md (#2633) · 0b7f2fba
      moto authored
      Summary:
      Update compatibility matrix
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2633
      
      Reviewed By: nateanl
      
      Differential Revision: D38827670
      
      Pulled By: mthrok
      
      fbshipit-source-id: 5c66bf60a06e37919ee725a5f4adf571e6c89100
      0b7f2fba
  7. 18 Aug, 2022 6 commits
  8. 16 Aug, 2022 4 commits
  9. 15 Aug, 2022 3 commits
  10. 12 Aug, 2022 1 commit
  11. 11 Aug, 2022 1 commit
  12. 10 Aug, 2022 3 commits
  13. 09 Aug, 2022 1 commit
    • Caroline Chen's avatar
      Add NNLM support to CTC Decoder (#2528) · 03a0d68e
      Caroline Chen authored
      Summary:
      Expose flashlight's LM and LMState classes to support decoding with custom language models, including NN LMs.
      
      The `ctc_decoder` API is as follows
      - To decode with KenLM, pass in KenLM language model path to `lm` variable
      - To decode with custom LM, create Python class with `CTCDecoderLM` subclass, and pass in the class to `lm` variable. Additionally create a file of LM words listed in order of the LM index, with a word per line, and pass in the file to `lm_path`.
      - To decode without a language model, set `lm` to `None` (default)
      
      Validated against fairseq w2l decoder on sample LibriSpeech dataset and LM. Code for validation can be found [here](https://github.com/facebookresearch/fairseq/compare/main...carolineechen:fairseq:ctc-decoder). Also added unit tests to validate custom implementations of ZeroLM and KenLM, and also using a biased LM.
      
      Follow ups:
      - Train simple LM on LibriSpeech and demonstrate usage in tutorial or examples directory
      
      cc jacobkahn
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2528
      
      Reviewed By: mthrok
      
      Differential Revision: D38243802
      
      Pulled By: carolineechen
      
      fbshipit-source-id: 445e78f6c20bda655aabf819fc0f771fe68c73d7
      03a0d68e
  14. 08 Aug, 2022 1 commit
  15. 05 Aug, 2022 4 commits
  16. 04 Aug, 2022 1 commit
  17. 03 Aug, 2022 2 commits
    • Sean Kim's avatar
      Add HDEMUCS_HIGH_MUSDB (#2601) · 6ecc11c2
      Sean Kim authored
      Summary:
      Add new model pretrained weights and tests
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2601
      
      Reviewed By: carolineechen, nateanl
      
      Differential Revision: D38396673
      
      Pulled By: skim0514
      
      fbshipit-source-id: e06f97d28508543bc18e671344386a947bc870c1
      6ecc11c2
    • bshall's avatar
      An implemenation of the ITU-R BS.1770-4 loudness recommendation (#2472) · 946b180a
      bshall authored
      Summary:
      I took a stab at implementing the ITU-R BS.1770-4 loudness recommendation (closes https://github.com/pytorch/audio/issues/1205). To give some more details:
      - I've implemented K-weighting following csteinmetz1 instead of BrechtDeMan since it fit well with torchaudio's already implemented filters (`treble_biquad` and `highpass_biquad`).
      - I've added four audio files to test compliance with the recommendation. These are linked in [this pdf](https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BS.2217-2-2016-PDF-E.pdf). There are many more test files there but I didn't want to bog down the assets directory with too many files. Let me know if I should add or remove anything.
      - I've kept many of the constant internal to the function (e.g. the block duration, overlap, and the absolute threshold gamma). I'm not sure if these should be exposed in the signature.
      - I've implemented support for up to 5 channels (following both csteinmetz1 and BrechtDeMan). The recommendation includes weights for up to 24 channels. Is there any convention for how many channels to support?
      
      I hope this is helpful! looking forward to hearing from you.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2472
      
      Reviewed By: hwangjeff
      
      Differential Revision: D38389155
      
      Pulled By: carolineechen
      
      fbshipit-source-id: fcc86d864c04ab2bedaa9acd941ebc4478ca6904
      946b180a
  18. 02 Aug, 2022 1 commit
  19. 01 Aug, 2022 3 commits