1. 06 Sep, 2022 3 commits
    • Ravi Makhija's avatar
      Fix random Gaussian generation (#2639) · 3430fd68
      Ravi Makhija authored
      Summary:
      This PR is meant to address the bug raised in issue https://github.com/pytorch/audio/issues/2634.
      
      In particular, previously the Box Muller transform was used to generate Gaussian variates for dithering based on `torch.rand` uniform variates, but it was incorrectly implemented (e.g. the same uniform variate was used as input to the transform, rather than two different uniform variates), which led to a different (non-Gaussian) distribution. This PR instead uses `torch.randn` to generate the Gaussian variates.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2639
      
      Reviewed By: mthrok
      
      Differential Revision: D39101144
      
      Pulled By: carolineechen
      
      fbshipit-source-id: 691e49679f6598ef0a1675f6f4ee721ef32215fd
      3430fd68
    • Caroline Chen's avatar
      Add metadata function for LibriSpeech (#2653) · 08d3bb17
      Caroline Chen authored
      Summary:
      Adding support for metadata mode, requested in https://github.com/pytorch/audio/issues/2539, by adding a public `get_metadata()` function in the dataset. This function can be used directly by users to fetch metadata for individual dataset indices, or users can subclass the dataset and override `__getitem__` with `get_metadata` to create a dataset class that directly handles metadata mode.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2653
      
      Reviewed By: nateanl, mthrok
      
      Differential Revision: D39105114
      
      Pulled By: carolineechen
      
      fbshipit-source-id: 6f26f1402a053dffcfcc5d859f87271ed5923348
      08d3bb17
    • Peter Albert's avatar
      Remove obsolete examples (#2655) · 4a20c412
      Peter Albert authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/2655
      
      Removed obsolete example and the corresponding test
      
      Reviewed By: mthrok
      
      Differential Revision: D39260253
      
      fbshipit-source-id: 0bde71ffd75dd0c94a5cc4a9940f4648a5d61bd7
      4a20c412
  2. 02 Sep, 2022 1 commit
  3. 01 Sep, 2022 1 commit
  4. 26 Aug, 2022 3 commits
  5. 25 Aug, 2022 1 commit
  6. 24 Aug, 2022 1 commit
    • moto's avatar
      Add StreamWriter (#2628) · 72404de9
      moto authored
      Summary:
      This commit adds FFmpeg-based encoder StreamWriter class.
      StreamWriter is pretty much the opposite of StreamReader class, and
      it supports;
      
      * Encoding audio / still image / video
      * Exporting to local file / streaming protocol / devices etc...
      * File-like object support (in later commit)
      * HW video encoding (in later commit)
      
      See also: https://fburl.com/gslide/z85kn5a9 (Meta internal)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2628
      
      Reviewed By: nateanl
      
      Differential Revision: D38816650
      
      Pulled By: mthrok
      
      fbshipit-source-id: a9343b0d55755e186971dc96fb86eb52daa003c8
      72404de9
  7. 23 Aug, 2022 2 commits
  8. 22 Aug, 2022 2 commits
  9. 20 Aug, 2022 1 commit
  10. 19 Aug, 2022 2 commits
    • Moto Hira's avatar
      Refactor sox pybind source code (#2636) · 789adf07
      Moto Hira authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/2636
      
      At the early stage of torchaudio extension module,
      `torchaudio/csrc/pybind` directory was created so that
      all the code defining Python interface would be placed
      there and there will be only one extension module called
      `torchaudio._torchaudio`.
      
      However, the codebase has been evolved in a way separate
      extensions are defined for each feature (third party
      dependency) for the sake of more moduler file organization.
      
      What is left in `csrc/pybind` is libsox Python bindings.
      This commit moves it under `csrc/sox`.
      
      Follow-up rename `torchaudio._torchaudio` to `torchaudio._torchaudio_sox`.
      
      Reviewed By: carolineechen
      
      Differential Revision: D38829253
      
      fbshipit-source-id: 3554af45a2beb0f902810c5548751264e093f28d
      789adf07
    • moto's avatar
      Update README.md (#2633) · 0b7f2fba
      moto authored
      Summary:
      Update compatibility matrix
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2633
      
      Reviewed By: nateanl
      
      Differential Revision: D38827670
      
      Pulled By: mthrok
      
      fbshipit-source-id: 5c66bf60a06e37919ee725a5f4adf571e6c89100
      0b7f2fba
  11. 18 Aug, 2022 6 commits
  12. 16 Aug, 2022 4 commits
  13. 15 Aug, 2022 3 commits
  14. 12 Aug, 2022 1 commit
  15. 11 Aug, 2022 1 commit
  16. 10 Aug, 2022 3 commits
  17. 09 Aug, 2022 1 commit
    • Caroline Chen's avatar
      Add NNLM support to CTC Decoder (#2528) · 03a0d68e
      Caroline Chen authored
      Summary:
      Expose flashlight's LM and LMState classes to support decoding with custom language models, including NN LMs.
      
      The `ctc_decoder` API is as follows
      - To decode with KenLM, pass in KenLM language model path to `lm` variable
      - To decode with custom LM, create Python class with `CTCDecoderLM` subclass, and pass in the class to `lm` variable. Additionally create a file of LM words listed in order of the LM index, with a word per line, and pass in the file to `lm_path`.
      - To decode without a language model, set `lm` to `None` (default)
      
      Validated against fairseq w2l decoder on sample LibriSpeech dataset and LM. Code for validation can be found [here](https://github.com/facebookresearch/fairseq/compare/main...carolineechen:fairseq:ctc-decoder). Also added unit tests to validate custom implementations of ZeroLM and KenLM, and also using a biased LM.
      
      Follow ups:
      - Train simple LM on LibriSpeech and demonstrate usage in tutorial or examples directory
      
      cc jacobkahn
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2528
      
      Reviewed By: mthrok
      
      Differential Revision: D38243802
      
      Pulled By: carolineechen
      
      fbshipit-source-id: 445e78f6c20bda655aabf819fc0f771fe68c73d7
      03a0d68e
  18. 08 Aug, 2022 1 commit
  19. 05 Aug, 2022 3 commits