1. 16 Sep, 2022 2 commits
  2. 15 Sep, 2022 3 commits
  3. 14 Sep, 2022 4 commits
  4. 13 Sep, 2022 1 commit
  5. 12 Sep, 2022 1 commit
  6. 07 Sep, 2022 1 commit
    • moto's avatar
      Tweak documentation (#2656) · 8a0d7b36
      moto authored
      Summary:
      1. Override class `__module__` attribute in `conf.py` so that no manual override is necessary
      2. Fix SourceSeparationBundle member attribute
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2656
      
      Reviewed By: carolineechen
      
      Differential Revision: D39293053
      
      Pulled By: mthrok
      
      fbshipit-source-id: 2b8d6be1aee517d0e692043c26ac2438a787adc6
      8a0d7b36
  7. 24 Aug, 2022 1 commit
    • moto's avatar
      Add StreamWriter (#2628) · 72404de9
      moto authored
      Summary:
      This commit adds FFmpeg-based encoder StreamWriter class.
      StreamWriter is pretty much the opposite of StreamReader class, and
      it supports;
      
      * Encoding audio / still image / video
      * Exporting to local file / streaming protocol / devices etc...
      * File-like object support (in later commit)
      * HW video encoding (in later commit)
      
      See also: https://fburl.com/gslide/z85kn5a9 (Meta internal)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2628
      
      Reviewed By: nateanl
      
      Differential Revision: D38816650
      
      Pulled By: mthrok
      
      fbshipit-source-id: a9343b0d55755e186971dc96fb86eb52daa003c8
      72404de9
  8. 22 Aug, 2022 1 commit
  9. 18 Aug, 2022 2 commits
  10. 15 Aug, 2022 2 commits
  11. 11 Aug, 2022 1 commit
  12. 05 Aug, 2022 1 commit
    • hwangjeff's avatar
      Add convolution operator (#2602) · b396157d
      hwangjeff authored
      Summary:
      Adds functions `convolve` and `fftconvolve`, which compute the convolution of two tensors along their trailing dimension. The former performs the convolution directly, whereas the latter performs it using FFT.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2602
      
      Reviewed By: nateanl, mthrok
      
      Differential Revision: D38450771
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: b2d1e063ba21eafeddf317d60749e7120b14292b
      b396157d
  13. 03 Aug, 2022 2 commits
    • Sean Kim's avatar
      Add HDEMUCS_HIGH_MUSDB (#2601) · 6ecc11c2
      Sean Kim authored
      Summary:
      Add new model pretrained weights and tests
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2601
      
      Reviewed By: carolineechen, nateanl
      
      Differential Revision: D38396673
      
      Pulled By: skim0514
      
      fbshipit-source-id: e06f97d28508543bc18e671344386a947bc870c1
      6ecc11c2
    • bshall's avatar
      An implemenation of the ITU-R BS.1770-4 loudness recommendation (#2472) · 946b180a
      bshall authored
      Summary:
      I took a stab at implementing the ITU-R BS.1770-4 loudness recommendation (closes https://github.com/pytorch/audio/issues/1205). To give some more details:
      - I've implemented K-weighting following csteinmetz1 instead of BrechtDeMan since it fit well with torchaudio's already implemented filters (`treble_biquad` and `highpass_biquad`).
      - I've added four audio files to test compliance with the recommendation. These are linked in [this pdf](https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BS.2217-2-2016-PDF-E.pdf). There are many more test files there but I didn't want to bog down the assets directory with too many files. Let me know if I should add or remove anything.
      - I've kept many of the constant internal to the function (e.g. the block duration, overlap, and the absolute threshold gamma). I'm not sure if these should be exposed in the signature.
      - I've implemented support for up to 5 channels (following both csteinmetz1 and BrechtDeMan). The recommendation includes weights for up to 24 channels. Is there any convention for how many channels to support?
      
      I hope this is helpful! looking forward to hearing from you.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2472
      
      Reviewed By: hwangjeff
      
      Differential Revision: D38389155
      
      Pulled By: carolineechen
      
      fbshipit-source-id: fcc86d864c04ab2bedaa9acd941ebc4478ca6904
      946b180a
  14. 29 Jul, 2022 1 commit
    • Zhaoheng Ni's avatar
      Improve speech enhancement tutorial (#2527) · d6267031
      Zhaoheng Ni authored
      Summary:
      - The "speech + noise" mixture still has a high SNR, which can't show the effectiveness of MVDR beamforming. To make the task more challenging, amplify the noise waveform to reduce the SNR of mixture speech.
      - Show the Si-SNR score of mixture speech when visualizing the mixture spectrogram.
      - FIx the figure in `rtf_power` subsection.
          - The description of enhanced spectrogram by `rtf_power` is wrong. Correct it to `rtf_power`.
      - Print PESQ, STOI, and SDR metric scores.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2527
      
      Reviewed By: mthrok
      
      Differential Revision: D38190218
      
      Pulled By: nateanl
      
      fbshipit-source-id: 39562850a67f58a16e0a2866ed95f78c3f4dc7de
      d6267031
  15. 28 Jul, 2022 1 commit
    • Sean Kim's avatar
      Create tutorial for HDemucs (#2572) · 919fd0c4
      Sean Kim authored
      Summary:
      Add tutorial python file, draft PR, will continue to modify accordingly to feedback.
      
      Future plan: modify spectrogram and bottom audio design and work on finding best audio track and segments
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2572
      
      Reviewed By: carolineechen, nateanl, mthrok
      
      Differential Revision: D38234001
      
      Pulled By: skim0514
      
      fbshipit-source-id: fe9207864f354dec5cf5ff52bf7d9ddcf4a001d5
      919fd0c4
  16. 26 Jul, 2022 1 commit
  17. 25 Jul, 2022 1 commit
  18. 22 Jul, 2022 1 commit
    • Zhaoheng Ni's avatar
      Add documents for SourceSeparationBundle (#2559) · 6cee56ab
      Zhaoheng Ni authored
      Summary:
      - Add documentation page for `SourceSeparationBundle` and `CONVTASNET_BASE_LIBRI2MIX`.
      - Add citation of Libri2Mix dataset in the bundle documentation.
      - url in integration test should use slash instead of `os.path.join` as it will fail on Windows. Change it to f-string.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2559
      
      Reviewed By: carolineechen
      
      Differential Revision: D38036116
      
      Pulled By: nateanl
      
      fbshipit-source-id: 736732805191113955badfec3955e2e24e8f4836
      6cee56ab
  19. 19 Jul, 2022 1 commit
  20. 12 Jul, 2022 2 commits
  21. 07 Jul, 2022 1 commit
    • moto's avatar
      Update lint config (#2389) · 515fd01c
      moto authored
      Summary:
      Following the formatter changes heppened in fbcode, this commit update the linter config.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2389
      
      Reviewed By: hwangjeff
      
      Differential Revision: D37659649
      
      Pulled By: mthrok
      
      fbshipit-source-id: 1c52ff93f0b10cb2e7303d2ad13b2d65ffccfcb0
      515fd01c
  22. 27 Jun, 2022 1 commit
    • Zhaoheng Ni's avatar
      Add VoxCeleb1 dataset (#2349) · 21b2d139
      Zhaoheng Ni authored
      Summary:
      This PR adds two dataset classes of VoxCeleb1 corpus.
      - `VoxCeleb1Identification`
      Each data sample contains the waveform, sample rate, speaker id, and the file id.
      - `VoxCeleb1Verification`
      Each data sample contains a pair of waveforms, sample rate, the label indicating if they are from the same speaker, and the file ids.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2349
      
      Reviewed By: carolineechen
      
      Differential Revision: D35927921
      
      Pulled By: nateanl
      
      fbshipit-source-id: 3e07ddd329178777698841565053eb59befe6449
      21b2d139
  23. 21 Jun, 2022 1 commit
    • Sean Kim's avatar
      Create musdb handler and tests (#2484) · b92a8a09
      Sean Kim authored
      Summary:
      Create dataset handler and tests for new dataset. Manually tested and unit tested to test validity. Pre-commit ran for style checks.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2484
      
      Reviewed By: carolineechen, nateanl
      
      Differential Revision: D37250556
      
      Pulled By: skim0514
      
      fbshipit-source-id: d2c8d73d22fd9d7282026265676f3eab1e178d51
      b92a8a09
  24. 20 Jun, 2022 1 commit
  25. 08 Jun, 2022 2 commits
  26. 04 Jun, 2022 1 commit
    • moto's avatar
      Make FFmpeg log level configurable (#2439) · 877a88c5
      moto authored
      Summary:
      Undesired logs are one of the loudest UX complains we get.
      Yet, loading media files involves uncertainty which is
      difficult to debug without debug log.
      
      This commit introduces utility functions to configure logging level
      so that we can ask users to enable it when they encounter an issue,
      while defaulting to non-verbose option.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2439
      
      Reviewed By: hwangjeff, xiaohui-zhang
      
      Differential Revision: D36903763
      
      Pulled By: mthrok
      
      fbshipit-source-id: f4ddd9915b13197c2a2eb97e965005b8b5b8d987
      877a88c5
  27. 01 Jun, 2022 2 commits
  28. 24 May, 2022 1 commit