1. 03 Oct, 2022 2 commits
  2. 23 Sep, 2022 1 commit
  3. 22 Sep, 2022 1 commit
  4. 21 Sep, 2022 2 commits
  5. 20 Sep, 2022 1 commit
  6. 16 Sep, 2022 3 commits
  7. 15 Sep, 2022 3 commits
  8. 14 Sep, 2022 4 commits
  9. 13 Sep, 2022 1 commit
  10. 12 Sep, 2022 1 commit
  11. 07 Sep, 2022 1 commit
    • moto's avatar
      Tweak documentation (#2656) · 8a0d7b36
      moto authored
      Summary:
      1. Override class `__module__` attribute in `conf.py` so that no manual override is necessary
      2. Fix SourceSeparationBundle member attribute
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2656
      
      Reviewed By: carolineechen
      
      Differential Revision: D39293053
      
      Pulled By: mthrok
      
      fbshipit-source-id: 2b8d6be1aee517d0e692043c26ac2438a787adc6
      8a0d7b36
  12. 24 Aug, 2022 1 commit
    • moto's avatar
      Add StreamWriter (#2628) · 72404de9
      moto authored
      Summary:
      This commit adds FFmpeg-based encoder StreamWriter class.
      StreamWriter is pretty much the opposite of StreamReader class, and
      it supports;
      
      * Encoding audio / still image / video
      * Exporting to local file / streaming protocol / devices etc...
      * File-like object support (in later commit)
      * HW video encoding (in later commit)
      
      See also: https://fburl.com/gslide/z85kn5a9 (Meta internal)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2628
      
      Reviewed By: nateanl
      
      Differential Revision: D38816650
      
      Pulled By: mthrok
      
      fbshipit-source-id: a9343b0d55755e186971dc96fb86eb52daa003c8
      72404de9
  13. 22 Aug, 2022 1 commit
  14. 18 Aug, 2022 2 commits
  15. 15 Aug, 2022 2 commits
  16. 11 Aug, 2022 1 commit
  17. 05 Aug, 2022 1 commit
    • hwangjeff's avatar
      Add convolution operator (#2602) · b396157d
      hwangjeff authored
      Summary:
      Adds functions `convolve` and `fftconvolve`, which compute the convolution of two tensors along their trailing dimension. The former performs the convolution directly, whereas the latter performs it using FFT.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2602
      
      Reviewed By: nateanl, mthrok
      
      Differential Revision: D38450771
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: b2d1e063ba21eafeddf317d60749e7120b14292b
      b396157d
  18. 03 Aug, 2022 2 commits
    • Sean Kim's avatar
      Add HDEMUCS_HIGH_MUSDB (#2601) · 6ecc11c2
      Sean Kim authored
      Summary:
      Add new model pretrained weights and tests
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2601
      
      Reviewed By: carolineechen, nateanl
      
      Differential Revision: D38396673
      
      Pulled By: skim0514
      
      fbshipit-source-id: e06f97d28508543bc18e671344386a947bc870c1
      6ecc11c2
    • bshall's avatar
      An implemenation of the ITU-R BS.1770-4 loudness recommendation (#2472) · 946b180a
      bshall authored
      Summary:
      I took a stab at implementing the ITU-R BS.1770-4 loudness recommendation (closes https://github.com/pytorch/audio/issues/1205). To give some more details:
      - I've implemented K-weighting following csteinmetz1 instead of BrechtDeMan since it fit well with torchaudio's already implemented filters (`treble_biquad` and `highpass_biquad`).
      - I've added four audio files to test compliance with the recommendation. These are linked in [this pdf](https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BS.2217-2-2016-PDF-E.pdf). There are many more test files there but I didn't want to bog down the assets directory with too many files. Let me know if I should add or remove anything.
      - I've kept many of the constant internal to the function (e.g. the block duration, overlap, and the absolute threshold gamma). I'm not sure if these should be exposed in the signature.
      - I've implemented support for up to 5 channels (following both csteinmetz1 and BrechtDeMan). The recommendation includes weights for up to 24 channels. Is there any convention for how many channels to support?
      
      I hope this is helpful! looking forward to hearing from you.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2472
      
      Reviewed By: hwangjeff
      
      Differential Revision: D38389155
      
      Pulled By: carolineechen
      
      fbshipit-source-id: fcc86d864c04ab2bedaa9acd941ebc4478ca6904
      946b180a
  19. 29 Jul, 2022 1 commit
    • Zhaoheng Ni's avatar
      Improve speech enhancement tutorial (#2527) · d6267031
      Zhaoheng Ni authored
      Summary:
      - The "speech + noise" mixture still has a high SNR, which can't show the effectiveness of MVDR beamforming. To make the task more challenging, amplify the noise waveform to reduce the SNR of mixture speech.
      - Show the Si-SNR score of mixture speech when visualizing the mixture spectrogram.
      - FIx the figure in `rtf_power` subsection.
          - The description of enhanced spectrogram by `rtf_power` is wrong. Correct it to `rtf_power`.
      - Print PESQ, STOI, and SDR metric scores.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2527
      
      Reviewed By: mthrok
      
      Differential Revision: D38190218
      
      Pulled By: nateanl
      
      fbshipit-source-id: 39562850a67f58a16e0a2866ed95f78c3f4dc7de
      d6267031
  20. 28 Jul, 2022 1 commit
    • Sean Kim's avatar
      Create tutorial for HDemucs (#2572) · 919fd0c4
      Sean Kim authored
      Summary:
      Add tutorial python file, draft PR, will continue to modify accordingly to feedback.
      
      Future plan: modify spectrogram and bottom audio design and work on finding best audio track and segments
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2572
      
      Reviewed By: carolineechen, nateanl, mthrok
      
      Differential Revision: D38234001
      
      Pulled By: skim0514
      
      fbshipit-source-id: fe9207864f354dec5cf5ff52bf7d9ddcf4a001d5
      919fd0c4
  21. 26 Jul, 2022 1 commit
  22. 25 Jul, 2022 1 commit
  23. 22 Jul, 2022 1 commit
    • Zhaoheng Ni's avatar
      Add documents for SourceSeparationBundle (#2559) · 6cee56ab
      Zhaoheng Ni authored
      Summary:
      - Add documentation page for `SourceSeparationBundle` and `CONVTASNET_BASE_LIBRI2MIX`.
      - Add citation of Libri2Mix dataset in the bundle documentation.
      - url in integration test should use slash instead of `os.path.join` as it will fail on Windows. Change it to f-string.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2559
      
      Reviewed By: carolineechen
      
      Differential Revision: D38036116
      
      Pulled By: nateanl
      
      fbshipit-source-id: 736732805191113955badfec3955e2e24e8f4836
      6cee56ab
  24. 19 Jul, 2022 1 commit
  25. 12 Jul, 2022 2 commits
  26. 07 Jul, 2022 1 commit
    • moto's avatar
      Update lint config (#2389) · 515fd01c
      moto authored
      Summary:
      Following the formatter changes heppened in fbcode, this commit update the linter config.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2389
      
      Reviewed By: hwangjeff
      
      Differential Revision: D37659649
      
      Pulled By: mthrok
      
      fbshipit-source-id: 1c52ff93f0b10cb2e7303d2ad13b2d65ffccfcb0
      515fd01c
  27. 27 Jun, 2022 1 commit
    • Zhaoheng Ni's avatar
      Add VoxCeleb1 dataset (#2349) · 21b2d139
      Zhaoheng Ni authored
      Summary:
      This PR adds two dataset classes of VoxCeleb1 corpus.
      - `VoxCeleb1Identification`
      Each data sample contains the waveform, sample rate, speaker id, and the file id.
      - `VoxCeleb1Verification`
      Each data sample contains a pair of waveforms, sample rate, the label indicating if they are from the same speaker, and the file ids.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2349
      
      Reviewed By: carolineechen
      
      Differential Revision: D35927921
      
      Pulled By: nateanl
      
      fbshipit-source-id: 3e07ddd329178777698841565053eb59befe6449
      21b2d139