1. 15 Aug, 2023 1 commit
    • moto's avatar
      [BC-breaking] Update pre-built ffmpeg4 to 4.4.4 (#3561) · bf07ea6b
      moto authored
      Summary:
      In https://github.com/pytorch/audio/pull/3460, we switched the build process for FFmpeg extension.
      Since it is complicated to install FFmpeg in some environments, at build time, pre-built binaries and its headers
      are downloaded and used as a scaffolding for torchaudio build.
      
      Now even though we did not change any code or FFmpeg version, it turned out that this causes segmentation
      fault on Ubuntu when using system Python and FFmpeg 4.4 installed via aptitude.
      While investigating the issue, I swapped the said pre-built FFmpeg scaffolding with FFmpeg 4.4 from aptitude,
      and the segmentation fault did not happen. This indicates that it is binary compatibility issue.
      
      Before https://github.com/pytorch/audio/issues/3460, each binary build job was building FFmpeg 4.1.8 using the same compiler used to build torchaudio,
      but after https://github.com/pytorch/audio/issues/3460 the environments to build FFmpeg 4.1.8 and torchaudio are different. My hypothesis is that
      this difference is causing some ABI incompatibility when linking against FFmpeg 4.4. (Also, I don't remember well,
      but I read somewhere that 4.4 has a different ABI)
      
      Through experiments, it turned out upgrading the pre-built FFmpeg scaffolding to 4.4 resolves this.
      So this commit upgrade the pre-built FFmpeg 4 to 4.4.
      The potential (yet unconfirmed) downside is that torchaudio will no longer work with 4.1, 4.2, and 4.3.
      Since FFmpeg 4.4 is what Ubuntu 20.04 and 22.04 support by default, and Google Colab is also on 20.04,
      I think it is more important to support 4.4.
      
      Therefore we drop the support for 4.1-4.3 from normal build (and official distributions). Those who wish to
      use 4.1-4.3 can build torchaudio from source by linking to specific FFmpeg.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3561
      
      Reviewed By: hwangjeff
      
      Differential Revision: D48340201
      
      Pulled By: mthrok
      
      fbshipit-source-id: 7ece82910f290c7cf83f58311c4cf6a384e8795e
      bf07ea6b
  2. 10 Aug, 2023 1 commit
  3. 08 Aug, 2023 2 commits
  4. 04 Aug, 2023 1 commit
  5. 01 Aug, 2023 1 commit
  6. 31 Jul, 2023 2 commits
  7. 29 Jul, 2023 1 commit
    • moto's avatar
      Refactor compat (#3518) · 8497ee91
      moto authored
      Summary:
      The I/O functions in _compat module was introduced there so that
      everything related to FFmpeg is in torchaudio.io and FFmpeg library
      initialization can be carried out in `torchaudio.io.__init__`.
      
      Now that this constraint is removed, (all the initialization happens
      at `torchaudio._extension.__init__`) and `_compat` is only used by
      FFmpeg dispatcher backend, we move the module to `torchaudio._backend`
      for better locality.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3518
      
      Reviewed By: huangruizhe
      
      Differential Revision: D47877412
      
      Pulled By: mthrok
      
      fbshipit-source-id: aa18c8cb6e5d5360950df5158c33c653e37c565f
      8497ee91
  8. 28 Jul, 2023 2 commits
  9. 25 Jul, 2023 1 commit
  10. 18 Jul, 2023 1 commit
  11. 15 Jul, 2023 1 commit
  12. 05 Jul, 2023 1 commit
  13. 28 Jun, 2023 1 commit
  14. 26 Jun, 2023 1 commit
  15. 21 Jun, 2023 1 commit
  16. 15 Jun, 2023 1 commit
    • moto's avatar
      Update forced alignment tutorial (#3440) · 18601691
      moto authored
      Summary:
      * Fix backtrack visualization (the cooridnate was off-by-one.)
      * Add note about the simplification and the new align API
      * Explicitly handle SOS and EOS
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3440
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D46761282
      
      Pulled By: mthrok
      
      fbshipit-source-id: b0b6c9754674e8e23543e9f002e29b55102c92f8
      18601691
  17. 07 Jun, 2023 1 commit
  18. 02 Jun, 2023 2 commits
    • moto's avatar
      [BC-Breaking] Remove compute_kaldi_pitch (#3368) · 5bbbb1d5
      moto authored
      Summary:
      This commit removes compute_kaldi_pitch function and the underlying Kaldi integration from torchaudio.
      
      Kaldi pitch function was added in a short period of time by integrating the original Kaldi implementation, instead of reimplementing it in PyTorch.
      
      The Kaldi integration employed a hack which replaces the base vector/matrix implementation of Kaldi with PyTorch Tensor so that there is only one blas library within torchaudio.
      
      Recently, we are making torchaudio more lean, and we don't see a wide adoption of kaldi_pitch feature, so we decided to remove them.
      
      See some of the discussion https://github.com/pytorch/audio/issues/1269
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3368
      
      Differential Revision: D46406176
      
      Pulled By: mthrok
      
      fbshipit-source-id: ee5e24d825188f379979ddccd680c7323b119b1e
      5bbbb1d5
    • moto's avatar
      Update data augmentation tutorial (#3375) · 2ba36b47
      moto authored
      Summary:
      Replace sox_effects with `torchaudio.io.AudioEffector`
      
      1. To show case the new and better feature
      2. To prepare for the upcoming removal of file-like support object
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3375
      
      Reviewed By: nateanl
      
      Differential Revision: D46379016
      
      Pulled By: mthrok
      
      fbshipit-source-id: 70f24b62494204949f327f6ac6c49f315c9ee315
      2ba36b47
  19. 31 May, 2023 1 commit
  20. 26 May, 2023 2 commits
    • atalman's avatar
      Revert "Upgrade to FFmpeg5 (#3298)" (#3377) · 37779ef9
      atalman authored
      Summary:
      This reverts commit d38a7854.
      
      This is temporary revert to unblock unit test migration from circleci to github
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3377
      
      Reviewed By: mthrok
      
      Differential Revision: D46230498
      
      Pulled By: atalman
      
      fbshipit-source-id: 000d8a9ca00750fc1ca61f4c2cdd6e930a5ce46d
      37779ef9
    • Lakshmi Krishnan's avatar
      Improve RNN-T streaming decoding (#3295) · 9fc0dcaa
      Lakshmi Krishnan authored
      Summary:
      This commit fixes the following issues affecting streaming decoding quality
      1. The `init_b` hypothesis is only regenerated from blank token if no initial hypotheses are provided.
      2. Allows the decoder to receive top-K hypothesis to continue decoding from, instead of using just the top hypothesis at each decoding step.  This dramatically affects decoding quality especially for speech with long pauses and disfluencies.
      3. Some minor errors regarding shape checking for length.
      
      This also means that the resulting output is the entire transcript up until that time step, instead of just the incremental change in transcript.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3295
      
      Reviewed By: nateanl
      
      Differential Revision: D46216113
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 8f7efae28dcca4a052f434ca55a2795c9e5ec0b0
      9fc0dcaa
  21. 23 May, 2023 1 commit
  22. 21 May, 2023 2 commits
  23. 16 May, 2023 1 commit
  24. 10 May, 2023 2 commits
  25. 05 May, 2023 1 commit
    • Zhaoheng Ni's avatar
      Update squim tutorial (#3313) · 05ef7dc6
      Zhaoheng Ni authored
      Summary:
      Add scatter plots for STOI, PESQ, Si-SDR, and MOS scores to demonstrate the performance of `SquimObjective` and `SquimSubjective` models and how close they are to the ground truths.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3313
      
      Reviewed By: hwangjeff
      
      Differential Revision: D45620311
      
      Pulled By: nateanl
      
      fbshipit-source-id: cb58ffd3744df4749b9385876da8de0cffd93557
      05ef7dc6
  26. 29 Apr, 2023 1 commit
  27. 31 Mar, 2023 1 commit
  28. 29 Mar, 2023 1 commit
    • moto's avatar
      Remove the note about AAC (#3214) · c07a96ab
      moto authored
      Summary:
      There is a part of StreamWriter tutorial that warns about corrupted AAC audio output, but this is no longer relevant thus this commit deletes it.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3214
      
      Reviewed By: nateanl
      
      Differential Revision: D44504030
      
      Pulled By: mthrok
      
      fbshipit-source-id: 4d26d582e9fb87d4e6fa674c05fe3192bc223eef
      c07a96ab
  29. 28 Mar, 2023 1 commit
  30. 16 Mar, 2023 1 commit
  31. 02 Mar, 2023 1 commit
  32. 15 Feb, 2023 1 commit
  33. 30 Jan, 2023 1 commit
    • Yan Li's avatar
      Fix hybrid demucs tutorial for CUDA (#3017) · da9d1627
      Yan Li authored
      Summary:
      Currently there will be a few errors when this tutorial is run with a CUDA device.
      
      The reasons being:
      - The source audio waveform is not properly moved to the GPU. The `to()` method is not in-place for Tensors, so we need to assign the return value of the method call to the variable (otherwise the Tensor would still be on the CPU).
      - When performing further analysis and displaying of the output audio, we need to move them back from the GPU to the CPU. This is because some of the functions we call require the Tensor to be on the CPU (e.g. `stft()` and `bss_eval_sources()`).
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3017
      
      Reviewed By: mthrok
      
      Differential Revision: D42828526
      
      Pulled By: nateanl
      
      fbshipit-source-id: c28bc855e79e3363a011f4a35a69aae1764e7762
      da9d1627