- 05 Aug, 2022 1 commit
-
-
Ravi Makhija authored
Summary: Added example for Vad transform as mentioned in issue https://github.com/pytorch/audio/issues/1564 Pull Request resolved: https://github.com/pytorch/audio/pull/2598 Reviewed By: mthrok Differential Revision: D38432103 Pulled By: carolineechen fbshipit-source-id: 8f7e26c48d4ffb6bfe55bba6f9c7ee915e6edaef
-
- 04 Aug, 2022 1 commit
-
-
Omkar Vichare authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2599 Bootcamp task T127107566. Replacing assert statements with if ... then raise so can be run in optimized mode Reviewed By: mthrok Differential Revision: D38370108 fbshipit-source-id: 74eaf5b72c511b62ddbb8e0e3b0ed638ad49e4f2
-
- 03 Aug, 2022 2 commits
-
-
Sean Kim authored
Summary: Add new model pretrained weights and tests Pull Request resolved: https://github.com/pytorch/audio/pull/2601 Reviewed By: carolineechen, nateanl Differential Revision: D38396673 Pulled By: skim0514 fbshipit-source-id: e06f97d28508543bc18e671344386a947bc870c1
-
bshall authored
Summary: I took a stab at implementing the ITU-R BS.1770-4 loudness recommendation (closes https://github.com/pytorch/audio/issues/1205). To give some more details: - I've implemented K-weighting following csteinmetz1 instead of BrechtDeMan since it fit well with torchaudio's already implemented filters (`treble_biquad` and `highpass_biquad`). - I've added four audio files to test compliance with the recommendation. These are linked in [this pdf](https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BS.2217-2-2016-PDF-E.pdf). There are many more test files there but I didn't want to bog down the assets directory with too many files. Let me know if I should add or remove anything. - I've kept many of the constant internal to the function (e.g. the block duration, overlap, and the absolute threshold gamma). I'm not sure if these should be exposed in the signature. - I've implemented support for up to 5 channels (following both csteinmetz1 and BrechtDeMan). The recommendation includes weights for up to 24 channels. Is there any convention for how many channels to support? I hope this is helpful! looking forward to hearing from you. Pull Request resolved: https://github.com/pytorch/audio/pull/2472 Reviewed By: hwangjeff Differential Revision: D38389155 Pulled By: carolineechen fbshipit-source-id: fcc86d864c04ab2bedaa9acd941ebc4478ca6904
-
- 02 Aug, 2022 1 commit
-
-
Eli Uriegas authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2581 Also removes spurious lines of code that were erroring out silently Signed-off-by:
Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: carolineechen Differential Revision: D38336705 Pulled By: seemethere fbshipit-source-id: 700a969a4bace7d9ca94a9db908b29f383b7d94e
-
- 01 Aug, 2022 3 commits
-
-
Ravi Makhija authored
Summary: Added example for [Vol transform](https://pytorch.org/audio/stable/transforms.html#torchaudio.transforms.Vol) as mentioned in this issue https://github.com/pytorch/audio/issues/1564. Also made a minor edit to the docstring for `class Vol` to fix a grammar typo and use more common verbiage. Pull Request resolved: https://github.com/pytorch/audio/pull/2597 Reviewed By: nateanl, mthrok Differential Revision: D38316433 Pulled By: carolineechen fbshipit-source-id: 0be8fc505800a59acdab843813767acfdeac8243
-
Ravi Makhija authored
Summary: Fixed minor typo in `Contributing.md`: "diemension" -> "dimension" Pull Request resolved: https://github.com/pytorch/audio/pull/2596 Reviewed By: mthrok Differential Revision: D38315517 Pulled By: carolineechen fbshipit-source-id: 5e771f22a5be008d3be30b4699fb5cc5637c627d
-
moto authored
Summary: In https://github.com/pytorch/audio/pull/2285, the SNR calculation was fixed, but there was still one that was not fixed. This commit fixes it. Also following the feedback https://github.com/pytorch/tutorials/issues/1930#issuecomment-1199741336, update the variable name. Pull Request resolved: https://github.com/pytorch/audio/pull/2595 Reviewed By: carolineechen Differential Revision: D38314672 Pulled By: mthrok fbshipit-source-id: b2015e2709729190d97264aa191651b3af4ba856
-
- 30 Jul, 2022 1 commit
-
-
Ansh Nanda authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2590 Converted assert checks for argument validation to if-else checks so that they are executed in optimized mode as well. Reviewed By: mthrok Differential Revision: D38211246 fbshipit-source-id: 922b5bcafe8214980e535527dd94c3345c1ff3e2
-
- 29 Jul, 2022 4 commits
-
-
moto authored
Summary: 1. Fix initialization. Previously, the SOS token score was initialized to 0 across the time axis. This was biasing the alignment to delay the start. The proper way to delay the SOS is via blank token. The new initilization takes the cumulated sum of blank scores. 2. Fill the end of trellis with Inf Similar to the start, at the end where there remaining time frame is less than the number of tokens, it is no longer possible to align the text, thus we fill with Inf for better visualization. 3. Clean up asset management code. Pull Request resolved: https://github.com/pytorch/audio/pull/2544 Reviewed By: nateanl Differential Revision: D38276478 Pulled By: mthrok fbshipit-source-id: 6d934cc850a0790b8c463a4f69f8f1143633d299
-
moto authored
Summary: This commit enables CTC decoder on Windows. The functionality seems to work fine. The tests are passing, the decoding tutorial runs fine. The only difference to the Linux/macOS version is that loading model in XZ compression format is not supported.  Pull Request resolved: https://github.com/pytorch/audio/pull/2587 Reviewed By: carolineechen, nateanl Differential Revision: D38276490 Pulled By: mthrok fbshipit-source-id: f2203b2235c5bbb0220fe560aaaf0e1d5530347a
-
Javier Cardenete Morales authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2592 std::runtime_error does not preserve the C++ stack trace, so it is unclear to users what went wrong internally. PyTorch's TORCH_CHECK macro allows to print C++ stack trace when TORCH_SHOW_CPP_STACKTRACES environment variable is set to 1. Reviewed By: mthrok Differential Revision: D38219331 fbshipit-source-id: f51c27111077e927f97127f73f83a31b8e74f61f
-
Zhaoheng Ni authored
Summary: - The "speech + noise" mixture still has a high SNR, which can't show the effectiveness of MVDR beamforming. To make the task more challenging, amplify the noise waveform to reduce the SNR of mixture speech. - Show the Si-SNR score of mixture speech when visualizing the mixture spectrogram. - FIx the figure in `rtf_power` subsection. - The description of enhanced spectrogram by `rtf_power` is wrong. Correct it to `rtf_power`. - Print PESQ, STOI, and SDR metric scores. Pull Request resolved: https://github.com/pytorch/audio/pull/2527 Reviewed By: mthrok Differential Revision: D38190218 Pulled By: nateanl fbshipit-source-id: 39562850a67f58a16e0a2866ed95f78c3f4dc7de
-
- 28 Jul, 2022 7 commits
-
-
Sean Kim authored
Summary: Add str to normalized parameter to enable frame_length based normalization to align with torch implementation of stft. Addresses issue https://github.com/pytorch/audio/issues/2104 Pull Request resolved: https://github.com/pytorch/audio/pull/2554 Reviewed By: carolineechen, mthrok Differential Revision: D38247554 Pulled By: skim0514 fbshipit-source-id: c243c7a6b8fda2a1e565cef4600f7c5a06baf602
-
Sean Kim authored
Summary: Edit factory function's docstrings. Pull Request resolved: https://github.com/pytorch/audio/pull/2570 Reviewed By: carolineechen Differential Revision: D38250369 Pulled By: skim0514 fbshipit-source-id: fa777e37d7cc517cf4ff1842d5585bf36558f50a
-
moto authored
Summary: This commit gets rid of our copy of CTC decoder code and replace it with upstream Flashlight-Text repo. Pull Request resolved: https://github.com/pytorch/audio/pull/2580 Reviewed By: carolineechen Differential Revision: D38244906 Pulled By: mthrok fbshipit-source-id: d274240fc67675552d19ff35e9a363b9b9048721
-
Sean Kim authored
Summary: Add tutorial python file, draft PR, will continue to modify accordingly to feedback. Future plan: modify spectrogram and bottom audio design and work on finding best audio track and segments Pull Request resolved: https://github.com/pytorch/audio/pull/2572 Reviewed By: carolineechen, nateanl, mthrok Differential Revision: D38234001 Pulled By: skim0514 fbshipit-source-id: fe9207864f354dec5cf5ff52bf7d9ddcf4a001d5
-
Vamsi Desu authored
Summary: CTC decoder and StreamReader are now in the main library. This commit removes their aliases in `torchaudio.prototypes` Pull Request resolved: https://github.com/pytorch/audio/pull/2583 Reviewed By: mthrok Differential Revision: D38189314 fbshipit-source-id: c62209f2ad4f7052c6756a537b6fc509064e428c
-
Zhaoheng Ni authored
Summary: - The optimizer in fine-tuning recipe should also be `AdamW`. See https://github.com/pytorch/audio/pull/2412 - Fix the import of `DistributedBatchSampler` in hubert dataset - Fix `dataset_path` in fine-tuning module. Pull Request resolved: https://github.com/pytorch/audio/pull/2588 Reviewed By: carolineechen Differential Revision: D38243423 Pulled By: nateanl fbshipit-source-id: badc88ce9eddfd71270201a65ae89433fae2733f
-
moto authored
Summary: Extract the helper functions for defining library and extension so that they can be reused for building flashlight library and binding in https://github.com/pytorch/audio/issues/2580. Pull Request resolved: https://github.com/pytorch/audio/pull/2585 Reviewed By: carolineechen Differential Revision: D38233407 Pulled By: mthrok fbshipit-source-id: 96f7c62a8b70bb3ff5caede9730165d54a55272f
-
- 27 Jul, 2022 3 commits
-
-
Eli Uriegas authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2582 CHECK_ were deprecated in upstream so we should replace them here as well Similar to https://github.com/pytorch/vision/pull/6322, relates to https://github.com/pytorch/pytorch/pull/82032 Signed-off-by:
Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: malfet, mthrok Differential Revision: D38208356 Pulled By: seemethere fbshipit-source-id: 6f42d517362f415e0775803514eee2628402918f
-
Son Dinh authored
Summary: This commit replaces the use of assert with `if ~ then raise` idiom, So that they are executed even when Python is running in optimized mode. Pull Request resolved: https://github.com/pytorch/audio/pull/2578 Reviewed By: mthrok Differential Revision: D38158122 fbshipit-source-id: da561145a6e021238e9e9df10ab8d2d3a751fb69
-
Piyush Soni authored
Summary: `assert` is not executed when running in optimized mode. This commit replaces all instances of "assert" in /fbcode/pytorch/audio/torchaudio/functional/functional.py Pull Request resolved: https://github.com/pytorch/audio/pull/2579 Reviewed By: mthrok Differential Revision: D38158280 fbshipit-source-id: f8d7fca1c8f9b3955c6ca312b16947eb12894d81
-
- 26 Jul, 2022 5 commits
-
-
Yu Shi authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2571 Per T127106783, replace `assert` statement with `if _ then raise` statement to enforce the assertion even in optimized mode Reviewed By: mthrok Differential Revision: D38123481 fbshipit-source-id: 19321f7467bfd993b38bd9e44fcd01e5f5e64b87
-
Sean Kim authored
Summary: Quick docstring change, adding extra line to properly parse Pull Request resolved: https://github.com/pytorch/audio/pull/2575 Reviewed By: mthrok Differential Revision: D38138566 Pulled By: skim0514 fbshipit-source-id: fc1ed68ed0050e194944714c753fb35adc85b27e
-
Moto Hira authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2557 Allow the use of flahslight-decoder from upstream Reviewed By: carolineechen Differential Revision: D37983846 fbshipit-source-id: edb1b701bd18718b3b10cf51cc63d3924d4cc073
-
Sean Kim authored
Summary: Created new branch and brought in commits due to rebasing issues, resolved conflicts on new branch, close old branch. Pull Request resolved: https://github.com/pytorch/audio/pull/2565 Reviewed By: nateanl, mthrok Differential Revision: D38131189 Pulled By: skim0514 fbshipit-source-id: 96531480cf50562944abb28d70879f21b4609f15
-
Abhinav Gupta authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2573 Moved the call to kaldo_io to each function (instead of up top) to delay the call. Reviewed By: mthrok Differential Revision: D38108022 fbshipit-source-id: 4ba8cc6a942a00de83668bbb7e361d5ae8b773eb
-
- 25 Jul, 2022 3 commits
-
-
proxyphi authored
Summary: The momentum in GriffinLim transform is modified before being passed to the functional. causing inconsistency between functional and transforms. Fix this by making it pass through in transform. Fixes https://github.com/pytorch/audio/issues/2567 Pull Request resolved: https://github.com/pytorch/audio/pull/2568 Reviewed By: nateanl Differential Revision: D38117632 Pulled By: mthrok fbshipit-source-id: 99754be4b3b6dea45ba115aaea9fb6d7285bc2c9
-
Sean Kim authored
Summary: Previous Issue: --use-tmp-hub-dir expected the temp directories used to store large file to be deleted after each test case, but pytest erases directories after 3 full test sessions. This commit fixes by manually deleting a new subdirectory created in each test case. https://github.com/pytorch/audio/pull/2565#discussion_r929007101 Pull Request resolved: https://github.com/pytorch/audio/pull/2569 Reviewed By: nateanl Differential Revision: D38117848 Pulled By: skim0514 fbshipit-source-id: 3767cb8df1238fd6218f6aaa58d5d583cea72699
-
moto authored
Summary: This commit fix build_docs job timeout by pinning `resampy=0.2.2`. For some mysterious reason, `resampy=0.3.1` causes slowdown of unrelated code. https://github.com/bmcfee/resampy/issues/106 Pull Request resolved: https://github.com/pytorch/audio/pull/2543 Reviewed By: carolineechen Differential Revision: D38115003 Pulled By: mthrok fbshipit-source-id: 67cd1c73dd4adb3091e0b88aaf5c31de0dd4b87e
-
- 22 Jul, 2022 2 commits
-
-
Sean Kim authored
Summary: Don't allow users to input incorrect dimensions Pull Request resolved: https://github.com/pytorch/audio/pull/2563 Reviewed By: carolineechen Differential Revision: D38074360 Pulled By: skim0514 fbshipit-source-id: 7bcae515706eb358ca6f68c50c7c0ccace1c3f95
-
Zhaoheng Ni authored
Summary: - Add documentation page for `SourceSeparationBundle` and `CONVTASNET_BASE_LIBRI2MIX`. - Add citation of Libri2Mix dataset in the bundle documentation. - url in integration test should use slash instead of `os.path.join` as it will fail on Windows. Change it to f-string. Pull Request resolved: https://github.com/pytorch/audio/pull/2559 Reviewed By: carolineechen Differential Revision: D38036116 Pulled By: nateanl fbshipit-source-id: 736732805191113955badfec3955e2e24e8f4836
-
- 21 Jul, 2022 4 commits
-
-
Sean Kim authored
Summary: Added back device in case of tensor creation Pull Request resolved: https://github.com/pytorch/audio/pull/2561 Reviewed By: mthrok Differential Revision: D38035351 Pulled By: skim0514 fbshipit-source-id: bdea07cbb34d0aa487187cded1a5636da6623d96
-
Jumon Nozaki authored
Summary: Fix the fallback function of load fileobj function in sox_io backend. The typo in the fallback function prevents showing the intended error message. Pull Request resolved: https://github.com/pytorch/audio/pull/2560 Reviewed By: carolineechen, nateanl Differential Revision: D38035077 Pulled By: mthrok fbshipit-source-id: 53c91c0569c7e7bba611aed6ea748dbd2f323221
-
Eli Uriegas authored
Summary: Updates the runner to the latest apple silicon machines we have that also run on macOS 12.4 Similar to https://github.com/pytorch/vision/pull/6290 Signed-off-by:
Eli Uriegas <eliuriegas@fb.com> Pull Request resolved: https://github.com/pytorch/audio/pull/2556 Reviewed By: atalman, mthrok Differential Revision: D37999959 Pulled By: seemethere fbshipit-source-id: 01d2ff01e48dcc0c4e33ed81758886fa19642aa3
-
Zhaoheng Ni authored
Summary: - Add SourceSeparationBundle class for source separation pipeline - Add `CONVTASNET_BASE_LIBRI2MIX` that is trained on Libri2Mix dataset. - Add integration test with example mixture audio and expected scale-invariant signal-to-distortion ratio (Si-SDR) score. The test computes the Si-SDR score with permutation-invariant training (PIT) criterion for all permutations of sources and use the highest value as the final output. The test verifies if the score is equal to or larger than the expected value. Pull Request resolved: https://github.com/pytorch/audio/pull/2440 Reviewed By: mthrok Differential Revision: D37997646 Pulled By: nateanl fbshipit-source-id: c951bcbbe8b7ed9553cb8793d6dc1ef90d5a29fe
-
- 20 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Modification from pull request https://github.com/pytorch/audio/issues/2415 to improve resample. Benchmarked for a 89% time reduction, tested in comparison to original resample method. Pull Request resolved: https://github.com/pytorch/audio/pull/2553 Reviewed By: carolineechen Differential Revision: D37997533 Pulled By: skim0514 fbshipit-source-id: ef4b719450ac26794db6ea01f9882509f4fda5cf
-
- 19 Jul, 2022 2 commits
-
-
John Lu authored
Summary: `std::runtime_error` does not preserve the C++ stack trace, so it is unclear to users what went wrong internally. PyTorch's `TORCH_CHECK` macro allows to print C++ stack trace when `TORCH_SHOW_CPP_STACKTRACES` environment variable is set to 1. Pull Request resolved: https://github.com/pytorch/audio/pull/2551 Improve assertion for TorchAudio ffmpeg directory Reviewed By: mthrok Differential Revision: D37915732 fbshipit-source-id: 9f597eb00cadd0dc6a1bbf8f7d5c8092804ef685
-
moto authored
Summary: After reviewing the code for KenLM it turned out that we can build it without boost. Pull Request resolved: https://github.com/pytorch/audio/pull/2552 Reviewed By: xiaohui-zhang Differential Revision: D37949699 Pulled By: mthrok fbshipit-source-id: 4a4ffae4220d0b764b53f52b93040670d91a84a3
-