Commits · c26b38b29b3f3f972f50057df20d0a226dc062a4 · OpenDAS / Torchaudio

29 Jul, 2022 4 commits

Update forced alignment tutorial (#2544) · c26b38b2

moto authored Jul 29, 2022

Summary:
1. Fix initialization.
Previously, the SOS token score was initialized to 0 across the time axis.
This was biasing the alignment to delay the start.
The proper way to delay the SOS is via blank token.
The new initilization takes the cumulated sum of blank scores.
2. Fill the end of trellis with Inf
Similar to the start, at the end where there remaining time frame is less
than the number of tokens, it is no longer possible to align the text, thus
we fill with Inf for better visualization.
3. Clean up asset management code.

Pull Request resolved: https://github.com/pytorch/audio/pull/2544

Reviewed By: nateanl

Differential Revision: D38276478

Pulled By: mthrok

fbshipit-source-id: 6d934cc850a0790b8c463a4f69f8f1143633d299

c26b38b2

Enable CTC decoder in Windows (#2587) · 67cb420d

moto authored Jul 29, 2022

Summary:
This commit enables CTC decoder on Windows.

The functionality seems to work fine.
The tests are passing, the decoding tutorial runs fine.

The only difference to the Linux/macOS version is that
loading model in XZ compression format is not supported.

![289961785_399620772041679_7768117002438616376_n](https://user-images.githubusercontent.com/855818/181420923-cfbd8402-20de-4e63-b9e4-e39f9aa9fc50.png)

Pull Request resolved: https://github.com/pytorch/audio/pull/2587

Reviewed By: carolineechen, nateanl

Differential Revision: D38276490

Pulled By: mthrok

fbshipit-source-id: f2203b2235c5bbb0220fe560aaaf0e1d5530347a

67cb420d

Replace 'runtime_error' exception with 'TORCH_CHECK' in TorchAudio sox (#2592) · f234e51f

Javier Cardenete Morales authored Jul 29, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2592

std::runtime_error does not preserve the C++ stack trace, so it is unclear to users what went wrong internally.

PyTorch's TORCH_CHECK macro allows to print C++ stack trace when TORCH_SHOW_CPP_STACKTRACES environment variable is set to 1.

Reviewed By: mthrok

Differential Revision: D38219331

fbshipit-source-id: f51c27111077e927f97127f73f83a31b8e74f61f

f234e51f

Improve speech enhancement tutorial (#2527) · d6267031

Zhaoheng Ni authored Jul 29, 2022

Summary:
- The "speech + noise" mixture still has a high SNR, which can't show the effectiveness of MVDR beamforming. To make the task more challenging, amplify the noise waveform to reduce the SNR of mixture speech.
- Show the Si-SNR score of mixture speech when visualizing the mixture spectrogram.
- FIx the figure in `rtf_power` subsection.
    - The description of enhanced spectrogram by `rtf_power` is wrong. Correct it to `rtf_power`.
- Print PESQ, STOI, and SDR metric scores.

Pull Request resolved: https://github.com/pytorch/audio/pull/2527

Reviewed By: mthrok

Differential Revision: D38190218

Pulled By: nateanl

fbshipit-source-id: 39562850a67f58a16e0a2866ed95f78c3f4dc7de

d6267031

28 Jul, 2022 7 commits

Add Union normalization parameter on spectrogram and inverse spectrogram (#2554) · 0fde7c57

Sean Kim authored Jul 28, 2022

Summary:
Add str to normalized parameter to enable frame_length based normalization to align with torch implementation of stft. Addresses issue https://github.com/pytorch/audio/issues/2104

Pull Request resolved: https://github.com/pytorch/audio/pull/2554

Reviewed By: carolineechen, mthrok

Differential Revision: D38247554

Pulled By: skim0514

fbshipit-source-id: c243c7a6b8fda2a1e565cef4600f7c5a06baf602

0fde7c57

Change docstring for easier understanding (#2570) · 338e3104

Sean Kim authored Jul 28, 2022

Summary:
Edit factory function's docstrings.

Pull Request resolved: https://github.com/pytorch/audio/pull/2570

Reviewed By: carolineechen

Differential Revision: D38250369

Pulled By: skim0514

fbshipit-source-id: fa777e37d7cc517cf4ff1842d5585bf36558f50a

338e3104

Migrate CTC decoder code (#2580) · 39b6343d

moto authored Jul 28, 2022

Summary:
This commit gets rid of our copy of CTC decoder code and
replace it with upstream Flashlight-Text repo.

Pull Request resolved: https://github.com/pytorch/audio/pull/2580

Reviewed By: carolineechen

Differential Revision: D38244906

Pulled By: mthrok

fbshipit-source-id: d274240fc67675552d19ff35e9a363b9b9048721

39b6343d

Create tutorial for HDemucs (#2572) · 919fd0c4

Sean Kim authored Jul 28, 2022

Summary:
Add tutorial python file, draft PR, will continue to modify accordingly to feedback.

Future plan: modify spectrogram and bottom audio design and work on finding best audio track and segments

Pull Request resolved: https://github.com/pytorch/audio/pull/2572

Reviewed By: carolineechen, nateanl, mthrok

Differential Revision: D38234001

Pulled By: skim0514

fbshipit-source-id: fe9207864f354dec5cf5ff52bf7d9ddcf4a001d5

919fd0c4

Remove deprecated prototype alias (#2583) · 08395ba6

Vamsi Desu authored Jul 28, 2022

Summary:
CTC decoder and StreamReader are now in the main library.
This commit removes their aliases in `torchaudio.prototypes`

Pull Request resolved: https://github.com/pytorch/audio/pull/2583

Reviewed By: mthrok

Differential Revision: D38189314

fbshipit-source-id: c62209f2ad4f7052c6756a537b6fc509064e428c

08395ba6

Fix hubert fine-tuning recipe bugs (#2588) · 0092aa3c

Zhaoheng Ni authored Jul 28, 2022

Summary:
- The optimizer in fine-tuning recipe should also be `AdamW`. See https://github.com/pytorch/audio/pull/2412
- Fix the import of `DistributedBatchSampler` in hubert dataset
- Fix `dataset_path` in fine-tuning module.

Pull Request resolved: https://github.com/pytorch/audio/pull/2588

Reviewed By: carolineechen

Differential Revision: D38243423

Pulled By: nateanl

fbshipit-source-id: badc88ce9eddfd71270201a65ae89433fae2733f

0092aa3c

Refactor cmake (#2585) · d84ce3b2

moto authored Jul 28, 2022

Summary:
Extract the helper functions for defining library and extension so that they can be reused for building flashlight library and binding in https://github.com/pytorch/audio/issues/2580.

Pull Request resolved: https://github.com/pytorch/audio/pull/2585

Reviewed By: carolineechen

Differential Revision: D38233407

Pulled By: mthrok

fbshipit-source-id: 96f7c62a8b70bb3ff5caede9730165d54a55272f

d84ce3b2

27 Jul, 2022 3 commits

Replaced CHECK_ by TORCH_CHECK_ (#2582) · 04057fa6

Eli Uriegas authored Jul 27, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2582

CHECK_ were deprecated in upstream so we should replace them here as
well

Similar to https://github.com/pytorch/vision/pull/6322, relates to https://github.com/pytorch/pytorch/pull/82032

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: malfet, mthrok

Differential Revision: D38208356

Pulled By: seemethere

fbshipit-source-id: 6f42d517362f415e0775803514eee2628402918f

04057fa6

Replace assert with raise in prototypes.models (#2578) · 34ef7e9c

Son Dinh authored Jul 27, 2022

Summary:
This commit replaces the use of assert with `if ~ then raise` idiom,
So that they are executed even when Python is running in optimized mode.

Pull Request resolved: https://github.com/pytorch/audio/pull/2578

Reviewed By: mthrok

Differential Revision: D38158122

fbshipit-source-id: da561145a6e021238e9e9df10ab8d2d3a751fb69

34ef7e9c

Replace assert with raise (#2579) · 0f4e1e8c

Piyush Soni authored Jul 27, 2022

Summary:
`assert` is not executed when running in optimized mode.

This commit replaces all instances of "assert" in /fbcode/pytorch/audio/torchaudio/functional/functional.py

Pull Request resolved: https://github.com/pytorch/audio/pull/2579

Reviewed By: mthrok

Differential Revision: D38158280

fbshipit-source-id: f8d7fca1c8f9b3955c6ca312b16947eb12894d81

0f4e1e8c

26 Jul, 2022 5 commits

Fix argument validation in TorchAudio datasets (#2571) · 5bf73b59

Yu Shi authored Jul 26, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2571

Per T127106783, replace `assert` statement with `if _ then raise` statement to enforce the assertion even in optimized mode

Reviewed By: mthrok

Differential Revision: D38123481

fbshipit-source-id: 19321f7467bfd993b38bd9e44fcd01e5f5e64b87

5bf73b59

Dataset docstring change (#2575) · 379487de

Sean Kim authored Jul 25, 2022

Summary:
Quick docstring change, adding extra line to properly parse

Pull Request resolved: https://github.com/pytorch/audio/pull/2575

Reviewed By: mthrok

Differential Revision: D38138566

Pulled By: skim0514

fbshipit-source-id: fc1ed68ed0050e194944714c753fb35adc85b27e

379487de

Switch to flashlight decoder from upstream (#2557) · 075a7706

Moto Hira authored Jul 25, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2557

Allow the use of flahslight-decoder from upstream

Reviewed By: carolineechen

Differential Revision: D37983846

fbshipit-source-id: edb1b701bd18718b3b10cf51cc63d3924d4cc073

075a7706

New Pipeline edits for HDemucs (#2565) · 4c4da32c

Sean Kim authored Jul 25, 2022

Summary:
Created new branch and brought in commits due to rebasing issues, resolved conflicts on new branch, close old branch.

Pull Request resolved: https://github.com/pytorch/audio/pull/2565

Reviewed By: nateanl, mthrok

Differential Revision: D38131189

Pulled By: skim0514

fbshipit-source-id: 96531480cf50562944abb28d70879f21b4609f15

4c4da32c

Delay the import of kaldi_io (#2573) · 45f512f6

Abhinav Gupta authored Jul 25, 2022

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/2573

Moved the call to kaldo_io to each function (instead of up top) to delay the call.

Reviewed By: mthrok

Differential Revision: D38108022

fbshipit-source-id: 4ba8cc6a942a00de83668bbb7e361d5ae8b773eb

45f512f6

25 Jul, 2022 3 commits

[BC-breaking] Fix momentum in transforms.GriffinLim (#2568) · 1634ed01

proxyphi authored Jul 25, 2022

Summary:
The momentum in GriffinLim transform is modified before being passed
to the functional. causing inconsistency between functional and transforms.

Fix this by making it pass through in transform.

Fixes https://github.com/pytorch/audio/issues/2567

Pull Request resolved: https://github.com/pytorch/audio/pull/2568

Reviewed By: nateanl

Differential Revision: D38117632

Pulled By: mthrok

fbshipit-source-id: 99754be4b3b6dea45ba115aaea9fb6d7285bc2c9

1634ed01

Integration test fix deleting temporary directory (#2569) · 8dcf06ac

Sean Kim authored Jul 25, 2022

Summary:
Previous Issue: --use-tmp-hub-dir expected the temp directories used to store large file to be deleted after each test case, but pytest erases directories after 3 full test sessions. This commit fixes by manually deleting a new subdirectory created in each test case. https://github.com/pytorch/audio/pull/2565#discussion_r929007101

Pull Request resolved: https://github.com/pytorch/audio/pull/2569

Reviewed By: nateanl

Differential Revision: D38117848

Pulled By: skim0514

fbshipit-source-id: 3767cb8df1238fd6218f6aaa58d5d583cea72699

8dcf06ac

Fix build_docs job (#2543) · 81780c95

moto authored Jul 25, 2022

Summary:
This commit fix build_docs job timeout by pinning `resampy=0.2.2`.

For some mysterious reason, `resampy=0.3.1` causes slowdown of unrelated code. https://github.com/bmcfee/resampy/issues/106

Pull Request resolved: https://github.com/pytorch/audio/pull/2543

Reviewed By: carolineechen

Differential Revision: D38115003

Pulled By: mthrok

fbshipit-source-id: 67cd1c73dd4adb3091e0b88aaf5c31de0dd4b87e

81780c95

22 Jul, 2022 2 commits

Add dimension and shape check (#2563) · b1f510fa

Sean Kim authored Jul 22, 2022

Summary:
Don't allow users to input incorrect dimensions

Pull Request resolved: https://github.com/pytorch/audio/pull/2563

Reviewed By: carolineechen

Differential Revision: D38074360

Pulled By: skim0514

fbshipit-source-id: 7bcae515706eb358ca6f68c50c7c0ccace1c3f95

b1f510fa

Add documents for SourceSeparationBundle (#2559) · 6cee56ab

Zhaoheng Ni authored Jul 22, 2022

Summary:
- Add documentation page for `SourceSeparationBundle` and `CONVTASNET_BASE_LIBRI2MIX`.
- Add citation of Libri2Mix dataset in the bundle documentation.
- url in integration test should use slash instead of `os.path.join` as it will fail on Windows. Change it to f-string.

Pull Request resolved: https://github.com/pytorch/audio/pull/2559

Reviewed By: carolineechen

Differential Revision: D38036116

Pulled By: nateanl

fbshipit-source-id: 736732805191113955badfec3955e2e24e8f4836

6cee56ab

21 Jul, 2022 4 commits

fix resample (#2561) · c18a103b

Sean Kim authored Jul 21, 2022

Summary:
Added back device in case of tensor creation

Pull Request resolved: https://github.com/pytorch/audio/pull/2561

Reviewed By: mthrok

Differential Revision: D38035351

Pulled By: skim0514

fbshipit-source-id: bdea07cbb34d0aa487187cded1a5636da6623d96

c18a103b

Fix fall back failure in sox_io backend (#2560) · 4778c2e5

Jumon Nozaki authored Jul 21, 2022

Summary:
Fix the fallback function of load fileobj function in sox_io backend.

The typo in the fallback function prevents showing the intended error message.

Pull Request resolved: https://github.com/pytorch/audio/pull/2560

Reviewed By: carolineechen, nateanl

Differential Revision: D38035077

Pulled By: mthrok

fbshipit-source-id: 53c91c0569c7e7bba611aed6ea748dbd2f323221

4778c2e5

ci: Update macos runners to AWS self hosted (#2556) · f0088599

Eli Uriegas authored Jul 21, 2022

Summary:
Updates the runner to the latest apple silicon machines we have that
also run on macOS 12.4

Similar to https://github.com/pytorch/vision/pull/6290

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Pull Request resolved: https://github.com/pytorch/audio/pull/2556

Reviewed By: atalman, mthrok

Differential Revision: D37999959

Pulled By: seemethere

fbshipit-source-id: 01d2ff01e48dcc0c4e33ed81758886fa19642aa3

f0088599

Add SourceSeparationBundle to prototype (#2440) · 83362580

Zhaoheng Ni authored Jul 20, 2022

Summary:
- Add SourceSeparationBundle class for source separation pipeline
- Add `CONVTASNET_BASE_LIBRI2MIX` that is trained on Libri2Mix dataset.
- Add integration test with example mixture audio and expected scale-invariant signal-to-distortion ratio (Si-SDR) score. The test computes the Si-SDR score with permutation-invariant training (PIT) criterion for all permutations of sources and use the highest value as the final output. The test verifies if the score is equal to or larger than the expected value.

Pull Request resolved: https://github.com/pytorch/audio/pull/2440

Reviewed By: mthrok

Differential Revision: D37997646

Pulled By: nateanl

fbshipit-source-id: c951bcbbe8b7ed9553cb8793d6dc1ef90d5a29fe

83362580

20 Jul, 2022 1 commit

Speed up resample with kernel generation modification (#2553) · 5c6e602c

Sean Kim authored Jul 20, 2022

Summary:
Modification from pull request https://github.com/pytorch/audio/issues/2415 to improve resample.

Benchmarked for a 89% time reduction, tested in comparison to original resample method.

Pull Request resolved: https://github.com/pytorch/audio/pull/2553

Reviewed By: carolineechen

Differential Revision: D37997533

Pulled By: skim0514

fbshipit-source-id: ef4b719450ac26794db6ea01f9882509f4fda5cf

5c6e602c

19 Jul, 2022 3 commits

Replace `runtime_error` exception with `TORCH_CHECK` in TorchAudio ffmpeg dir (2/2) (#2551) · a2d6fee2

John Lu authored Jul 19, 2022

Summary:
`std::runtime_error` does not preserve the C++ stack trace, so it is unclear to users what went wrong internally.

PyTorch's `TORCH_CHECK` macro allows to print C++ stack trace when `TORCH_SHOW_CPP_STACKTRACES` environment variable is set to 1.

Pull Request resolved: https://github.com/pytorch/audio/pull/2551

Improve assertion for TorchAudio ffmpeg directory

Reviewed By: mthrok

Differential Revision: D37915732

fbshipit-source-id: 9f597eb00cadd0dc6a1bbf8f7d5c8092804ef685

a2d6fee2

Remove boost (#2552) · ee631d6b

moto authored Jul 19, 2022

Summary:
After reviewing the code for KenLM it turned out that we can build it without boost.

Pull Request resolved: https://github.com/pytorch/audio/pull/2552

Reviewed By: xiaohui-zhang

Differential Revision: D37949699

Pulled By: mthrok

fbshipit-source-id: 4a4ffae4220d0b764b53f52b93040670d91a84a3

ee631d6b

Adding pipeline changes, factory functions to HDemucs (#2547) · 62854588

Sean Kim authored Jul 19, 2022

Summary:
Factory functions have been added to HDemucs class and test the implementation within the testing files.

Pull Request resolved: https://github.com/pytorch/audio/pull/2547

Reviewed By: carolineechen

Differential Revision: D37948600

Pulled By: skim0514

fbshipit-source-id: 7ac4e4a71519450cfbbc24ff7d7e70521f676040

62854588

18 Jul, 2022 1 commit

Replace `runtime_error` exception with `TORCH_CHECK` in TorchAudio ffmpeg dir (1/2) (#2550) · af6ebbae

John Lu authored Jul 18, 2022

Summary:
`std::runtime_error` does not preserve the C++ stack trace, so it is unclear to users what went wrong internally.

PyTorch's `TORCH_CHECK` macro allows to print C++ stack trace when `TORCH_SHOW_CPP_STACKTRACES` environment variable is set to 1.

Pull Request resolved: https://github.com/pytorch/audio/pull/2550

Improves assertion for TorchAudio ffmpeg directory

Reviewed By: mthrok

Differential Revision: D37914953

fbshipit-source-id: 7704c41bb88b0616ae2e73961a5496bc0d95cf13

af6ebbae

15 Jul, 2022 1 commit

Set MACOSX_DEPLOYMENT_TARGET=10.9 in binary build jobs (#2546) · b53ff1b9

moto authored Jul 15, 2022

Summary:
Recent CircleCI migration https://github.com/pytorch/audio/pull/2529
silently bumped the minimum supported macOS version to 11.

PyTorch still supports 10.9 and the ecosystem still uses 10.9.
Issue: https://github.com/pytorch/audio/issues/2536

This commit sets MACOSX_DEPLOYMENT_TARGET=10.9, so that binary
distribution are compatible on macOS=10.9.

Pull Request resolved: https://github.com/pytorch/audio/pull/2546

Reviewed By: atalman

Differential Revision: D37854586

Pulled By: mthrok

fbshipit-source-id: a43986ae4de9ef51a4261e0f9fe58e88b4b72148

b53ff1b9

12 Jul, 2022 6 commits

Simplify the requirements to minimum runtime dependencies (#2313) · 632ea670

moto authored Jul 12, 2022

Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2313

Reviewed By: carolineechen, nateanl

Differential Revision: D37799552

Pulled By: mthrok

fbshipit-source-id: 12e27fccb7098f3142e9ca0b748c71325cd324ee

632ea670

Docstring change for Hybrid Demucs (#2542) · 99303143

Sean Kim authored Jul 12, 2022

Summary:
Small edit to docstring for kernel

Pull Request resolved: https://github.com/pytorch/audio/pull/2542

Reviewed By: carolineechen

Differential Revision: D37797937

Pulled By: skim0514

fbshipit-source-id: 4bdd1e3ddb49cbdf2bd5367edb03cf9603d4ec6e

99303143

Simplify HW acceleration code (#2534) · 4ba56323

moto authored Jul 12, 2022

Summary:
FFmpeg's API provide multiple ways to initialize decoder. This PR simplifies the initialization by delegating the HW device context management to FFmpeg's native code.

Pull Request resolved: https://github.com/pytorch/audio/pull/2534

Reviewed By: hwangjeff

Differential Revision: D37734573

Pulled By: mthrok

fbshipit-source-id: e61736b4d4d2ca6e94d8965abd93b4e9a68e7351

4ba56323

Hybrid Demucs model implementation (#2506) · 608b8ea6

Sean Kim authored Jul 12, 2022

Summary:
Draft PR with initial model implementation with minor changes from previous implementation

Pull Request resolved: https://github.com/pytorch/audio/pull/2506

Reviewed By: nateanl

Differential Revision: D37762671

Pulled By: skim0514

fbshipit-source-id: b7dc0a6ef725d6ae6d76c23c882623f7d339977c

608b8ea6

Clean up the interface around dictionary (#2533) · e2641452

moto authored Jul 11, 2022

Summary:
Python dictionary is bound to different types in TorchBind and PyBind.
StreamReader has methods that receive and return dictionary.

This commit cleans up the treatment of dictionary and consolidate
helper functions.

* The core implementation and TorchBind all uses `c10::Dict`.
* PyBind version uses `std::map` and converts it to `c10::Dict`.
* The helper functions to convert `std::map` <-> `c10::Dict` are consolidated in pybind directory.
* The wrapper methods are implemented in `pybind` dir.

Pull Request resolved: https://github.com/pytorch/audio/pull/2533

Reviewed By: hwangjeff

Differential Revision: D37731866

Pulled By: mthrok

fbshipit-source-id: 5a5cf1372668f7d3aacc0bb461bc69fa07212f3f

e2641452

Fix docstring (#2540) · 05d2580a

Zhaoheng Ni authored Jul 11, 2022

Summary:
The docstring of `apply_beamforming` has warning when building the documentation page. Fix it in this PR.

Pull Request resolved: https://github.com/pytorch/audio/pull/2540

Reviewed By: mthrok

Differential Revision: D37763745

Pulled By: nateanl

fbshipit-source-id: 0e9f1e098865af032b00ac56d918cb9d2ffc5024

05d2580a