"examples/pytorch/vscode:/vscode.git/clone" did not exist on "b4ad59d77f3fff30f148a0391b4cdfc6ef19915c"
- 14 Sep, 2022 1 commit
-
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2673 Reviewed By: mthrok Differential Revision: D39507612 Pulled By: carolineechen fbshipit-source-id: 3a9ee53f72cabd6e3085c76867017be4a6ed7f53
-
- 13 Sep, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2669 Reviewed By: carolineechen, mthrok Differential Revision: D39433560 Pulled By: nateanl fbshipit-source-id: 5b652b31c00badb37b27a32ac25b422a5bcc74cb
-
- 12 Sep, 2022 1 commit
-
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2668 Reviewed By: nateanl, mthrok Differential Revision: D39433671 Pulled By: carolineechen fbshipit-source-id: 3545a5b4019832861c34fd8c05e5f8600fd80d5c
-
- 01 Sep, 2022 1 commit
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2648 Reviewed By: nateanl Differential Revision: D38976874 Pulled By: mthrok fbshipit-source-id: 0541dea2a633d97000b4b8609ff6b83f6b82c864
-
- 24 Aug, 2022 1 commit
-
-
moto authored
Summary: This commit adds FFmpeg-based encoder StreamWriter class. StreamWriter is pretty much the opposite of StreamReader class, and it supports; * Encoding audio / still image / video * Exporting to local file / streaming protocol / devices etc... * File-like object support (in later commit) * HW video encoding (in later commit) See also: https://fburl.com/gslide/z85kn5a9 (Meta internal) Pull Request resolved: https://github.com/pytorch/audio/pull/2628 Reviewed By: nateanl Differential Revision: D38816650 Pulled By: mthrok fbshipit-source-id: a9343b0d55755e186971dc96fb86eb52daa003c8
-
- 11 Aug, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds function `add_noise`, which computes and returns the sum of a waveform and scaled noise. Pull Request resolved: https://github.com/pytorch/audio/pull/2608 Reviewed By: nateanl Differential Revision: D38557141 Pulled By: hwangjeff fbshipit-source-id: 1457fa213f43ca5b4333d3c7580971655d4260a0
-
- 09 Aug, 2022 1 commit
-
-
Caroline Chen authored
Summary: Expose flashlight's LM and LMState classes to support decoding with custom language models, including NN LMs. The `ctc_decoder` API is as follows - To decode with KenLM, pass in KenLM language model path to `lm` variable - To decode with custom LM, create Python class with `CTCDecoderLM` subclass, and pass in the class to `lm` variable. Additionally create a file of LM words listed in order of the LM index, with a word per line, and pass in the file to `lm_path`. - To decode without a language model, set `lm` to `None` (default) Validated against fairseq w2l decoder on sample LibriSpeech dataset and LM. Code for validation can be found [here](https://github.com/facebookresearch/fairseq/compare/main...carolineechen:fairseq:ctc-decoder). Also added unit tests to validate custom implementations of ZeroLM and KenLM, and also using a biased LM. Follow ups: - Train simple LM on LibriSpeech and demonstrate usage in tutorial or examples directory cc jacobkahn Pull Request resolved: https://github.com/pytorch/audio/pull/2528 Reviewed By: mthrok Differential Revision: D38243802 Pulled By: carolineechen fbshipit-source-id: 445e78f6c20bda655aabf819fc0f771fe68c73d7
-
- 05 Aug, 2022 1 commit
-
-
hwangjeff authored
Summary: Adds functions `convolve` and `fftconvolve`, which compute the convolution of two tensors along their trailing dimension. The former performs the convolution directly, whereas the latter performs it using FFT. Pull Request resolved: https://github.com/pytorch/audio/pull/2602 Reviewed By: nateanl, mthrok Differential Revision: D38450771 Pulled By: hwangjeff fbshipit-source-id: b2d1e063ba21eafeddf317d60749e7120b14292b
-
- 03 Aug, 2022 2 commits
-
-
Sean Kim authored
Summary: Add new model pretrained weights and tests Pull Request resolved: https://github.com/pytorch/audio/pull/2601 Reviewed By: carolineechen, nateanl Differential Revision: D38396673 Pulled By: skim0514 fbshipit-source-id: e06f97d28508543bc18e671344386a947bc870c1
-
bshall authored
Summary: I took a stab at implementing the ITU-R BS.1770-4 loudness recommendation (closes https://github.com/pytorch/audio/issues/1205). To give some more details: - I've implemented K-weighting following csteinmetz1 instead of BrechtDeMan since it fit well with torchaudio's already implemented filters (`treble_biquad` and `highpass_biquad`). - I've added four audio files to test compliance with the recommendation. These are linked in [this pdf](https://www.itu.int/dms_pub/itu-r/opb/rep/R-REP-BS.2217-2-2016-PDF-E.pdf). There are many more test files there but I didn't want to bog down the assets directory with too many files. Let me know if I should add or remove anything. - I've kept many of the constant internal to the function (e.g. the block duration, overlap, and the absolute threshold gamma). I'm not sure if these should be exposed in the signature. - I've implemented support for up to 5 channels (following both csteinmetz1 and BrechtDeMan). The recommendation includes weights for up to 24 channels. Is there any convention for how many channels to support? I hope this is helpful! looking forward to hearing from you. Pull Request resolved: https://github.com/pytorch/audio/pull/2472 Reviewed By: hwangjeff Differential Revision: D38389155 Pulled By: carolineechen fbshipit-source-id: fcc86d864c04ab2bedaa9acd941ebc4478ca6904
-
- 28 Jul, 2022 2 commits
-
-
Sean Kim authored
Summary: Add str to normalized parameter to enable frame_length based normalization to align with torch implementation of stft. Addresses issue https://github.com/pytorch/audio/issues/2104 Pull Request resolved: https://github.com/pytorch/audio/pull/2554 Reviewed By: carolineechen, mthrok Differential Revision: D38247554 Pulled By: skim0514 fbshipit-source-id: c243c7a6b8fda2a1e565cef4600f7c5a06baf602
-
Sean Kim authored
Summary: Edit factory function's docstrings. Pull Request resolved: https://github.com/pytorch/audio/pull/2570 Reviewed By: carolineechen Differential Revision: D38250369 Pulled By: skim0514 fbshipit-source-id: fa777e37d7cc517cf4ff1842d5585bf36558f50a
-
- 26 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Created new branch and brought in commits due to rebasing issues, resolved conflicts on new branch, close old branch. Pull Request resolved: https://github.com/pytorch/audio/pull/2565 Reviewed By: nateanl, mthrok Differential Revision: D38131189 Pulled By: skim0514 fbshipit-source-id: 96531480cf50562944abb28d70879f21b4609f15
-
- 25 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Previous Issue: --use-tmp-hub-dir expected the temp directories used to store large file to be deleted after each test case, but pytest erases directories after 3 full test sessions. This commit fixes by manually deleting a new subdirectory created in each test case. https://github.com/pytorch/audio/pull/2565#discussion_r929007101 Pull Request resolved: https://github.com/pytorch/audio/pull/2569 Reviewed By: nateanl Differential Revision: D38117848 Pulled By: skim0514 fbshipit-source-id: 3767cb8df1238fd6218f6aaa58d5d583cea72699
-
- 22 Jul, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: - Add documentation page for `SourceSeparationBundle` and `CONVTASNET_BASE_LIBRI2MIX`. - Add citation of Libri2Mix dataset in the bundle documentation. - url in integration test should use slash instead of `os.path.join` as it will fail on Windows. Change it to f-string. Pull Request resolved: https://github.com/pytorch/audio/pull/2559 Reviewed By: carolineechen Differential Revision: D38036116 Pulled By: nateanl fbshipit-source-id: 736732805191113955badfec3955e2e24e8f4836
-
- 21 Jul, 2022 1 commit
-
-
Zhaoheng Ni authored
Summary: - Add SourceSeparationBundle class for source separation pipeline - Add `CONVTASNET_BASE_LIBRI2MIX` that is trained on Libri2Mix dataset. - Add integration test with example mixture audio and expected scale-invariant signal-to-distortion ratio (Si-SDR) score. The test computes the Si-SDR score with permutation-invariant training (PIT) criterion for all permutations of sources and use the highest value as the final output. The test verifies if the score is equal to or larger than the expected value. Pull Request resolved: https://github.com/pytorch/audio/pull/2440 Reviewed By: mthrok Differential Revision: D37997646 Pulled By: nateanl fbshipit-source-id: c951bcbbe8b7ed9553cb8793d6dc1ef90d5a29fe
-
- 19 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Factory functions have been added to HDemucs class and test the implementation within the testing files. Pull Request resolved: https://github.com/pytorch/audio/pull/2547 Reviewed By: carolineechen Differential Revision: D37948600 Pulled By: skim0514 fbshipit-source-id: 7ac4e4a71519450cfbbc24ff7d7e70521f676040
-
- 12 Jul, 2022 1 commit
-
-
Sean Kim authored
Summary: Draft PR with initial model implementation with minor changes from previous implementation Pull Request resolved: https://github.com/pytorch/audio/pull/2506 Reviewed By: nateanl Differential Revision: D37762671 Pulled By: skim0514 fbshipit-source-id: b7dc0a6ef725d6ae6d76c23c882623f7d339977c
-
- 07 Jul, 2022 1 commit
-
-
moto authored
Summary: This commit add support for `"yuv444p"` type as output format of StreamReader. Pull Request resolved: https://github.com/pytorch/audio/pull/2516 Reviewed By: hwangjeff Differential Revision: D37659715 Pulled By: mthrok fbshipit-source-id: eae9b5590d8f138a6ebf3808c08adfe068f11a2b
-
- 06 Jul, 2022 1 commit
-
-
Caroline Chen authored
Summary: fluent dataset test currently fails on windows, due to new line generation in csv writer in testing and incorrect path parsing in dataset impl. Pull Request resolved: https://github.com/pytorch/audio/pull/2510 Reviewed By: carolineechen Differential Revision: D37573203 Pulled By: mthrok fbshipit-source-id: 4868bc649690c7e596b002686c6128ce735d3564
-
- 28 Jun, 2022 1 commit
-
-
moto authored
Summary: Small clean up in ffmpeg binding code. 1. Make `get_option_dict` and `clean_up_dict` public utility 2. Merge the exception into `clean_up_dict` 3. Get rid of custom string join function and use `c10::Join`. Pull Request resolved: https://github.com/pytorch/audio/pull/2507 Reviewed By: hwangjeff Differential Revision: D37466022 Pulled By: mthrok fbshipit-source-id: 44b769ac6ff1ab20e6d6ae086cd1447deacb5969
-
- 27 Jun, 2022 4 commits
-
-
moto authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2511 Reviewed By: nateanl Differential Revision: D37461021 Pulled By: mthrok fbshipit-source-id: 6f894c02bbefc5afda0f9584d26ad785f7c71ee4
-
Zhaoheng Ni authored
Summary: In https://github.com/pytorch/audio/issues/2283, torchaudio's downloading function is updated to reduce code duplication. The links in `EMFORMER_RNNT_BASE_LIBRISPEECH` are updated, but the ones in prototype pipelines are not. This PR addresses it by updating the download links of `EMFORMER_RNNT_BASE_MUSTC` and `EMFORMER_RNNT_BASE_TEDLIUM3` in prototype. Corresponding integration tests are added as well. Pull Request resolved: https://github.com/pytorch/audio/pull/2444 Reviewed By: mthrok Differential Revision: D37389178 Pulled By: nateanl fbshipit-source-id: 46598dd71c95be47d1e1b54cef89ea51d280e17a
-
moto authored
Summary: Follow-up of https://github.com/pytorch/audio/issues/2464. Add utility function to fetch the versions of FFmpeg. Pull Request resolved: https://github.com/pytorch/audio/pull/2467 Reviewed By: carolineechen Differential Revision: D37028006 Pulled By: mthrok fbshipit-source-id: 72adce1e6b43985760ce55b715b0e59af5244fdb
-
Zhaoheng Ni authored
Summary: This PR adds two dataset classes of VoxCeleb1 corpus. - `VoxCeleb1Identification` Each data sample contains the waveform, sample rate, speaker id, and the file id. - `VoxCeleb1Verification` Each data sample contains a pair of waveforms, sample rate, the label indicating if they are from the same speaker, and the file ids. Pull Request resolved: https://github.com/pytorch/audio/pull/2349 Reviewed By: carolineechen Differential Revision: D35927921 Pulled By: nateanl fbshipit-source-id: 3e07ddd329178777698841565053eb59befe6449
-
- 23 Jun, 2022 1 commit
-
-
Summary: Meta: **If you take no action, this diff will be automatically accepted on 2022-06-23.** (To remove yourself from auto-accept diffs and just let them all land, add yourself to [this Butterfly rule](https://www.internalfb.com/butterfly/rule/904302247110220)) Produced by `tools/arcanist/lint/codemods/black-fbsource`. #nocancel Rules run: - CodemodTransformerSimpleShell Config Oncall: [lint](https://our.intern.facebook.com/intern/oncall3/?shortname=lint) CodemodConfig: [CodemodConfigFBSourceBlackLinter](https://www.internalfb.com/code/www/flib/intern/codemod_service/config/fbsource_arc_f/CodemodConfigFBSourceBlackLinter.php) ConfigType: php Sandcastle URL: https://www.internalfb.com/intern/sandcastle/job/13510799586951394/ This diff was automatically created with CodemodService. To learn more about CodemodService, check out the [CodemodService wiki](https://fburl.com/CodemodService). _____ ## Questions / Comments / Feedback? **[Click here to give feedback about this diff](https://www.internalfb.com/codemod_service/feedback?sandcastle_job_id=13510799586951394).** * Returning back to author or abandoning this diff will only cause the diff to be regenerated in the future. * Do **NOT** post in the CodemodService Feedback group about this specific diff. drop-conflicts Reviewed By: adamjernst Differential Revision: D37375235 fbshipit-source-id: 3d7eb39e5c0539a78d1412f37562dec90b0fc759
-
- 21 Jun, 2022 1 commit
-
-
Sean Kim authored
Summary: Create dataset handler and tests for new dataset. Manually tested and unit tested to test validity. Pre-commit ran for style checks. Pull Request resolved: https://github.com/pytorch/audio/pull/2484 Reviewed By: carolineechen, nateanl Differential Revision: D37250556 Pulled By: skim0514 fbshipit-source-id: d2c8d73d22fd9d7282026265676f3eab1e178d51
-
- 20 Jun, 2022 1 commit
-
-
Caroline Chen authored
Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2480 Reviewed By: nateanl Differential Revision: D37249571 Pulled By: carolineechen fbshipit-source-id: caefeec4253c91f2579655a0c1735edaeed51be9
-
- 13 Jun, 2022 1 commit
-
-
Reviewed By: ivanmurashko Differential Revision: D37103342 fbshipit-source-id: adc908c790a413384bd88a75d3c2b4b0974c6674
-
- 10 Jun, 2022 1 commit
-
-
Sean Kim authored
Summary: Split existing Pitchshift into multiple helper functions in order to cache kernel and speed up overall process addressing https://github.com/pytorch/audio/issues/2359. Existing unit tests pass. edit: functional and transforms unit test pass. Adopted lazy initialization to avoid BC-breaking. Pull Request resolved: https://github.com/pytorch/audio/pull/2441 Reviewed By: carolineechen Differential Revision: D36905582 Pulled By: skim0514 fbshipit-source-id: 6780db3ac8a29d59017a6abe7e82ce1fd17aaac2
-
- 08 Jun, 2022 2 commits
-
-
moto authored
Summary: In https://github.com/pytorch/audio/issues/2461, `metadata` field was added to StreamInfo. However, the value attached to this new field was source-level metadata, while each stream can have different metadata. * source level metadata [AVFormatContext->metadata](https://ffmpeg.org/doxygen/4.1/structAVFormatContext.html#a3019a56080ed2e3297ff25bc2ff88adf) * stream level metadata [AVFormatContext->streams[]->metadata](https://ffmpeg.org/doxygen/4.1/structAVStream.html#a50d250a128a3da9ce3d135e84213fb82) This commit moves source level metadata to dedicated method, `get_metadata`, and fix the stream-level metadata to report stream metadata. Pull Request resolved: https://github.com/pytorch/audio/pull/2464 Reviewed By: hwangjeff, xiaohui-zhang Differential Revision: D36995452 Pulled By: mthrok fbshipit-source-id: 534be1f7feb07790a0ce8624c336cdb7b65a8697
-
moto authored
Summary: Add metadata, such as ID3 (https://github.com/pytorch/audio/commit/7d98db0567cb60fabcc173949b8c08e3a3487ac2)tag to `StreamReaderSourceAudioStream`. Pull Request resolved: https://github.com/pytorch/audio/pull/2461 Reviewed By: hwangjeff Differential Revision: D36985656 Pulled By: mthrok fbshipit-source-id: e66f9e6e980eb57c378cc643a8979b6b7813dae7
-
- 07 Jun, 2022 1 commit
-
-
moto authored
Summary: Import StreamReader from the new location Pull Request resolved: https://github.com/pytorch/audio/pull/2455 Reviewed By: nateanl Differential Revision: D36959668 Pulled By: mthrok fbshipit-source-id: c2b8c9f9dff1ec306ea39c495294faa9208b3c4e
-
- 04 Jun, 2022 1 commit
-
-
moto authored
Summary: Undesired logs are one of the loudest UX complains we get. Yet, loading media files involves uncertainty which is difficult to debug without debug log. This commit introduces utility functions to configure logging level so that we can ask users to enable it when they encounter an issue, while defaulting to non-verbose option. Pull Request resolved: https://github.com/pytorch/audio/pull/2439 Reviewed By: hwangjeff, xiaohui-zhang Differential Revision: D36903763 Pulled By: mthrok fbshipit-source-id: f4ddd9915b13197c2a2eb97e965005b8b5b8d987
-
- 03 Jun, 2022 1 commit
-
-
Sean Kim authored
Summary: For test files where applicable, removed manual seeds where applicable. Refactoring https://github.com/pytorch/audio/issues/2267 Pull Request resolved: https://github.com/pytorch/audio/pull/2436 Reviewed By: carolineechen Differential Revision: D36896854 Pulled By: skim0514 fbshipit-source-id: 7b4dd8a8dbfbef271f5cc56564dc83a760407e6c
-
- 02 Jun, 2022 3 commits
-
-
Caroline Chen authored
Summary: update QUESST14 getitem to include docstrings and additionally return sample rate Pull Request resolved: https://github.com/pytorch/audio/pull/2435 Reviewed By: nateanl Differential Revision: D36864254 Pulled By: carolineechen fbshipit-source-id: 9e68bbc5de27ad2f32f6b298414103c4f6784801
-
moto authored
Summary: Remove the code related to libmad, which had been disabled in https://github.com/pytorch/audio/issues/2354 In https://github.com/pytorch/audio/issues/2419, we mp3 decoding to ffmpeg. But CI tests were still using libmad. This commit completely removes libmad from torchaudio. This is BC-breaking change as `apply_sox_effects_file` function cannot handle MP3, and it cannot fallback to ffmpeg. The workaround for this is to use `torchaudio.load` then `apply_sox_effects_tensor`. Pull Request resolved: https://github.com/pytorch/audio/pull/2428 Reviewed By: carolineechen Differential Revision: D36851805 Pulled By: mthrok fbshipit-source-id: f98795c59a1ac61cef511f2bbeac37f7c3c69d55
-
moto authored
Summary: This commit add fallback mechanism to `info` and `load` functions of sox_io backend. If torchaudio is compiled to use FFmpeg, and runtime dependencies are properly loaded, in case `info` and `load` fail, it fallback to FFmpeg-based implementation. BC-breaking changes: - FFmpeg does not report the number of frames for MP3, this is because MP3 does not store the information of the number of frames. It can be estimated from the audio duration and sample rate, but it might be inaccurate, so we keep it 0. Depends on - https://github.com/pytorch/audio/issues/2416 - https://github.com/pytorch/audio/issues/2417 - https://github.com/pytorch/audio/issues/2418 - https://github.com/pytorch/audio/issues/2423 - https://github.com/pytorch/audio/issues/2427 Pull Request resolved: https://github.com/pytorch/audio/pull/2419 Reviewed By: carolineechen Differential Revision: D36740306 Pulled By: mthrok fbshipit-source-id: 9e2ad095b8b39e41404970de0d8d9b5aaa856c97
-
- 01 Jun, 2022 2 commits
-
-
moto authored
Summary: * Update error messages * Update audio stream tests Pull Request resolved: https://github.com/pytorch/audio/pull/2429 Reviewed By: carolineechen, nateanl Differential Revision: D36812769 Pulled By: mthrok fbshipit-source-id: 7a51d0c4dbae558010d2e59412333e4a7f00d318
-
Sean Kim authored
Summary: Bringing in move seed commit from previous open commit https://github.com/pytorch/audio/issues/2267. Organizes seed to utils. Pull Request resolved: https://github.com/pytorch/audio/pull/2425 Reviewed By: carolineechen, nateanl Differential Revision: D36787599 Pulled By: skim0514 fbshipit-source-id: 37a0d632d13d4336a830c4b98bdb04828ed88c20
-