1. 19 Oct, 2022 1 commit
  2. 11 Oct, 2022 1 commit
  3. 10 Oct, 2022 1 commit
    • Zhaoheng Ni's avatar
      Add unit test for LibriMix dataset (#2659) · c5b8e585
      Zhaoheng Ni authored
      Summary:
      Besides the unit test, the PR also addresses these issues:
      - The original `LibriMix` dataset only supports "min" mode, which means the audio length is the minimum of all clean sources. It is default for source separation task. Users may also want to use "max" mode which allows for end-to-end separation and recognition. The PR adds ``mode`` argument to let users decide which dataset they want to use.
      - If the task is ``"enh_both"``, the target is the audios in ``mix_clean`` instead of separate clean sources. The PR fixes it to use ``mix_clean`` as target.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2659
      
      Reviewed By: carolineechen
      
      Differential Revision: D40229227
      
      Pulled By: nateanl
      
      fbshipit-source-id: fc07e0d88a245e1367656d3767cf98168a799235
      c5b8e585
  4. 09 Oct, 2022 1 commit
  5. 06 Jul, 2022 1 commit
    • Caroline Chen's avatar
      Fix fluent test for windows (#2510) · 09daa438
      Caroline Chen authored
      Summary:
      fluent dataset test currently fails on windows, due to new line generation in csv writer in testing and incorrect path parsing in dataset impl.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2510
      
      Reviewed By: carolineechen
      
      Differential Revision: D37573203
      
      Pulled By: mthrok
      
      fbshipit-source-id: 4868bc649690c7e596b002686c6128ce735d3564
      09daa438
  6. 27 Jun, 2022 1 commit
    • Zhaoheng Ni's avatar
      Add VoxCeleb1 dataset (#2349) · 21b2d139
      Zhaoheng Ni authored
      Summary:
      This PR adds two dataset classes of VoxCeleb1 corpus.
      - `VoxCeleb1Identification`
      Each data sample contains the waveform, sample rate, speaker id, and the file id.
      - `VoxCeleb1Verification`
      Each data sample contains a pair of waveforms, sample rate, the label indicating if they are from the same speaker, and the file ids.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2349
      
      Reviewed By: carolineechen
      
      Differential Revision: D35927921
      
      Pulled By: nateanl
      
      fbshipit-source-id: 3e07ddd329178777698841565053eb59befe6449
      21b2d139
  7. 23 Jun, 2022 1 commit
  8. 21 Jun, 2022 1 commit
    • Sean Kim's avatar
      Create musdb handler and tests (#2484) · b92a8a09
      Sean Kim authored
      Summary:
      Create dataset handler and tests for new dataset. Manually tested and unit tested to test validity. Pre-commit ran for style checks.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2484
      
      Reviewed By: carolineechen, nateanl
      
      Differential Revision: D37250556
      
      Pulled By: skim0514
      
      fbshipit-source-id: d2c8d73d22fd9d7282026265676f3eab1e178d51
      b92a8a09
  9. 20 Jun, 2022 1 commit
  10. 02 Jun, 2022 1 commit
  11. 23 May, 2022 1 commit
    • Zhaoheng Ni's avatar
      Add LibriLightLimited dataset (#2302) · af9cab3b
      Zhaoheng Ni authored
      Summary:
      The `LibriLightLimited` dataset is created for fine-tuning SSL models, such as Wav2Vec2 and HuBERT. It is a supervised subset of [Libri-Light](https://github.com/facebookresearch/libri-light) dataset. To distinguish the unsupervised subset and the supervised one, it's clearer to put it in a separate dataset class for fine-tuning purpose.
      It contains "10 min", "1 hour", "10 hour" splits.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2302
      
      Reviewed By: mthrok
      
      Differential Revision: D36388188
      
      Pulled By: nateanl
      
      fbshipit-source-id: ba49f1c9996be17db5db41127d8ca96224c94249
      af9cab3b
  12. 20 May, 2022 1 commit
  13. 15 May, 2022 1 commit
    • John Reese's avatar
      [codemod][usort] apply import merging for fbcode (8 of 11) · d62875cc
      John Reese authored
      Summary:
      Applies new import merging and sorting from µsort v1.0.
      
      When merging imports, µsort will make a best-effort to move associated
      comments to match merged elements, but there are known limitations due to
      the diynamic nature of Python and developer tooling. These changes should
      not produce any dangerous runtime changes, but may require touch-ups to
      satisfy linters and other tooling.
      
      Note that µsort uses case-insensitive, lexicographical sorting, which
      results in a different ordering compared to isort. This provides a more
      consistent sorting order, matching the case-insensitive order used when
      sorting import statements by module name, and ensures that "frog", "FROG",
      and "Frog" always sort next to each other.
      
      For details on µsort's sorting and merging semantics, see the user guide:
      https://usort.readthedocs.io/en/stable/guide.html#sorting
      
      Reviewed By: lisroach
      
      Differential Revision: D36402214
      
      fbshipit-source-id: b641bfa9d46242188524d4ae2c44998922a62b4c
      d62875cc
  14. 18 Apr, 2022 1 commit
  15. 30 Dec, 2021 1 commit
  16. 23 Dec, 2021 1 commit
  17. 08 Oct, 2021 1 commit
  18. 06 Oct, 2021 2 commits
  19. 05 Oct, 2021 1 commit
  20. 02 Aug, 2021 1 commit
  21. 02 Mar, 2021 1 commit
  22. 24 Feb, 2021 1 commit
  23. 08 Feb, 2021 1 commit
  24. 05 Jan, 2021 6 commits
  25. 30 Dec, 2020 4 commits
  26. 27 Dec, 2020 1 commit
  27. 21 Dec, 2020 1 commit
    • Aziz's avatar
      Remove walk_files (#1111) · 8187dc0a
      Aziz authored
      The use of `walk_files` made it ambiguous who is responsible to locate 
      the correct set of files. (Dataset class? or utility?)
      In fact, just glob-ing everything is not the right problem being solved in implementing
      Dataset, because if you have a specific dataset you consider to access, then
      the directory structure and file locations are determined. No need to do arbitral number of recursions.
      Each Dataset implementation should be glob-ing the right set of files it requires.
      8187dc0a
  28. 18 Dec, 2020 1 commit
  29. 11 Dec, 2020 1 commit
  30. 03 Dec, 2020 1 commit
  31. 18 Nov, 2020 1 commit