1. 06 Mar, 2020 1 commit
  2. 05 Mar, 2020 3 commits
  3. 28 Feb, 2020 1 commit
    • moto's avatar
      Add test for InverseMelScale (#448) · babc24af
      moto authored
      
      
      * Inverse Mel Scale Implementation
      
      * Inverse Mel Scale Docs
      
      * Better working version.
      
      * GPU fix
      
      * These shouldn't go on git..
      
      * Even better one, but does not support JITability.
      
      * Remove JITability test
      
      * Flake8
      
      * n_stft is a must
      
      * minor clean up of initialization
      
      * Add librosa consistency test
      
      This PR follows up #366 and adds test for `InverseMelScale` (and `MelScale`) for librosa compatibility.
      
      For `MelScale` compatibility test;
      1. Generate spectrogram
      2. Feed the spectrogram to `torchaudio.transforms.MelScale` instance
      3. Feed the spectrogram to `librosa.feature.melspectrogram` function.
      4. Compare the result from 2 and 3 elementwise.
      Element-wise numerical comparison is possible because under the hood their implementations use the same algorith.
      
      For `InverseMelScale` compatibility test, it is more elaborated than that.
      1. Generate the original spectrogram
      2. Convert the original spectrogram to Mel scale using `torchaudio.transforms.MelScale` instance
      3. Reconstruct spectrogram using torchaudio implementation
      3.1. Feed the Mel spectrogram to `torchaudio.transforms.InverseMelScale` instance and get reconstructed spectrogram.
      3.2. Compute the sum of element-wise P1 distance of the original spectrogram and that from 3.1.
      4. Reconstruct spectrogram using librosa
      4.1. Feed the Mel spectrogram to `librosa.feature.inverse.mel_to_stft` function and get reconstructed spectrogram.
      4.2. Compute the sum of element-wise P1 distance of the original spectrogram and that from 4.1. (this is the reference.)
      5. Check that resulting P1 distance are in a roughly same value range.
      
      Element-wise numerical comparison is not possible due to the difference algorithms used to compute the inverse. The reconstructed spectrograms can have some values vary in magnitude.
      Therefore the strategy here is to check that P1 distance (reconstruction loss) is not that different from the value obtained using `librosa`. For this purpose, threshold was empirically chosen
      
      ```
      print('p1 dist (orig <-> ta):', torch.dist(spec_orig, spec_ta, p=1))
      print('p1 dist (orig <-> lr):', torch.dist(spec_orig, spec_lr, p=1))
      >>> p1 dist (orig <-> ta): tensor(1482.1917)
      >>> p1 dist (orig <-> lr): tensor(1420.7103)
      ```
      
      This value can vary based on the length and the kind of the signal being processed, so it was handpicked.
      
      * Address review feedbacks
      
      * Support arbitrary batch dimensions.
      
      * Add batch test
      
      * Use view for batch
      
      * fix sgd
      
      * Use negative indices and update docstring
      
      * Update threshold
      Co-authored-by: default avatarCharles J.Y. Yoon <jaeyeun97@gmail.com>
      babc24af
  4. 25 Feb, 2020 1 commit
  5. 24 Feb, 2020 1 commit
  6. 22 Feb, 2020 1 commit
    • Tomás Osório's avatar
      Adding Speech Command Dataset (#437) · 4d58bc46
      Tomás Osório authored
      
      
      * add speechcommand dataset and test
      
      * prepend the full path to each result
      
      * add missing param on docstring in walk_files
      
      * add file to run tests on SpeechCommand Dataset
      
      * reduce logic
      
      * update test on SpeechCommands
      
      * correct the indentation on docstring walk_files
      
      * flake8 compliance
      
      * change tuple type returned. move path split logic in load item.
      
      * typo in name.
      
      * redundant file path.
      
      * filter background noise.
      Co-authored-by: default avatarVincent QB <vincentqb@users.noreply.github.com>
      4d58bc46
  7. 20 Feb, 2020 1 commit
  8. 14 Feb, 2020 1 commit
  9. 12 Feb, 2020 1 commit
  10. 29 Jan, 2020 2 commits
  11. 22 Jan, 2020 2 commits
  12. 17 Jan, 2020 1 commit
  13. 16 Jan, 2020 3 commits
  14. 13 Jan, 2020 3 commits
  15. 09 Jan, 2020 3 commits
  16. 08 Jan, 2020 1 commit
    • peterjc123's avatar
      Add Windows CI (#394) · be5b2d56
      peterjc123 authored
      * [WIP] Add Windows CI
      
      * Remove cu_version
      
      * checkout_merge -> checkout
      
      * Add build script
      
      * Switch backend to soundfile
      
      * Remove soundfile as dependency
      
      * Rename jobs
      
      * Fix lint
      be5b2d56
  17. 02 Jan, 2020 4 commits
  18. 27 Dec, 2019 2 commits
  19. 26 Dec, 2019 3 commits
  20. 23 Dec, 2019 1 commit
  21. 20 Dec, 2019 1 commit
    • David Pollack's avatar
      Improve lfilter functional (#374) · f3365ecf
      David Pollack authored
      
      
      * Simplify lfilter functional
      
      * use `torch.clamp` instead of `torch.min(..., torch.max(...))`
      * remove unneeded creation of ones tensor for previous method
      
      The current lfilter function uses min and max to essentially do a clamp
      function.  I changed the code to use clamp instead.  It is more readable
      than the previous version.
      
      FYI, if you want to keep the previous way, you could make a
      broadcastable tensor of size 1 instead of creating a tensor the size of
      the input.
      Signed-off-by: default avatarDavid Pollack <david@da3.net>
      
      * Parallelize waveform windows calculation
      
      I've parallelized the calculation of the waveform windows and also
      removed the inefficient calculation within the for-loop.
      Signed-off-by: default avatarDavid Pollack <david@da3.net>
      
      * Refactoring and minor readability changes
      Signed-off-by: default avatarDavid Pollack <david@da3.net>
      
      * Remove one more creation of a temporary tensor
      Signed-off-by: default avatarDavid Pollack <david@da3.net>
      f3365ecf
  22. 19 Dec, 2019 1 commit
    • Vincent QB's avatar
      Backend switch (#355) · 774ebc78
      Vincent QB authored
      * move sox inside function calls.
      
      * add backend switch mechanism.
      
      * import sox at runtime, not import.
      
      * add backend list.
      
      * backend tests.
      
      * creating hidden modules for backend.
      
      * naming backend same as file: soundfile.
      
      * remove docstring in backend file.
      
      * test soundfile info.
      
      * soundfile doesn't support int64.
      
      * adding test for wav file.
      
      * error with incorrect parameter instead of silent ignore.
      
      * adding test across backend. using float32 as done in sox.
      
      * backend guard decorator.
      774ebc78
  23. 18 Dec, 2019 2 commits