1. 25 Jun, 2020 2 commits
    • moto's avatar
      Add load function (#731) · 793eeab8
      moto authored
      This is a part of PRs to add new "sox_io" backend. #726 and depends on #718 and #728 .
      
      This PR adds `load` function to "sox_io" backend, which is  tested on the following audio formats;
       - `wav`
       - `mp3`
       - `flac`
       - `ogg/vorbis` *
      
      By default, "sox_io" backend returns Tensor with `float32` dtype and the shape of `[channel, time]`. The samples are normalized to fit in the range of `[-1.0, 1.0]`.
      
      Unlike existing "sox" backend, the new `load` function can handle WAV file natively, when the input format is WAV with integer type, (such as 32-bit signed integer, 16-bit signed integer and 8-bit unsigned integer) by providing `normalize=False`, this function can return integer Tensor, where the samples are expressed within the whole range of the corresponding dtype, that is, `int32` tensor for `32-bit PCM`, `int16` for `16-bit PCM` and `uint8` for `8-bit PCM`. This behavior follows [scipy.io.wavfile.read](https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.read.html). `normalize` parameter has no effect for other formats and the load function always return normalized value with `float32` Tensor.
      
      __* Note__ The current binary distribution of torchaudio does not contain `ogg/vorbis` and `opus` codecs. To handle these files, one needs to build torchaudio from the source with proper codecs in the system.
      
      __Note 2__ Since this PR, `scipy` becomes required module for running test. 
      793eeab8
    • moto's avatar
      0f0d0af3
  2. 24 Jun, 2020 2 commits
  3. 23 Jun, 2020 5 commits
  4. 22 Jun, 2020 1 commit
  5. 19 Jun, 2020 1 commit
    • moto's avatar
      Add TorchScript-able "info" func to sox_io backend (#728) · 88fccd14
      moto authored
      This is a part of PRs to add new "sox_io" backend #726, and depends on #718.
      
      This PR adds `info` function to "sox_io" backend, which allows users to fetch some metadata of an audio file. 
      At this moment, the information retrieved are;
      
       - Number of samples in the audio file
       - Sampling rate
       - Number of channels
      88fccd14
  6. 18 Jun, 2020 3 commits
  7. 17 Jun, 2020 1 commit
  8. 16 Jun, 2020 3 commits
  9. 15 Jun, 2020 1 commit
  10. 11 Jun, 2020 4 commits
  11. 10 Jun, 2020 2 commits
    • jimchen90's avatar
      Add cmu_arctic dataset (#710) · 55b5c80c
      jimchen90 authored
      
      
      * Add cmu_arctic dataset
      
      * add dataset name
      
      * update audio test file with whitenoise.wav file
      
      * add test text file
      
      * update text method and file name
      
      * update comment
      
      * change datasets order in doc
      
      * add line length
      Co-authored-by: default avatarJi Chen <jimchen90@devfair0160.h2.fair>
      55b5c80c
    • moto's avatar
      Add sox_effects submodule and delegate sox_effects init/shutdown (#708) · c82a7f9c
      moto authored
      There are couple of aspects of this PR that overall improves the maintainability of the code base, based on "decoupling" and "separation of concerns".
      
      First, `sox_effects` functionalities can be either available or unavailable. From the viewpoint of `torchaudio` main module, the looser the connection between the `torchaudio` module and `torchaudio.sox_effects`, the more manageable the code base become because you can change the two modules independently. This is mostly accomplished when the definitions of `initialize_sox` and `shutdown_sox` were moved from `torchaudio.__init__` to `torchaudio.sox_effects`, but the initialization of sox effects are still happening in `torchaudio.__init__`. If we move the initialization to `sox_effects` module, the responsibility of sox initialization is moved to `sox_effects` module, along with the required module availability check etc. The main `torchaudio` module can be carefree about how the `sox_effects` module should work.
      
      In addition to that, I found that `initialize_sox` and `shutdown_sox` are confusing because it sound like they are required for `libsox` based I/O. To make it clear, I renamed them to include `sox_effect` in function name.
      
      Also moving functions from the original places are BC breaking itself, therefore, these functions are re-imported in `torchaudio.__init__` and renamed to match the original names. Therefore the PR is not BC breaking.
      c82a7f9c
  12. 09 Jun, 2020 2 commits
  13. 08 Jun, 2020 4 commits
  14. 05 Jun, 2020 5 commits
  15. 04 Jun, 2020 4 commits