1. 31 Dec, 2021 2 commits
  2. 30 Dec, 2021 8 commits
  3. 29 Dec, 2021 9 commits
  4. 28 Dec, 2021 6 commits
  5. 24 Dec, 2021 2 commits
  6. 23 Dec, 2021 6 commits
  7. 22 Dec, 2021 2 commits
  8. 21 Dec, 2021 3 commits
    • Moto Hira's avatar
      Clean up CTC decoder bynding code (#2092) · 4c2edd21
      Moto Hira authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/2092
      
      Reviewed By: carolineechen
      
      Differential Revision: D33169110
      
      fbshipit-source-id: e422ad93efe50b91f1ac5d572dc82768c1000c05
      4c2edd21
    • moto's avatar
      Update audio augmentation tutorial (#2082) · 3a03d8c0
      moto authored
      Summary:
      1. Reorder Audio display so that audios are playable from browser in doc
      2. Add link to function documentations
      
      https://470342-90321822-gh.circle-artifacts.com/0/docs/tutorials/audio_data_augmentation_tutorial.html
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2082
      
      Reviewed By: carolineechen
      
      Differential Revision: D33227725
      
      Pulled By: mthrok
      
      fbshipit-source-id: c7ee360b6f9b84c8e0a9b72193b98487d03b57ab
      3a03d8c0
    • moto's avatar
      Fix load behavior for 24-bit input (#2084) · 4554d242
      moto authored
      Summary:
      ## bug description
      
      When a 24 bits-par-sample audio is loaded via file-like object,
      the loaded Tensor is wrong. It was fine if the audio is loaded
      from local file.
      
      ## The cause of the bug
      
      The core of the sox's decoding mechanism is `sox_read` function,
      one of which parameter is the maximum number of samples to decode
      from the given buffer.
      
      https://fossies.org/dox/sox-14.4.2/formats_8c.html#a2a4f0194a0f919d4f38c57b81aa2c06f)]
      
      The `sox_read` function is called in what is called `drain` effect,
      callback and this callback receives output buffer and its size in
      byte. The previous implementation passed this size value as
      the argument of `sox_read` for the maximum number of samples to
      read. Since buffer size is larger than the number of samples fit in
      the buffer, `sox_read` function always consumed the entire
      buffer. (This behavior is not wrong except when the input is
      24 bits-per-sample and file-like object.)
      
      When the input is read from file-like object, inside of drain
      callback, new data are fetched via Python's `read` method and
      loaded on fixed-size memory region. The size of this memory region
      can be adjusted via `torchaudio.utils.sox_utils.set_buffer_size`,
      but the default value is 8096.
      
      If the input format is 24 bits-per-sample, the end of memory region
      does not necessarily correspond to the end of a valid sample.
      When `sox_read` consumes all the data in the buffer region, the data
      at the end introduces some unexpected values.
      This causes the aforementioned bug
      
      ## Fix
      
      Pass proper (better estimated) maximum number of samples decodable to
      `sox_read`.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2084
      
      Reviewed By: carolineechen
      
      Differential Revision: D33236947
      
      Pulled By: mthrok
      
      fbshipit-source-id: 171d9b7945f81db54f98362a68b20f2f95bb11a4
      4554d242
  9. 20 Dec, 2021 2 commits
    • moto's avatar
      Standardize the location of third-party source code (#2086) · 2476dd2d
      moto authored
      Summary:
      Previously sox-related third-party source code was archived at
      `third_party/sox/archives`.
      Recently KenLM-related third-party source code was added and
      they are archived at `third_party/archives`.
      
      This PR changes the sox archive location to `third_party/archives`,
      so that all the archvies are cached at the same location.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2086
      
      Reviewed By: carolineechen
      
      Differential Revision: D33236927
      
      Pulled By: mthrok
      
      fbshipit-source-id: 2f2aa5f4b386fefb46d7c98f7179c04995219f3c
      2476dd2d
    • Joao Gomes's avatar
      Update URLs for libritts (#2074) · f3f23e42
      Joao Gomes authored
      Summary:
      The urls for this dataset seem to have changed so I am updating to the new location
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2074
      
      Reviewed By: mthrok
      
      Differential Revision: D33234996
      
      Pulled By: jdsgomes
      
      fbshipit-source-id: e09c35a122e8227fcce7fa97aeeeea312cb89173
      f3f23e42