"vscode:/vscode.git/clone" did not exist on "98325b1097877b93dc872727d22ce2f402666e8f"
  1. 13 May, 2022 1 commit
    • moto's avatar
      Move Streamer API out of prototype (#2378) · 72b712a1
      moto authored
      Summary:
      This commit moves the Streaming API out of prototype module.
      
      * The related classes are renamed as following
      
        - `Streamer` -> `StreamReader`.
        - `SourceStream` -> `StreamReaderSourceStream`
        - `SourceAudioStream` -> `StreamReaderSourceAudioStream`
        - `SourceVideoStream` -> `StreamReaderSourceVideoStream`
        - `OutputStream` -> `StreamReaderOutputStream`
      
      This change is preemptive measurement for the possibility to add
      `StreamWriter` API.
      
      * Replace BUILD_FFMPEG build arg with USE_FFMPEG
      
      We are not building FFmpeg, so USE_FFMPEG is more appropriate
      
       ---
      
      After https://github.com/pytorch/audio/issues/2377
      
      Remaining TODOs: (different PRs)
      - [ ] Introduce `is_ffmpeg_binding_available` function.
      - [ ] Refactor C++ code:
         - Rename `Streamer` to `StreamReader`.
         - Rename `streamer.[h|cpp]` to `stream_reader.[h|cpp]`.
         - Rename `prototype.cpp` to `stream_reader_binding.cpp`.
         - Introduce `stream_reader` directory.
      - [x] Enable FFmpeg in smoke test (https://github.com/pytorch/audio/issues/2381)
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2378
      
      Reviewed By: carolineechen
      
      Differential Revision: D36359299
      
      Pulled By: mthrok
      
      fbshipit-source-id: 6a57b702996af871e577fb7addbf3522081c1328
      72b712a1
  2. 12 May, 2022 2 commits
    • Zhaoheng Ni's avatar
      Fix CollateFn in HuBERT pre-training recipe (#2296) · 09639680
      Zhaoheng Ni authored
      Summary:
      - When cropping the waveform and corresponding label, we use the formula `torch.div(audio_start - kernel_size * sample_rate, stride * sample_rate, rounding_mode="floor")` to align the audio start and label start indices. However, sometimes the value can be negative, which result in an empty label. The training example will hurt the performance after zero-padding (i.e., the labels are all zero for the input waveform).
      This PR fixes the bug by checking if `label_start` is negative, and change it to zero if so.
      - If `pad` is True, the `length` should be the length of each waveform instead of the max length. Fix it to make the model ignore the padding component in pre-training.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2296
      
      Reviewed By: mthrok
      
      Differential Revision: D36323217
      
      Pulled By: nateanl
      
      fbshipit-source-id: 1ffa71e39bbc0e8dee55c3b829911bc2e785b423
      09639680
    • John Reese's avatar
      [black][codemod] formatting changes from black 22.3.0 · 595dc5d3
      John Reese authored
      Summary:
      Applies the black-fbsource codemod with the new build of pyfmt.
      
      paintitblack
      
      Reviewed By: lisroach
      
      Differential Revision: D36324783
      
      fbshipit-source-id: 280c09e88257e5e569ab729691165d8dedd767bc
      595dc5d3
  3. 11 May, 2022 1 commit
    • hwangjeff's avatar
      Refactor LibriSpeech Conformer RNN-T recipe (#2366) · 69467ea5
      hwangjeff authored
      Summary:
      Modifies the example LibriSpeech Conformer RNN-T recipe as follows:
      - Moves data loading and transforms logic from lightning module to data module (improves generalizability and reusability of lightning module and data module).
      - Moves transforms logic from dataloader collator function to dataset (resolves dataloader multiprocessing issues on certain platforms).
      - Replaces lambda functions with `partial` equivalents (resolves pickling issues in certain runtime environments).
      - Modifies training script to allow for specifying path model checkpoint to restart training from.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2366
      
      Reviewed By: mthrok
      
      Differential Revision: D36305028
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 0b768da5d5909136c55418bf0a3c2ddd0c5683ba
      69467ea5
  4. 28 Apr, 2022 1 commit
  5. 26 Apr, 2022 1 commit
  6. 22 Apr, 2022 1 commit
    • Zhaoheng Ni's avatar
      Introduce DistributedBatchSampler (#2299) · 6411c9ad
      Zhaoheng Ni authored
      Summary:
      When using customized `batch_sampler`, pytorch_lightning can't wrap the distributed sampler onto it. Hence we provide a `DistributedBatchSampler` that supports `BucketizeBatchSampler` in `ddp` mode.
      
      The `DistributedBatchSampler` assumes `BucketizeBatchSampler.iter_list` is a list of lists, where each sub-list contains a batch of indices. Setting `shuffle` to `True` will shuffle the lists based on `seed` and current `epoch`.
      
      The `shuffle` only happens in the initialization, and won't be changed if user don't reset it. The reason is shuffling `BucketizeBatchSampler` may have a different length than before, do shuffling in ``__iter__`` may result in mismatch between ``__len__`` and the real length value.
      Hence users need to set `reload_dataloaders_every_n_epochs=1` in pytorch_lightning's Trainer. Then the value of ``__len__``  and the real length is the same.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2299
      
      Reviewed By: hwangjeff
      
      Differential Revision: D35781538
      
      Pulled By: nateanl
      
      fbshipit-source-id: 6e8396615497f1aeddab1ee5678830c0445c2b2a
      6411c9ad
  7. 21 Apr, 2022 1 commit
    • hwangjeff's avatar
      Change underlying implementation of RNN-T hypothesis to tuple (#2339) · 6b242c29
      hwangjeff authored
      Summary:
      PyTorch Lite, which is becoming a standard for mobile PyTorch usage, does not support containers containing custom classes. Consequently, because TorchAudio's RNN-T decoder currently returns and accepts lists of `Hypothesis` namedtuples, it is not compatible with PyTorch Lite. This PR resolves said incompatibility by changing the underlying implementation of `Hypothesis` to tuple.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2339
      
      Reviewed By: nateanl
      
      Differential Revision: D35806529
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 9cbae5504722390511d35e7f9966af2519ccede5
      6b242c29
  8. 13 Apr, 2022 2 commits
    • hwangjeff's avatar
      Add Conformer RNN-T LibriSpeech training recipe (#2329) · c262758b
      hwangjeff authored
      Summary:
      Adds Conformer RNN-T LibriSpeech training recipe to examples directory.
      
      Produces 30M-parameter model that achieves the following WER:
      
      |                     |          WER |
      |:-------------------:|-------------:|
      | test-clean          |       0.0310 |
      | test-other          |       0.0805 |
      | dev-clean           |       0.0314 |
      | dev-other           |       0.0827 |
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2329
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D35578727
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: afa9146c5b647727b8605d104d928110a1d3976d
      c262758b
    • hwangjeff's avatar
      Add nightly build installation code snippet to prototype feature tutorials (#2325) · fb51cecc
      hwangjeff authored
      Summary:
      Tutorial notebooks that leverage TorchAudio prototype features don't run as-is on Google Colab due to its runtime's not having nightly builds pre-installed. To make it easier for users to run said notebooks in Colab, this PR adds a code block that installs nightly Pytorch and TorchAudio builds as a comment that users can copy and run locally.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2325
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D35597753
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 59914e492ad72e31c0136a48cd88d697e8ea5f6c
      fb51cecc
  9. 05 Apr, 2022 1 commit
  10. 04 Apr, 2022 2 commits
  11. 01 Apr, 2022 1 commit
  12. 25 Mar, 2022 1 commit
  13. 24 Mar, 2022 2 commits
  14. 22 Mar, 2022 1 commit
    • Hagen Wierstorf's avatar
      Fix calculation of SNR value in tutorial (#2285) · 8395fe65
      Hagen Wierstorf authored
      Summary:
      The calculation of the SNR in tha data augmentation examples seems to be wrong to me:
      
      ![image](https://user-images.githubusercontent.com/173624/159487032-c60470c6-ef8e-48a0-ad5e-a117fcb8d606.png)
      
      If we start from the definition of the signal-to-noise ratio using the root mean square value we get:
      
      ```
      SNR = 20 log10 ( rms(scale * speech) / rms(noise) )
      ```
      this can be transformed to
      ```
      scale = 10^(SNR/20) rms(noise) / rms(speech)
      ```
      In the example not `rms` is used but `lambda x: x.norm(p=2)`, but as we have the same length of the speech and noise signal, we have
      ```
      rms(noise) / rms(speech) = noise.norm(p=2) / speech.norm(p=2)
      ```
      this would lead us to:
      ```
      10^(SNR/20) = e^(SNR / 10)
      ```
      which is not true.
      
      Hence I changed `e^(SNR / 10)` to `10^(SNR/20)`.
      
      For the proposed SNR values of 20 dB, 10 dB, 3 dB the value of the scale would change from 7.39, 2.72, 1.35 to 10.0, 3.16, 1.41.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2285
      
      Reviewed By: nateanl
      
      Differential Revision: D35047737
      
      Pulled By: mthrok
      
      fbshipit-source-id: ac24c8fd48ef06b4b611e35163084644330a3ef3
      8395fe65
  15. 17 Mar, 2022 1 commit
  16. 10 Mar, 2022 1 commit
  17. 08 Mar, 2022 1 commit
  18. 26 Feb, 2022 1 commit
    • moto's avatar
      Improve device streaming (#2202) · 365313ed
      moto authored
      Summary:
      This commit adds tutorial for device ASR, and update API for device streaming.
      
      The changes for the interface are
      1. Add `timeout` and `backoff` parameters to `process_packet` and `stream` methods.
      2. Move `fill_buffer` method to private.
      
      When dealing with device stream, there are situations where the device buffer is not
      ready and the system returns `EAGAIN`. In such case, the previous implementation of
      `process_packet` method raised an exception in Python layer , but for device ASR,
      this is inefficient. A better approach is to retry within C++ layer in blocking manner.
      The new `timeout` parameter serves this purpose.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2202
      
      Reviewed By: nateanl
      
      Differential Revision: D34475829
      
      Pulled By: mthrok
      
      fbshipit-source-id: bb6d0b125d800f87d189db40815af06fbd4cab59
      365313ed
  19. 24 Feb, 2022 1 commit
  20. 23 Feb, 2022 1 commit
  21. 17 Feb, 2022 2 commits
  22. 16 Feb, 2022 6 commits
  23. 15 Feb, 2022 1 commit
  24. 11 Feb, 2022 5 commits
  25. 10 Feb, 2022 1 commit
  26. 09 Feb, 2022 1 commit
    • hwangjeff's avatar
      Fix librosa calls (#2208) · e5d567c9
      hwangjeff authored
      Summary:
      Yesterday's release of librosa 0.9.0 made args keyword-only and changed default padding from "reflect" to "zero" for some functions. This PR adjusts callsites in our tutorials and tests accordingly.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2208
      
      Reviewed By: mthrok
      
      Differential Revision: D34099793
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 4e2642cdda8aae6d0a928befaf1bbb3873d229bc
      e5d567c9