"vscode:/vscode.git/clone" did not exist on "d3be97104b09bfcabc7de507bfe8a79455ebce30"
  1. 12 May, 2022 1 commit
  2. 11 May, 2022 1 commit
    • hwangjeff's avatar
      Refactor LibriSpeech Conformer RNN-T recipe (#2366) · 69467ea5
      hwangjeff authored
      Summary:
      Modifies the example LibriSpeech Conformer RNN-T recipe as follows:
      - Moves data loading and transforms logic from lightning module to data module (improves generalizability and reusability of lightning module and data module).
      - Moves transforms logic from dataloader collator function to dataset (resolves dataloader multiprocessing issues on certain platforms).
      - Replaces lambda functions with `partial` equivalents (resolves pickling issues in certain runtime environments).
      - Modifies training script to allow for specifying path model checkpoint to restart training from.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2366
      
      Reviewed By: mthrok
      
      Differential Revision: D36305028
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 0b768da5d5909136c55418bf0a3c2ddd0c5683ba
      69467ea5
  3. 28 Apr, 2022 1 commit
  4. 26 Apr, 2022 1 commit
  5. 22 Apr, 2022 1 commit
    • Zhaoheng Ni's avatar
      Introduce DistributedBatchSampler (#2299) · 6411c9ad
      Zhaoheng Ni authored
      Summary:
      When using customized `batch_sampler`, pytorch_lightning can't wrap the distributed sampler onto it. Hence we provide a `DistributedBatchSampler` that supports `BucketizeBatchSampler` in `ddp` mode.
      
      The `DistributedBatchSampler` assumes `BucketizeBatchSampler.iter_list` is a list of lists, where each sub-list contains a batch of indices. Setting `shuffle` to `True` will shuffle the lists based on `seed` and current `epoch`.
      
      The `shuffle` only happens in the initialization, and won't be changed if user don't reset it. The reason is shuffling `BucketizeBatchSampler` may have a different length than before, do shuffling in ``__iter__`` may result in mismatch between ``__len__`` and the real length value.
      Hence users need to set `reload_dataloaders_every_n_epochs=1` in pytorch_lightning's Trainer. Then the value of ``__len__``  and the real length is the same.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2299
      
      Reviewed By: hwangjeff
      
      Differential Revision: D35781538
      
      Pulled By: nateanl
      
      fbshipit-source-id: 6e8396615497f1aeddab1ee5678830c0445c2b2a
      6411c9ad
  6. 21 Apr, 2022 1 commit
    • hwangjeff's avatar
      Change underlying implementation of RNN-T hypothesis to tuple (#2339) · 6b242c29
      hwangjeff authored
      Summary:
      PyTorch Lite, which is becoming a standard for mobile PyTorch usage, does not support containers containing custom classes. Consequently, because TorchAudio's RNN-T decoder currently returns and accepts lists of `Hypothesis` namedtuples, it is not compatible with PyTorch Lite. This PR resolves said incompatibility by changing the underlying implementation of `Hypothesis` to tuple.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2339
      
      Reviewed By: nateanl
      
      Differential Revision: D35806529
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 9cbae5504722390511d35e7f9966af2519ccede5
      6b242c29
  7. 13 Apr, 2022 2 commits
    • hwangjeff's avatar
      Add Conformer RNN-T LibriSpeech training recipe (#2329) · c262758b
      hwangjeff authored
      Summary:
      Adds Conformer RNN-T LibriSpeech training recipe to examples directory.
      
      Produces 30M-parameter model that achieves the following WER:
      
      |                     |          WER |
      |:-------------------:|-------------:|
      | test-clean          |       0.0310 |
      | test-other          |       0.0805 |
      | dev-clean           |       0.0314 |
      | dev-other           |       0.0827 |
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2329
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D35578727
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: afa9146c5b647727b8605d104d928110a1d3976d
      c262758b
    • hwangjeff's avatar
      Add nightly build installation code snippet to prototype feature tutorials (#2325) · fb51cecc
      hwangjeff authored
      Summary:
      Tutorial notebooks that leverage TorchAudio prototype features don't run as-is on Google Colab due to its runtime's not having nightly builds pre-installed. To make it easier for users to run said notebooks in Colab, this PR adds a code block that installs nightly Pytorch and TorchAudio builds as a comment that users can copy and run locally.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2325
      
      Reviewed By: xiaohui-zhang
      
      Differential Revision: D35597753
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 59914e492ad72e31c0136a48cd88d697e8ea5f6c
      fb51cecc
  8. 05 Apr, 2022 1 commit
  9. 04 Apr, 2022 2 commits
  10. 01 Apr, 2022 1 commit
  11. 25 Mar, 2022 1 commit
  12. 24 Mar, 2022 2 commits
  13. 22 Mar, 2022 1 commit
    • Hagen Wierstorf's avatar
      Fix calculation of SNR value in tutorial (#2285) · 8395fe65
      Hagen Wierstorf authored
      Summary:
      The calculation of the SNR in tha data augmentation examples seems to be wrong to me:
      
      ![image](https://user-images.githubusercontent.com/173624/159487032-c60470c6-ef8e-48a0-ad5e-a117fcb8d606.png)
      
      If we start from the definition of the signal-to-noise ratio using the root mean square value we get:
      
      ```
      SNR = 20 log10 ( rms(scale * speech) / rms(noise) )
      ```
      this can be transformed to
      ```
      scale = 10^(SNR/20) rms(noise) / rms(speech)
      ```
      In the example not `rms` is used but `lambda x: x.norm(p=2)`, but as we have the same length of the speech and noise signal, we have
      ```
      rms(noise) / rms(speech) = noise.norm(p=2) / speech.norm(p=2)
      ```
      this would lead us to:
      ```
      10^(SNR/20) = e^(SNR / 10)
      ```
      which is not true.
      
      Hence I changed `e^(SNR / 10)` to `10^(SNR/20)`.
      
      For the proposed SNR values of 20 dB, 10 dB, 3 dB the value of the scale would change from 7.39, 2.72, 1.35 to 10.0, 3.16, 1.41.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2285
      
      Reviewed By: nateanl
      
      Differential Revision: D35047737
      
      Pulled By: mthrok
      
      fbshipit-source-id: ac24c8fd48ef06b4b611e35163084644330a3ef3
      8395fe65
  14. 17 Mar, 2022 1 commit
  15. 10 Mar, 2022 1 commit
  16. 08 Mar, 2022 1 commit
  17. 26 Feb, 2022 1 commit
    • moto's avatar
      Improve device streaming (#2202) · 365313ed
      moto authored
      Summary:
      This commit adds tutorial for device ASR, and update API for device streaming.
      
      The changes for the interface are
      1. Add `timeout` and `backoff` parameters to `process_packet` and `stream` methods.
      2. Move `fill_buffer` method to private.
      
      When dealing with device stream, there are situations where the device buffer is not
      ready and the system returns `EAGAIN`. In such case, the previous implementation of
      `process_packet` method raised an exception in Python layer , but for device ASR,
      this is inefficient. A better approach is to retry within C++ layer in blocking manner.
      The new `timeout` parameter serves this purpose.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2202
      
      Reviewed By: nateanl
      
      Differential Revision: D34475829
      
      Pulled By: mthrok
      
      fbshipit-source-id: bb6d0b125d800f87d189db40815af06fbd4cab59
      365313ed
  18. 24 Feb, 2022 1 commit
  19. 23 Feb, 2022 1 commit
  20. 17 Feb, 2022 2 commits
  21. 16 Feb, 2022 6 commits
  22. 15 Feb, 2022 1 commit
  23. 11 Feb, 2022 5 commits
  24. 10 Feb, 2022 1 commit
  25. 09 Feb, 2022 1 commit
    • hwangjeff's avatar
      Fix librosa calls (#2208) · e5d567c9
      hwangjeff authored
      Summary:
      Yesterday's release of librosa 0.9.0 made args keyword-only and changed default padding from "reflect" to "zero" for some functions. This PR adjusts callsites in our tutorials and tests accordingly.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/2208
      
      Reviewed By: mthrok
      
      Differential Revision: D34099793
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: 4e2642cdda8aae6d0a928befaf1bbb3873d229bc
      e5d567c9
  26. 04 Feb, 2022 1 commit
  27. 03 Feb, 2022 1 commit