1. 03 May, 2023 4 commits
  2. 02 May, 2023 3 commits
  3. 01 May, 2023 2 commits
  4. 29 Apr, 2023 1 commit
  5. 28 Apr, 2023 1 commit
    • Yuekai Zhang's avatar
      Add cuctc decoder (#3096) · 0a1801ed
      Yuekai Zhang authored
      Summary:
      This PR implements a CUDA based ctc prefix beam search decoder.
      
      Attach serveral benchmark results using V100 below:
      |decoder type| model |datasets       | decoding time (secs)| beam size | batch size | model unit | subsampling times | vocab size |
      |--------------|---------|------|-----------------|------------|-------------|------------|-----------------------|------------|
      | cuctc |  conformer nemo    |dev clean        |7.68s | 8           |  32       | bpe         |    4  | 1000|
      | cuctc |  conformer nemo   |dev clean  (sort by length)      |1.6s | 8           |  32       | bpe         |    4  | 1000|
      | cuctc |  wav2vec2.0 torchaudio |dev clean                                |22s | 10           |  1       | char         |    2  | 29|
      | cuctc |   conformer espnet   |aishell1 test                             | 5s | 10           |  24       | char         |    4  | 4233|
      
      Note:
      1.  The design is to parallel computation through batch and vocab axis, for loop the frames axis. So it's more friendly with smaller sequence lengths, larger vocab size comparing with CPU implementations.
      2. WER is the same as CPU implementations. However, it can't decode with LM now.
      
      Resolves: https://github.com/pytorch/audio/issues/2957.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3096
      
      Reviewed By: nateanl
      
      Differential Revision: D44709397
      
      Pulled By: mthrok
      
      fbshipit-source-id: 3078c54a2b44dc00eb4a81b4c657487eeff8c155
      0a1801ed
  6. 25 Apr, 2023 1 commit
  7. 19 Apr, 2023 2 commits
  8. 18 Apr, 2023 1 commit
  9. 12 Apr, 2023 3 commits
  10. 11 Apr, 2023 2 commits
  11. 10 Apr, 2023 4 commits
  12. 07 Apr, 2023 5 commits
  13. 06 Apr, 2023 2 commits
    • moto's avatar
      Remove custom flashlight import (#3246) · ae614ed3
      moto authored
      Summary:
      In https://github.com/pytorch/audio/pull/3232, the CTC decoder is excluded from binary distribution.
      To use CTCDecoder, users need to install flashlight-text.
      
      Currently, if flashlight-text is not available, torchaudio still attempts to import the custom bundle.
      This commit clean up this behavior by delaying the error until one of the components is actually used,
      and providing a better message.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3246
      
      Test Plan:
      Binary smoke tests import torchaudio without installing flashlight.
      Unit test CI jobs run the CTC decoder with flashlight installed.
      
      Reviewed By: jacobkahn
      
      Differential Revision: D44748413
      
      Pulled By: mthrok
      
      fbshipit-source-id: 21d2cbd9961ed88405a739cc682071066712f5e4
      ae614ed3
    • Jeff Hwang's avatar
      Add frame writing API to StreamWriter (#3244) · f4d94cab
      Jeff Hwang authored
      Summary:
      Pull Request resolved: https://github.com/pytorch/audio/pull/3244
      
      Adds methods to `StreamWriter` that allow for passing in `AVFrame` instances rather than tensors.
      
      Reviewed By: mthrok
      
      Differential Revision: D44589256
      
      fbshipit-source-id: f100e0d349708482b873a9a4bae1eaf5eb65301a
      f4d94cab
  14. 05 Apr, 2023 2 commits
  15. 04 Apr, 2023 4 commits
    • moto's avatar
      [BC-breaking] Make I/O optional arguments kw-only (#3227) · ab40a3a3
      moto authored
      Summary:
      Recently, we added bunch of options to make StreamReader/Writer flexible. As a result, their methods have many number of arguments, and some of them have semantic grouping.
      
      For example, the arguments of ``StreamWriter.add_video_stream`` are roughly grouped as follow;
      
      - Information about input media format
         `frame_rate`, `width`, `height`, `format`
      - Information about encoder
         `encoder`, `encoder_option`
      - Information about codec configuration
         `codec_config`
      - Information about encode media format
         `encoder_format`, `encoder_frame_rate`, `encoder_width`, `encoder_height`
      - Information about additional processing
         `filter_desc`
      - Hardware acceleration
         `hw_accel`
      
      We do not know what arguments will be added in the future, but when we do,
      we want to keep them roughly grouped, by inserting the new argument
      somewhere in a middle without breaking backward compatibility.
      
      This commit puts most of them in keyword-only argument, so that we can
      rearrange them without breaking backward compatibility.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3227
      
      Reviewed By: hwangjeff
      
      Differential Revision: D44681620
      
      Pulled By: mthrok
      
      fbshipit-source-id: b55f6168f4c2f3d0f59731b9bb0db4ae54e5a90f
      ab40a3a3
    • moto's avatar
      Disable CTC decoder bundle by default (#3232) · 3844a2bd
      moto authored
      Summary:
      As we migrate to use upstream flashlight-text and KenLM, this PR disable building CTC decoder by default.
      This will stop shipping flashlight-text and KenLM bundle in torchaudio binary.
      
      Ref: https://github.com/pytorch/audio/issues/3088
      
      cc jacobkahn
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3232
      
      Reviewed By: hwangjeff
      
      Differential Revision: D44650872
      
      Pulled By: mthrok
      
      fbshipit-source-id: 2415623abaf3cafa181135db5112d3c711137cd7
      3844a2bd
    • hwangjeff's avatar
      Swap in assertions for decoder setup checks (#3235) · ea212c6e
      hwangjeff authored
      Summary: Pull Request resolved: https://github.com/pytorch/audio/pull/3235
      
      Reviewed By: mthrok
      
      Differential Revision: D44653654
      
      Pulled By: hwangjeff
      
      fbshipit-source-id: f28a6068e826581d76ed4a216adb6019b6486e53
      ea212c6e
    • moto's avatar
      Remove linux GPU unit test from CircleCI (#3231) · 0d57a3af
      moto authored
      Summary:
      Linux GPU unit test on CircleCI relies on custom Docker image with CUDA 10.2.
      
      PyTorch 2.0 does not support CUDA 10, so these tests have not run for a while.
      
      We have GPU tests on GHA for Linux, so we can get rid of them.
      
      Windows GPU tests are not ported to GHA yet, but they are still working on CircleCI, so we don't delete them yet.
      
      Pull Request resolved: https://github.com/pytorch/audio/pull/3231
      
      Reviewed By: hwangjeff
      
      Differential Revision: D44639302
      
      Pulled By: mthrok
      
      fbshipit-source-id: c1fd39f4805a50a12af4259d423985fe453fd229
      0d57a3af
  16. 03 Apr, 2023 3 commits