Commits · 3ab521fa2b26b6dbfdb87d36739f0e4191a500a1 · OpenDAS / Torchaudio

04 May, 2021 1 commit
- Refactor libtorchaudio example (#1486) · 3ab521fa
  moto authored May 04, 2021
  
  3ab521fa
30 Apr, 2021 1 commit

Replace existing prototype RNNT Loss (#1479) · 0c263a93

Caroline Chen authored Apr 30, 2021

Replace the prototype RNNT implementation (using warp-transducer) with one without external library dependencies

0c263a93

23 Apr, 2021 1 commit
- Add WER to readme in wav2letter pipeline (#1470) · acf82d0c
  Vincent QB authored Apr 23, 2021
  
  acf82d0c
15 Apr, 2021 2 commits
- fix: dataset-folder-in-archive flag is empty (#1060) · 245da370
  Tran N.M. Hoang authored Apr 16, 2021
  
  245da370
- Use torchaudio melscale 'slaney' instead of librosa in WaveRNN pipeline preprocessing (#1444) · e061b268
  discort authored Apr 16, 2021
```
* Use torchaudio melscale instead of librosa
```
  e061b268
16 Mar, 2021 1 commit
- Lint code style and remove PY2 compatibility (#1386) · 6bad3a66
  Ankit Dobhal authored Mar 16, 2021
  
  6bad3a66
04 Mar, 2021 1 commit
- Add libtorchaudio cpp example (#1349) · f4589714
  moto authored Mar 03, 2021
  
  f4589714
11 Feb, 2021 1 commit
- DOC Document undocumented parameters and add CI check(#1248) · 5efb13e3
  Nicolas Hug authored Feb 11, 2021
  
  5efb13e3
06 Nov, 2020 1 commit
- remove print-freq option and compute validation loss at each epoch. (#997) · 70fc197b
  Vincent QB authored Nov 06, 2020
  
  70fc197b
02 Nov, 2020 1 commit

Sync fbcode (#996) · 758f6c2a

moto authored Nov 02, 2020

fbshipit-source-id: 4fb853c391900d3070b936e5a3e4609eb78a780d

* 20200428 pytorch/audio import

Summary: [10:30:47: cpuhrsch@devvm3140 pytorch]$ ./fb_build/import_audio.sh

Reviewed By: vincentqb

Differential Revision: D21282421

fbshipit-source-id: 9bde1455ca6a19defbf33dbbfc5f0d49a8e4dc6a

* Import torchaudio 20200528

Summary: Import Up to #664

Reviewed By: cpuhrsch

Differential Revision: D21728204

fbshipit-source-id: 648dd622087fa762194ca5f89a310500e777263d

* Remove unnecessary config file from torchaudio

Summary: Turned out .use_external_sox is not necessary for building torchaudio in fbcode.

Reviewed By: vincentqb

Differential Revision: D21792939

fbshipit-source-id: c0fb5173c6533e67114f50ddc8e9425bd129574f

* Import torchaudio 20200605

Summary: import torchaudio 0.5.0 in fbcode using import_audio.sh:

Reviewed By: vincentqb

Differential Revision: D21884426

fbshipit-source-id: b6f2cc308e597caef2dd767c315b167c09fb0d4c

* Change parameterized testing system to be compatible with unittest

Summary: The previous implementation of parameterized testing worked by modifying test.common_utils inplace.  This doesn't work in general because unittest's contract with test modules is such that it must be able to load the module and run the test itself.  Because the previous implementation needed to load the module and modify it, it is incompatible.

Reviewed By: mthrok

Differential Revision: D21964676

fbshipit-source-id: 9bb71e8c3f9fab074239b22306f3bbddb0f3975b

* Import torchaudio 20200618 #718

Summary: Import torchaudio up to #719

Reviewed By: zhangguanheng66

Differential Revision: D22119491

fbshipit-source-id: e14842278a32c9373179fc132e8111a0ffe66d93

* Import torchaudio 20200714 #782 (#784)

Summary:
Pull Request resolved: https://github.com/pytorch/audio/pull/784

 - Import torchaudio.
 - Change test util module name from test_case_utils to case_utils

Reviewed By: cpuhrsch

Differential Revision: D22261638

fbshipit-source-id: eb4df500c1d7db0a60baa100dd22795a63851438

* remediation of S205607

fbshipit-source-id: 5113fe0c527595e4227ff827253b7414abbdf7ac

* remediation of S205607

fbshipit-source-id: 798decc90db4f13770e97cdce3c0df7d5421b2a3

* Import torchaudio 20200723

Summary: Import torchaudio 20200723 #814

Reviewed By: fmassa

Differential Revision: D22666393

fbshipit-source-id: 50df07b5c158fe4e95ada7ea54381b2e26f6aecd

* Support custom exception message (#41907)

Summary:
Raise and assert used to have a hard-coded error message "Exception". User provided error message was ignored. This PR adds support to represent user's error message in TorchScript.

This breaks backward compatibility because now we actually need to script the user's error message, which can potentially contain unscriptable expressions. Such programs can break when scripting, but saved models can still continue to work.

Increased an op count in test_mobile_optimizer.py because now we need aten::format to form the actual exception message.

This is built upon an WIP PR:  https://github.com/pytorch/pytorch/pull/34112 by driazati

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41907



Reviewed By: ngimel

Differential Revision: D22778301

Pulled By: gmagogsfm

fbshipit-source-id: 2b94f0db4ae9fe70c4cd03f4048e519ea96323ad

* Import torchaudio 20200804

Summary: Up to #804

Reviewed By: vincentqb

Differential Revision: D22947671

fbshipit-source-id: d1a005cec2f1a00913c41eda380b9f4b993ef779

* Remove .python3 markers

Reviewed By: ashwinp-fb

Differential Revision: D22955630

fbshipit-source-id: f00ef17a905e4c7cd9196c8924db39f9cdfe8cfa

* Import torchaudio 20200821

Reviewed By: cpuhrsch

Differential Revision: D23273584

fbshipit-source-id: 2fe7effa11b7f7cdf0cee1da6b1cac5556e9f55b

* Import torchaudio 20200922

Summary: Up to #914

Reviewed By: vincentqb, cpuhrsch

Differential Revision: D23846718

fbshipit-source-id: 9feb4e58563b900965467bd9ff66c979211c50df

* replace max-sentences with batch-size for dependencies

Summary: this fixes some regressions introduced by D24121305. fairseq configuration is changing from command line to dataclasses (via hydra eventually) which no longer supports option aliases. one such alias is --max-sentences / --batch-size, and D24121305 removed --max-sentences as --batch-size is more appropriate (fairseq is not just an nlp framework dealing with sentences). unfortunately it seems some existing flows broke and this diff attempts to fix this

Differential Revision: D24142488

fbshipit-source-id: 075180ea10a9d706a3f8d64b978d66dfd83c3d2b
Co-authored-by: Vincent Quenneville-Belair <vincentqb@gmail.com>
Co-authored-by: cpuhrsch <cpuhrsch@fb.com>
Co-authored-by: Ji Chen <jimchen90@fb.com>
Co-authored-by: Ben Mehne <bmehne@fb.com>
Co-authored-by: Stanislau Hlebik <stash@fb.com>
Co-authored-by: Yanan Cao <gmagogsfm@gmail.com>
Co-authored-by: Andres Suarez <asuarez@fb.com>
Co-authored-by: Alexei Baevski <abaevski@fb.com>

758f6c2a

13 Oct, 2020 1 commit
- Add Conv-TasNet training script to example (#896) · 4e97213b
  moto authored Oct 13, 2020
  
  4e97213b
12 Oct, 2020 1 commit
- Add wsj0-mix dataset to source separation example (#895) · 2d879132
  moto authored Oct 12, 2020
  
  2d879132
06 Oct, 2020 1 commit
- Add metrics to source separation example(#894) · 725f8b06
  moto authored Oct 06, 2020
  
  725f8b06
24 Sep, 2020 1 commit

Example pipeline with wav2letter (#632) · 9c274228

Vincent QB authored Sep 24, 2020

* example pipeline, initial commit.

* removing notebook conversion artifacts.

* remove extra comments. lint.

* addressing some feedback.

* main function.

* defining args in function.

* refactor.

* lint.

* checkpoint.

* clean version to start with.

* adding more parameters.

* lint.

* cleaning full version.

* check for not None.

* cleaning.

* back -l 160

* black.

* fix runtime error.

* removing some print statements.

* add help to command line. add progress bar option.

* grouping librispeech-specific transform in subclass.

* typo.

* fix concatenation.

* typo.

* black. tqdm.

* missing transpose.

* renaming variables.

* sum cer and wer

* clip norm.

* second signal handler removed.

* cosmetic.

* default to no checkpoint.

* remove non_blocking.

* adadelta works better than sgd.

* anomaly detection.

* moving dataset to separate file.

* lint.

* move to separate module: languagemodel, decoder, metric.

* flush=True.

* renaming decoder.

* CTC Decoders.

* flush=True.

* pass length for viterbi decoder.

* progress bar. relative path.

* generalize transition matrix to n-gram. progress bar.

* choice of decoder.

* collate func.

* remove signal handling.

* adding distributed.

* lint.

* normalize w/r to length of dataset, and w/r to total number characters.

* relative cer/wer.

* clip grad parameter. momentum back but not yet used.

* Switch to SGD.

* choice of optimizer.

* scheduler.

* move to utils file.

* metric log, and utils file.

* rename metric_logger.

* stderr and stdout. simpler metric logger.

* replace by logging.

* adding time measurement in metric logger.

* fix duplicate name. remove tqdm. keep track of epoch instead and iteration instead.

* rename main file. and add readme.

* refactor distributed.

* swap example and output in readme.

* remove time from logger.

* check non-empty tensor input.

* typo in variable name and log update.

* typo.

* compute cer/wer in training too.

* typo.

* add back slurm signal capture to resubmit job.

* update levinstein distance.

* adding tests for levenstein distance.

* record error rate during iteration.

* metric logger using setitem.

* moving signal break to end of loop and return loss so far.

* typo.

* add citation.

* change default to best run.

* adding other experiment with decoders.

* remove other decoders than greedy.

* Revert "remove other decoders than greedy."

This reverts commit fb114372e89e317bf48d0b1f846c60bca8efe1ac.

* changing name of folfder.

* remove other decoders, and unused dataset class.

* rename functions to align with other pipeline.

* pick which parts to train with.

* adding specaugment to validation. note that caching prevents randomization from happening in validation.

* updating readme.

* typo in metric logging.

* Revert "typo in metric logging."

This reverts commit acac245eec250f61d2039a67933d3c01f1975ce9.

* Revert "Revert "typo in metric logging.""

This reverts commit 2c80d9691ed401044da734c40df3715dba92d0db.

* update metric logger.

* simplify metric logger implementation.

* use json dumps instead.

* group metric together.

* move function.

* lint.

* quick summary of files in folder.

* pass clip_grad explictly.

* typo in default dataset name.

* option to disable logger.

* ergonomics for distributed.

* reminder about signal handler.

* minor refactor of main in main.

* replace by not_main_rank.

* raising error if parameter not supported.

* move model before invoking DDP.

* changing log level. using python 2 style string for logging.

* dynamic augmentations.

* update metric log.

batch cer/wer metric. correct typo in time. adding other dimensions in metric.

* save learning rate even if function not available.

* add type option to model.

* add adamw.

* reduce lr on validation step or training step.

* specify hop-length and win-length.

* normalize option.

* rename parameter.

* add dropout and tweak to number of channels.

* copy model in pipeline folder for experimentation.

* fix scheduler stepping.

* fix input_type and num_features.

* waveform mode changes shape more.

* adding best character error rate with current implementation of model with mfcc.

* comment update.

* remove signal. remove custom wav2letter model.

* remove comment.

* simpler import with pandas.

9c274228

11 Sep, 2020 1 commit

Fix interactive asr (#900) · b6a61c3f

sdarkhovsky authored Sep 11, 2020

* updated the build_generator call to include the models argument

* fixed RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

b6a61c3f

07 Aug, 2020 1 commit

Add spectrogram normalization option (#863) · 6aafbb6d

jimchen90 authored Aug 07, 2020



* Add spectrogram normalization option
Co-authored-by: Ji Chen <jimchen90@devfair0160.h2.fair>

6aafbb6d

30 Jul, 2020 1 commit
- Add libritts dataset option (#818) · 870811c7
  jimchen90 authored Jul 30, 2020
```
Co-authored-by: Ji Chen <jimchen90@devfair0160.h2.fair>
```
  870811c7
29 Jul, 2020 1 commit

Remove underscore of wavernn model (#810) · f7549730

jimchen90 authored Jul 28, 2020



* Remove underscore of model name
Co-authored-by: Ji Chen <jimchen90@devfair0160.h2.fair>

f7549730

21 Jul, 2020 1 commit

Add wavernn example pipeline (#749) · fac1bba9

jimchen90 authored Jul 21, 2020

* Add WaveRNN example

This is the pipeline example based on [WaveRNN model](https://github.com/pytorch/audio/pull/735) in torchaudio. The design of this pipeline is inspired by [#632](https://github.com/pytorch/audio/pull/632). It offers a standardized implementation of WaveRNN vocoder in torchaudio.

* Add utils and readme

The metric logger is added based on the Wav2letter pipeline [#632](https://github.com/pytorch/audio/pull/632). It offers the way to parse the standard output as described in readme.

* Add channel dimension

The channel dimension of waveform in datasets is added to match the input dimensions of WaveRNN model because the channel dimensions of waveform and spectrogram are added in [this part] (https://github.com/pytorch/audio/blob/master/torchaudio/models/_wavernn.py#L281) of WaveRNN model.

* Update date split and transform

The design of dataset structure is discussed in [this comment](https://github.com/pytorch/audio/pull/749#discussion_r454627027

). Now the dataset file has a clearer workflow after using the random-split function instead of walking through all the files. All transform functions are put together inside the transforms block.
Co-authored-by: Ji Chen <jimchen90@devfair0160.h2.fair>

fac1bba9

01 Apr, 2020 1 commit
- Replace six with python3 version (#486) · d069fb9f
  Bhargav Kathivarapu authored Apr 01, 2020
  
  d069fb9f
06 Sep, 2019 1 commit
- lint. (#266) · f720aec0
  Vincent QB authored Sep 06, 2019
  
  f720aec0
21 Aug, 2019 1 commit
- Increasing test coverage (ASR demo) (#248) · ed175137
  jamarshon authored Aug 21, 2019
  
  ed175137
19 Aug, 2019 1 commit

Interactive speech recognition demo (#229) · fc9fb931

cpuhrsch authored Aug 19, 2019

Interactive speech recognition demo based on PySpeech's new model, Jason Lian's PR, and Vincent QB's VAD PR.

fc9fb931