- 13 Oct, 2020 1 commit
-
-
moto authored
-
- 12 Oct, 2020 1 commit
-
-
moto authored
-
- 06 Oct, 2020 1 commit
-
-
moto authored
-
- 24 Sep, 2020 1 commit
-
-
Vincent QB authored
* example pipeline, initial commit. * removing notebook conversion artifacts. * remove extra comments. lint. * addressing some feedback. * main function. * defining args in function. * refactor. * lint. * checkpoint. * clean version to start with. * adding more parameters. * lint. * cleaning full version. * check for not None. * cleaning. * back -l 160 * black. * fix runtime error. * removing some print statements. * add help to command line. add progress bar option. * grouping librispeech-specific transform in subclass. * typo. * fix concatenation. * typo. * black. tqdm. * missing transpose. * renaming variables. * sum cer and wer * clip norm. * second signal handler removed. * cosmetic. * default to no checkpoint. * remove non_blocking. * adadelta works better than sgd. * anomaly detection. * moving dataset to separate file. * lint. * move to separate module: languagemodel, decoder, metric. * flush=True. * renaming decoder. * CTC Decoders. * flush=True. * pass length for viterbi decoder. * progress bar. relative path. * generalize transition matrix to n-gram. progress bar. * choice of decoder. * collate func. * remove signal handling. * adding distributed. * lint. * normalize w/r to length of dataset, and w/r to total number characters. * relative cer/wer. * clip grad parameter. momentum back but not yet used. * Switch to SGD. * choice of optimizer. * scheduler. * move to utils file. * metric log, and utils file. * rename metric_logger. * stderr and stdout. simpler metric logger. * replace by logging. * adding time measurement in metric logger. * fix duplicate name. remove tqdm. keep track of epoch instead and iteration instead. * rename main file. and add readme. * refactor distributed. * swap example and output in readme. * remove time from logger. * check non-empty tensor input. * typo in variable name and log update. * typo. * compute cer/wer in training too. * typo. * add back slurm signal capture to resubmit job. * update levinstein distance. * adding tests for levenstein distance. * record error rate during iteration. * metric logger using setitem. * moving signal break to end of loop and return loss so far. * typo. * add citation. * change default to best run. * adding other experiment with decoders. * remove other decoders than greedy. * Revert "remove other decoders than greedy." This reverts commit fb114372e89e317bf48d0b1f846c60bca8efe1ac. * changing name of folfder. * remove other decoders, and unused dataset class. * rename functions to align with other pipeline. * pick which parts to train with. * adding specaugment to validation. note that caching prevents randomization from happening in validation. * updating readme. * typo in metric logging. * Revert "typo in metric logging." This reverts commit acac245eec250f61d2039a67933d3c01f1975ce9. * Revert "Revert "typo in metric logging."" This reverts commit 2c80d9691ed401044da734c40df3715dba92d0db. * update metric logger. * simplify metric logger implementation. * use json dumps instead. * group metric together. * move function. * lint. * quick summary of files in folder. * pass clip_grad explictly. * typo in default dataset name. * option to disable logger. * ergonomics for distributed. * reminder about signal handler. * minor refactor of main in main. * replace by not_main_rank. * raising error if parameter not supported. * move model before invoking DDP. * changing log level. using python 2 style string for logging. * dynamic augmentations. * update metric log. batch cer/wer metric. correct typo in time. adding other dimensions in metric. * save learning rate even if function not available. * add type option to model. * add adamw. * reduce lr on validation step or training step. * specify hop-length and win-length. * normalize option. * rename parameter. * add dropout and tweak to number of channels. * copy model in pipeline folder for experimentation. * fix scheduler stepping. * fix input_type and num_features. * waveform mode changes shape more. * adding best character error rate with current implementation of model with mfcc. * comment update. * remove signal. remove custom wav2letter model. * remove comment. * simpler import with pandas.
-
- 11 Sep, 2020 1 commit
-
-
sdarkhovsky authored
* updated the build_generator call to include the models argument * fixed RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same
-
- 07 Aug, 2020 1 commit
-
-
jimchen90 authored
* Add spectrogram normalization option Co-authored-by:Ji Chen <jimchen90@devfair0160.h2.fair>
-
- 30 Jul, 2020 1 commit
-
-
jimchen90 authored
Co-authored-by:Ji Chen <jimchen90@devfair0160.h2.fair>
-
- 29 Jul, 2020 1 commit
-
-
jimchen90 authored
* Remove underscore of model name Co-authored-by:Ji Chen <jimchen90@devfair0160.h2.fair>
-
- 21 Jul, 2020 1 commit
-
-
jimchen90 authored
* Add WaveRNN example This is the pipeline example based on [WaveRNN model](https://github.com/pytorch/audio/pull/735) in torchaudio. The design of this pipeline is inspired by [#632](https://github.com/pytorch/audio/pull/632). It offers a standardized implementation of WaveRNN vocoder in torchaudio. * Add utils and readme The metric logger is added based on the Wav2letter pipeline [#632](https://github.com/pytorch/audio/pull/632). It offers the way to parse the standard output as described in readme. * Add channel dimension The channel dimension of waveform in datasets is added to match the input dimensions of WaveRNN model because the channel dimensions of waveform and spectrogram are added in [this part] (https://github.com/pytorch/audio/blob/master/torchaudio/models/_wavernn.py#L281) of WaveRNN model. * Update date split and transform The design of dataset structure is discussed in [this comment](https://github.com/pytorch/audio/pull/749#discussion_r454627027 ). Now the dataset file has a clearer workflow after using the random-split function instead of walking through all the files. All transform functions are put together inside the transforms block. Co-authored-by:
Ji Chen <jimchen90@devfair0160.h2.fair>
-
- 01 Apr, 2020 1 commit
-
-
Bhargav Kathivarapu authored
-
- 06 Sep, 2019 1 commit
-
-
Vincent QB authored
-
- 21 Aug, 2019 1 commit
-
-
jamarshon authored
-
- 19 Aug, 2019 1 commit
-
-
cpuhrsch authored
Interactive speech recognition demo based on PySpeech's new model, Jason Lian's PR, and Vincent QB's VAD PR.
-