1. 18 Mar, 2021 1 commit
    • Mohamed El-Geish's avatar
      wav2vec2: support datasets other than LibriSpeech (#10581) · af8afdc8
      Mohamed El-Geish authored
      * wav2vec2: support datasets other than LibriSpeech
      
      * Formatting run_asr.py to pass code quality test
      
      * bundled orthography options and added verbose logs
      
      * fixing a typo in timit fine-tuning script
      
      * update comment for clarity
      
      * resize_lm_head and load custom vocab from file
      
      * adding a max_duration_in_seconds filter
      
      * do not assign `duration_filter` lambda, use a def
      
      * log untransliterated text as well
      
      * fix base model for arabic
      
      * fix duration filter when target_sr is not set
      
      * drop duration_in_seconds when unneeded
      
      * script for wav2vec2-large-lv60-timit-asr
      
      * fix for "tha" in arabic corpus (huggingface#10581)
      
      * adding more options to work with common_voice
      
      * PR feedback (huggingface#10581)
      
      * small README change
      af8afdc8