• Mohamed El-Geish's avatar
    wav2vec2: support datasets other than LibriSpeech (#10581) · af8afdc8
    Mohamed El-Geish authored
    * wav2vec2: support datasets other than LibriSpeech
    
    * Formatting run_asr.py to pass code quality test
    
    * bundled orthography options and added verbose logs
    
    * fixing a typo in timit fine-tuning script
    
    * update comment for clarity
    
    * resize_lm_head and load custom vocab from file
    
    * adding a max_duration_in_seconds filter
    
    * do not assign `duration_filter` lambda, use a def
    
    * log untransliterated text as well
    
    * fix base model for arabic
    
    * fix duration filter when target_sr is not set
    
    * drop duration_in_seconds when unneeded
    
    * script for wav2vec2-large-lv60-timit-asr
    
    * fix for "tha" in arabic corpus (huggingface#10581)
    
    * adding more options to work with common_voice
    
    * PR feedback (huggingface#10581)
    
    * small README change
    af8afdc8
README.md 5.6 KB