1. 23 Mar, 2022 1 commit
  2. 10 Jan, 2022 1 commit
  3. 14 Oct, 2021 1 commit
  4. 08 Aug, 2021 1 commit
  5. 09 Jun, 2021 1 commit
  6. 08 Jun, 2021 1 commit
  7. 18 Mar, 2021 1 commit
    • Mohamed El-Geish's avatar
      wav2vec2: support datasets other than LibriSpeech (#10581) · af8afdc8
      Mohamed El-Geish authored
      * wav2vec2: support datasets other than LibriSpeech
      
      * Formatting run_asr.py to pass code quality test
      
      * bundled orthography options and added verbose logs
      
      * fixing a typo in timit fine-tuning script
      
      * update comment for clarity
      
      * resize_lm_head and load custom vocab from file
      
      * adding a max_duration_in_seconds filter
      
      * do not assign `duration_filter` lambda, use a def
      
      * log untransliterated text as well
      
      * fix base model for arabic
      
      * fix duration filter when target_sr is not set
      
      * drop duration_in_seconds when unneeded
      
      * script for wav2vec2-large-lv60-timit-asr
      
      * fix for "tha" in arabic corpus (huggingface#10581)
      
      * adding more options to work with common_voice
      
      * PR feedback (huggingface#10581)
      
      * small README change
      af8afdc8
  8. 01 Mar, 2021 1 commit
    • Patrick von Platen's avatar
      Add Fine-Tuning for Wav2Vec2 (#10145) · 0234de84
      Patrick von Platen authored
      
      
      * add encode labels function to tokenizer
      
      * start adding finetuning
      
      * init dropout
      
      * upload
      
      * correct convert script
      
      * apply changes
      
      * fix second typo
      
      * make first dummy training run
      
      * adapt convert script
      
      * push confg for comparison
      
      * remove conf
      
      * finish training
      
      * adapt data collator
      
      * add research folder
      
      * update according to fairseq feedback
      
      * some minor corrections
      
      * refactor masking indices a bit
      
      * some minor changes
      
      * clean tokenizer
      
      * finish clean-up
      
      * remove previous logic
      
      * update run script
      
      * correct training
      
      * finish changes
      
      * finish model
      
      * correct bug
      
      * fix training a bit more
      
      * add some tests
      
      * finish gradient checkpointing
      
      * finish example
      
      * correct gradient checkpointing
      
      * improve tokenization method
      
      * revert changes in tokenizer
      
      * revert general change
      
      * adapt fine-tuning
      
      * update
      
      * save intermediate test
      
      * Update README.md
      
      * finish finetuning
      
      * delete conversion script
      
      * Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
      
      * Update src/transformers/models/wav2vec2/processing_wav2vec2.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * finish wav2vec2 script
      
      * finish wav2vec2 fine-tuning
      
      * finalize test
      
      * correct test
      
      * adapt tests
      
      * finish
      
      * remove test file
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      0234de84