1. 30 Jul, 2021 1 commit
    • 21jun's avatar
      fix typo in gradient_checkpointing arg (#12855) · 5c673efa
      21jun authored
      help for `ModelArguments.gradient_checkpointing` should be
      "If True, use gradient checkpointing to save memory
      at the expense of slower backward pass."
      not "Whether to freeze the feature extractor layers of the model."
      (which is duplicated from `freeze_feature_extractor` arg)
      5c673efa
  2. 25 Jun, 2021 1 commit
  3. 14 Jun, 2021 1 commit
  4. 12 May, 2021 1 commit
  5. 18 Mar, 2021 1 commit
    • Mohamed El-Geish's avatar
      wav2vec2: support datasets other than LibriSpeech (#10581) · af8afdc8
      Mohamed El-Geish authored
      * wav2vec2: support datasets other than LibriSpeech
      
      * Formatting run_asr.py to pass code quality test
      
      * bundled orthography options and added verbose logs
      
      * fixing a typo in timit fine-tuning script
      
      * update comment for clarity
      
      * resize_lm_head and load custom vocab from file
      
      * adding a max_duration_in_seconds filter
      
      * do not assign `duration_filter` lambda, use a def
      
      * log untransliterated text as well
      
      * fix base model for arabic
      
      * fix duration filter when target_sr is not set
      
      * drop duration_in_seconds when unneeded
      
      * script for wav2vec2-large-lv60-timit-asr
      
      * fix for "tha" in arabic corpus (huggingface#10581)
      
      * adding more options to work with common_voice
      
      * PR feedback (huggingface#10581)
      
      * small README change
      af8afdc8
  6. 05 Mar, 2021 1 commit
  7. 01 Mar, 2021 1 commit
    • Patrick von Platen's avatar
      Add Fine-Tuning for Wav2Vec2 (#10145) · 0234de84
      Patrick von Platen authored
      
      
      * add encode labels function to tokenizer
      
      * start adding finetuning
      
      * init dropout
      
      * upload
      
      * correct convert script
      
      * apply changes
      
      * fix second typo
      
      * make first dummy training run
      
      * adapt convert script
      
      * push confg for comparison
      
      * remove conf
      
      * finish training
      
      * adapt data collator
      
      * add research folder
      
      * update according to fairseq feedback
      
      * some minor corrections
      
      * refactor masking indices a bit
      
      * some minor changes
      
      * clean tokenizer
      
      * finish clean-up
      
      * remove previous logic
      
      * update run script
      
      * correct training
      
      * finish changes
      
      * finish model
      
      * correct bug
      
      * fix training a bit more
      
      * add some tests
      
      * finish gradient checkpointing
      
      * finish example
      
      * correct gradient checkpointing
      
      * improve tokenization method
      
      * revert changes in tokenizer
      
      * revert general change
      
      * adapt fine-tuning
      
      * update
      
      * save intermediate test
      
      * Update README.md
      
      * finish finetuning
      
      * delete conversion script
      
      * Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
      
      * Update src/transformers/models/wav2vec2/processing_wav2vec2.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * finish wav2vec2 script
      
      * finish wav2vec2 fine-tuning
      
      * finalize test
      
      * correct test
      
      * adapt tests
      
      * finish
      
      * remove test file
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      0234de84