1. 15 Aug, 2024 2 commits
    • Yoach Lacombe's avatar
      Update training guide colab (#108) · 8e465f1b
      Yoach Lacombe authored
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * update configs and readme
      
      * fix training and eval single gpus and long audios errors
      
      * fix error transcriptions none
      
      * fix trascription null wer
      
      * Update README.md
      
      * Update README.md
      
      ---------
      
      Co-authored-by: yoach@huggingface.co <Yoach Lacombe>
      8e465f1b
    • Yoach Lacombe's avatar
      Update training guide (#102) · 8f5ef3a2
      Yoach Lacombe authored
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * update configs and readme
      
      * fix training and eval single gpus and long audios errors
      
      * fix error transcriptions none
      
      * fix trascription null wer
      
      ---------
      
      Co-authored-by: yoach@huggingface.co <Yoach Lacombe>
      8f5ef3a2
  2. 08 Aug, 2024 1 commit
  3. 31 Jul, 2024 1 commit
    • Yoach Lacombe's avatar
      Architecture improvements (#65) · 11b209e1
      Yoach Lacombe authored
      
      
      * add RoPe
      
      * don't include padding in rope
      
      * possibly use cross-attn for prompt
      
      * fix rope
      
      * fix cross-attn
      
      * fix self-attn
      
      * fix dummy model
      
      * clean-up rope
      
      * first gqa implementation
      
      * fix wer eval
      
      * feat: add flash attention and spda
      
      * chore: add README for flash attention
      
      * chore: add benchmark script
      
      * chore: add benchmark attention approach
      
      * multi node and fix wer and fix compile
      
      * Update modeling_parler_tts.py
      
      * fix FA2, SDPA and add cross-attn MHA and attention type forcing
      
      * better cross_attention key values number of heads default + add training arguments for attn implementation
      
      * fix audio padding when torch compile or pad_to_max_length=True
      
      * correct multi node
      
      * make rope faster
      
      * fix encoder sdpa
      
      * fix training with cross attention + with FAZ
      
      * use fp32 as default model dtype + fix generation when using FA2 with autocast
      
      * remove redundant passes in generate + clean and fix attentions
      
      * fix edge case in WER evaluation when longform generation
      
      * better multi-node mapping and saving / add eval dataloader num workers
      
      * remove old benchmarks
      
      * faster audio encoding + checkpointing + fix generation step
      
      * better eval + add right padding + fix eval loss compute
      
      * correct README
      
      * correct config docstrings
      
      * remove comment
      
      * make style
      
      ---------
      Co-authored-by: default avatarsanchit-gandhi <sanchit@huggingface.co>
      Co-authored-by: default avatarsang-nguyen-ts <sang.nguyen@trustingsocial.com>
      Co-authored-by: yoach@huggingface.co <Yoach Lacombe>
      11b209e1
  4. 30 Apr, 2024 1 commit
  5. 12 Apr, 2024 1 commit
  6. 10 Apr, 2024 5 commits
  7. 09 Apr, 2024 6 commits
  8. 08 Apr, 2024 5 commits
  9. 05 Apr, 2024 1 commit
  10. 28 Feb, 2024 1 commit
  11. 14 Feb, 2024 2 commits
  12. 13 Feb, 2024 2 commits