1. 15 Aug, 2024 2 commits
    • Yoach Lacombe's avatar
      Update training guide colab (#108) · 8e465f1b
      Yoach Lacombe authored
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * update configs and readme
      
      * fix training and eval single gpus and long audios errors
      
      * fix error transcriptions none
      
      * fix trascription null wer
      
      * Update README.md
      
      * Update README.md
      
      ---------
      
      Co-authored-by: yoach@huggingface.co <Yoach Lacombe>
      8e465f1b
    • Yoach Lacombe's avatar
      Update training guide (#102) · 8f5ef3a2
      Yoach Lacombe authored
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * update configs and readme
      
      * fix training and eval single gpus and long audios errors
      
      * fix error transcriptions none
      
      * fix trascription null wer
      
      ---------
      
      Co-authored-by: yoach@huggingface.co <Yoach Lacombe>
      8f5ef3a2
  2. 13 Aug, 2024 1 commit
  3. 08 Aug, 2024 1 commit
  4. 07 Aug, 2024 2 commits
  5. 31 Jul, 2024 1 commit
    • Yoach Lacombe's avatar
      Architecture improvements (#65) · 11b209e1
      Yoach Lacombe authored
      
      
      * add RoPe
      
      * don't include padding in rope
      
      * possibly use cross-attn for prompt
      
      * fix rope
      
      * fix cross-attn
      
      * fix self-attn
      
      * fix dummy model
      
      * clean-up rope
      
      * first gqa implementation
      
      * fix wer eval
      
      * feat: add flash attention and spda
      
      * chore: add README for flash attention
      
      * chore: add benchmark script
      
      * chore: add benchmark attention approach
      
      * multi node and fix wer and fix compile
      
      * Update modeling_parler_tts.py
      
      * fix FA2, SDPA and add cross-attn MHA and attention type forcing
      
      * better cross_attention key values number of heads default + add training arguments for attn implementation
      
      * fix audio padding when torch compile or pad_to_max_length=True
      
      * correct multi node
      
      * make rope faster
      
      * fix encoder sdpa
      
      * fix training with cross attention + with FAZ
      
      * use fp32 as default model dtype + fix generation when using FA2 with autocast
      
      * remove redundant passes in generate + clean and fix attentions
      
      * fix edge case in WER evaluation when longform generation
      
      * better multi-node mapping and saving / add eval dataloader num workers
      
      * remove old benchmarks
      
      * faster audio encoding + checkpointing + fix generation step
      
      * better eval + add right padding + fix eval loss compute
      
      * correct README
      
      * correct config docstrings
      
      * remove comment
      
      * make style
      
      ---------
      Co-authored-by: default avatarsanchit-gandhi <sanchit@huggingface.co>
      Co-authored-by: default avatarsang-nguyen-ts <sang.nguyen@trustingsocial.com>
      Co-authored-by: yoach@huggingface.co <Yoach Lacombe>
      11b209e1
  6. 30 May, 2024 2 commits
  7. 23 May, 2024 2 commits
  8. 22 May, 2024 9 commits
  9. 18 May, 2024 1 commit
  10. 14 May, 2024 7 commits
  11. 09 May, 2024 4 commits
  12. 30 Apr, 2024 4 commits
  13. 25 Apr, 2024 2 commits
  14. 24 Apr, 2024 2 commits