1. 11 Mar, 2024 3 commits
    • Tanay Mehta's avatar
      Add Fill-in-the-middle training objective example - PyTorch (#27464) · 6d67837f
      Tanay Mehta authored
      * add: initial script to train clm fim
      
      * fix: if training model from scratch, new tokens will be added and embeddings resized
      
      * fix: fixed attention_mask errors when generating FIM data
      
      * fix: file formatted using black
      
      * add: run_fim_no_trainer.py and fixed some comments in run_fim.py
      
      * add: added fim examples to the README.md and ran code fixup
      
      * fix: little bug in both fim training scripts
      
      * fix: remove comment from notebook and added a note on fim related params
      
      * fix: minor typo in README
      
      * add: suggested minor changes to README and run_fim.py
      
      * add: gradient_accumulation_steps and gradient_checkpointing args
      
      * add: improved model embedding resizing
      
      * add: pad_to_multiple_of and attn_implementation params
      
      * add: requested minor changes
      
      * add: deepspeed zero compatibility
      
      * add: resize embeddings layer with zero3 support for fim model initialization
      6d67837f
    • j-gc's avatar
      [`Docs`] fixed minor typo (#29555) · d80c9a34
      j-gc authored
      d80c9a34
    • Arthur's avatar
      [`Mamba doc`] Post merge updates (#29472) · 4f27ee93
      Arthur authored
      * post merge update
      
      * nit
      
      * oups
      4f27ee93
  2. 08 Mar, 2024 13 commits
  3. 07 Mar, 2024 9 commits
  4. 06 Mar, 2024 13 commits
  5. 05 Mar, 2024 2 commits