1. 11 Sep, 2023 2 commits
  2. 09 Sep, 2023 1 commit
  3. 07 Sep, 2023 1 commit
  4. 06 Sep, 2023 1 commit
  5. 05 Sep, 2023 2 commits
  6. 04 Sep, 2023 1 commit
  7. 03 Sep, 2023 1 commit
  8. 30 Aug, 2023 1 commit
  9. 28 Aug, 2023 1 commit
  10. 27 Aug, 2023 1 commit
  11. 26 Aug, 2023 3 commits
  12. 22 Aug, 2023 1 commit
  13. 21 Aug, 2023 1 commit
  14. 19 Aug, 2023 2 commits
  15. 18 Aug, 2023 2 commits
    • Xuechen Li's avatar
      support when num_heads is not divisible by world_size; resolves #459 (#461) · bb4cded1
      Xuechen Li authored
      * uneql rank.
      
      * trim.
      
      * enable passing in number of heads for each rank.
      
      * simplify.
      
      * simplify.
      
      * cleanup.
      
      * fix col parallel.
      
      * fix bug with row parallel.
      
      * fit out proj.
      
      * refac.
      
      * fix sharding logic.
      
      * refac sharding.
      
      * refac.
      
      * support multiple of.
      
      * make fn reuseable.
      
      * fix bug in dimensions.
      
      * scaffold.
      
      * test uneven heads.
      
      * fix test by adding barrier.
      
      * refac.
      
      * reuse code.
      
      * clean up.
      bb4cded1
    • Tri Dao's avatar
      [ViT] Minor fix so it runs · a81900d4
      Tri Dao authored
      a81900d4
  16. 15 Aug, 2023 1 commit
    • Xuechen Li's avatar
      enable loading hf llama checkpoints for training (#446) · 0f7853c6
      Xuechen Li authored
      * prelim.
      
      * add hf convertion fn.
      
      * mlp.
      
      * change name.
      
      * fix bug.
      
      * inverse permute.
      
      * change comment.
      
      * revert style changes.
      
      * fix.
      
      * add doc.
      
      * revert.
      
      * enable load safe.
      
      * fix safe load.
      
      * fix import.
      
      * fix typing-related lints.
      
      * fix ckpt loading logic.
      
      * make single gpu work.
      
      * test with parallel.
      
      * ckpt format.
      
      * enable pretrained state dict.
      
      * remove unused imports.
      
      * remove unused.
      
      * mark idea related.
      0f7853c6
  17. 29 Jul, 2023 1 commit
  18. 26 Jul, 2023 2 commits
  19. 23 Jul, 2023 2 commits
  20. 22 Jul, 2023 1 commit
  21. 02 Jul, 2023 1 commit
  22. 30 May, 2023 1 commit
  23. 05 May, 2023 1 commit
  24. 21 Apr, 2023 1 commit
  25. 19 Apr, 2023 1 commit
  26. 14 Apr, 2023 1 commit
  27. 31 Mar, 2023 1 commit
  28. 29 Mar, 2023 2 commits
  29. 22 Mar, 2023 1 commit
  30. 23 Jan, 2023 1 commit
  31. 18 Jan, 2023 1 commit