1. 08 Aug, 2024 3 commits
  2. 07 Aug, 2024 1 commit
  3. 06 Aug, 2024 3 commits
  4. 05 Aug, 2024 1 commit
    • drbh's avatar
      fix: attempt forward on flash attn2 to check hardware support (#2335) · 215ed3ad
      drbh authored
      * fix: attempt forward on flash attn2 to check hardware support
      
      * fix: warn window_size_left when using flash attn 1
      
      * fix: prefer version check over test op and avoid window_size_left if not flash attn2
      
      * fix: improve condtional and error message
      
      * fix: update sliding window conditional
      
      * fix: simplify changes and revert model changes
      
      * fix: avoid changing conditional
      
      * fix: typo tweak
      215ed3ad
  5. 01 Aug, 2024 2 commits
  6. 31 Jul, 2024 2 commits
  7. 30 Jul, 2024 1 commit
  8. 29 Jul, 2024 2 commits
  9. 26 Jul, 2024 2 commits
    • drbh's avatar
      feat: add ruff and resolve issue (#2262) · bab02ff2
      drbh authored
      * feat: add ruff and resolve issue
      
      * fix: update client exports and adjust after rebase
      
      * fix: adjust syntax to avoid circular import
      
      * fix: adjust client ruff settings
      
      * fix: lint and refactor import check and avoid model enum as global names
      
      * fix: improve fbgemm_gpu check and lints
      
      * fix: update lints
      
      * fix: prefer comparing model enum over str
      
      * fix: adjust lints and ignore specific rules
      
      * fix: avoid unneeded quantize check
      bab02ff2
    • Daniël de Kok's avatar
  10. 25 Jul, 2024 1 commit
  11. 24 Jul, 2024 4 commits
  12. 23 Jul, 2024 8 commits
  13. 22 Jul, 2024 3 commits
  14. 21 Jul, 2024 1 commit
  15. 20 Jul, 2024 1 commit
    • OlivierDehaene's avatar
      feat(fp8): use fbgemm kernels and load fp8 weights directly (#2248) · 53ec0b79
      OlivierDehaene authored
      * feat(fp8): add support for fbgemm
      
      * allow loading fp8 weights directly
      
      * update outlines
      
      * fix makefile
      
      * build fbgemm
      
      * avoid circular import and fix dockerfile
      
      * add default dtype
      
      * refactored weights loader
      
      * fix auto conversion
      
      * fix quantization config parsing
      
      * force new nccl on install
      
      * missing get_weights implementation
      
      * increase timeout
      53ec0b79
  16. 19 Jul, 2024 5 commits