1. 27 Mar, 2025 1 commit
  2. 25 Mar, 2025 3 commits
    • Matthew Douglas's avatar
      Bump dev version · b86ff64b
      Matthew Douglas authored
      b86ff64b
    • Matthew Douglas's avatar
      PyTorch Custom Operator Integration (#1544) · e82f72b3
      Matthew Douglas authored
      
      
      * Sketch out first custom op registration
      
      * Add note
      
      * Initial int8 op registration
      
      * Cleanup some deprecated functions.
      
      * Int8 ops updates; tests
      
      * Implement 4bit quant/dequant ops
      
      * Fix nested quant
      
      * cleanup
      
      * Test improvements
      
      * Clean up and improve tests
      
      * Add higher level custom op for int8 matmul + dequant + bias
      
      * Add gemv 4bit custom op
      
      * Cleanup
      
      * Implement out kwarg overloads for custom ops
      
      * Update PyTorch minimum to 2.1
      
      * Deprecation updates
      
      * Deprecation updates
      
      * Cleanup; rename int8_linear_dequant -> int8_scaled_mm
      
      * Bump min pytorch to 2.2
      
      * cleanup
      
      * Test reorganization
      
      * Remove deprecated supports_igemmlt
      
      * More cleanup
      
      * Cleanup obsolete C++/CUDA code
      
      * Cleanup
      
      * Create 'default' backend for fallback op implementations; initial CPU nf4 work
      
      * Stub out for multi-platform
      
      * Fix serialization tests for torch>=2.6.0
      
      * Add example for torch.compile e2e inference
      
      * Test update
      
      ---------
      Co-authored-by: default avatarTitus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
      e82f72b3
    • Matthew Douglas's avatar
      Release 0.45.4 · f0735f95
      Matthew Douglas authored
      f0735f95
  3. 19 Mar, 2025 1 commit
  4. 13 Mar, 2025 1 commit
  5. 07 Mar, 2025 1 commit
  6. 25 Feb, 2025 2 commits
  7. 24 Feb, 2025 4 commits
  8. 20 Feb, 2025 1 commit
  9. 19 Feb, 2025 4 commits
  10. 06 Feb, 2025 6 commits
  11. 28 Jan, 2025 2 commits
  12. 23 Jan, 2025 3 commits
  13. 22 Jan, 2025 1 commit
  14. 14 Jan, 2025 3 commits
  15. 17 Dec, 2024 2 commits
  16. 11 Dec, 2024 1 commit
  17. 10 Dec, 2024 2 commits
  18. 05 Dec, 2024 2 commits
    • Matthew Douglas's avatar
      Release 0.45.0 · 64d382da
      Matthew Douglas authored
      64d382da
    • Matthew Douglas's avatar
      LLM.int8() Refactoring: Part 1 (#1401) · 81e6345d
      Matthew Douglas authored
      
      
      * Start of int8 refactor: remove col32/col_ampere/col_turing transforms in new igemmlt implementation
      
      * Fix unintended change
      
      * New naive mm_dequant kernel for row-major; cleanup
      
      * fix
      
      * int8 refactor: initial sparse decomp, cleanup
      
      * Int8 refactoring: remove separate NO_CUBLASLT build; more cleanup
      
      * int8: inference optimizations, some cleanup
      
      * int8: more tests passing, cleanup
      
      * int8 - more cleanup, most tests passing
      
      * int8: specify CUDA stream for int8 ops
      
      * perf: reduce overhead from getting cudaStream ptr
      
      * Mark some functions for deprecation.
      
      * int8 sparse decomp: small perf improvement
      
      * update setup.py
      
      * Update bitsandbytes/autograd/_functions.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/research/autograd/_functions.py
      ...
      81e6345d