1. 06 Feb, 2025 2 commits
  2. 23 Jan, 2025 3 commits
  3. 22 Jan, 2025 1 commit
  4. 14 Jan, 2025 3 commits
  5. 17 Dec, 2024 2 commits
  6. 11 Dec, 2024 1 commit
  7. 10 Dec, 2024 2 commits
  8. 05 Dec, 2024 2 commits
    • Matthew Douglas's avatar
      Release 0.45.0 · 64d382da
      Matthew Douglas authored
      64d382da
    • Matthew Douglas's avatar
      LLM.int8() Refactoring: Part 1 (#1401) · 81e6345d
      Matthew Douglas authored
      
      
      * Start of int8 refactor: remove col32/col_ampere/col_turing transforms in new igemmlt implementation
      
      * Fix unintended change
      
      * New naive mm_dequant kernel for row-major; cleanup
      
      * fix
      
      * int8 refactor: initial sparse decomp, cleanup
      
      * Int8 refactoring: remove separate NO_CUBLASLT build; more cleanup
      
      * int8: inference optimizations, some cleanup
      
      * int8: more tests passing, cleanup
      
      * int8 - more cleanup, most tests passing
      
      * int8: specify CUDA stream for int8 ops
      
      * perf: reduce overhead from getting cudaStream ptr
      
      * Mark some functions for deprecation.
      
      * int8 sparse decomp: small perf improvement
      
      * update setup.py
      
      * Update bitsandbytes/autograd/_functions.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/research/autograd/_functions.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * int8 - perf improvement for sparse decomposition inference; deprecate get_tensor_stream() in favor of new private fn
      
      * int8 cleanup
      
      * Ignore ruff rule ISC001 (incompatible with formatter)
      
      * add comment
      
      * int8 more cleanup
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * int8: rename / deprecate old fn signatures
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * type annotation
      
      * format update
      
      * Update bitsandbytes/research/autograd/_functions.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * cleanup
      
      * Add comment to explain division optimization
      
      * more cleanup
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * cleanup
      
      * Type annotations, cleanup
      
      * remove unused kernels; improved type annotations
      
      * small perf optimization for single-GPU systems
      
      * small perf optimization for single-GPU systems
      
      * update docstrings
      
      * Improve docs and tests
      
      * Update docstring
      
      * Update test
      
      * add benchmarking script
      
      * test cleanup: add deprecated marker, move benchmarks out
      
      * Add int8 dequant function; misc improvements
      
      * int8 matmul fallback for inner dims not divisible by 4
      
      * improve register usage of kInt8VectorQuant - especially for A100/H100
      
      * disable fail-fast for package build
      
      * maxwell compat
      
      * ptxas verbose
      
      * docs update
      
      * doc update
      
      * backward fix
      
      * Bugfix sparse decomp
      
      * Int8 fix for PEFT OLoRA init
      
      * Fix test for deprecated spmm_coo
      
      * test improvement
      
      * doc update
      
      * typo
      
      * doc cleanup
      
      * docs
      
      * add inference benchmark script
      
      * Add benchmarks, doc update
      
      ---------
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      81e6345d
  9. 02 Dec, 2024 1 commit
  10. 19 Nov, 2024 1 commit
  11. 14 Nov, 2024 1 commit
  12. 23 Oct, 2024 1 commit
  13. 16 Oct, 2024 1 commit
    • pnunna93's avatar
      Remove depth option in installation steps (#1395) · c8f2769b
      pnunna93 authored
      
      
      * Add build job for rocm
      
      * Add rocm build script
      
      * Copy shared obj file into output_dir
      
      * upload build artifacts and enable wheels build
      
      * Remove cuda build temporarily
      
      * Add ROCm version to .so filename
      
      * Add rocm_version to whls build
      
      * Revert "Remove cuda build temporarily"
      
      This reverts commit 1413c5f3a2aed51140b86daa8ee9283c67cce738.
      
      * Add rocm_version env var
      
      * Remove thrush header files
      
      * Print node info
      
      * print cuda node info
      
      * Revert "print cuda node info"
      
      This reverts commit cdb209a2eb896d9c4166f53e9b2aa580c10e42c0.
      
      * Revert "Print node info"
      
      This reverts commit 7e9a65c33f66fffcb14ee2438170718777c06022.
      
      * Add rocm arch to compile command
      
      * Rename .so files to rocm
      
      * Update default gpu arch
      
      * Skip cpu based igemmlt int tests on ROCm
      
      * Update Documentation
      
      * Update upstream repo name
      
      * Update docs
      
      * Update string format
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Remove pre-release option for torch install
      
      * Update pytorch install path
      Co-authored-by: default avatarTitus <9048635+Titus-von-Koeller@users.noreply.github.com>
      
      * Add messages for Heuristics error
      
      * Remove toolcache for disk space
      
      * print disk usage
      
      * Clean disk space for linux
      
      * Fix for ubuntu
      
      * Add sudo for apt clean
      
      * Update clean up disk list
      
      * remove disk usage print
      
      * Add BNB_BACKEND variable
      
      * Update diagnostic functions for ROCm
      
      * Fix tuple error
      
      * Fix library detection bug for recursive and symlink cases
      
      * fix pre-commit errors
      
      * Remove recursive path lib search
      
      * Create function for runtime lib patterns
      
      * Update logger format
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update error reporting
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Remove commented code
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update error reporting
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update error reporting
      
      * Create hip diagnostics functions
      
      * Fix Typo
      
      * Fix pre-commit checks
      
      * Enable 6.2 build
      
      * Skip gemv 4 bit cpu test
      
      * Update documentation for 6.2.0 pip install
      
      * Update README for default branch change
      
      * Fix typo
      
      * Sync README with upstream
      
      * Remove depth
      
      ---------
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      Co-authored-by: default avatarTitus <9048635+Titus-von-Koeller@users.noreply.github.com>
      Co-authored-by: default avatarAswin John Mathews <81309834+amathews-amd@users.noreply.github.com>
      Co-authored-by: default avatarroot <root@banff-cyxtera-s78-4.ctr.dcgpu>
      c8f2769b
  14. 14 Oct, 2024 1 commit
  15. 01 Oct, 2024 2 commits
  16. 30 Sep, 2024 7 commits
  17. 27 Sep, 2024 1 commit
  18. 24 Sep, 2024 3 commits
  19. 21 Sep, 2024 1 commit
  20. 20 Sep, 2024 2 commits
  21. 19 Sep, 2024 2 commits