1. 22 Apr, 2025 1 commit
    • Matthew Douglas's avatar
      Updates for device agnosticism (#1601) · 1088ec52
      Matthew Douglas authored
      * Include device support tags for transformers multi-backend compatability; add xpu() and cpu() to Params4bit
      
      * Make test suite more device-agnostic
      
      * Additional device agnostic tests
      
      * Additional device agnosticism for tests
      
      * Add BNB_TEST_DEVICE env var to manually select device for unit tests
      
      * Include device support tags for transformers multi-backend compatability; add xpu() and cpu() to Params4bit
      
      * Make test suite more device-agnostic
      
      * Additional device agnostic tests
      
      * Additional device agnosticism for tests
      
      * Add BNB_TEST_DEVICE env var to manually select device for unit tests
      
      * Small bugfix for int8 test
      
      * Exclude backward() from code coverage reports
      
      * Params4bit: don't try to quantize when moving to meta device
      1088ec52
  2. 05 Dec, 2024 1 commit
    • Matthew Douglas's avatar
      LLM.int8() Refactoring: Part 1 (#1401) · 81e6345d
      Matthew Douglas authored
      
      
      * Start of int8 refactor: remove col32/col_ampere/col_turing transforms in new igemmlt implementation
      
      * Fix unintended change
      
      * New naive mm_dequant kernel for row-major; cleanup
      
      * fix
      
      * int8 refactor: initial sparse decomp, cleanup
      
      * Int8 refactoring: remove separate NO_CUBLASLT build; more cleanup
      
      * int8: inference optimizations, some cleanup
      
      * int8: more tests passing, cleanup
      
      * int8 - more cleanup, most tests passing
      
      * int8: specify CUDA stream for int8 ops
      
      * perf: reduce overhead from getting cudaStream ptr
      
      * Mark some functions for deprecation.
      
      * int8 sparse decomp: small perf improvement
      
      * update setup.py
      
      * Update bitsandbytes/autograd/_functions.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/research/autograd/_functions.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * int8 - perf improvement for sparse decomposition inference; deprecate get_tensor_stream() in favor of new private fn
      
      * int8 cleanup
      
      * Ignore ruff rule ISC001 (incompatible with formatter)
      
      * add comment
      
      * int8 more cleanup
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * int8: rename / deprecate old fn signatures
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * type annotation
      
      * format update
      
      * Update bitsandbytes/research/autograd/_functions.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * cleanup
      
      * Add comment to explain division optimization
      
      * more cleanup
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * Update bitsandbytes/functional.py
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      
      * cleanup
      
      * Type annotations, cleanup
      
      * remove unused kernels; improved type annotations
      
      * small perf optimization for single-GPU systems
      
      * small perf optimization for single-GPU systems
      
      * update docstrings
      
      * Improve docs and tests
      
      * Update docstring
      
      * Update test
      
      * add benchmarking script
      
      * test cleanup: add deprecated marker, move benchmarks out
      
      * Add int8 dequant function; misc improvements
      
      * int8 matmul fallback for inner dims not divisible by 4
      
      * improve register usage of kInt8VectorQuant - especially for A100/H100
      
      * disable fail-fast for package build
      
      * maxwell compat
      
      * ptxas verbose
      
      * docs update
      
      * doc update
      
      * backward fix
      
      * Bugfix sparse decomp
      
      * Int8 fix for PEFT OLoRA init
      
      * Fix test for deprecated spmm_coo
      
      * test improvement
      
      * doc update
      
      * typo
      
      * doc cleanup
      
      * docs
      
      * add inference benchmark script
      
      * Add benchmarks, doc update
      
      ---------
      Co-authored-by: default avatarAarni Koskela <akx@iki.fi>
      81e6345d
  3. 06 Aug, 2024 1 commit
  4. 15 Jul, 2024 1 commit
  5. 13 Mar, 2024 1 commit
  6. 21 Feb, 2024 1 commit
  7. 05 Feb, 2024 1 commit
  8. 01 Feb, 2024 2 commits
    • Aarni Koskela's avatar
      6974920b
    • Aarni Koskela's avatar
      Test improvements (#1001) · 2336a45c
      Aarni Koskela authored
      * test_nvidia_transform: fix variable reference
      
      `out_order` is the global parametrization list, not the test fixture argument
      
      * Make `parametrize` use more idiomatic
      
      * Use a more deterministic helper for `dim*` determination
      
      * Convert NO_CUBLASLT errors into skips too
      
      * Mark slow and benchmark tests as such (allows `-k "not benchmark"`)
      2336a45c
  9. 30 Jan, 2024 1 commit
    • Aarni Koskela's avatar
      Ruff fixes (#984) · 706ec24d
      Aarni Koskela authored
      
      
      * Adjust Ruff configuration
      
      * do not autofix always
      * be less strict around tests and benchmarks
      * adjust ignores for now
      
      * Ruff: autofix I and F401
      
      * Apply ruff autofixes
      
      * Fix RUF013 complaint
      
      * Fix mutable default in replace_linear
      
      * Don't use bare except
      
      * Wrap bitsandbytes.__main__ entrypoint in function; fix "sensible" typo
      
      * Fix ruff B008 (function call in arguments)
      
      * Add ruff noqas as suitable
      
      * Fix RUF005 (splat instead of concatenating)
      
      * Fix B018 (useless expression)
      
      * Add pre-commit configuration + GitHub Actions lint workflow
      
      * Fix unused `e` in bitsandbytes/__main__.py
      
      * fix merge conflict resolution error
      
      * run pre-commit hook
      
      ---------
      Co-authored-by: default avatarTitus <9048635+Titus-von-Koeller@users.noreply.github.com>
      706ec24d
  10. 24 Jan, 2024 1 commit
  11. 08 Jan, 2024 1 commit
  12. 22 Jul, 2023 1 commit
  13. 10 Jul, 2023 1 commit
  14. 07 May, 2023 1 commit
  15. 12 Apr, 2023 1 commit
  16. 04 Apr, 2023 1 commit
  17. 03 Apr, 2023 1 commit
  18. 01 Apr, 2023 1 commit
  19. 14 Feb, 2023 1 commit
  20. 05 Feb, 2023 2 commits
  21. 02 Feb, 2023 1 commit
  22. 27 Oct, 2022 1 commit
  23. 24 Oct, 2022 1 commit
  24. 20 Sep, 2022 2 commits
  25. 17 Sep, 2022 12 commits
  26. 17 Aug, 2022 1 commit