Commits · 9b3399522d228c61a607701d638ac24e6a0d9eed · OpenDAS / bitsandbytes

27 Mar, 2025 1 commit
- Bump CUDA 12.8.0 build to CUDA 12.8.1 (#1575) · 9b339952
  Matthew Douglas authored Mar 27, 2025
  
  9b339952
25 Mar, 2025 3 commits

Bump dev version · b86ff64b
Matthew Douglas authored Mar 25, 2025

b86ff64b

PyTorch Custom Operator Integration (#1544) · e82f72b3

Matthew Douglas authored Mar 25, 2025



* Sketch out first custom op registration

* Add note

* Initial int8 op registration

* Cleanup some deprecated functions.

* Int8 ops updates; tests

* Implement 4bit quant/dequant ops

* Fix nested quant

* cleanup

* Test improvements

* Clean up and improve tests

* Add higher level custom op for int8 matmul + dequant + bias

* Add gemv 4bit custom op

* Cleanup

* Implement out kwarg overloads for custom ops

* Update PyTorch minimum to 2.1

* Deprecation updates

* Deprecation updates

* Cleanup; rename int8_linear_dequant -> int8_scaled_mm

* Bump min pytorch to 2.2

* cleanup

* Test reorganization

* Remove deprecated supports_igemmlt

* More cleanup

* Cleanup obsolete C++/CUDA code

* Cleanup

* Create 'default' backend for fallback op implementations; initial CPU nf4 work

* Stub out for multi-platform

* Fix serialization tests for torch>=2.6.0

* Add example for torch.compile e2e inference

* Test update

---------
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>

e82f72b3

Release 0.45.4 · f0735f95
Matthew Douglas authored Mar 25, 2025

f0735f95

19 Mar, 2025 1 commit
- docs: fix typo · e1f515cd
  Titus authored Mar 19, 2025
  
  e1f515cd
13 Mar, 2025 1 commit
- disable dependabot for now until CI revamp · e772a9e8
  Titus authored Mar 13, 2025
  
  e772a9e8
07 Mar, 2025 1 commit
- Fix CPU dequantization to use nested dequantized scaling constant (#1549) · d8d157f4
  Ethan Kiang authored Mar 07, 2025
  
  d8d157f4
25 Feb, 2025 2 commits

Build: use ubuntu-22.04 instead of 24.04 for CPU build (glibc compat) (#1538) · b8223fed
Matthew Douglas authored Feb 25, 2025

b8223fed

Bump ruff in the minor-patch group across 1 directory (#1507) · 5d468883

dependabot[bot] authored Feb 25, 2025

Bumps the minor-patch group with 1 update in the / directory: [ruff](https://github.com/astral-sh/ruff).


Updates `ruff` from 0.6.9 to 0.9.6
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.6.9...0.9.6

)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: minor-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

5d468883

24 Feb, 2025 4 commits
- Bump dev version · ec2e5ad2
  Matthew Douglas authored Feb 24, 2025
  
  ec2e5ad2
- Release 0.45.3 · efc14c19
  Matthew Douglas authored Feb 24, 2025
  
  efc14c19
- Update build-cuda.sh · fc6d8b24
  Matthew Douglas authored Feb 24, 2025
  
  fc6d8b24
- Update build-cuda.sh · e4a9a94c
  Matthew Douglas authored Feb 24, 2025
  
  e4a9a94c
20 Feb, 2025 1 commit
- deprecate alpha release notif (prepping merge to main) · 12cc3fb4
  Titus authored Feb 20, 2025
  
  12cc3fb4
19 Feb, 2025 4 commits
- Installation doc updates (#1529) · 2ce3ed9e
  Matthew Douglas authored Feb 19, 2025
  
  2ce3ed9e
- QuantState.to(): move code tensor with others to correct device, fixes #1527 (#1528) · e4fe1cfb
  Matthew Douglas authored Feb 19, 2025
  
  e4fe1cfb
- Update cuda versions in error messages (#1520) · cb3adb03
  Fx Morin authored Feb 19, 2025
  
  cb3adb03
- Update create_dynamic_map to always return a float32 tensor (#1521) · 8ed7d97b
  Mitchell Goff authored Feb 18, 2025
  
  8ed7d97b
06 Feb, 2025 6 commits
- Bump dev version · 86b6c37a
  Matthew Douglas authored Feb 06, 2025
  
  86b6c37a
- Merge branch 'patch/0.45.x' · bf244839
  Matthew Douglas authored Feb 06, 2025
  
  bf244839
- Release 0.45.2 · 7aec4a88
  Matthew Douglas authored Feb 06, 2025
  
  7aec4a88
- Improve guard for triton compatibility (#1497) · 507a4afc
  Matthew Douglas authored Feb 06, 2025
  
  507a4afc
- Fix #1490 (#1496) · 6de46d99
  Matthew Douglas authored Jan 28, 2025
  
  6de46d99
- Improve guard for triton compatibility (#1497) · 66da99aa
  Matthew Douglas authored Feb 06, 2025
  
  66da99aa
28 Jan, 2025 2 commits
- Blackwell binaries! (#1491) · f3e8cbb2
  Johnny authored Jan 28, 2025
```
* blackwell

* blackwell

* Update python-package.yml
```
  f3e8cbb2
- Fix #1490 (#1496) · 410e79dc
  Matthew Douglas authored Jan 28, 2025
  
  410e79dc
23 Jan, 2025 3 commits
- Release v0.45.1 · 8cd7793b
  Matthew Douglas authored Jan 23, 2025
  
  8cd7793b
- Exclude tests from distribution (#1486) · d6781bc6
  Aarni Koskela authored Jan 23, 2025
```
* Exclude tests from distribution

Fixes #1478

* Update pyproject.toml

* Update pyproject.toml

---------
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
```
  d6781bc6
- (build) include Ada/Hopper targets in cu118 build (#1487) · b4172770
  Matthew Douglas authored Jan 23, 2025
  
  b4172770
22 Jan, 2025 1 commit

Initial support blackwell (#1481) · db90effe

Johnny authored Jan 22, 2025



* initial support blackwell

* Update CHANGELOG.md
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update CMakeLists.txt

* Update CHANGELOG.md

* fix build-cuda.sh

* fix build-cuda.sh

* fix cuda 12.7 build-cuda.sh

* Update build-cuda.sh

* Update cuda from 12.6.2 to 12.6.3

* Update .github/workflows/python-package.yml
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update install_cuda.py
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update install_cuda.sh
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update .github/scripts/build-cuda.sh

* Update install_cuda.sh

---------
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

db90effe

14 Jan, 2025 3 commits
- doc updates (#1471) · a9cfd1b4
  Benjamin Badger authored Jan 14, 2025
```
Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>
```
  a9cfd1b4
- (Deps) Require torch 2.x and minor updates (#1459) · bcc052fd
  Matthew Douglas authored Jan 14, 2025
  
  bcc052fd
- cleanup: remove unused kernels/C++ code (#1458) · 58922237
  Matthew Douglas authored Jan 14, 2025
```
* (chore) Remove unused dotfiles

* cleanup: remove unused kernels/C++ code
```
  58922237
17 Dec, 2024 2 commits

chore: migrate config files to `pyproject.toml` (#1373) · 5b015890

Saurav Maheshkar authored Dec 17, 2024



* chore: move configs to pyproject.toml

* fix: drop file from CI workflow

* feat: reorder pytest markers

* chore: retain comments

* chore(build): migrate build data to pyproject
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Aarni Koskela <akx@iki.fi>

* chore: move configs to pyproject.toml

* Apply suggestions from code review
Co-authored-by: Aarni Koskela <akx@iki.fi>

* bump ruff

---------
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
Co-authored-by: Aarni Koskela <akx@iki.fi>

5b015890

Remove triton.ops, copy necessary bits here (#1413) · 032beb95

Bert Maher authored Dec 17, 2024

* Remove triton.ops, copy necessary bits here

Summary: Triton upstream removed `triton.ops` and moved it to a
semi-unmaintained `kernels` repo.  Since all that's needed here is the perf
model, just add those bits here.

* Add source reference/license comment

032beb95

11 Dec, 2024 1 commit
- (chore) Remove unused dotfiles (#1445) · a676f6ed
  Matthew Douglas authored Dec 11, 2024
  
  a676f6ed
10 Dec, 2024 2 commits
- Bump version to 0.45.1.dev0 · bf58ad13
  Matthew Douglas authored Dec 10, 2024
  
  bf58ad13
- Add installation docs for Ascend NPU (#1442) · 55016dad
  Huazhong Ji authored Dec 11, 2024
  
  55016dad
05 Dec, 2024 2 commits

Release 0.45.0 · 64d382da
Matthew Douglas authored Dec 05, 2024

64d382da

LLM.int8() Refactoring: Part 1 (#1401) · 81e6345d

Matthew Douglas authored Dec 05, 2024



* Start of int8 refactor: remove col32/col_ampere/col_turing transforms in new igemmlt implementation

* Fix unintended change

* New naive mm_dequant kernel for row-major; cleanup

* fix

* int8 refactor: initial sparse decomp, cleanup

* Int8 refactoring: remove separate NO_CUBLASLT build; more cleanup

* int8: inference optimizations, some cleanup

* int8: more tests passing, cleanup

* int8 - more cleanup, most tests passing

* int8: specify CUDA stream for int8 ops

* perf: reduce overhead from getting cudaStream ptr

* Mark some functions for deprecation.

* int8 sparse decomp: small perf improvement

* update setup.py

* Update bitsandbytes/autograd/_functions.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update bitsandbytes/research/autograd/_functions.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* int8 - perf improvement for sparse decomposition inference; deprecate get_tensor_stream() in favor of new private fn

* int8 cleanup

* Ignore ruff rule ISC001 (incompatible with formatter)

* add comment

* int8 more cleanup

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* int8: rename / deprecate old fn signatures

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* type annotation

* format update

* Update bitsandbytes/research/autograd/_functions.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* cleanup

* Add comment to explain division optimization

* more cleanup

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* cleanup

* Type annotations, cleanup

* remove unused kernels; improved type annotations

* small perf optimization for single-GPU systems

* small perf optimization for single-GPU systems

* update docstrings

* Improve docs and tests

* Update docstring

* Update test

* add benchmarking script

* test cleanup: add deprecated marker, move benchmarks out

* Add int8 dequant function; misc improvements

* int8 matmul fallback for inner dims not divisible by 4

* improve register usage of kInt8VectorQuant - especially for A100/H100

* disable fail-fast for package build

* maxwell compat

* ptxas verbose

* docs update

* doc update

* backward fix

* Bugfix sparse decomp

* Int8 fix for PEFT OLoRA init

* Fix test for deprecated spmm_coo

* test improvement

* doc update

* typo

* doc cleanup

* docs

* add inference benchmark script

* Add benchmarks, doc update

---------
Co-authored-by: Aarni Koskela <akx@iki.fi>

81e6345d