Commits · 3b89a05e22074cd36d230c8c905c4f263b1e0871 · OpenDAS / bitsandbytes

14 Jul, 2025 2 commits
- Add 32bit optimizer interface · 3b89a05e
  Egor Krivov authored Jul 14, 2025
  
  3b89a05e
- enabled tests · abf4a1e3
  Egor Krivov authored Jul 14, 2025
  
  abf4a1e3
11 Jul, 2025 2 commits
- Fixed bugs · 35ce337b
  Egor Krivov authored Jul 11, 2025
  
  35ce337b
- Add interface for 8bit optimizer · b43edf56
  Egor Krivov authored Jul 11, 2025
  
  b43edf56
08 Jul, 2025 2 commits
- Merge pull request #1696 from Egor-Krivov/egor/inf_benchmark · adc7fda7
  Matthew Douglas authored Jul 08, 2025
```
[XPU] Add inference benchmark for XPU
```
  adc7fda7
- Merge pull request #1697 from jiqing-feng/log · ee017367
  Matthew Douglas authored Jul 08, 2025
```
fix log
```
  ee017367
03 Jul, 2025 1 commit
- fix log · ea4b59f3
  jiqing-feng authored Jul 03, 2025
```
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
```
  ea4b59f3
02 Jul, 2025 1 commit
- Added inference benchmark · 32786145
  Egor Krivov authored Jul 02, 2025
  
  32786145
01 Jul, 2025 4 commits

Automatically call CMake as part of PEP 517 build (#1512) · ed9c8fca

Michał Górny authored Jul 01, 2025

* Automatically call CMake as part of PEP 517 build

Call CMake and build the CPU extension when invoking the build
via a PEP 517 backend, to ensure that at least some extension is built
when users are building from source.  This improves consistency with
other Python packages, and reduces the risk of accidents.

We are using `scikit-build-core` setuptools plugin to take care of CMake
dependencies and call into CMake.  However, we need to modify
the `build_py` command to ensure that CMake is called prior to
the setuptools command, as otherwise the newly built shared library
won't be picked up by `build_py`.

Since setuptools is still responsible for collecting the Python package,
it also collects all other shared libraries that were built earlier,
for example via manual CMake calls as done in the CI pipeline.
Furthermore, if the user does not have `scikit-build-core` installed
and calls `setup.py` directly, we output a warning but continue working
as before.

The logic can be further extended in the future, for example to detect
the best COMPUTE_BACKEND default.

Fixes #1511

* Include C sources and build files in source distribution

* Fix formatting

ed9c8fca

CI: Test with PyTorch 2.8.0 RC (#1693) · ed398d28
Matthew Douglas authored Jul 01, 2025
```
* Add torch 2.8 rc / 2.9 nightly to tests

* Update tests.yml

* Update tests.yml
```
ed398d28
Update README.md · e28d4d91
Matthew Douglas authored Jul 01, 2025

e28d4d91
fix triton kernel on the correct device (#1691) · bdcee0ff
jiqing-feng authored Jul 01, 2025
```
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
```
bdcee0ff

30 Jun, 2025 1 commit
- Temporarily disable HPU tests · 6d0a5cd2
  Matthew Douglas authored Jun 30, 2025
  
  6d0a5cd2
27 Jun, 2025 1 commit

Add CUDA 12.9 build (#1689) · 1abd5e78

Matthew Douglas authored Jun 27, 2025

* Add CUDA 12.9 to build/test workflows

* Downgrade Jimver/cuda-toolkit to v0.2.24

* Update python-package.yml

* Update python-package.yml

* Update python-package.yml

* Update tests.yml

* Update tests.yml

1abd5e78

24 Jun, 2025 1 commit
- Make minor improvements to optimizer.py (#1687) · aca9778e
  Aman Gupta authored Jun 24, 2025
  
  aca9778e
23 Jun, 2025 1 commit
- Fix AdamW documentation (#1686) · fd2949ab
  Aman Gupta authored Jun 23, 2025
  
  fd2949ab
20 Jun, 2025 1 commit

Enable ROCm backend with custom ops integration (#1683) · 888788d7

pnunna93 authored Jun 20, 2025



* Port ROCm changes from multi-backend-refactor branch

* Update ops.py

* Update functional.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update functional.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update functional.py

* Update functional.py

* Update functional.py

* Update functional.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update ops.py

* Update functional.py

* Update functional.py

* Update functional.py

* Update test_ops.py

* Update test_functional.py

* Update test_ops.py

* Update test_functional.py

* Update test_functional.py

* Update functional.py

* Update functional.py

* Update ops.py

* Update ops.py

* Update test_functional.py

* Update test_functional.py

* Update cextension.py

* Update cuda_specs.py

* Update cuda_specs.py

* Update test_functional.py

* Update test_linear4bit.py

* Update test_cuda_setup_evaluator.py

* Update test_functional.py

* Update modules.py

* Update modules.py

* Update ops.py

* Update test_linear4bit.py

* Update ops.py

* Update ops.py

* Update test_linear4bit.py

* Update test_linear4bit.py

* Update python-package.yml

* Update python-package.yml

* Update python-package.yml

* Update python-package.yml

* Create build-rocm.sh

* Update cuda_specs.py

* Fix trailing whitespace

* Remove conflicts.diff

* update for hipblasVersionMajor >=3

* Update test_functional.py

* Update test_linear4bit.py

* Update test_ops.py

* Update main.py

* Update test_functional.py

* Update test_linear4bit.py

* Update test_ops.py

* Update test_linear4bit.py

* Lint

* Lint

* Update helpers.py

* Update test_functional.py

* Update test_linear4bit.py

* Update test_ops.py

* Lint

* Update pythonInterface.cpp

* lint fix

* lint

* Update pythonInterface.cpp

* revert permissions change

* Fix indentation

* Update kernels_hip.cuh

* Update kernels.hip

* Update ops.hip

* Update ops_hip.cuh

* Update kernels_hip.cuh

* Update kernels.hip

* Update kernels.hip

* Update ops.hip

* Update ops_hip.cuh

* Update ops.hip

* Update CMakeLists.txt

* Update functional.py

* Update cextension.py

* Update cextension.py

---------
Co-authored-by: MISHANMAURYA <118961433+MISHANMAURYA@users.noreply.github.com>
Co-authored-by: MISHANMAUYRA <mishanmaurya31081@gmail.com>
Co-authored-by: amcamd <andrew.chapman@amd.com>
Co-authored-by: Prasanth Nunna <root@banff-cyxtera-s78-1.amd.com>

888788d7

19 Jun, 2025 1 commit
- Update README.md (#1684) · a1cd3f6e
  Matthew Douglas authored Jun 19, 2025
  
  a1cd3f6e
18 Jun, 2025 1 commit
- Update unit tests for HPU (#1682) · 31034b4f
  Chetan Kumar Verma authored Jun 18, 2025
  
  31034b4f
17 Jun, 2025 1 commit

CI: Setup HPU nightly tests (#1681) · 29564ad6

Matthew Douglas authored Jun 17, 2025

* Setup XPU CI

* CI: expand XPU matrix

* test

* test

* test

* test

* test

* test

* test

* test

* test

* test

* skip some fp4 tests on hpu

* skip some fp4 tests on hpu

* skip gemv tests on hpu

* test

* Additional test patches for HPU

* HPU test update

* HPU test update

* HPU test update

* HPU test update

* Format

29564ad6

16 Jun, 2025 1 commit
- HPU support for unit tests (#1680) · 70bbbb92
  Chetan Kumar Verma authored Jun 16, 2025
  
  70bbbb92
13 Jun, 2025 3 commits
- Add clang-format (#1677) · d863adb2
  Matthew Douglas authored Jun 13, 2025
```
* Add clang-format rules

* Update clang-format
```
  d863adb2
- Update .git-blame-ignore-revs · 6bd94c27
  Matthew Douglas authored Jun 13, 2025
  
  6bd94c27
- Apply clang-format rules (#1678) · 4955d136
  Matthew Douglas authored Jun 13, 2025
  
  4955d136
11 Jun, 2025 3 commits

doc fix signature for 8-bit optim (#1660) · 61db0859
वेदांत authored Jun 11, 2025
```
* doc fix signature for 8-bit optim

* required changes

* precommit
```
61db0859
Fixed a bug in test_fw_bit_quant (#1675) · df73d3e1
Egor authored Jun 11, 2025

df73d3e1

[Triton/XPU] Support 4bit dequantization logic on Triton (#1629) · a23026c8

Dmitrii Makarenko authored Jun 11, 2025



* [xpu/triton] Add trtion dequantization kernel

This PR adds xpu backend and trtion kernel for dequantization nf4 dtype.
Trtion is an optional import.
Tests:
	tests/test_functional.py::TestQuantize4BitFunctional supported nf4/fp4 cases
	tests/test_functional.py::Test8BitBlockwiseQuantizeFunctional
implemented quantize_blockwise with binary search that works faster for XPU
        tests/test_linear4bit.py
Signed-off-by: Dmitrii Makarenko <dmitrii.makarenko@intel.com>

* align with ipex code

* enable test for ipex

* test_kbit_backprop: skip no longer needed

* remove unused

---------
Signed-off-by: Dmitrii Makarenko <dmitrii.makarenko@intel.com>

a23026c8

08 Jun, 2025 1 commit
- Improvement for torch.compile support on Params4bit (#1673) · d9333aa9
  Matthew Douglas authored Jun 08, 2025
  
  d9333aa9
06 Jun, 2025 2 commits
- Update README.md · 11df723f
  Matthew Douglas authored Jun 06, 2025
  
  11df723f
- Fix Linear4bit warnings/test for compute dtype · e9f3605f
  Matthew Douglas authored Jun 06, 2025
  
  e9f3605f
05 Jun, 2025 2 commits

Add support for Intel Gaudi/HPU backend (#1662) · 812ef06a

Ruheena Suhani Shaik authored Jun 05, 2025



* supports hpu backend in main branch

* Update bitsandbytes/backends/hpu/ops.py

updates the assertion message
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update bitsandbytes/backends/hpu/ops.py
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update ops.py

Fix lint issue

* Update ops.py

---------
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

812ef06a

CI workflow: bump torch 2.7.0 to 2.7.1 (#1670) · e9fc96a2
Matthew Douglas authored Jun 05, 2025

e9fc96a2

04 Jun, 2025 1 commit

Deprecation cleanup (#1669) · 849d9449

Matthew Douglas authored Jun 04, 2025

* Deprecation cleanup: remove histogram_scatter_add_2d

* Deprecation cleanup: vectorwise_mm_dequant

* Deprecation cleanup: vectorwise_quant

* Remove unused test

* Optimizer test cleanup

* Deprecations: remove estimate_quantiles, create_quantile_map

* Move deprecated test

849d9449

03 Jun, 2025 3 commits
- Update README.md · 76d3e2b1
  Matthew Douglas authored Jun 03, 2025
  
  76d3e2b1
- pass current bnb_quantized when moving quantized Params4bit to different device (#1665) · cd8bd2d6
  mklabunde authored Jun 03, 2025
  
  cd8bd2d6
- Tests: don't require grad on weights for test_kbit_backprop · 55ebaac7
  Matthew Douglas authored Jun 03, 2025
  
  55ebaac7
02 Jun, 2025 3 commits

Add CPU + IPEX to nightly CI (#1667) · 318a86e3

Matthew Douglas authored Jun 02, 2025

* Tests: add linux x64 cpu+ipex to nightly CI workflow

* typo

* Tests: guard linear8bit compile test for ipex cpu issue

318a86e3

Fix CI regression (#1666) · 945f7c1d

Matthew Douglas authored Jun 02, 2025

* Tests: xfail opcheck for 4bit quantization with floating storage dtypes

* Tests: xfail opcheck for 4bit quantization with floating storage dtypes

* Tests: skip test_gemv_eye_4bit on CPU with bf16 when not supported by torch

* Tests: skip test_gemv_eye_4bit on CPU with bf16 when not supported by torch

945f7c1d

Bump dev version · a2a74ede
Matthew Douglas authored Jun 02, 2025

a2a74ede

28 May, 2025 1 commit

Enable CPU/XPU native and ipex path (#1628) · aaa71d7e

jiqing-feng authored May 29, 2025



* enable ipex
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix cpu 8bit quantization
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix int8 and nf4 cpu inference
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add cpu fp4 and rem
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix dequantize nf4 xpu
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix ipex op
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix dequantize nf4 name
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix dequantize nf4 ipex
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix matmul8bitfp
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable cpu tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix quantize blockwise output shape
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix quant_storage bf16 and gemv cpu
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix cpu tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix xpu tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix lib
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* skip xpu dequantize blockwise op check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix matmul8bit
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* skip not used function teests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix matmul8bit fp
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* check ipex before MatMul8bitFp
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update ipex install guide
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update install guide
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix error log
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix error lof
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update comment
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* move torch op to default
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* revert ipex check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix code tabledevice
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix code table device
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix xpu ops
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

aaa71d7e