Commits · 9f85829479ac2299dcf692e913a83911a1069ad4 · OpenDAS / bitsandbytes

19 May, 2025 1 commit
- continuous release: tweak + docs · 513e69be
  Titus von Koeller authored May 19, 2025
  
  513e69be
02 May, 2025 1 commit
- point to correct latest (#1621) · a7649706
  Wing Lian authored May 02, 2025
  
  a7649706
28 Apr, 2025 1 commit
- fix intel cpu/xpu installation (#1613) · c244e983
  jiqing-feng authored Apr 28, 2025
```
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
```
  c244e983
22 Apr, 2025 1 commit

Stop building for CUDA toolkit < 11.8 (#1605) · 53daa0e2

Matthew Douglas authored Apr 22, 2025

* Stop building for CUDA toolkit < 11.8

* Simplify

* Drop sm70 from cu128 build targets to align with pytorch

53daa0e2

25 Mar, 2025 1 commit

PyTorch Custom Operator Integration (#1544) · e82f72b3

Matthew Douglas authored Mar 25, 2025



* Sketch out first custom op registration

* Add note

* Initial int8 op registration

* Cleanup some deprecated functions.

* Int8 ops updates; tests

* Implement 4bit quant/dequant ops

* Fix nested quant

* cleanup

* Test improvements

* Clean up and improve tests

* Add higher level custom op for int8 matmul + dequant + bias

* Add gemv 4bit custom op

* Cleanup

* Implement out kwarg overloads for custom ops

* Update PyTorch minimum to 2.1

* Deprecation updates

* Deprecation updates

* Cleanup; rename int8_linear_dequant -> int8_scaled_mm

* Bump min pytorch to 2.2

* cleanup

* Test reorganization

* Remove deprecated supports_igemmlt

* More cleanup

* Cleanup obsolete C++/CUDA code

* Cleanup

* Create 'default' backend for fallback op implementations; initial CPU nf4 work

* Stub out for multi-platform

* Fix serialization tests for torch>=2.6.0

* Add example for torch.compile e2e inference

* Test update

---------
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>

e82f72b3

19 Mar, 2025 1 commit
- docs: fix typo · e1f515cd
  Titus authored Mar 19, 2025
  
  e1f515cd
19 Feb, 2025 1 commit
- Installation doc updates (#1529) · 2ce3ed9e
  Matthew Douglas authored Feb 19, 2025
  
  2ce3ed9e
14 Jan, 2025 2 commits
- doc updates (#1471) · a9cfd1b4
  Benjamin Badger authored Jan 14, 2025
```
Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>
```
  a9cfd1b4
- (Deps) Require torch 2.x and minor updates (#1459) · bcc052fd
  Matthew Douglas authored Jan 14, 2025
  
  bcc052fd
10 Dec, 2024 1 commit
- Add installation docs for Ascend NPU (#1442) · 55016dad
  Huazhong Ji authored Dec 11, 2024
  
  55016dad
05 Dec, 2024 1 commit

LLM.int8() Refactoring: Part 1 (#1401) · 81e6345d

Matthew Douglas authored Dec 05, 2024



* Start of int8 refactor: remove col32/col_ampere/col_turing transforms in new igemmlt implementation

* Fix unintended change

* New naive mm_dequant kernel for row-major; cleanup

* fix

* int8 refactor: initial sparse decomp, cleanup

* Int8 refactoring: remove separate NO_CUBLASLT build; more cleanup

* int8: inference optimizations, some cleanup

* int8: more tests passing, cleanup

* int8 - more cleanup, most tests passing

* int8: specify CUDA stream for int8 ops

* perf: reduce overhead from getting cudaStream ptr

* Mark some functions for deprecation.

* int8 sparse decomp: small perf improvement

* update setup.py

* Update bitsandbytes/autograd/_functions.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update bitsandbytes/research/autograd/_functions.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* int8 - perf improvement for sparse decomposition inference; deprecate get_tensor_stream() in favor of new private fn

* int8 cleanup

* Ignore ruff rule ISC001 (incompatible with formatter)

* add comment

* int8 more cleanup

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* int8: rename / deprecate old fn signatures

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* type annotation

* format update

* Update bitsandbytes/research/autograd/_functions.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* cleanup

* Add comment to explain division optimization

* more cleanup

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update bitsandbytes/functional.py
Co-authored-by: Aarni Koskela <akx@iki.fi>

* cleanup

* Type annotations, cleanup

* remove unused kernels; improved type annotations

* small perf optimization for single-GPU systems

* small perf optimization for single-GPU systems

* update docstrings

* Improve docs and tests

* Update docstring

* Update test

* add benchmarking script

* test cleanup: add deprecated marker, move benchmarks out

* Add int8 dequant function; misc improvements

* int8 matmul fallback for inner dims not divisible by 4

* improve register usage of kInt8VectorQuant - especially for A100/H100

* disable fail-fast for package build

* maxwell compat

* ptxas verbose

* docs update

* doc update

* backward fix

* Bugfix sparse decomp

* Int8 fix for PEFT OLoRA init

* Fix test for deprecated spmm_coo

* test improvement

* doc update

* typo

* doc cleanup

* docs

* add inference benchmark script

* Add benchmarks, doc update

---------
Co-authored-by: Aarni Koskela <akx@iki.fi>

81e6345d

02 Dec, 2024 1 commit

[Build] Add CUDA 12.6.2 build; update 12.5.0 to 12.5.1 (#1431) · 7dca7004

Matthew Douglas authored Dec 02, 2024

* [Build] Add CUDA 12.6.2 build; update 12.5.0 to 12.5.1

* bump cuda-toolkit action version

* Update docs for cuda versions

7dca7004

16 Oct, 2024 1 commit

Remove depth option in installation steps (#1395) · c8f2769b

pnunna93 authored Oct 16, 2024



* Add build job for rocm

* Add rocm build script

* Copy shared obj file into output_dir

* upload build artifacts and enable wheels build

* Remove cuda build temporarily

* Add ROCm version to .so filename

* Add rocm_version to whls build

* Revert "Remove cuda build temporarily"

This reverts commit 1413c5f3a2aed51140b86daa8ee9283c67cce738.

* Add rocm_version env var

* Remove thrush header files

* Print node info

* print cuda node info

* Revert "print cuda node info"

This reverts commit cdb209a2eb896d9c4166f53e9b2aa580c10e42c0.

* Revert "Print node info"

This reverts commit 7e9a65c33f66fffcb14ee2438170718777c06022.

* Add rocm arch to compile command

* Rename .so files to rocm

* Update default gpu arch

* Skip cpu based igemmlt int tests on ROCm

* Update Documentation

* Update upstream repo name

* Update docs

* Update string format
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Remove pre-release option for torch install

* Update pytorch install path
Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>

* Add messages for Heuristics error

* Remove toolcache for disk space

* print disk usage

* Clean disk space for linux

* Fix for ubuntu

* Add sudo for apt clean

* Update clean up disk list

* remove disk usage print

* Add BNB_BACKEND variable

* Update diagnostic functions for ROCm

* Fix tuple error

* Fix library detection bug for recursive and symlink cases

* fix pre-commit errors

* Remove recursive path lib search

* Create function for runtime lib patterns

* Update logger format
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update error reporting
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Remove commented code
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update error reporting
Co-authored-by: Aarni Koskela <akx@iki.fi>

* Update error reporting

* Create hip diagnostics functions

* Fix Typo

* Fix pre-commit checks

* Enable 6.2 build

* Skip gemv 4 bit cpu test

* Update documentation for 6.2.0 pip install

* Update README for default branch change

* Fix typo

* Sync README with upstream

* Remove depth

---------
Co-authored-by: Aarni Koskela <akx@iki.fi>
Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>
Co-authored-by: Aswin John Mathews <81309834+amathews-amd@users.noreply.github.com>
Co-authored-by: root <root@banff-cyxtera-s78-4.ctr.dcgpu>

c8f2769b

01 Oct, 2024 1 commit
- docs: remove 2 obsolete lines · 723e2162
  Titus von Koeller authored Oct 01, 2024
  
  723e2162
30 Sep, 2024 1 commit

refine docs for multi-backend alpha release (#1380) · 485427f1

Titus authored Sep 30, 2024

* refine docs for multi-backend alpha release

* docs: further tweaks to multi-backend alpha docs

* docs: further tweaks to multi-backend alpha docs

* docs: further tweaks to multi-backend alpha docs

* docs: add multi-backend feedback links

* docs: add request for contributions

* docs: small fixes

* docs: small fixes

* docs: add info about `main` continuous build

* docs: further tweaks to multi-backend alpha docs

* docs: further tweaks to multi-backend alpha docs

485427f1

27 Sep, 2024 1 commit
- docs: update supported ROCm vers · 776140a5
  Titus von Koeller authored Sep 27, 2024
  
  776140a5
21 Sep, 2024 1 commit

docs: add cpu benchmark (#1366) · e7c6fc61

jiqing-feng authored Sep 21, 2024



* cpu benchmark

* try to fix formatting

* cleanup

* cleanup

---------
Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>

e7c6fc61

20 Sep, 2024 1 commit

Add AdEMAMix optimizer (#1360) · d9645465

Matthew Douglas authored Sep 20, 2024

* Add AdEMAMix optimizer

* Add PagedAdEMAMix32bit, AdEMAMix32bit

* Add PagedAdEMAMix32bit, AdEMAMix32bit

* AdEMAMix: add support for alpha/beta3 scheduling

* Update paged AdEMAMix

d9645465

19 Sep, 2024 1 commit
- docs: add internal reference to multi-backend guide (#1352) · abb0c32a
  Titus authored Sep 19, 2024
  
  abb0c32a
29 Aug, 2024 1 commit
- docs: get started on detailed multi-backend guide · d3001365
  Titus von Koeller authored Aug 29, 2024
  
  d3001365
26 Aug, 2024 1 commit
- docs: tweaks for multi-backend preview release prep · 2a8cc8d1
  Titus von Koeller authored Aug 26, 2024
  
  2a8cc8d1
21 Aug, 2024 1 commit
- docs: fix pre-commit instructions · 6ae9859f
  Titus von Koeller authored Aug 21, 2024
  
  6ae9859f
18 Aug, 2024 1 commit
- docs: update rocm installation instructions · 7476f6b6
  Titus von Koeller authored Aug 18, 2024
  
  7476f6b6
27 Jul, 2024 1 commit
- docs: cleanup compilation instrs for multi-backend · 2621e1af
  Titus von Koeller authored Jul 27, 2024
  
  2621e1af
23 Jul, 2024 1 commit
- remove unnecessary version mention · 15711106
  Titus von Koeller authored Jul 23, 2024
  
  15711106
21 Jul, 2024 1 commit
- Add CUDA 12.5 and update 12.4 builds (#1284) · 0bdd57cc
  Matthew Douglas authored Jul 21, 2024
```
* Add CUDA 12.5 builds and enable CUDA 12.4 on Windows

* Update install doc
```
  0bdd57cc
21 Jun, 2024 1 commit

cpu install guide (#1227) · dada5301

jiqing-feng authored Jun 22, 2024



* cpu install guide

* update readme

* fix format

* fix format

* fix typo

* add windows guide

* fix readme to pip install . instead of building wheel

* Update docs/source/installation.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/installation.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/installation.mdx
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

dada5301

16 May, 2024 1 commit
- feedback · d7a5a244
  Steven Liu authored May 16, 2024
  
  d7a5a244
14 May, 2024 1 commit
- clarify · 2b7daed3
  Steven Liu authored May 14, 2024
  
  2b7daed3
13 May, 2024 1 commit
- clarify · 13c70d30
  Steven Liu authored Apr 29, 2024
  
  13c70d30
17 Apr, 2024 1 commit

(docs) integrations: fix omission in bf16 related warning (#1183) · ffd7d0db

Titus authored Apr 17, 2024



* (docs) integrations: fix omission in bf16 related warning

* (docs) integrations: further clarifications to prior fix

* (docs) integrations: fix punctuation
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* (docs) integrations: fix omitted code formatting

---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

ffd7d0db

09 Apr, 2024 1 commit
- [docs] Install from source (#1149) · 6be3d0f9
  Steven Liu authored Apr 09, 2024
```
* split build from source off

* validated compilers
```
  6be3d0f9
26 Mar, 2024 3 commits
- Add CUDA 12.4 to docs/install helper (#1136) · 04052631
  Matthew Douglas authored Mar 26, 2024
```
* Add CUDA 12.4 download to utility script, docs

* (ci) Add CUDA 12.4.0 build to workflow

* Apply ruff format to install_cuda.py
```
  04052631
- toctree · e3376abf
  Steven Liu authored Mar 26, 2024
  
  e3376abf
- first draft · 67e7ee3b
  Steven Liu authored Mar 26, 2024
  
  67e7ee3b
15 Mar, 2024 1 commit
- [docs] refine optimizers, integrations, etc (#1125) · 05483768
  Steven Liu authored Mar 15, 2024
```
* optim, integration

* toctree

* feedback
```
  05483768
07 Mar, 2024 1 commit

[docs] implement API docs (#1075) · ac5d6ee6

Steven Liu authored Mar 07, 2024



* optims

* fix path

* fix path

* mdx

* fix path

* toctree

* fix

* optimizer, adagrad

* add init

* add

* more apis

* params

* clarify

* run pre-commit hooks

---------
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>

ac5d6ee6

06 Mar, 2024 1 commit
- fix typo on the script installation file (#1109) · 87e029bc
  MOHAMMAD ALBARHAM authored Mar 07, 2024
  
  87e029bc
28 Feb, 2024 1 commit
- docs: add header for compilation from source · 20f3eea7
  Titus authored Feb 28, 2024
  
  20f3eea7
27 Feb, 2024 1 commit

improve accelerate reference in docs (#1086) · 433275e3

Titus authored Feb 27, 2024



* improve accelerate reference in docs

* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* fix spelling

---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

433275e3