- 19 May, 2025 1 commit
-
-
Titus von Koeller authored
-
- 02 May, 2025 1 commit
-
-
Wing Lian authored
-
- 28 Apr, 2025 1 commit
-
-
jiqing-feng authored
Signed-off-by:jiqing-feng <jiqing.feng@intel.com>
-
- 22 Apr, 2025 1 commit
-
-
Matthew Douglas authored
* Stop building for CUDA toolkit < 11.8 * Simplify * Drop sm70 from cu128 build targets to align with pytorch
-
- 25 Mar, 2025 1 commit
-
-
Matthew Douglas authored
* Sketch out first custom op registration * Add note * Initial int8 op registration * Cleanup some deprecated functions. * Int8 ops updates; tests * Implement 4bit quant/dequant ops * Fix nested quant * cleanup * Test improvements * Clean up and improve tests * Add higher level custom op for int8 matmul + dequant + bias * Add gemv 4bit custom op * Cleanup * Implement out kwarg overloads for custom ops * Update PyTorch minimum to 2.1 * Deprecation updates * Deprecation updates * Cleanup; rename int8_linear_dequant -> int8_scaled_mm * Bump min pytorch to 2.2 * cleanup * Test reorganization * Remove deprecated supports_igemmlt * More cleanup * Cleanup obsolete C++/CUDA code * Cleanup * Create 'default' backend for fallback op implementations; initial CPU nf4 work * Stub out for multi-platform * Fix serialization tests for torch>=2.6.0 * Add example for torch.compile e2e inference * Test update --------- Co-authored-by:Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
-
- 19 Mar, 2025 1 commit
-
-
Titus authored
-
- 19 Feb, 2025 1 commit
-
-
Matthew Douglas authored
-
- 14 Jan, 2025 2 commits
-
-
Benjamin Badger authored
Co-authored-by:Titus <9048635+Titus-von-Koeller@users.noreply.github.com>
-
Matthew Douglas authored
-
- 10 Dec, 2024 1 commit
-
-
Huazhong Ji authored
-
- 05 Dec, 2024 1 commit
-
-
Matthew Douglas authored
* Start of int8 refactor: remove col32/col_ampere/col_turing transforms in new igemmlt implementation * Fix unintended change * New naive mm_dequant kernel for row-major; cleanup * fix * int8 refactor: initial sparse decomp, cleanup * Int8 refactoring: remove separate NO_CUBLASLT build; more cleanup * int8: inference optimizations, some cleanup * int8: more tests passing, cleanup * int8 - more cleanup, most tests passing * int8: specify CUDA stream for int8 ops * perf: reduce overhead from getting cudaStream ptr * Mark some functions for deprecation. * int8 sparse decomp: small perf improvement * update setup.py * Update bitsandbytes/autograd/_functions.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/research/autograd/_functions.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * int8 - perf improvement for sparse decomposition inference; deprecate get_tensor_stream() in favor of new private fn * int8 cleanup * Ignore ruff rule ISC001 (incompatible with formatter) * add comment * int8 more cleanup * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * int8: rename / deprecate old fn signatures * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * type annotation * format update * Update bitsandbytes/research/autograd/_functions.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * cleanup * Add comment to explain division optimization * more cleanup * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * cleanup * Type annotations, cleanup * remove unused kernels; improved type annotations * small perf optimization for single-GPU systems * small perf optimization for single-GPU systems * update docstrings * Improve docs and tests * Update docstring * Update test * add benchmarking script * test cleanup: add deprecated marker, move benchmarks out * Add int8 dequant function; misc improvements * int8 matmul fallback for inner dims not divisible by 4 * improve register usage of kInt8VectorQuant - especially for A100/H100 * disable fail-fast for package build * maxwell compat * ptxas verbose * docs update * doc update * backward fix * Bugfix sparse decomp * Int8 fix for PEFT OLoRA init * Fix test for deprecated spmm_coo * test improvement * doc update * typo * doc cleanup * docs * add inference benchmark script * Add benchmarks, doc update --------- Co-authored-by:
Aarni Koskela <akx@iki.fi>
-
- 02 Dec, 2024 1 commit
-
-
Matthew Douglas authored
* [Build] Add CUDA 12.6.2 build; update 12.5.0 to 12.5.1 * bump cuda-toolkit action version * Update docs for cuda versions
-
- 16 Oct, 2024 1 commit
-
-
pnunna93 authored
* Add build job for rocm * Add rocm build script * Copy shared obj file into output_dir * upload build artifacts and enable wheels build * Remove cuda build temporarily * Add ROCm version to .so filename * Add rocm_version to whls build * Revert "Remove cuda build temporarily" This reverts commit 1413c5f3a2aed51140b86daa8ee9283c67cce738. * Add rocm_version env var * Remove thrush header files * Print node info * print cuda node info * Revert "print cuda node info" This reverts commit cdb209a2eb896d9c4166f53e9b2aa580c10e42c0. * Revert "Print node info" This reverts commit 7e9a65c33f66fffcb14ee2438170718777c06022. * Add rocm arch to compile command * Rename .so files to rocm * Update default gpu arch * Skip cpu based igemmlt int tests on ROCm * Update Documentation * Update upstream repo name * Update docs * Update string format Co-authored-by:
Aarni Koskela <akx@iki.fi> * Remove pre-release option for torch install * Update pytorch install path Co-authored-by:
Titus <9048635+Titus-von-Koeller@users.noreply.github.com> * Add messages for Heuristics error * Remove toolcache for disk space * print disk usage * Clean disk space for linux * Fix for ubuntu * Add sudo for apt clean * Update clean up disk list * remove disk usage print * Add BNB_BACKEND variable * Update diagnostic functions for ROCm * Fix tuple error * Fix library detection bug for recursive and symlink cases * fix pre-commit errors * Remove recursive path lib search * Create function for runtime lib patterns * Update logger format Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update error reporting Co-authored-by:
Aarni Koskela <akx@iki.fi> * Remove commented code Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update error reporting Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update error reporting * Create hip diagnostics functions * Fix Typo * Fix pre-commit checks * Enable 6.2 build * Skip gemv 4 bit cpu test * Update documentation for 6.2.0 pip install * Update README for default branch change * Fix typo * Sync README with upstream * Remove depth --------- Co-authored-by:
Aarni Koskela <akx@iki.fi> Co-authored-by:
Titus <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by:
Aswin John Mathews <81309834+amathews-amd@users.noreply.github.com> Co-authored-by:
root <root@banff-cyxtera-s78-4.ctr.dcgpu>
-
- 01 Oct, 2024 1 commit
-
-
Titus von Koeller authored
-
- 30 Sep, 2024 1 commit
-
-
Titus authored
* refine docs for multi-backend alpha release * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs * docs: add multi-backend feedback links * docs: add request for contributions * docs: small fixes * docs: small fixes * docs: add info about `main` continuous build * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs
-
- 27 Sep, 2024 1 commit
-
-
Titus von Koeller authored
-
- 21 Sep, 2024 1 commit
-
-
jiqing-feng authored
* cpu benchmark * try to fix formatting * cleanup * cleanup --------- Co-authored-by:Titus <9048635+Titus-von-Koeller@users.noreply.github.com>
-
- 20 Sep, 2024 1 commit
-
-
Matthew Douglas authored
* Add AdEMAMix optimizer * Add PagedAdEMAMix32bit, AdEMAMix32bit * Add PagedAdEMAMix32bit, AdEMAMix32bit * AdEMAMix: add support for alpha/beta3 scheduling * Update paged AdEMAMix
-
- 19 Sep, 2024 1 commit
-
-
Titus authored
-
- 29 Aug, 2024 1 commit
-
-
Titus von Koeller authored
-
- 26 Aug, 2024 1 commit
-
-
Titus von Koeller authored
-
- 21 Aug, 2024 1 commit
-
-
Titus von Koeller authored
-
- 18 Aug, 2024 1 commit
-
-
Titus von Koeller authored
-
- 27 Jul, 2024 1 commit
-
-
Titus von Koeller authored
-
- 23 Jul, 2024 1 commit
-
-
Titus von Koeller authored
-
- 21 Jul, 2024 1 commit
-
-
Matthew Douglas authored
* Add CUDA 12.5 builds and enable CUDA 12.4 on Windows * Update install doc
-
- 21 Jun, 2024 1 commit
-
-
jiqing-feng authored
* cpu install guide * update readme * fix format * fix format * fix typo * add windows guide * fix readme to pip install . instead of building wheel * Update docs/source/installation.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/installation.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/installation.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 16 May, 2024 1 commit
-
-
Steven Liu authored
-
- 14 May, 2024 1 commit
-
-
Steven Liu authored
-
- 13 May, 2024 1 commit
-
-
Steven Liu authored
-
- 17 Apr, 2024 1 commit
-
-
Titus authored
* (docs) integrations: fix omission in bf16 related warning * (docs) integrations: further clarifications to prior fix * (docs) integrations: fix punctuation Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * (docs) integrations: fix omitted code formatting --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 09 Apr, 2024 1 commit
-
-
Steven Liu authored
* split build from source off * validated compilers
-
- 26 Mar, 2024 3 commits
-
-
Matthew Douglas authored
* Add CUDA 12.4 download to utility script, docs * (ci) Add CUDA 12.4.0 build to workflow * Apply ruff format to install_cuda.py
-
Steven Liu authored
-
Steven Liu authored
-
- 15 Mar, 2024 1 commit
-
-
Steven Liu authored
* optim, integration * toctree * feedback
-
- 07 Mar, 2024 1 commit
-
-
Steven Liu authored
* optims * fix path * fix path * mdx * fix path * toctree * fix * optimizer, adagrad * add init * add * more apis * params * clarify * run pre-commit hooks --------- Co-authored-by:Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
-
- 06 Mar, 2024 1 commit
-
-
MOHAMMAD ALBARHAM authored
-
- 28 Feb, 2024 1 commit
-
-
Titus authored
-
- 27 Feb, 2024 1 commit
-
-
Titus authored
* improve accelerate reference in docs * Apply suggestions from code review Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix spelling --------- Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-