- 01 Jul, 2025 1 commit
-
-
Matthew Douglas authored
* Add torch 2.8 rc / 2.9 nightly to tests * Update tests.yml * Update tests.yml
-
- 30 Jun, 2025 1 commit
-
-
Matthew Douglas authored
-
- 27 Jun, 2025 1 commit
-
-
Matthew Douglas authored
* Add CUDA 12.9 to build/test workflows * Downgrade Jimver/cuda-toolkit to v0.2.24 * Update python-package.yml * Update python-package.yml * Update python-package.yml * Update tests.yml * Update tests.yml
-
- 20 Jun, 2025 1 commit
-
-
pnunna93 authored
* Port ROCm changes from multi-backend-refactor branch * Update ops.py * Update functional.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update functional.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update functional.py * Update functional.py * Update functional.py * Update functional.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update functional.py * Update functional.py * Update functional.py * Update test_ops.py * Update test_functional.py * Update test_ops.py * Update test_functional.py * Update test_functional.py * Update functional.py * Update functional.py * Update ops.py * Update ops.py * Update test_functional.py * Update test_functional.py * Update cextension.py * Update cuda_specs.py * Update cuda_specs.py * Update test_functional.py * Update test_linear4bit.py * Update test_cuda_setup_evaluator.py * Update test_functional.py * Update modules.py * Update modules.py * Update ops.py * Update test_linear4bit.py * Update ops.py * Update ops.py * Update test_linear4bit.py * Update test_linear4bit.py * Update python-package.yml * Update python-package.yml * Update python-package.yml * Update python-package.yml * Create build-rocm.sh * Update cuda_specs.py * Fix trailing whitespace * Remove conflicts.diff * update for hipblasVersionMajor >=3 * Update test_functional.py * Update test_linear4bit.py * Update test_ops.py * Update main.py * Update test_functional.py * Update test_linear4bit.py * Update test_ops.py * Update test_linear4bit.py * Lint * Lint * Update helpers.py * Update test_functional.py * Update test_linear4bit.py * Update test_ops.py * Lint * Update pythonInterface.cpp * lint fix * lint * Update pythonInterface.cpp * revert permissions change * Fix indentation * Update kernels_hip.cuh * Update kernels.hip * Update ops.hip * Update ops_hip.cuh * Update kernels_hip.cuh * Update kernels.hip * Update kernels.hip * Update ops.hip * Update ops_hip.cuh * Update ops.hip * Update CMakeLists.txt * Update functional.py * Update cextension.py * Update cextension.py --------- Co-authored-by:
MISHANMAURYA <118961433+MISHANMAURYA@users.noreply.github.com> Co-authored-by:
MISHANMAUYRA <mishanmaurya31081@gmail.com> Co-authored-by:
amcamd <andrew.chapman@amd.com> Co-authored-by:
Prasanth Nunna <root@banff-cyxtera-s78-1.amd.com>
-
- 17 Jun, 2025 1 commit
-
-
Matthew Douglas authored
* Setup XPU CI * CI: expand XPU matrix * test * test * test * test * test * test * test * test * test * test * skip some fp4 tests on hpu * skip some fp4 tests on hpu * skip gemv tests on hpu * test * Additional test patches for HPU * HPU test update * HPU test update * HPU test update * HPU test update * Format
-
- 05 Jun, 2025 1 commit
-
-
Matthew Douglas authored
-
- 02 Jun, 2025 1 commit
-
-
Matthew Douglas authored
* Tests: add linux x64 cpu+ipex to nightly CI workflow * typo * Tests: guard linear8bit compile test for ipex cpu issue
-
- 24 May, 2025 2 commits
-
-
Matthew Douglas authored
* Add torch.compile tests * Tests: WA aarch64 CPU regressions for torch 2.6.0; add Windows torch==2.7.0+cu118 test config * Tests: skip torch.compile for cuda on windows
-
Matthew Douglas authored
* General cleanup & test improvements * Tests: WA numpy 2 compat issue for torch<2.3 * Tests: update aarch64 cpu min torch version * Tests: update aarch64 cpu min torch version * Tests: update aarch64 cpu min torch version
-
- 19 May, 2025 3 commits
-
-
Matthew Douglas authored
* Test g5g runner * Switch L4 to L40S runner; swap GitHub Linux T4 runner for AWS g4dn * Run tests on last 2 pytorch stable releases * Run tests on last 2 pytorch stable releases
-
Titus von Koeller authored
-
Titus von Koeller authored
-
- 16 May, 2025 5 commits
-
-
Titus von Koeller authored
-
Titus von Koeller authored
-
Titus von Koeller authored
-
Titus von Koeller authored
-
Titus von Koeller authored
-
- 15 May, 2025 1 commit
-
-
Titus von Koeller authored
-
- 14 May, 2025 1 commit
-
-
Matthew Douglas authored
* Improvements for testing suite * Add workflow for macOS arm64 CPU tests * Update tests.yml * Update tests.yml Use new L4 and CPU runners for testing. * Update tests.yml
-
- 13 May, 2025 1 commit
-
-
Matthew Douglas authored
* Improvements for testing suite * Add workflow for macOS arm64 CPU tests
-
- 08 May, 2025 4 commits
-
-
Matthew Douglas authored
Show slow test durations.
-
Matthew Douglas authored
Fix trailing whitespace.
-
Titus von Koeller authored
-
Titus von Koeller authored
-
- 05 May, 2025 2 commits
-
-
Matthew Douglas authored
-
Matthew Douglas authored
-
- 02 May, 2025 2 commits
-
-
Matthew Douglas authored
* Add aarch64 cpu tests and CUDA build to nightly workflow * aarch64: limit CUDA targets to sm75, sm80, sm90, sm100 * aarch64: limit CUDA targets to sm75, sm80, sm90, sm100 * Update build cpu script * fix * Update auditwheel for aarch64
-
Johnny authored
* Update python-package.yml * Update python-package.yml * Update python-package.yml * Cleanup * Matrix update --------- Co-authored-by:Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
-
- 29 Apr, 2025 1 commit
-
-
Matthew Douglas authored
* Run unit tests on GH Actions * fix * fix * trigger workflow * Update * Update * Update * Run tests nightly * Disable paged optimizer test on Windows * Skip unit tests on Windows for CUDA 12.x (driver on runner is too old)
-
- 22 Apr, 2025 1 commit
-
-
Matthew Douglas authored
* Stop building for CUDA toolkit < 11.8 * Simplify * Drop sm70 from cu128 build targets to align with pytorch
-
- 07 Apr, 2025 1 commit
-
-
Titus authored
-
- 27 Mar, 2025 2 commits
-
-
Matthew Douglas authored
* Drop Python 3.8 support. * Formatting
-
Matthew Douglas authored
-
- 25 Feb, 2025 1 commit
-
-
Matthew Douglas authored
-
- 28 Jan, 2025 1 commit
-
-
Johnny authored
* blackwell * blackwell * Update python-package.yml
-
- 17 Dec, 2024 1 commit
-
-
Saurav Maheshkar authored
* chore: move configs to pyproject.toml * fix: drop file from CI workflow * feat: reorder pytest markers * chore: retain comments * chore(build): migrate build data to pyproject Co-authored-by:
Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Aarni Koskela <akx@iki.fi> * chore: move configs to pyproject.toml * Apply suggestions from code review Co-authored-by:
Aarni Koskela <akx@iki.fi> * bump ruff --------- Co-authored-by:
Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by:
Aarni Koskela <akx@iki.fi>
-
- 05 Dec, 2024 1 commit
-
-
Matthew Douglas authored
* Start of int8 refactor: remove col32/col_ampere/col_turing transforms in new igemmlt implementation * Fix unintended change * New naive mm_dequant kernel for row-major; cleanup * fix * int8 refactor: initial sparse decomp, cleanup * Int8 refactoring: remove separate NO_CUBLASLT build; more cleanup * int8: inference optimizations, some cleanup * int8: more tests passing, cleanup * int8 - more cleanup, most tests passing * int8: specify CUDA stream for int8 ops * perf: reduce overhead from getting cudaStream ptr * Mark some functions for deprecation. * int8 sparse decomp: small perf improvement * update setup.py * Update bitsandbytes/autograd/_functions.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/research/autograd/_functions.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * int8 - perf improvement for sparse decomposition inference; deprecate get_tensor_stream() in favor of new private fn * int8 cleanup * Ignore ruff rule ISC001 (incompatible with formatter) * add comment * int8 more cleanup * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * int8: rename / deprecate old fn signatures * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * type annotation * format update * Update bitsandbytes/research/autograd/_functions.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * cleanup * Add comment to explain division optimization * more cleanup * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * cleanup * Type annotations, cleanup * remove unused kernels; improved type annotations * small perf optimization for single-GPU systems * small perf optimization for single-GPU systems * update docstrings * Improve docs and tests * Update docstring * Update test * add benchmarking script * test cleanup: add deprecated marker, move benchmarks out * Add int8 dequant function; misc improvements * int8 matmul fallback for inner dims not divisible by 4 * improve register usage of kInt8VectorQuant - especially for A100/H100 * disable fail-fast for package build * maxwell compat * ptxas verbose * docs update * doc update * backward fix * Bugfix sparse decomp * Int8 fix for PEFT OLoRA init * Fix test for deprecated spmm_coo * test improvement * doc update * typo * doc cleanup * docs * add inference benchmark script * Add benchmarks, doc update --------- Co-authored-by:
Aarni Koskela <akx@iki.fi>
-
- 02 Dec, 2024 1 commit
-
-
Matthew Douglas authored
* [Build] Add CUDA 12.6.2 build; update 12.5.0 to 12.5.1 * bump cuda-toolkit action version * Update docs for cuda versions
-
- 30 Sep, 2024 2 commits
-
-
Titus von Koeller authored
-
Titus von Koeller authored
-