- 02 May, 2025 1 commit
-
-
Johnny authored
* Update python-package.yml * Update python-package.yml * Update python-package.yml * Cleanup * Matrix update --------- Co-authored-by:Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
-
- 22 Apr, 2025 1 commit
-
-
Matthew Douglas authored
* Stop building for CUDA toolkit < 11.8 * Simplify * Drop sm70 from cu128 build targets to align with pytorch
-
- 07 Apr, 2025 1 commit
-
-
Titus authored
-
- 27 Mar, 2025 2 commits
-
-
Matthew Douglas authored
* Drop Python 3.8 support. * Formatting
-
Matthew Douglas authored
-
- 25 Feb, 2025 1 commit
-
-
Matthew Douglas authored
-
- 28 Jan, 2025 1 commit
-
-
Johnny authored
* blackwell * blackwell * Update python-package.yml
-
- 17 Dec, 2024 1 commit
-
-
Saurav Maheshkar authored
* chore: move configs to pyproject.toml * fix: drop file from CI workflow * feat: reorder pytest markers * chore: retain comments * chore(build): migrate build data to pyproject Co-authored-by:
Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Aarni Koskela <akx@iki.fi> * chore: move configs to pyproject.toml * Apply suggestions from code review Co-authored-by:
Aarni Koskela <akx@iki.fi> * bump ruff --------- Co-authored-by:
Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by:
Aarni Koskela <akx@iki.fi>
-
- 05 Dec, 2024 1 commit
-
-
Matthew Douglas authored
* Start of int8 refactor: remove col32/col_ampere/col_turing transforms in new igemmlt implementation * Fix unintended change * New naive mm_dequant kernel for row-major; cleanup * fix * int8 refactor: initial sparse decomp, cleanup * Int8 refactoring: remove separate NO_CUBLASLT build; more cleanup * int8: inference optimizations, some cleanup * int8: more tests passing, cleanup * int8 - more cleanup, most tests passing * int8: specify CUDA stream for int8 ops * perf: reduce overhead from getting cudaStream ptr * Mark some functions for deprecation. * int8 sparse decomp: small perf improvement * update setup.py * Update bitsandbytes/autograd/_functions.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/research/autograd/_functions.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * int8 - perf improvement for sparse decomposition inference; deprecate get_tensor_stream() in favor of new private fn * int8 cleanup * Ignore ruff rule ISC001 (incompatible with formatter) * add comment * int8 more cleanup * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * int8: rename / deprecate old fn signatures * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * type annotation * format update * Update bitsandbytes/research/autograd/_functions.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * cleanup * Add comment to explain division optimization * more cleanup * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update bitsandbytes/functional.py Co-authored-by:
Aarni Koskela <akx@iki.fi> * cleanup * Type annotations, cleanup * remove unused kernels; improved type annotations * small perf optimization for single-GPU systems * small perf optimization for single-GPU systems * update docstrings * Improve docs and tests * Update docstring * Update test * add benchmarking script * test cleanup: add deprecated marker, move benchmarks out * Add int8 dequant function; misc improvements * int8 matmul fallback for inner dims not divisible by 4 * improve register usage of kInt8VectorQuant - especially for A100/H100 * disable fail-fast for package build * maxwell compat * ptxas verbose * docs update * doc update * backward fix * Bugfix sparse decomp * Int8 fix for PEFT OLoRA init * Fix test for deprecated spmm_coo * test improvement * doc update * typo * doc cleanup * docs * add inference benchmark script * Add benchmarks, doc update --------- Co-authored-by:
Aarni Koskela <akx@iki.fi>
-
- 02 Dec, 2024 1 commit
-
-
Matthew Douglas authored
* [Build] Add CUDA 12.6.2 build; update 12.5.0 to 12.5.1 * bump cuda-toolkit action version * Update docs for cuda versions
-
- 30 Sep, 2024 3 commits
-
-
Titus von Koeller authored
-
Titus von Koeller authored
-
Titus von Koeller authored
-
- 24 Sep, 2024 1 commit
-
-
Matthew Douglas authored
* CI/CD: Add step to publish wheels on tag creation * Remove file * Restrict pre-release workflow branches * Update PyPI publishing * Update PyPI publishing * Update package workflow name * continuous pre-release only on main
-
- 31 Jul, 2024 1 commit
-
-
Titus authored
-
- 29 Jul, 2024 1 commit
-
-
Titus authored
-
- 21 Jul, 2024 1 commit
-
-
Matthew Douglas authored
* Add CUDA 12.5 builds and enable CUDA 12.4 on Windows * Update install doc
-
- 08 Apr, 2024 2 commits
-
-
Matthew Douglas authored
-
Matthew Douglas authored
-
- 11 Mar, 2024 4 commits
-
-
Aarni Koskela authored
-
Aarni Koskela authored
Closes #1114 Co-authored-by:Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
-
Aarni Koskela authored
-
Aarni Koskela authored
-
- 08 Mar, 2024 1 commit
-
-
Matthew Douglas authored
* (ci) build with wider CUDA version matrix * (ci) build with wider CUDA version matrix * (ci) skip sm_89 target on CUDA 11.7 * (ci) skip sm_90 target on CUDA 11.8 * modify workflow to publish to test.pypi * (build) Test for manylinux_2_24 build on GH actions * (build) got that backwards. * try fixing manual triggering condition for testpypi * try if Ubuntu 18.04 is an easy fix to allow for `manylinux_2_24` compatibility * hardcode publish step to run to test publishing * set ubuntu to newest supported version * try statically linking libstdc++ to achieve manylinux_2_18 * last commit only brought us to manylinux_2_34, reverse * add misssing permission for publishing to pypi * snake case deprecated in favor of kebab * downgrade cuda ubuntu aiming for manylinux_2_24 * add step to upgrade cmake due to old Ubuntu for CUDA build * adjust path to prefer pip installed cmake * (cmake) set CMAKE_BUILD_TYPE=Release if unspecified * default to CMAKE_BUILD_TYPE Release for optimized releases and better many_linux compatibility * (build) back to ubuntu22.04 docker images * verify Cmake in separte step * add clarifying comment about Python version compatibility * (build) we don't need cmake for wheel step * fixup testpypi publish to run in PR for testing * add pypi publishing when tagged on main * add functionality to rewrite platform tags * (ci) adjust platform tags for wheels * fix for windows, get order right. * fix for windows, get order right. * (build) slim down those fatbins on windows cuda * sloppy * remove broken PyPi upload for now --------- Co-authored-by:Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
-
- 28 Feb, 2024 1 commit
-
-
Matthew Douglas authored
-
- 27 Feb, 2024 2 commits
-
-
Rickard authored
Co-authored-by:wkpark <wkpark@gmail.com>
-
Matthew Douglas authored
* (cmake) Fix generation of targets for nvcc * Typo * (ci) linux + CUDA workflow: make sure we specify target architectures * fix * fix one more time * (cmake) Default in CMAKE_CUDA_ARCHITECTURES_ALL when cmake<3.23, make sure we build only selected cubins and only ptx for latest capability * Fix static lookup for CMAKE_CUDA_ARCHITECTURES_ALL on cmake<3.23 * Remove debug setting * clarification
-
- 19 Feb, 2024 1 commit
-
-
Rickard authored
-
- 14 Feb, 2024 1 commit
-
-
Won-Kyu Park authored
* CI: fix cuda-toolkit speed issue * CI: use MSVC instead msbuild to remove 'visual_stuido_integration' dependency * use Ninja to compile without MS toolset * use 'network', install 'ninja' only Co-authored-by:
Rickard <rickardp@users.noreply.github.com> --------- Co-authored-by:
Rickard <rickardp@users.noreply.github.com>
-
- 07 Feb, 2024 1 commit
-
-
Rickard authored
-
- 05 Feb, 2024 1 commit
-
-
Rickard authored
* Make native code portable and add GitHub workflow for building * Removed deprecated Python versions * Update python-package.yml Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update python-package.yml Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update python-package.yml Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update python-package.yml Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update python-package.yml Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update python-package.yml Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update python-package.yml Co-authored-by:
Aarni Koskela <akx@iki.fi> * Update python-package.yml * Do not test on Python 3.13 until released * Update python-package.yml * Update python-package.yml * Update python-package.yml * Update python-package.yml * Refactor build stage * Fixed breaking actions change * Slim down Windows cuda * Create dependabot.yml * Bespoke local dev requirements.txt * Enable VS integration * Group Dependabot updates * Cleanup * Update python-package.yml * Reinstate file that was wrongly merged * Fixed regression caused by new version of download-artifact * Update python-package.yml * Update python-package.yml * Fix matrix * Update python-package.yml * Merge * Pipeline * Fixed conflict * Fixed conflict * Update CMakeLists.txt * Fixed merge error * cleanup * cleanup * Find CUDA * Fix * Fixing merge error from latest merge from main * Fix setup.py * Fixed typo in artifact name * Remove linker flags * Build nocublaslt versions * Fixed formatting * Fixed VS Code format on save * Ran format on save from VScode * Re-saved the json files using the new settings * Re-saved CMakeLists.txt to get formatting right * Add path filter * Formatting --------- Co-authored-by:
Aarni Koskela <akx@iki.fi>
-