- 21 Jul, 2025 1 commit
-
-
Matthew Douglas authored
-
- 14 Jul, 2025 11 commits
-
-
Matthew Douglas authored
-
Matthew Douglas authored
Add kernel registration for 8bit and 32bit optimizers
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
- 11 Jul, 2025 2 commits
-
-
Egor Krivov authored
-
Egor Krivov authored
-
- 08 Jul, 2025 2 commits
-
-
Matthew Douglas authored
[XPU] Add inference benchmark for XPU
-
Matthew Douglas authored
fix log
-
- 03 Jul, 2025 1 commit
-
-
jiqing-feng authored
Signed-off-by:jiqing-feng <jiqing.feng@intel.com>
-
- 02 Jul, 2025 1 commit
-
-
Egor Krivov authored
-
- 01 Jul, 2025 4 commits
-
-
Michał Górny authored
* Automatically call CMake as part of PEP 517 build Call CMake and build the CPU extension when invoking the build via a PEP 517 backend, to ensure that at least some extension is built when users are building from source. This improves consistency with other Python packages, and reduces the risk of accidents. We are using `scikit-build-core` setuptools plugin to take care of CMake dependencies and call into CMake. However, we need to modify the `build_py` command to ensure that CMake is called prior to the setuptools command, as otherwise the newly built shared library won't be picked up by `build_py`. Since setuptools is still responsible for collecting the Python package, it also collects all other shared libraries that were built earlier, for example via manual CMake calls as done in the CI pipeline. Furthermore, if the user does not have `scikit-build-core` installed and calls `setup.py` directly, we output a warning but continue working as before. The logic can be further extended in the future, for example to detect the best COMPUTE_BACKEND default. Fixes #1511 * Include C sources and build files in source distribution * Fix formatting
-
Matthew Douglas authored
* Add torch 2.8 rc / 2.9 nightly to tests * Update tests.yml * Update tests.yml
-
Matthew Douglas authored
-
jiqing-feng authored
Signed-off-by:jiqing-feng <jiqing.feng@intel.com>
-
- 30 Jun, 2025 1 commit
-
-
Matthew Douglas authored
-
- 27 Jun, 2025 1 commit
-
-
Matthew Douglas authored
* Add CUDA 12.9 to build/test workflows * Downgrade Jimver/cuda-toolkit to v0.2.24 * Update python-package.yml * Update python-package.yml * Update python-package.yml * Update tests.yml * Update tests.yml
-
- 24 Jun, 2025 1 commit
-
-
Aman Gupta authored
-
- 23 Jun, 2025 1 commit
-
-
Aman Gupta authored
-
- 20 Jun, 2025 1 commit
-
-
pnunna93 authored
* Port ROCm changes from multi-backend-refactor branch * Update ops.py * Update functional.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update functional.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update functional.py * Update functional.py * Update functional.py * Update functional.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update ops.py * Update functional.py * Update functional.py * Update functional.py * Update test_ops.py * Update test_functional.py * Update test_ops.py * Update test_functional.py * Update test_functional.py * Update functional.py * Update functional.py * Update ops.py * Update ops.py * Update test_functional.py * Update test_functional.py * Update cextension.py * Update cuda_specs.py * Update cuda_specs.py * Update test_functional.py * Update test_linear4bit.py * Update test_cuda_setup_evaluator.py * Update test_functional.py * Update modules.py * Update modules.py * Update ops.py * Update test_linear4bit.py * Update ops.py * Update ops.py * Update test_linear4bit.py * Update test_linear4bit.py * Update python-package.yml * Update python-package.yml * Update python-package.yml * Update python-package.yml * Create build-rocm.sh * Update cuda_specs.py * Fix trailing whitespace * Remove conflicts.diff * update for hipblasVersionMajor >=3 * Update test_functional.py * Update test_linear4bit.py * Update test_ops.py * Update main.py * Update test_functional.py * Update test_linear4bit.py * Update test_ops.py * Update test_linear4bit.py * Lint * Lint * Update helpers.py * Update test_functional.py * Update test_linear4bit.py * Update test_ops.py * Lint * Update pythonInterface.cpp * lint fix * lint * Update pythonInterface.cpp * revert permissions change * Fix indentation * Update kernels_hip.cuh * Update kernels.hip * Update ops.hip * Update ops_hip.cuh * Update kernels_hip.cuh * Update kernels.hip * Update kernels.hip * Update ops.hip * Update ops_hip.cuh * Update ops.hip * Update CMakeLists.txt * Update functional.py * Update cextension.py * Update cextension.py --------- Co-authored-by:
MISHANMAURYA <118961433+MISHANMAURYA@users.noreply.github.com> Co-authored-by:
MISHANMAUYRA <mishanmaurya31081@gmail.com> Co-authored-by:
amcamd <andrew.chapman@amd.com> Co-authored-by:
Prasanth Nunna <root@banff-cyxtera-s78-1.amd.com>
-
- 19 Jun, 2025 1 commit
-
-
Matthew Douglas authored
-
- 18 Jun, 2025 1 commit
-
-
Chetan Kumar Verma authored
-
- 17 Jun, 2025 1 commit
-
-
Matthew Douglas authored
* Setup XPU CI * CI: expand XPU matrix * test * test * test * test * test * test * test * test * test * test * skip some fp4 tests on hpu * skip some fp4 tests on hpu * skip gemv tests on hpu * test * Additional test patches for HPU * HPU test update * HPU test update * HPU test update * HPU test update * Format
-
- 16 Jun, 2025 1 commit
-
-
Chetan Kumar Verma authored
-
- 13 Jun, 2025 3 commits
-
-
Matthew Douglas authored
* Add clang-format rules * Update clang-format
-
Matthew Douglas authored
-
Matthew Douglas authored
-
- 11 Jun, 2025 3 commits
-
-
वेदांत authored
* doc fix signature for 8-bit optim * required changes * precommit
-
Egor authored
-
Dmitrii Makarenko authored
* [xpu/triton] Add trtion dequantization kernel This PR adds xpu backend and trtion kernel for dequantization nf4 dtype. Trtion is an optional import. Tests: tests/test_functional.py::TestQuantize4BitFunctional supported nf4/fp4 cases tests/test_functional.py::Test8BitBlockwiseQuantizeFunctional implemented quantize_blockwise with binary search that works faster for XPU tests/test_linear4bit.py Signed-off-by:Dmitrii Makarenko <dmitrii.makarenko@intel.com> * align with ipex code * enable test for ipex * test_kbit_backprop: skip no longer needed * remove unused --------- Signed-off-by:
Dmitrii Makarenko <dmitrii.makarenko@intel.com>
-
- 08 Jun, 2025 1 commit
-
-
Matthew Douglas authored
-
- 06 Jun, 2025 2 commits
-
-
Matthew Douglas authored
-
Matthew Douglas authored
-