[Triton/XPU] Support 4bit dequantization logic on Triton (#1629)
* [xpu/triton] Add trtion dequantization kernel
This PR adds xpu backend and trtion kernel for dequantization nf4 dtype.
Trtion is an optional import.
Tests:
tests/test_functional.py::TestQuantize4BitFunctional supported nf4/fp4 cases
tests/test_functional.py::Test8BitBlockwiseQuantizeFunctional
implemented quantize_blockwise with binary search that works faster for XPU
tests/test_linear4bit.py
Signed-off-by:
Dmitrii Makarenko <dmitrii.makarenko@intel.com>
* align with ipex code
* enable test for ipex
* test_kbit_backprop: skip no longer needed
* remove unused
---------
Signed-off-by:
Dmitrii Makarenko <dmitrii.makarenko@intel.com>
Showing
This diff is collapsed.
Please register or sign in to comment