- 15 Sep, 2025 2 commits
-
-
YangKai0616 authored
* Implemented 32bit optimizers in triton * Modify Comments * Optimizing pure torch implementation * Restore the order of parameters and modify the position of pure pytorch implementation * Restore files permissions --------- Co-authored-by:Fanli Lin <fanli.lin@intel.com>
-
Matthew Douglas authored
-
- 09 Sep, 2025 1 commit
-
-
Matthew Douglas authored
* Test suite improvements for MPS/XPU/HPU * Skip test on torch==2.8.0+cpu for Windows regression
-
- 08 Sep, 2025 2 commits
-
-
Matthew Douglas authored
-
Matthew Douglas authored
* Add parametrize util for targeting parameters outside of nn.Linear modules * Parametrize 4bit: replace existing prequantized weight * cleanup * Add caching for parametrization * Add tests * Fix tests * Guard for torch < 2.5 * Guard for torch < 2.5 * Another test gaurd for torch >= 2.5
-
- 03 Sep, 2025 2 commits
-
-
kaixuanliu authored
* for intel xpu case, use MatMul8bitFp even not use ipex Signed-off-by:
Liu, Kaixuan <kaixuan.liu@intel.com> * fix lint issue Signed-off-by:
Liu, Kaixuan <kaixuan.liu@intel.com> --------- Signed-off-by:
Liu, Kaixuan <kaixuan.liu@intel.com>
-
jiqing-feng authored
* add int mm for xpu after torch 2.9 Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> * add packaging on pyproject Signed-off-by:
jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by:
jiqing-feng <jiqing.feng@intel.com>
-
- 02 Sep, 2025 1 commit
-
-
Yuanyuan Chen authored
* Fix unused variable warnings and other ruff warnings Signed-off-by:
cyy <cyyever@outlook.com> * Fix format Signed-off-by:
cyy <cyyever@outlook.com> --------- Signed-off-by:
cyy <cyyever@outlook.com>
-
- 25 Aug, 2025 1 commit
-
-
Yuanyuan Chen authored
Signed-off-by:cyy <cyyever@outlook.com>
-
- 11 Aug, 2025 4 commits
-
-
Matthew Douglas authored
-
Matthew Douglas authored
-
Matthew Douglas authored
-
Matthew Douglas authored
-
- 06 Aug, 2025 2 commits
-
-
Matthew Douglas authored
[CUDA] Fixing quantization uint8 packing bug for NF4 and FP4
-
Matthew Douglas authored
Fix Params4bit tensor subclass handling
-
- 04 Aug, 2025 1 commit
-
-
ved1beta authored
-
- 02 Aug, 2025 2 commits
-
-
ved1beta authored
-
Mohamed Hisham authored
-
- 31 Jul, 2025 1 commit
-
-
ved1beta authored
-
- 21 Jul, 2025 4 commits
-
-
Matthew Douglas authored
Add Volta support in cu128/cu129 builds
-
Matthew Douglas authored
-
Matthew Douglas authored
Create FUNDING.yml
-
Matthew Douglas authored
-
- 14 Jul, 2025 11 commits
-
-
Matthew Douglas authored
-
Matthew Douglas authored
Add kernel registration for 8bit and 32bit optimizers
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
Egor Krivov authored
-
- 11 Jul, 2025 2 commits
-
-
Egor Krivov authored
-
Egor Krivov authored
-
- 08 Jul, 2025 2 commits
-
-
Matthew Douglas authored
[XPU] Add inference benchmark for XPU
-
Matthew Douglas authored
fix log
-
- 03 Jul, 2025 1 commit
-
-
jiqing-feng authored
Signed-off-by:jiqing-feng <jiqing.feng@intel.com>
-
- 02 Jul, 2025 1 commit
-
-
Egor Krivov authored
-