PyTorch Custom Operator Integration (#1544)
* Sketch out first custom op registration
* Add note
* Initial int8 op registration
* Cleanup some deprecated functions.
* Int8 ops updates; tests
* Implement 4bit quant/dequant ops
* Fix nested quant
* cleanup
* Test improvements
* Clean up and improve tests
* Add higher level custom op for int8 matmul + dequant + bias
* Add gemv 4bit custom op
* Cleanup
* Implement out kwarg overloads for custom ops
* Update PyTorch minimum to 2.1
* Deprecation updates
* Deprecation updates
* Cleanup; rename int8_linear_dequant -> int8_scaled_mm
* Bump min pytorch to 2.2
* cleanup
* Test reorganization
* Remove deprecated supports_igemmlt
* More cleanup
* Cleanup obsolete C++/CUDA code
* Cleanup
* Create 'default' backend for fallback op implementations; initial CPU nf4 work
* Stub out for multi-platform
* Fix serialization tests for torch>=2.6.0
* Add example for torch.compile e2e inference
* Test update
---------
Co-authored-by:
Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
Showing
bitsandbytes/_ops.py
0 → 100644
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment