• Matthew Douglas's avatar
    PyTorch Custom Operator Integration (#1544) · e82f72b3
    Matthew Douglas authored
    
    
    * Sketch out first custom op registration
    
    * Add note
    
    * Initial int8 op registration
    
    * Cleanup some deprecated functions.
    
    * Int8 ops updates; tests
    
    * Implement 4bit quant/dequant ops
    
    * Fix nested quant
    
    * cleanup
    
    * Test improvements
    
    * Clean up and improve tests
    
    * Add higher level custom op for int8 matmul + dequant + bias
    
    * Add gemv 4bit custom op
    
    * Cleanup
    
    * Implement out kwarg overloads for custom ops
    
    * Update PyTorch minimum to 2.1
    
    * Deprecation updates
    
    * Deprecation updates
    
    * Cleanup; rename int8_linear_dequant -> int8_scaled_mm
    
    * Bump min pytorch to 2.2
    
    * cleanup
    
    * Test reorganization
    
    * Remove deprecated supports_igemmlt
    
    * More cleanup
    
    * Cleanup obsolete C++/CUDA code
    
    * Cleanup
    
    * Create 'default' backend for fallback op implementations; initial CPU nf4 work
    
    * Stub out for multi-platform
    
    * Fix serialization tests for torch>=2.6.0
    
    * Add example for torch.compile e2e inference
    
    * Test update
    
    ---------
    Co-authored-by: default avatarTitus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com>
    e82f72b3
test_linear8bitlt.py 7.48 KB