• Aryan's avatar
    [core] Layerwise Upcasting (#10347) · beacaa55
    Aryan authored
    
    
    * update
    
    * update
    
    * make style
    
    * remove dynamo disable
    
    * add coauthor
    Co-Authored-By: default avatarDhruv Nair <dhruv.nair@gmail.com>
    
    * update
    
    * update
    
    * update
    
    * update mixin
    
    * add some basic tests
    
    * update
    
    * update
    
    * non_blocking
    
    * improvements
    
    * update
    
    * norm.* -> norm
    
    * apply suggestions from review
    
    * add example
    
    * update hook implementation to the latest changes from pyramid attention broadcast
    
    * deinitialize should raise an error
    
    * update doc page
    
    * Apply suggestions from code review
    Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
    
    * update docs
    
    * update
    
    * refactor
    
    * fix _always_upcast_modules for asym ae and vq_model
    
    * fix lumina embedding forward to not depend on weight dtype
    
    * refactor tests
    
    * add simple lora inference tests
    
    * _always_upcast_modules -> _precision_sensitive_module_patterns
    
    * remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case
    
    * check layer dtypes in lora test
    
    * fix UNet1DModelTests::test_layerwise_upcasting_inference
    
    * _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback
    
    * skip test in NCSNppModelTests
    
    * skip tests for AutoencoderTinyTests
    
    * skip tests for AutoencoderOobleckTests
    
    * skip tests for UNet1DModelTests - unsupported pytorch operations
    
    * layerwise_upcasting -> layerwise_casting
    
    * skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support
    
    * add layerwise fp8 pipeline test
    
    * use xfail
    
    * Apply suggestions from code review
    Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
    
    * add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass)
    
    * add note about memory consumption on tesla CI runner for failing test
    
    ---------
    Co-authored-by: default avatarDhruv Nair <dhruv.nair@gmail.com>
    Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
    beacaa55
utilities.md 1.08 KB