• Anupam Bhatnagar's avatar
    [feature] [experimental] Layerwise Gradient Scaler (#879) · 52d066a2
    Anupam Bhatnagar authored
    * [skip ci] first commit
    
    * [skip ci] gradient scaler example
    
    * [skip ci] adding feed forward toy example
    
    * [skip ci] adding types
    
    * [skip ci] adding backward hook
    
    * [skip ci] update
    
    * [skip ci] working feed forward example
    
    * [skip ci] working feed forward example
    
    * [skip ci] use named_modules instead of named_children
    
    * [skip ci] adding new file
    
    * [skip ci] clean up
    
    * [skip ci] implement unscale function
    
    * [skip ci] implement unscale function
    
    * [skip ci] removing old file
    
    * [skip ci] removing some more old files
    
    * [skip ci] making unscale function generic
    
    * [skip ci] adding test for vision model
    
    * [skip ci] adding identity layer
    
    * [skip ci] cleanup files
    
    * [skip ci] refactoring
    
    * [skip ci] more refactoring
    
    * [skip ci] added functionality to update scale
    
    * [skip ci] data loader clean up
    
    * [skip ci] implemented inf checks and update scale functions
    
    * [skip ci]code clean up. added test with autocast. does not work atm
    
    * adding documentation
    
    * adding dependency in requirements-dev.txt
    
    * updating pytorch nightly version
    
    * updating changelog
    
    * adding is_cuda_available to test_vision_model
    
    * set same timeout on cpu and gpu
    
    * reverting cpu timeout, skip vision test on cpu
    
    * addressing comments, fixing vision test
    
    * unscale uses in-place matmul
    
    * some more cleanup
    52d066a2
pyproject.toml 698 Bytes