• Sangkug Lym's avatar
    gradient accumulation fusion · 83b1e42f
    Sangkug Lym authored
    remove redundant linear layer class definition
    
    add fuse_gradient_accumulation attribute to weights for simple targetting
    
    reflect feedback and clean up the codes
    
    arg change
    83b1e42f
__init__.py 3.56 KB