• Matthew Yu's avatar
    support a layer that saves outputs · 120b463c
    Matthew Yu authored
    Summary:
    Pull Request resolved: https://github.com/facebookresearch/d2go/pull/417
    
    This diff adds a layer `CachedLayer` which is meant to be used with dynamic mixin. This layer runs the original module and clones the output into a dictionary provided by the user.
    
    The main use case is in distillation where we dynamically mixin these layers to the layers that the user wants to compute various losses.
    
    See subsequent diffs to get integration with distillation.
    
    Reviewed By: Minione
    
    Differential Revision: D40285573
    
    fbshipit-source-id: 2058deff8b96f63aebd1e9b9933a5352b5197111
    120b463c
test_modeling_distillation.py 16.9 KB