1. 08 Mar, 2021 1 commit
    • Min Xu's avatar
      [fix]: handle inputs with containers in mixed precision (#486) · 2e9a14e7
      Min Xu authored
      * [fix]: handle inputs with containers
      
      - this is an issue surfaces by vissl as well
      - fix seems to be super simple
      - also cleaned up two tests with respect to multiple such tests
        running back to back (they don't do that presently)
      
      * cleanup
      
      * fix
      
      * lint
      2e9a14e7
  2. 05 Mar, 2021 1 commit
    • Min Xu's avatar
      [refactor] enhance wrap and auto_wrap (#467) · a05a79bc
      Min Xu authored
      
      
      * [refactor] enhance wrap and auto_wrap
      
      - Two things were done in this PR
        1. We don't need to import FSDP in wrap.py since the wrapper class
           type is stored in the context now.
        2. We can use a `auto_wrap_policy` function to customize wrapping policy
           for auto_wrap, including size of module, blacklist, exclude list
      - The auto_wrap function got simplified a bit as a minor side effect.
      
      * Update fairscale/nn/wrap/auto_wrap.py
      Co-authored-by: default avatarSean Naren <sean@grid.ai>
      
      * addressed comments
      
      * addressed more comments
      Co-authored-by: default avatarSean Naren <sean@grid.ai>
      a05a79bc
  3. 02 Mar, 2021 1 commit
    • Sean Naren's avatar
      [feat] Add context manager to FSDP for easier child module wrapping (#446) · f3359550
      Sean Naren authored
      This adds a context manager that assists in making child modules with similar defaults.
      Usage:
      ```
      from fairscale.nn.misc import enable_wrap, wrap
      
      with enable_wrap(**handleful_of_important_params):
          layer_1 = wrap(torch.nn.Linear(5, 5))
          layer_2 = wrap(torch.nn.Linear(5, 5), flatten_parameters=True) # Override parameters if you'd like
      
      # without the context manager, creates Linear layer
      layer_1 = wrap(torch.nn.Linear(5, 5))
      ```
      If not within the FSDP context, this would be a no-op. This makes it easier to annotate layers without having to copy any changes in parameters.
      f3359550