1. 23 Sep, 2022 1 commit
    • Min Xu's avatar
      [fix] better handling non-flatten in FSDP (#1072) · 429f3d31
      Min Xu authored
      
      
      * [fix] better handling non-flatten in FSDP
      
      - see the detailed comment about that backward firing case
      - also minor debugging help in FSDP
      - also minor fix in FPW's state dict
      
      * [feat] disallow reset_parameters by default
      
      * [feat] adding fsdp_instances API - useful in check wrapping by user code
      
      * [fix] one line fix but more than a day of debugging
      
      * fixed the case of loading combined check with empty fsdp instances
      
      * fixed another bug around state loading the root/nonroot module full param caching due to not resharding after forward
      
      * [feat] support .half and .float better
      
      * fixed a bug in gather optim state losses extra keys from the original state_dict
      
      * fixed a test failure in mixed precision
      
      * fixed another bug affecting no_sync grad acc
      
      * fixed a bug and a test in fsdp optim state
      
      * fixed another corner case
      
      * added a comment
      
      * skip ssd offload tests
      
      * skip fsdp one for ssd overload
      Co-authored-by: default avatarMin Xu <min.xu.public@gmail.com>
      429f3d31
  2. 08 Aug, 2022 1 commit
  3. 26 May, 2022 1 commit
  4. 02 May, 2022 1 commit
    • Paul Johnson's avatar
      [FSDP] ssd_offload fixing backward path (grad_fn) for SsdFlatParameter and... · 51b53ddb
      Paul Johnson authored
      [FSDP] ssd_offload fixing backward path (grad_fn) for SsdFlatParameter and SsdFlatParameterView (#974)
      
      * [FSDP] fixing backward path for SsdFlatParameter and SsdFlatParameterView when overriding .data
      
      * Get ssd_offload unit tests passing
      
      * [FSDP] get all test_fsdp_offload tests passing w/ ssd_offload on
      
      * Update changelog
      51b53ddb
  5. 26 Apr, 2022 1 commit
  6. 06 Apr, 2022 1 commit
  7. 05 Jan, 2022 1 commit
    • Paul Johnson's avatar
      Enabling ssd_offload training basic tests. (#887) · c5e471bc
      Paul Johnson authored
      * Enabling ssd_offload training and test via tests/nn/data_parallel/test_fsdp_offload.py.
      * Removed unused classes: SsdBuffer, SsdTensorHandleView, SsdParameter, SsdTensor
      * Enhance test coverage of test_ssd_offloading_train_flatten_params_wrapper
      * Modifications from PR #887 review comments.
      * Update Changelog
      c5e471bc
  8. 08 Nov, 2021 1 commit
  9. 01 Nov, 2021 1 commit
    • anj-s's avatar
      [feature] Add the low level SSD APIs (#829) · a9fcaa28
      anj-s authored
      * add doc strings
      
      * add lower level SSD APIs and tests
      
      * add the test to the list to be run
      
      * remove unused imports
      
      * more doc string changes
      
      * fix lint errors
      a9fcaa28