1. 08 Apr, 2021 1 commit
    • Stas Bekman's avatar
      [DeepSpeed] ZeRO Stage 3 (#10753) · c6d66484
      Stas Bekman authored
      
      
      * synced gpus
      
      * fix
      
      * fix
      
      * need to use t5-small for quality tests
      
      * notes
      
      * complete merge
      
      * fix a disappearing std stream problem
      
      * start zero3 tests
      
      * wip
      
      * tune params
      
      * sorting out the pre-trained model loading
      
      * reworking generate loop wip
      
      * wip
      
      * style
      
      * fix tests
      
      * split the tests
      
      * refactor tests
      
      * wip
      
      * parameterized
      
      * fix
      
      * workout the resume from non-ds checkpoint pass + test
      
      * cleanup
      
      * remove no longer needed code
      
      * split getter/setter functions
      
      * complete the docs
      
      * suggestions
      
      * gpus and their compute capabilities link
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * style
      
      * remove invalid paramgd
      
      * automatically configure zero3 params that rely on hidden size
      
      * make _get_resized_embeddings zero3-aware
      
      * add test exercising resize_token_embeddings()
      
      * add docstring
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      c6d66484
  2. 17 Mar, 2021 1 commit
  3. 16 Mar, 2021 1 commit
  4. 15 Mar, 2021 1 commit
  5. 12 Mar, 2021 1 commit
  6. 24 Feb, 2021 1 commit
  7. 22 Feb, 2021 1 commit
  8. 18 Feb, 2021 1 commit
  9. 17 Feb, 2021 1 commit
  10. 15 Feb, 2021 1 commit
  11. 11 Feb, 2021 1 commit
  12. 10 Feb, 2021 1 commit
  13. 08 Feb, 2021 1 commit