1. 14 Sep, 2021 1 commit
    • Sylvain Gugger's avatar
      Push to hub when saving checkpoints (#13503) · 3081d386
      Sylvain Gugger authored
      * Push to hub when saving checkpoints
      
      * Add model card
      
      * Revert partial model card
      
      * Small fix for checkpoint
      
      * Add tests
      
      * Add documentation
      
      * Fix tests
      
      * Bump huggingface_hub
      
      * Fix test
      3081d386
  2. 08 Sep, 2021 1 commit
  3. 30 Aug, 2021 1 commit
  4. 25 Jun, 2021 1 commit
  5. 22 Jun, 2021 2 commits
  6. 14 Jun, 2021 1 commit
  7. 02 Jun, 2021 1 commit
  8. 01 Jun, 2021 1 commit
  9. 04 May, 2021 1 commit
  10. 30 Apr, 2021 1 commit
    • Stas Bekman's avatar
      [DeepSpeed] fp32 support (#11499) · 4e7bf94e
      Stas Bekman authored
      * prep for deepspeed==0.3.16
      
      * new version
      
      * too soon
      
      * support and test fp32 mode
      
      * troubleshooting doc start
      
      * workaround no longer needed
      
      * add fp32 doc
      
      * style
      
      * cleanup, add tf32 note
      
      * clarify
      
      * release was made
      4e7bf94e
  11. 26 Apr, 2021 1 commit
  12. 21 Apr, 2021 1 commit
  13. 13 Apr, 2021 1 commit
  14. 09 Apr, 2021 1 commit
  15. 08 Apr, 2021 3 commits
  16. 16 Mar, 2021 1 commit
  17. 15 Mar, 2021 1 commit
  18. 12 Mar, 2021 2 commits
  19. 10 Mar, 2021 1 commit
  20. 05 Mar, 2021 1 commit
  21. 25 Feb, 2021 1 commit
  22. 22 Feb, 2021 1 commit
  23. 17 Feb, 2021 1 commit
  24. 11 Feb, 2021 1 commit
  25. 10 Feb, 2021 1 commit
  26. 14 Jan, 2021 1 commit
  27. 13 Jan, 2021 1 commit
    • Stas Bekman's avatar
      [trainer] deepspeed integration (#9211) · 2df34f4a
      Stas Bekman authored
      
      
      * deepspeed integration
      
      * style
      
      * add test
      
      * ds wants to do its own backward
      
      * fp16 assert
      
      * Update src/transformers/training_args.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * style
      
      * for clarity extract what args are being passed to deepspeed
      
      * introduce the concept of self.wrapped_model
      
      * s/self.wrapped_model/self.model_wrapped/
      
      * complete transition to self.wrapped_model / self.model
      
      * fix
      
      * doc
      
      * give ds its own init
      
      * add custom overrides, handle bs correctly
      
      * fix test
      
      * clean up model_init logic, fix small bug
      
      * complete fix
      
      * collapse --deepspeed_config into --deepspeed
      
      * style
      
      * start adding doc notes
      
      * style
      
      * implement hf2ds optimizer and scheduler configuration remapping
      
      * oops
      
      * call get_num_training_steps absolutely when needed
      
      * workaround broken auto-formatter
      
      * deepspeed_config arg is no longer needed - fixed in deepspeed master
      
      * use hf's fp16 args in config
      
      * clean
      
      * start on the docs
      
      * rebase cleanup
      
      * finish up --fp16
      
      * clarify the supported stages
      
      * big refactor thanks to discovering deepspeed.init_distributed
      
      * cleanup
      
      * revert fp16 part
      
      * add checkpoint-support
      
      * more init ds into integrations
      
      * extend docs
      
      * cleanup
      
      * unfix docs
      
      * clean up old code
      
      * imports
      
      * move docs
      
      * fix logic
      
      * make it clear which file it's referring to
      
      * document nodes/gpus
      
      * style
      
      * wrong format
      
      * style
      
      * deepspeed handles gradient clipping
      
      * easier to read
      
      * major doc rewrite
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * docs
      
      * switch to AdamW optimizer
      
      * style
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * clarify doc
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      2df34f4a
  28. 22 Dec, 2020 1 commit
  29. 07 Dec, 2020 1 commit
  30. 12 Nov, 2020 1 commit
  31. 26 Oct, 2020 1 commit
    • Sylvain Gugger's avatar
      Doc styling (#8067) · 08f534d2
      Sylvain Gugger authored
      * Important files
      
      * Styling them all
      
      * Revert "Styling them all"
      
      This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.
      
      * Syling them for realsies
      
      * Fix syntax error
      
      * Fix benchmark_utils
      
      * More fixes
      
      * Fix modeling auto and script
      
      * Remove new line
      
      * Fixes
      
      * More fixes
      
      * Fix more files
      
      * Style
      
      * Add FSMT
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * Fixes
      
      * More fixes
      
      * More fixes
      
      * Last fixes
      
      * Make sphinx happy
      08f534d2
  32. 13 Oct, 2020 1 commit
  33. 07 Oct, 2020 1 commit
    • Sylvain Gugger's avatar
      Trainer callbacks (#7596) · 08ba4b49
      Sylvain Gugger authored
      
      
      * Initial callback proposal
      
      * Finish various callbacks
      
      * Post-rebase conflicts
      
      * Fix tests
      
      * Don't use something that's not set
      
      * Documentation
      
      * Remove unwanted print.
      
      * Document all models can work
      
      * Add tests + small fixes
      
      * Update docs/source/internal/trainer_utils.rst
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Address review comments
      
      * Fix TF tests
      
      * Real fix this time
      
      * This one should work
      
      * Fix typo
      
      * Really fix typo
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      08ba4b49
  34. 23 Sep, 2020 1 commit
  35. 11 Sep, 2020 1 commit
  36. 31 Jul, 2020 1 commit