1. 09 Jan, 2024 1 commit
  2. 11 Dec, 2023 1 commit
    • Ella Charlaix's avatar
      Add deepspeed test to amd scheduled CI (#27633) · 39acfe84
      Ella Charlaix authored
      
      
      * add deepspeed scheduled test for amd
      
      * fix image
      
      * add dockerfile
      
      * add comment
      
      * enable tests
      
      * trigger
      
      * remove trigger for this branch
      
      * trigger
      
      * change runner env to trigger the docker build image test
      
      * use new docker image
      
      * remove test suffix from docker image tag
      
      * replace test docker image with original image
      
      * push new image
      
      * Trigger
      
      * add back amd tests
      
      * fix typo
      
      * add amd tests back
      
      * fix
      
      * comment until docker image build scheduled test fix
      
      * remove deprecated deepspeed build option
      
      * upgrade torch
      
      * update docker & make tests pass
      
      * Update docker/transformers-pytorch-deepspeed-amd-gpu/Dockerfile
      
      * fix
      
      * tmp disable test
      
      * precompile deepspeed to avoid timeout during tests
      
      * fix comment
      
      * trigger deepspeed tests with new image
      
      * comment tests
      
      * trigger
      
      * add sklearn dependency to fix slow tests
      
      * enable back other tests
      
      * final update
      
      ---------
      Co-authored-by: default avatarFelix Marty <felix@hf.co>
      Co-authored-by: default avatarFélix Marty <9808326+fxmarty@users.noreply.github.com>
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      39acfe84
  3. 09 Nov, 2023 2 commits
  4. 13 Sep, 2023 1 commit
  5. 05 Sep, 2023 1 commit
    • Sourab Mangrulkar's avatar
      deepspeed resume from ckpt fixes and adding support for deepspeed optimizer... · 6bc517cc
      Sourab Mangrulkar authored
      deepspeed resume from ckpt fixes and adding support for deepspeed optimizer and HF scheduler (#25863)
      
      * Add support for deepspeed optimizer and HF scheduler
      
      * fix bug
      
      * fix the import
      
      * fix issue with deepspeed scheduler saving for hf optim + hf scheduler scenario
      
      * fix loading of hf scheduler when loading deepspeed checkpoint
      
      * fix import of `DeepSpeedSchedulerWrapper`
      
      * add tests
      
      * add the comment and skip the failing tests
      
      * address comment
      6bc517cc
  6. 25 Aug, 2023 1 commit
  7. 31 May, 2023 1 commit
    • Sourab Mangrulkar's avatar
      accelerate deepspeed and gradient accumulation integrate (#23236) · a73b1d59
      Sourab Mangrulkar authored
      * mixed precision support via accelerate
      
      * fix issues
      
      * fix for the sharded ddp case
      
      * fix flax and tf failing tests
      
      * `refactor the place to create `Accelerator` object
      
      * move ddp prep to accelerate
      
      * fix 😅
      
      * resolving comments
      
      * move fsdp handling to accelerate
      
      * fixex
      
      * fix saving
      
      * shift torch dynamo handling to accelerate
      
      * shift deepspeed integration and save & load utils to accelerate
      
      * fix accelerate launcher support
      
      * oops
      
      * fix 🐛
      
      * save ckpt fix
      
      * Trigger CI
      
      * nasty 🐛 😅
      
      * as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate
      
      * make tests happy
      
      * quality 
      
      * loss tracked needs to account for grad_acc
      
      * fixing the deepspeed tests
      
      * quality 
      
      * 😅😅😅
      
      * tests 😡
      
      * quality 
      
      
      
      * Trigger CI
      
      * resolve comments and fix the issue with the previous merge from branch
      
      * Trigger CI
      
      * accelerate took over deepspeed integration
      
      ---------
      Co-authored-by: default avatarStas Bekman <stas@stason.org>
      a73b1d59
  8. 11 Apr, 2023 1 commit
  9. 09 Mar, 2023 1 commit
  10. 23 Feb, 2023 1 commit
  11. 22 Feb, 2023 1 commit
  12. 08 Feb, 2023 1 commit
  13. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  14. 16 Jun, 2022 1 commit
  15. 06 Jun, 2022 1 commit
  16. 03 Jun, 2022 1 commit
  17. 02 Jun, 2022 1 commit
  18. 10 May, 2022 1 commit
    • Stas Bekman's avatar
      [Deepspeed] add many more models to the model zoo test (#12695) · f8615044
      Stas Bekman authored
      * model zoo take 2
      
      * add deberta
      
      * new param for zero2
      
      * doc update
      
      * doc update
      
      * add layoutlm
      
      * bump deepspeed
      
      * add deberta-v2, funnel, longformer
      
      * new models
      
      * style
      
      * add t5_v1
      
      * update TAPAS status
      
      * reorg problematic models
      
      * move doc to another PR
      
      * style
      
      * fix checkpoint check test
      
      * making progress on more models running
      
      * cleanup
      
      * new version
      
      * cleanup
      f8615044
  19. 15 Apr, 2022 1 commit
  20. 23 Mar, 2022 1 commit
    • Sylvain Gugger's avatar
      Reorganize file utils (#16264) · 4975002d
      Sylvain Gugger authored
      * Split file_utils in several submodules
      
      * Fixes
      
      * Add back more objects
      
      * More fixes
      
      * Who exactly decided to import that from there?
      
      * Second suggestion to code with code review
      
      * Revert wront move
      
      * Fix imports
      
      * Adapt all imports
      
      * Adapt all imports everywhere
      
      * Revert this import, will fix in a separate commit
      4975002d
  21. 12 Mar, 2022 1 commit
    • Stas Bekman's avatar
      [Deepspeed] add support for bf16 mode (#14569) · 580dd87c
      Stas Bekman authored
      
      
      * [WIP] add support for bf16 mode
      
      * prep for bf16
      
      * prep for bf16
      
      * fix; zero2/bf16 is ok
      
      * check bf16 is available
      
      * test fixes
      
      * enable zero3_bf16
      
      * config files
      
      * docs
      
      * split stage_dtype; merge back to non-dtype-specific config file
      
      * fix doc
      
      * cleanup
      
      * cleanup
      
      * bfloat16 => bf16 to match the PR changes
      
      * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/
      
      * test fixes/skipping
      
      * move
      
      * fix
      
      * Update docs/source/main_classes/deepspeed.mdx
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * backticks
      
      * cleanup
      
      * cleanup
      
      * cleanup
      
      * new version
      
      * add note about grad accum in bf16
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      580dd87c
  22. 02 Mar, 2022 1 commit
  23. 23 Feb, 2022 1 commit
  24. 03 Feb, 2022 1 commit
  25. 07 Dec, 2021 1 commit
  26. 23 Nov, 2021 1 commit
  27. 11 Nov, 2021 1 commit
  28. 08 Nov, 2021 1 commit
  29. 30 Aug, 2021 1 commit
  30. 23 Jul, 2021 1 commit
  31. 14 Jul, 2021 1 commit
  32. 13 Jul, 2021 1 commit
  33. 22 Jun, 2021 1 commit
  34. 08 Jun, 2021 2 commits
  35. 04 Jun, 2021 1 commit
  36. 02 Jun, 2021 2 commits
  37. 01 Jun, 2021 1 commit