1. 11 Dec, 2023 1 commit
    • Ella Charlaix's avatar
      Add deepspeed test to amd scheduled CI (#27633) · 39acfe84
      Ella Charlaix authored
      
      
      * add deepspeed scheduled test for amd
      
      * fix image
      
      * add dockerfile
      
      * add comment
      
      * enable tests
      
      * trigger
      
      * remove trigger for this branch
      
      * trigger
      
      * change runner env to trigger the docker build image test
      
      * use new docker image
      
      * remove test suffix from docker image tag
      
      * replace test docker image with original image
      
      * push new image
      
      * Trigger
      
      * add back amd tests
      
      * fix typo
      
      * add amd tests back
      
      * fix
      
      * comment until docker image build scheduled test fix
      
      * remove deprecated deepspeed build option
      
      * upgrade torch
      
      * update docker & make tests pass
      
      * Update docker/transformers-pytorch-deepspeed-amd-gpu/Dockerfile
      
      * fix
      
      * tmp disable test
      
      * precompile deepspeed to avoid timeout during tests
      
      * fix comment
      
      * trigger deepspeed tests with new image
      
      * comment tests
      
      * trigger
      
      * add sklearn dependency to fix slow tests
      
      * enable back other tests
      
      * final update
      
      ---------
      Co-authored-by: default avatarFelix Marty <felix@hf.co>
      Co-authored-by: default avatarF茅lix Marty <9808326+fxmarty@users.noreply.github.com>
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      39acfe84
  2. 05 Dec, 2023 2 commits
  3. 21 Nov, 2023 1 commit
  4. 13 Nov, 2023 1 commit
  5. 07 Nov, 2023 1 commit
  6. 06 Nov, 2023 1 commit
  7. 01 Nov, 2023 1 commit
  8. 11 Oct, 2023 1 commit
  9. 05 Oct, 2023 2 commits
  10. 20 Sep, 2023 1 commit
    • Funtowicz Morgan's avatar
      Integrate AMD GPU in CI/CD environment (#26007) · 2d71307d
      Funtowicz Morgan authored
      
      
      * Add a Dockerfile for PyTorch + ROCm based on official AMD released artifact
      
      * Add a new artifact single-amdgpu testing on main
      
      * Attempt to test the workflow without merging.
      
      * Changed BERT to check if things are triggered
      
      * Meet the dependencies graph on workflow
      
      * Revert BERT changes
      
      * Add check_runners_amdgpu to correctly mount and check availability
      
      * Rename setup to setup_gpu for CUDA and add setup_amdgpu for AMD
      
      * Fix all the needs.setup -> needs.setup_[gpu|amdgpu] dependencies
      
      * Fix setup dependency graph to use check_runner_amdgpu
      
      * Let's do the runner status check only on AMDGPU target
      
      * Update the Dockerfile.amd to put ourselves in / rather than /var/lib
      
      * Restore the whole setup for CUDA too.
      
      * Let's redisable them
      
      * Change BERT to trigger tests
      
      * Restore BERT
      
      * Add torchaudio with rocm 5.6 to AMD Dockerfile (#26050)
      
      fix dockerfile
      Co-authored-by: default avatarFelix Marty <felix@hf.co>
      
      * Place AMD GPU tests in a separate workflow (correct branch) (#26105)
      
      AMDGPU CI lives in an other workflow
      
      * Fix invalid job name is dependencies.
      
      * Remove tests multi-amdgpu for now.
      
      * Use single-amdgpu
      
      * Use --net=host for now.
      
      * Remote host networking.
      
      * Removed duplicated check_runners_amdgpu step
      
      * Let's tag machine-types with mi210 for now.
      
      * Machine type should be only mi210
      
      * Remove unnecessary push.branches item
      
      * Apply review suggestions moving from `x-amdgpu` to `x-gpu` introducing `amd-gpu` and `miXXX` labels.
      
      * Remove amdgpu from step names.
      
      * finalize
      
      * delete
      
      ---------
      Co-authored-by: default avatarfxmarty <9808326+fxmarty@users.noreply.github.com>
      Co-authored-by: default avatarFelix Marty <felix@hf.co>
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      2d71307d
  11. 24 Aug, 2023 1 commit
  12. 18 Aug, 2023 1 commit
  13. 17 Aug, 2023 1 commit
  14. 10 Aug, 2023 1 commit
  15. 07 Aug, 2023 1 commit
  16. 31 Jul, 2023 1 commit
  17. 13 Jul, 2023 1 commit
  18. 11 Jul, 2023 1 commit
  19. 01 Jul, 2023 1 commit
  20. 30 Jun, 2023 1 commit
  21. 19 Jun, 2023 1 commit
  22. 16 Jun, 2023 1 commit
  23. 19 May, 2023 2 commits
  24. 17 May, 2023 1 commit
  25. 12 May, 2023 2 commits
  26. 11 May, 2023 1 commit
  27. 10 May, 2023 1 commit
  28. 27 Apr, 2023 1 commit
  29. 24 Apr, 2023 1 commit
  30. 19 Apr, 2023 1 commit
  31. 13 Apr, 2023 1 commit
  32. 30 Mar, 2023 1 commit
  33. 24 Mar, 2023 1 commit
  34. 17 Mar, 2023 2 commits
  35. 16 Mar, 2023 1 commit