1. 27 Sep, 2022 1 commit
  2. 03 Aug, 2022 1 commit
    • LSinev's avatar
      Fix torch version comparisons (#18460) · 02b176c4
      LSinev authored
      Comparisons like
      version.parse(torch.__version__) > version.parse("1.6")
      are True for torch==1.6.0+cu101 or torch==1.6.0+cpu
      
      version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py
      02b176c4
  3. 29 Jul, 2022 1 commit
  4. 19 May, 2022 1 commit
  5. 12 May, 2022 1 commit
  6. 30 Mar, 2022 1 commit
  7. 23 Mar, 2022 1 commit
  8. 12 Mar, 2022 1 commit
    • Stas Bekman's avatar
      [Deepspeed] add support for bf16 mode (#14569) · 580dd87c
      Stas Bekman authored
      
      
      * [WIP] add support for bf16 mode
      
      * prep for bf16
      
      * prep for bf16
      
      * fix; zero2/bf16 is ok
      
      * check bf16 is available
      
      * test fixes
      
      * enable zero3_bf16
      
      * config files
      
      * docs
      
      * split stage_dtype; merge back to non-dtype-specific config file
      
      * fix doc
      
      * cleanup
      
      * cleanup
      
      * bfloat16 => bf16 to match the PR changes
      
      * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/
      
      * test fixes/skipping
      
      * move
      
      * fix
      
      * Update docs/source/main_classes/deepspeed.mdx
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * backticks
      
      * cleanup
      
      * cleanup
      
      * cleanup
      
      * new version
      
      * add note about grad accum in bf16
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      580dd87c
  9. 09 Feb, 2022 1 commit
  10. 10 Jan, 2022 1 commit
  11. 11 Nov, 2021 1 commit
  12. 22 Oct, 2021 1 commit
  13. 14 Oct, 2021 1 commit
  14. 08 Aug, 2021 1 commit
  15. 30 Jul, 2021 1 commit
    • 21jun's avatar
      fix typo in gradient_checkpointing arg (#12855) · 5c673efa
      21jun authored
      help for `ModelArguments.gradient_checkpointing` should be
      "If True, use gradient checkpointing to save memory
      at the expense of slower backward pass."
      not "Whether to freeze the feature extractor layers of the model."
      (which is duplicated from `freeze_feature_extractor` arg)
      5c673efa
  16. 23 Jul, 2021 1 commit
  17. 15 Jul, 2021 1 commit
  18. 25 Jun, 2021 1 commit
  19. 14 Jun, 2021 1 commit
  20. 09 Jun, 2021 2 commits
  21. 08 Jun, 2021 1 commit
  22. 12 May, 2021 1 commit
  23. 14 Apr, 2021 1 commit
  24. 30 Mar, 2021 1 commit
  25. 22 Mar, 2021 2 commits
  26. 21 Mar, 2021 4 commits
  27. 19 Mar, 2021 3 commits
  28. 18 Mar, 2021 6 commits