"examples/legacy/vscode:/vscode.git/clone" did not exist on "61e191987d8aa0778e0f44613deaf7ad99253cab"
  1. 07 Jun, 2023 3 commits
  2. 05 Jun, 2023 1 commit
  3. 02 Jun, 2023 1 commit
  4. 31 May, 2023 8 commits
    • Sourab Mangrulkar's avatar
      remove the extra `accelerator.prepare` (#23914) · d13021e3
      Sourab Mangrulkar authored
      remove the extra `accelerator.prepare` that slipped in with multiple update from main 😅
      d13021e3
    • Sylvain Gugger's avatar
    • Sourab Mangrulkar's avatar
      accelerate deepspeed and gradient accumulation integrate (#23236) · a73b1d59
      Sourab Mangrulkar authored
      * mixed precision support via accelerate
      
      * fix issues
      
      * fix for the sharded ddp case
      
      * fix flax and tf failing tests
      
      * `refactor the place to create `Accelerator` object
      
      * move ddp prep to accelerate
      
      * fix 😅
      
      * resolving comments
      
      * move fsdp handling to accelerate
      
      * fixex
      
      * fix saving
      
      * shift torch dynamo handling to accelerate
      
      * shift deepspeed integration and save & load utils to accelerate
      
      * fix accelerate launcher support
      
      * oops
      
      * fix 🐛
      
      * save ckpt fix
      
      * Trigger CI
      
      * nasty 🐛 😅
      
      * as deepspeed needs grad_acc fixes, transfer grad_acc to accelerate
      
      * make tests happy
      
      * quality 
      
      * loss tracked needs to account for grad_acc
      
      * fixing the deepspeed tests
      
      * quality 
      
      * 😅😅😅
      
      * tests 😡
      
      * quality 
      
      
      
      * Trigger CI
      
      * resolve comments and fix the issue with the previous merge from branch
      
      * Trigger CI
      
      * accelerate took over deepspeed integration
      
      ---------
      Co-authored-by: default avatarStas Bekman <stas@stason.org>
      a73b1d59
    • Sylvain Gugger's avatar
      9fea71b4
    • Sourab Mangrulkar's avatar
      shift torch dynamo handling to accelerate (#23168) · 03db5910
      Sourab Mangrulkar authored
      * mixed precision support via accelerate
      
      * fix issues
      
      * fix for the sharded ddp case
      
      * fix flax and tf failing tests
      
      * `refactor the place to create `Accelerator` object
      
      * move ddp prep to accelerate
      
      * fix 😅
      
      * resolving comments
      
      * move fsdp handling to accelerate
      
      * fixex
      
      * fix saving
      
      * shift torch dynamo handling to accelerate
      03db5910
    • Sourab Mangrulkar's avatar
      move fsdp handling to accelerate (#23158) · 0b774074
      Sourab Mangrulkar authored
      * mixed precision support via accelerate
      
      * fix issues
      
      * fix for the sharded ddp case
      
      * fix flax and tf failing tests
      
      * `refactor the place to create `Accelerator` object
      
      * move ddp prep to accelerate
      
      * fix 😅
      
      * resolving comments
      
      * move fsdp handling to accelerate
      
      * fixex
      
      * fix saving
      0b774074
    • Sourab Mangrulkar's avatar
      Smangrul/accelerate ddp integrate (#23151) · 1cf148a6
      Sourab Mangrulkar authored
      * mixed precision support via accelerate
      
      * fix issues
      
      * fix for the sharded ddp case
      
      * fix flax and tf failing tests
      
      * `refactor the place to create `Accelerator` object
      
      * move ddp prep to accelerate
      
      * fix 😅
      
      * resolving comments
      1cf148a6
    • Sourab Mangrulkar's avatar
      Smangrul/accelerate mp integrate (#23148) · 9f0646a5
      Sourab Mangrulkar authored
      * mixed precision support via accelerate
      
      * fix issues
      
      * fix for the sharded ddp case
      
      * fix flax and tf failing tests
      
      * `refactor the place to create `Accelerator` object
      
      * address comments by removing debugging print statements
      9f0646a5
  5. 26 May, 2023 1 commit
  6. 25 May, 2023 1 commit
  7. 24 May, 2023 3 commits
    • Zachary Mueller's avatar
      Fix sagemaker DP/MP (#23681) · 75bbf20b
      Zachary Mueller authored
      * Check for use_sagemaker_dp
      
      * Add a check for is_sagemaker_mp when setting _n_gpu again. Should be last broken thing
      
      * Try explicit check?
      
      * Quality
      75bbf20b
    • Tim Dettmers's avatar
      Paged Optimizer + Lion Optimizer for Trainer (#23217) · 796162c5
      Tim Dettmers authored
      
      
      * Added lion and paged optimizers and made original tests pass.
      
      * Added tests for paged and lion optimizers.
      
      * Added and fixed optimizer tests.
      
      * Style and quality checks.
      
      ---------
      Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
      796162c5
    • Tim Dettmers's avatar
      4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) (#23479) · 9d73b922
      Tim Dettmers authored
      
      
      * Added lion and paged optimizers and made original tests pass.
      
      * Added tests for paged and lion optimizers.
      
      * Added and fixed optimizer tests.
      
      * Style and quality checks.
      
      * Initial draft. Some tests fail.
      
      * Fixed dtype bug.
      
      * Fixed bug caused by torch_dtype='auto'.
      
      * All test green for 8-bit and 4-bit layers.
      
      * Added fix for fp32 layer norms and bf16 compute in LLaMA.
      
      * Initial draft. Some tests fail.
      
      * Fixed dtype bug.
      
      * Fixed bug caused by torch_dtype='auto'.
      
      * All test green for 8-bit and 4-bit layers.
      
      * Added lion and paged optimizers and made original tests pass.
      
      * Added tests for paged and lion optimizers.
      
      * Added and fixed optimizer tests.
      
      * Style and quality checks.
      
      * Fixing issues for PR #23479.
      
      * Added fix for fp32 layer norms and bf16 compute in LLaMA.
      
      * Reverted variable name change.
      
      * Initial draft. Some tests fail.
      
      * Fixed dtype bug.
      
      * Fixed bug caused by torch_dtype='auto'.
      
      * All test green for 8-bit and 4-bit layers.
      
      * Added lion and paged optimizers and made original tests pass.
      
      * Added tests for paged and lion optimizers.
      
      * Added and fixed optimizer tests.
      
      * Style and quality checks.
      
      * Added missing tests.
      
      * Fixup changes.
      
      * Added fixup changes.
      
      * Missed some variables to rename.
      
      * revert trainer tests
      
      * revert test trainer
      
      * another revert
      
      * fix tests and safety checkers
      
      * protect import
      
      * simplify a bit
      
      * Update src/transformers/trainer.py
      
      * few fixes
      
      * add warning
      
      * replace with `load_in_kbit = load_in_4bit or load_in_8bit`
      
      * fix test
      
      * fix tests
      
      * this time fix tests
      
      * safety checker
      
      * add docs
      
      * revert torch_dtype
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * multiple fixes
      
      * update docs
      
      * version checks and multiple fixes
      
      * replace `is_loaded_in_kbit`
      
      * replace `load_in_kbit`
      
      * change methods names
      
      * better checks
      
      * oops
      
      * oops
      
      * address final comments
      
      ---------
      Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      9d73b922
  8. 23 May, 2023 1 commit
  9. 17 May, 2023 1 commit
  10. 16 May, 2023 1 commit
  11. 09 May, 2023 1 commit
  12. 04 May, 2023 1 commit
  13. 02 May, 2023 1 commit
  14. 28 Apr, 2023 2 commits
  15. 21 Apr, 2023 1 commit
  16. 19 Apr, 2023 1 commit
  17. 17 Apr, 2023 1 commit
  18. 07 Apr, 2023 1 commit
  19. 06 Apr, 2023 2 commits
  20. 05 Apr, 2023 1 commit
    • Quentin Meeus's avatar
      Add thousands separator in training summary (#22583) · 4861c258
      Quentin Meeus authored
      The logger prints a summary at the beginning of training that displays some info such as number of examples, number of parameters, total number of steps, etc. Those numbers can be quite large and difficult to read. I added a thousand separator to improve readability for the following:
      - num_examples
      - num_train_epochs
      - per_device_train_batch_size
      - total_train_batch_size
      - max_steps
      - num_trainable_params
      4861c258
  21. 04 Apr, 2023 1 commit
  22. 03 Apr, 2023 3 commits
  23. 29 Mar, 2023 1 commit
  24. 23 Mar, 2023 2 commits