- 07 Dec, 2021 1 commit
-
-
Stas Bekman authored
* [deepspeed] fix load_best_model_at_end * try with pull_request_target * revert: try with pull_request_target * style * add test * cleanup
-
- 06 Dec, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 01 Dec, 2021 2 commits
-
-
Stas Bekman authored
-
Jamie DeAntonis authored
* started bf16 integration * minor changes * code now runs * style * lay foundation for bf16 testing * lay foundation for bf16 testing * start the tests * better bf16 check * style * 2 separate checkers - one for bf16 support, another for bf16+autocast * Update src/transformers/training_args.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * a couple of comment resolutions * more comment resolutions * resolved a small bug * just some print statemtns * added todo marking * added a todo * adjust for API change s/fast_dtype/dtype/ * fix style * merge 2 bf16 util functions * bf16 now does scaling too * Add support for bfloat16 * Revert T5 layernorm to float32 This is based on the comment at https://github.com/huggingface/transformers/pull/14448/files#r752660929 and the PyTorch PR https://github.com/pytorch/pytorch/pull/66920 . * Add comment about conversion to float32 before returning the numpy data * Add comment about AMP-bfloat16 incompatibility * Fix formatting * typo * reformer / bf16 * cleanup * require at least pt-1.10 * fix * will deal with deepspeed separately * cleanup * revert * cleanup * fp16_full_eval and bf16_full_eval are separate modes * proper deprecation * cleanup * test and fixes * spelling * cleanup * add a note that this API is experimental Co-authored-by:
jamie <jamie@cortx.com> Co-authored-by:
Stas Bekman <stas@stason.org> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
suriya <suriya@cortx.com> Co-authored-by:
Manuel R. Ciosici <manuelrciosici@gmail.com>
-
- 23 Nov, 2021 1 commit
-
-
Stas Bekman authored
* [deepspeed] zero inference * only z3 makes sense for inference * fix and style * docs * rework * fix test * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * responding to suggestions Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 16 Nov, 2021 1 commit
-
-
Valentin authored
* stop training when a finite IterableDataset is exhausted when using an iterable dataset num_epochs is set to sys.maxsize to make sure all data is consumed likewise we want to set max_steps high enough but still stop when all data is consumed (cherry picked from commit 6f0e1d6363153da9051e93acffe1cbab3a3f3b12) * fix typo flase -> false * add test for stopping training on exhausted finite iterable dataset * remove redundant gradient_accumulation_steps * run make style reformat training_args docstring
-
- 05 Nov, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 01 Nov, 2021 1 commit
-
-
mathor authored
-
- 20 Oct, 2021 2 commits
-
-
Kwanghee Choi authored
Co-authored-by:jonas <jonas@hpcnt.com>
-
Robert Stone authored
-
- 11 Oct, 2021 2 commits
-
-
Sylvain Gugger authored
-
Patrick von Platen authored
[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled (#13961) * up * correct test
-
- 07 Oct, 2021 1 commit
-
-
Alex Hedges authored
-
- 06 Oct, 2021 3 commits
-
-
Anton Lozhkov authored
-
Sylvain Gugger authored
-
Yanming Wang authored
* Fix logging_nan_inf_filter in torch_xla mode * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix format Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 05 Oct, 2021 1 commit
-
-
Zhaofeng Wu authored
* Allow dataset to be an optional argument for (Distributed)LengthGroupedSampler * Fix
-
- 27 Sep, 2021 1 commit
-
-
Sylvain Gugger authored
Co-authored-by:
quantitative-technologies <james.hirschorn@quantitative-technologies.com> Co-authored-by:
quantitative-technologies <james.hirschorn@quantitative-technologies.com>
-
- 26 Sep, 2021 1 commit
-
-
Patrick von Platen authored
[Trainer] Make sure shown loss in distributed training is correctly averaged over all workers (#13681) * push * improve tr loss gather
-
- 23 Sep, 2021 2 commits
-
-
kding1 authored
* update trainer with cpu distributed fine-tuning support. Signed-off-by:
Ding, Ke <ke.ding@intel.com> * Style. * refinement on cpu dist training check. Signed-off-by:
Ding, Ke <ke.ding@intel.com> * style. Signed-off-by:
Ding, Ke <ke.ding@intel.com> * Test over private field not public one. Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Morgan Funtowicz <funtowiczmo@gmail.com> Co-authored-by:
Funtowicz Morgan <mfuntowicz@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
kding1 authored
* add sigopt hpo to transformers. Signed-off-by:
Ding, Ke <ke.ding@intel.com> * extend sigopt changes to test code and others.. Signed-off-by:
Ding, Ke <ke.ding@intel.com> * Style. * fix style for sigopt integration. Signed-off-by:
Ding, Ke <ke.ding@intel.com> * Add necessary information to run unittests on SigOpt. Co-authored-by:
Morgan Funtowicz <funtowiczmo@gmail.com>
-
- 22 Sep, 2021 1 commit
-
-
Sylvain Gugger authored
* Make gradient_checkpointing a training argument * Update src/transformers/modeling_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Fix tests * Style * document Gradient Checkpointing as a performance feature * Small rename * PoC for not using the config * Adapt BC to new PoC * Forgot to save * Rollout changes to all other models * Fix typo Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas@stason.org>
-
- 17 Sep, 2021 1 commit
-
-
Patrick von Platen authored
* finish * add test * push * remove unnecessary code * up * correct test * Update src/transformers/training_args.py
-
- 14 Sep, 2021 2 commits
-
-
elishowk authored
Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* Push to hub when saving checkpoints * Add model card * Revert partial model card * Small fix for checkpoint * Add tests * Add documentation * Fix tests * Bump huggingface_hub * Fix test
-
- 09 Sep, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 31 Aug, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 30 Aug, 2021 3 commits
-
-
Olatunji Ruwase authored
* Use DS callable API to allow hf_scheduler + ds_optimizer * Preserve backward-compatibility * Restore backward compatibility * Tweak arg positioning * Tweak arg positioning * bump the required version * Undo indent * Update src/transformers/trainer.py * style Co-authored-by:
Stas Bekman <stas@stason.org> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
Maxwell Forbes authored
-
Li-Huai (Allan) Lin authored
* Check None before going through iteration * Format
-
- 23 Aug, 2021 1 commit
-
-
Philipp Schmid authored
* Barrier -> barrier * added logger for metrics * removed stream handler in trainer * moved handler * removed streamhandler from trainer * updated test image and instance type added datasets version to test * Update tests/sagemaker/scripts/pytorch/requirements.txt Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
- 19 Aug, 2021 1 commit
-
-
Allan Lin authored
* Update torch.utils.data namespaces to the latest. * Format * Update Dataloader. * Style
-
- 06 Aug, 2021 2 commits
-
-
Sylvain Gugger authored
* Fix tied weights on TPU * Manually tie weights in no trainer examples * Fix for test * One last missing * Gettning owned by my scripts * Address review comments * Fix test * Fix tests * Fix reformer tests
-
Sylvain Gugger authored
* Initial work * All auto models * All tf auto models * All flax auto models * Tokenizers * Add feature extractors * Fix typos * Fix other typo * Use the right config * Remove old mapping names and update logic in AutoTokenizer * Update check_table * Fix copies and check_repo script * Fix last test * Add back name * clean up * Update template * Update template * Forgot a ) * Use alternative to fixup * Fix TF model template * Address review comments * Address review comments * Style
-
- 03 Aug, 2021 1 commit
-
-
Philip May authored
* fix #12970 * Update tests/test_trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove unnecessary issue link * fix test formatting Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 30 Jul, 2021 1 commit
-
-
wulu473 authored
Co-authored-by:Lukas Wutschitz <lukas.wutschitz@microsoft.com>
-
- 26 Jul, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 21 Jul, 2021 3 commits
-
-
Sylvain Gugger authored
-
Stas Bekman authored
-
Masatoshi TSUCHIYA authored
* Refer warmup_ratio when setting warmup_num_steps. * Add a method to get number of warmup steps to TrainerArguments class. * Fix. * Fix.
-