- 13 Apr, 2023 1 commit
-
-
Stas Bekman authored
* [trainer] update url * style
-
- 12 Apr, 2023 1 commit
-
-
Michael Benayoun authored
`torch.distributed` group initialization for `torch_neuron` disabled when `optimum-neuron` is installed (#22728) * Make the process group initialization not happen if optimum_neuron is installed * Add warning * Remove list and added warning
-
- 04 Apr, 2023 1 commit
-
-
Viktor Scherbakov authored
* implemented safetensors save/load * remove duplicated file * added tests * more tests * style fix * fix tf tests * change to list comprehension Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * review fixes + safe load for sharded checkpoint * style fix * remove rogue import * remove partial to avoid undefined exception * use naming alias instead of safetensors.torch * fix safe sharding in tests * grammar Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update docs Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * minor corrections * style --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 24 Mar, 2023 1 commit
-
-
Stas Bekman authored
-
- 20 Mar, 2023 2 commits
-
-
heya5 authored
Update training_args.py
-
Pasquale Minervini authored
Update training_args.py A nightly install is not required anymore for `torch.compile`.
-
- 14 Mar, 2023 1 commit
-
-
Stas Bekman authored
* [trainer] add --optim adamw_torch_fused * change optim default * deal with non-torch * revert default change; prep; add fp16/amp assert * typo * typo
-
- 13 Mar, 2023 1 commit
-
-
Sylvain Gugger authored
* Remove backend enforcment for torch.compile * Update error * Update src/transformers/training_args.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Style --------- Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
- 09 Mar, 2023 2 commits
-
-
aws-sangeetha authored
Co-authored-by:EC2 Default User <ec2-user@ip-172-31-42-72.us-west-2.compute.internal>
-
Sylvain Gugger authored
* Add setters by type of args to TrainingArguments * Define more setters
-
- 22 Feb, 2023 2 commits
-
-
Sylvain Gugger authored
* Respect documentation on passive log level * Fix test and set log level in examples * Add doc
-
Aaron Gokaslan authored
-
- 20 Feb, 2023 1 commit
-
-
AlexWertheim authored
* Reinserted import statement accidentally removed during rebasing. * Added auto_wrap functionality, restructured XLA FSDP logic to more closely match PyTorch FSDP logic. * Fixed flag descriptions; changed several instances of fsdp_ to xla_fsdp_; pass in auto_wrap_policy and auto_wrapper_callable directly to avoid lambda saving. * Moved XLA FSDP logic to be adjacent to Fairscale FSDP logic in trainer. * Formatted changes in accordance with HF style requirements. * Added back in warning which was accidentally removed. * - Merged XLA FSDP training arguments into `fsdp_config` - Added `xla` boolean flag to `fsdp_config` to specify XLA FSDP wrapping - Merged XLA FSDP wrapping logic into FSDP wrapping logic within trainer class * Cleaned up errors, moved argument to fsdp_config - Set `xla` and `xla_fsdp_grad_ckpt` flags by default in fsdp_config - Added missing colons following conditionals - Moved `fsdp_transformer_layer_cls_to_wrap` to `fsdp_config` - Modified `fsdp_transformer_layer_cls_to_wrap` to be list of strings, not just one string - Changed Fairscale FSDP logic to allow for set of layer classes to wrap - Removed unnecessary checks for `xla_fsdp` * Corrected small errors, improved layer class flag - Correctly set default values for `xla` and `xla_fsdp_grad_ckpt` arguments - Made `fsdp_transformer_layer_cls_to_wrap` a list of strings instead of a single string - Added processing to ensure that `fsdp_transformer_layer_cls_to_wrap` works as expected if passed as a single string - Updated PyTorch FSDP logic to accept a list of layers to wrap, as done with XLA FSDP - Replaced instances of `getattr()` with `.get()` for dictionary retrievals with default values, including when setting `fsdp_min_num_params` - Corrected `self.fsdp is not None` to `len(self.fsdp) > 0` - Removed extraneous `xla_fsdp` argument descriptions from outside `fsdp_config` * Changed xla-fsdp-settings to be dictionary - Modified xla-fsdp-settings to be entered directly as dictionary instead of loaded through JSON file - Made small style corrections * Reverted unintentional local_rank TPU check * Do not block XLA FSDP if local rank is -1 * Rebased and applied automatic formatting - Rebased - Applied automatic formatting changes via `make style` * Applied automatic formatting with latest version of black * Replaced expression with * Reran black examples tests src utils ruff examples tests src utils --fix make autogenerate_code make[1]: Entering directory '/usr/local/google/home/awertheim/HF-FSDP-PR/transformers' make[1]: Leaving directory '/usr/local/google/home/awertheim/HF-FSDP-PR/transformers' after additional formatting changes * Additionall automatic formatting changes * Remove unnecessary whitespace characters from src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 07 Feb, 2023 1 commit
-
-
raghavanone authored
* Add limit_all_gathers option to fsdp_config and fix forward_prefetch bug * Fix black issue * Fix ruff failure * Incorporate PR feedbacks * Incorporate PR feedbacks * Incorporate PR feedbacks
-
- 06 Feb, 2023 1 commit
-
-
Sylvain Gugger authored
* Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies
-
- 31 Jan, 2023 1 commit
-
-
raghavanone authored
* Add support of backward_prefetch and forward_prefetch * Fix format issue * Fix isort issue * Fix doc style issue * Update src/transformers/trainer.py Co-authored-by:
Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> * Fix black issue * Fix doc-style issue * Make additional fsdp parameters into fsdp config * Fix black issue * Remove unused imports * Fix doc style issues * Incorporate PR feedbacks * Remove unused imports * Fix tests * Fix tests * Fix tests * Fix tests * Fix tests * Update src/transformers/training_args.py Co-authored-by:
Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> * Fix tests * Incorporate PR feedbacks * Incorporate PR feedbacks * Fix black issues --------- Co-authored-by:
Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
-
- 24 Jan, 2023 1 commit
-
-
Frederico Tommasi Caroli authored
* Update TrainingArguments.label_names docs * Change wording * Change wording
-
- 18 Jan, 2023 1 commit
-
-
jeffhataws authored
* Add XLA torchrun support * Clarify that currently DDP doesn't work with torch.distributed XLA backend yet * Enable DDP with torchrun and XLA (now available in PT-XLA 1.13) * Add check for AWS Neuron availability and AWS Neuron specific compiler flag * Change the new test's name to TestTrainerDistributedNeuronCore * Remove "assert" and replace raised exception * Remove compiler flag as it is optional. If needed, will be another PR. * Use TORCHELASTIC_RUN_ID to determine whether torchrun is used
-
- 29 Dec, 2022 1 commit
-
-
Alex Hedges authored
* Remove non-breaking space in comment It was likely added unintionally. * Remove remaining non-breaking spaces
-
- 14 Dec, 2022 1 commit
-
-
amyeroberts authored
* Replaces xxx_required with requires_backends * Fixup
-
- 08 Dec, 2022 2 commits
-
-
jeffhataws authored
-
Sylvain Gugger authored
* Migrate torchdynamo to torch.compile * Add docstring and generic option * Properly use the function... * Reorg args
-
- 30 Nov, 2022 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Repurpose torchdynamo training args towards torch._dynamo * Add doc
-
- 28 Nov, 2022 2 commits
-
-
Henghui Zhu authored
-
Wang, Yi authored
with pytorch cpu only version. without --no_cuda, using --bf16 will trigger error like "Your setup doesn't support bf16/gpu. You need torch>=1.10, using Ampere GPU with cuda>=11.0" (#20445)
-
- 18 Nov, 2022 1 commit
-
-
atturaioe authored
* Add AnyPrecisionAdamW optimizer * Add optim_args argument to TrainingArgs * Add tests for AnyPrecisionOptimizer * Change AnyPrecisionAdam default params to float32 * Move default_anyprecision_kwargs in trainer test * Rename AnyPrecisionAdamW
-
- 15 Nov, 2022 1 commit
-
-
Muhammad Sakib Khan Inan authored
* Init Update * ClearML Callbacks integration * update corrections * args reporting updated * {'tensorboard': False, 'pytorch': False} * ClearML Tests added * add clearml * output_uri=True in Task.init * reformatted integrations.py * reformatted and fixed * IF-ELSE statement issue on "has_clearml" resolved * Add clearml in main callback docs * Add additional clearml documentation * Update src/transformers/integrations.py Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Accept suggestion Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Accept suggestion Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Small change in comments * Make style clearml * Accept suggestion Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Victor Sonck <victor.sonck@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 14 Oct, 2022 1 commit
-
-
Wang, Yi authored
Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
-
- 29 Sep, 2022 1 commit
-
-
atturaioe authored
-
- 22 Sep, 2022 1 commit
-
-
Sylvain Gugger authored
* Fix TrainingArguments documentation * Fix TFTrainingArguments documentation
-
- 21 Sep, 2022 1 commit
-
-
Zhong Hui authored
-
- 09 Sep, 2022 1 commit
-
-
Rafał Jankowski authored
* NeptuneCallback improvements * After review suggestions and deduplication of initial run * Added volatile checkpoints support due to missing post-rebase commit * Update README per review comments - Remove list formatting - Correct Neptune docs link Co-authored-by:Sabine <sabine.nyholm@neptune.ai>
-
- 07 Sep, 2022 1 commit
-
-
Yanming Wang authored
* Fix XLA fp16 and bf16 error checking * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 01 Sep, 2022 1 commit
-
-
Gustavo de Rosa authored
* chore(training_args): Adds support for timeout argument. * fix(training_args): Passes make style through changes. * fix(training_args): Removes wrong docstring sentence. * fix(training_args): Fixes timeout not being JSON serializable. * fix(training_args_sm): Also updates timeout to timeout_delta. * fix(training_args): Fixes PR according to suggestions.
-
- 31 Aug, 2022 1 commit
-
-
Wang, Yi authored
* oob performance improvement for cpu DDP Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * add is_psutil_available check Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
-
- 16 Aug, 2022 1 commit
-
-
Sourab Mangrulkar authored
* mac m1 `mps` integration * Update docs/source/en/main_classes/trainer.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * addressing comments * Apply suggestions from code review Co-authored-by:
Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * resolve comment Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com>
-
- 10 Aug, 2022 1 commit
-
-
Matt authored
* Finished QA example * Dodge a merge conflict * Update text classification and LM examples * Update NER example * New Keras metrics WIP, fix NER example * Update NER example * Update MC, summarization and translation examples * Add XLA warnings when shapes are variable * Make sure batch_size is consistently scaled by num_replicas * Add PushToHubCallback to all models * Add docs links for KerasMetricCallback * Add docs links for prepare_tf_dataset and jit_compile * Correct inferred model names * Don't assume the dataset has 'lang' * Don't assume the dataset has 'lang' * Write metrics in text classification * Add 'framework' to TrainingArguments and TFTrainingArguments * Export metrics in all examples and add tests * Fix training args for Flax * Update command line args for translation test * make fixup * Fix accidentally running other tests in fp16 * Remove do_train/do_eval from run_clm.py * Remove do_train/do_eval from run_mlm.py * Add tensorflow tests to circleci * Fix circleci * Update examples/tensorflow/language-modeling/run_mlm.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/test_tensorflow_examples.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/translation/run_translation.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/token-classification/run_ner.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Fix save path for tests * Fix some model card kwargs * Explain the magical -1000 * Actually enable tests this time * Skip text classification PR until we fix shape inference * make fixup Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
- 27 Jul, 2022 1 commit
-
-
Wang, Yi authored
* start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch and should import it before use Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * add doc for perf_train_cpu_many Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * update doc Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
-
- 26 Jul, 2022 1 commit
-
-
Carolyn Wang authored
* add import * format
-