"...research_projects/seq2seq-distillation/utils copy.py" did not exist on "de9e2979647338bc9617dae68c5e9dccc413fb9f"
- 18 Feb, 2021 2 commits
-
-
Stas Bekman authored
* memory tracker metrics * go back to eval for somewhat consistency * handle no-gpu case * deal with stackable eval calls * restore callback order * style * simplify the API * add test * docs * consistently use eval_ prefix * improve docs * Update src/transformers/trainer_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * rename method * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Tanmay Garg authored
Introduce warmup_ratio training argument in both TrainingArguments and TFTrainingArguments classes (#6673)
-
- 11 Feb, 2021 2 commits
-
-
Sylvain Gugger authored
* Refactor things out of main train * Store signature * Add SageMakerTrainer * Init + Copyright * Address review comments
-
Stas Bekman authored
* init devices/setup explicitly * docs + test * simplify * cleanup * cleanup * cleanup * correct the required dist setup * derive local_rank from env LOCAL_RANK
-
- 09 Feb, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 31 Jan, 2021 1 commit
-
-
lewtun authored
* Clarify definition of seed argument in Trainer * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args_tf.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix style * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 29 Jan, 2021 2 commits
-
-
Stas Bekman authored
-
Sylvain Gugger authored
* When on sagemaker use their env variables for saves * Address review comments * Quality
-
- 28 Jan, 2021 2 commits
-
-
abhishek thakur authored
-
abhishek thakur authored
Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
- 27 Jan, 2021 1 commit
-
-
Sylvain Gugger authored
* Add a flag for find_unused_parameters * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Remove negation Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
- 26 Jan, 2021 1 commit
-
-
Sylvain Gugger authored
* Add a debug print * Adapt Trainer to use smdistributed if available * Forgotten parenthesis * Real check for sagemaker * Donforget to define device... * Woopsie, local)rank is defined differently * Update since local_rank has the proper value * Remove debug statement * More robust check for smdistributed * Quality * Deal with key not present error
-
- 22 Jan, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 20 Jan, 2021 2 commits
-
-
Sylvain Gugger authored
-
Gunjan Chhablani authored
* Fix Trainer and Args to mention AdamW, not Adam. * Update the docs for Training Arguments. * Change arguments adamw_* to adam_* * Fixed links to AdamW in TrainerArguments docs * Fix line length in Training Args docs.
-
- 14 Jan, 2021 2 commits
-
-
Sylvain Gugger authored
* Upstream (and rename) sortish sampler * Use proper sampler * Update src/transformers/trainer_pt_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Sylvain Gugger authored
* Fix Trainer with a parallel model * More clean up
-
- 13 Jan, 2021 2 commits
-
-
Sylvain Gugger authored
* Fix data parallelism in Trainer * Update src/transformers/training_args.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Stas Bekman authored
* deepspeed integration * style * add test * ds wants to do its own backward * fp16 assert * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * for clarity extract what args are being passed to deepspeed * introduce the concept of self.wrapped_model * s/self.wrapped_model/self.model_wrapped/ * complete transition to self.wrapped_model / self.model * fix * doc * give ds its own init * add custom overrides, handle bs correctly * fix test * clean up model_init logic, fix small bug * complete fix * collapse --deepspeed_config into --deepspeed * style * start adding doc notes * style * implement hf2ds optimizer and scheduler configuration remapping * oops * call get_num_training_steps absolutely when needed * workaround broken auto-formatter * deepspeed_config arg is no longer needed - fixed in deepspeed master * use hf's fp16 args in config * clean * start on the docs * rebase cleanup * finish up --fp16 * clarify the supported stages * big refactor thanks to discovering deepspeed.init_distributed * cleanup * revert fp16 part * add checkpoint-support * more init ds into integrations * extend docs * cleanup * unfix docs * clean up old code * imports * move docs * fix logic * make it clear which file it's referring to * document nodes/gpus * style * wrong format * style * deepspeed handles gradient clipping * easier to read * major doc rewrite * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * docs * switch to AdamW optimizer * style * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * clarify doc Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 11 Jan, 2021 1 commit
-
-
Stas Bekman authored
* fix bad merge - dropped code * remove --model_parallel * Deal with TrainingArguments * Use a private attr and fix batch sizes * fix _n_gpu * add is_parallel helper wrapper * fix attribute * introduce a new attribute is_model_parallel * docs * docs * Put back init False and rearrange doc * Ignore non-init args in HFArgumentParser Co-authored-by:Sylvain Gugger <sylvain.gugger@gmail.com>
-
- 05 Jan, 2021 2 commits
-
-
Stas Bekman authored
* [t5 doc] typos a few run away backticks @sgugger * style * [trainer] put fp16 args together this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read @sgugger * style
-
Stas Bekman authored
* --model_parallel hasn't been implemented for most models * make the help clear as well * implement is_parallelizable; use it * oops * remove property
-
- 22 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Add label smoothing in Trainer * Add options for scheduler and Adafactor in Trainer * Put Seq2SeqTrainer in the main lib * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments and adapt scripts * Documentation * Move test not using script to tests folder Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 18 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Add timing inside Trainer * Fix tests * Add n_objs for train * Sort logs
-
- 16 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Experimental stupport for fairscale ShardedDDP * Add import error if fairscale not available * Address review comments * Fix seq2seq trainer
-
- 15 Dec, 2020 3 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Add possibility to switch between APEX and AMP in Trainer * Update src/transformers/training_args.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Address review comments * Update src/transformers/training_args.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
lewtun authored
* Clarify impact of disable_tqdm on Jupyter Notebooks * Add weblink to argparse * Replace "dev set" with more common "validation set" in do_eval * Tweak prediction_loss_only * Tweak description of Adam hyperparameters * Add weblink to TensorBoard * Capitalise apex * Tweak local_rank description * Add weblink for wandb * Replace nlp with datasets * Tweak grammar in model_parallel * Capitalise apex * Update TensorFlow training args to match PyTorch ones * Fix style * Fix underscore in weblink Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix underscore in weblink Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix underscore in weblink Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix underscore in weblink Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add obj to datasets.Dataset Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 14 Dec, 2020 1 commit
-
-
Navjot authored
-
- 07 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Add copyright everywhere missing * Style
-
- 01 Dec, 2020 2 commits
-
-
Sylvain Gugger authored
* Add a `distributed_env` property to TrainingArguments * Change name * Address comment
-
Sylvain Gugger authored
-
- 23 Nov, 2020 2 commits
-
-
Sylvain Gugger authored
-
alexorona authored
* gpt2 and t5 parallel modeling * model_parallel utils update * adding missing model_parallel_utils Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5 * training_args reformat Reformatted training_args * style formatting Style formatting doc string length on training_args and model_parallel_utils * style changes make style && make quality for training_args and model_parallel_utils. * adding tests * minor change in trainer reverts loss calculation * Update training_args.py * Update training_args.py added back docstring language for adam_beta1 and adam_beta2 * Update trainer.py * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix style & rebase Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
LysandreJik <lysandre.debut@reseau.eseo.fr>
-
- 20 Nov, 2020 1 commit
-
-
Sylvain Gugger authored
-
- 17 Nov, 2020 1 commit
-
-
Sylvain Gugger authored
* Remove old deprecated arguments Co-authored-by:
LysandreJik <lysandre.debut@reseau.eseo.fr> * Remove needless imports * Fix tests Co-authored-by:
LysandreJik <lysandre.debut@reseau.eseo.fr>
-
- 03 Nov, 2020 1 commit
-
-
Philip May authored
* improve documentation of training_args.py - do_train - do_eval - do_predict * fix line too long * fix style with black on training_args.py * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix line length with utils/style_doc * black reformatting Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 30 Oct, 2020 1 commit
-
-
Abi See authored
* make sure that logging_first_step evaluates * fix bug with incorrect loss on logging_first_step * fix style * logging_first_step only logs, not evals
-
- 29 Oct, 2020 1 commit
-
-
Santiago Castro authored
* Fix doc errors and typos across the board * Fix a typo * Fix the CI * Fix more typos * Fix CI * More fixes * Fix CI * More fixes * More fixes
-
- 27 Oct, 2020 1 commit
-
-
Sylvain Gugger authored
* Fix a few docstrings * More fixes * Styling
-