- 24 Aug, 2020 4 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Add optuna hyperparameter search to Trainer * @julien-c suggestions Co-authored-by:
Julien Chaumond <chaumond@gmail.com> * Make compute_objective an arg function * Formatting * Rework to make it easier to add ray * Formatting * Initial support for Ray * Formatting * Polish and finalize * Add trial id to checkpoint with Ray * Smaller default * Use GPU in ray if available * Formatting * Fix test * Update install instruction Co-authored-by:
Richard Liaw <rliaw@berkeley.edu> * Address review comments * Formatting post-merge Co-authored-by:
Julien Chaumond <chaumond@gmail.com> Co-authored-by:
Richard Liaw <rliaw@berkeley.edu>
-
sgugger authored
-
Sylvain Gugger authored
* Don't reset the type of the dataset * Formatting * Update trainer.py Co-authored-by:Teven <teven.lescao@gmail.com>
-
- 20 Aug, 2020 4 commits
-
-
Sylvain Gugger authored
* Add a classmethod to easily build a Trainer from nlp dataset and metric * Fix docstrings * Split train/eval * Formatting * Log dropped columns + docs * Authorize callable activations * Poc for auto activation * Be framework-agnostic * Formatting * Remove class method * Remove unnecessary code
-
Sylvain Gugger authored
* Add tests to Trainer * Test if removing long breaks everything * Remove ugly hack * Fix distributed test * Use float for number of epochs
-
sgugger authored
-
Prajjwal Bhargava authored
* removed redundant arg in prepare_inputs * made same change in prediction_loop
-
- 14 Aug, 2020 1 commit
-
-
Jin Young (Daniel) Sohn authored
Currently with the bug introduced we're taking two optimizer steps per batch: one global one, where `xm.optimizer_step` injects a CRS between all cores in training, and one without. This has been affecting training accuracy (for example, XLNet GLUE on MNLI is not converging, etc.).
-
- 12 Aug, 2020 1 commit
-
-
Sylvain Gugger authored
-
- 11 Aug, 2020 1 commit
-
-
David LaPalomento authored
* Warn if debug requested without TPU fixes (#6308) Check whether a PyTorch compatible TPU is available before attempting to print TPU metrics after training has completed. This way, users who apply `--debug` without reading the documentation aren't suprised by a stacktrace. * Style Co-authored-by:Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 06 Aug, 2020 1 commit
-
-
Doug Blank authored
* Support for Comet.ml * Need to import comet first * Log this model, not the one in the backprop step * Log args as hyperparameters; use framework to allow fine control * Log hyperparameters with context * Apply black formatting * isort fix integrations * isort fix __init__ * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer_tf.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Address review comments * Style + Quality, remove Tensorboard import test Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 05 Aug, 2020 1 commit
-
-
Teven authored
* added `name` argument for wandb logging, also logging model config with trainer arguments * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added tf, post-review changes Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 03 Aug, 2020 2 commits
-
-
Jay Mody authored
* Adds train_batch_size, eval_batch_size, and n_gpu to to_sanitized_dict() output * Update wandb config logging to use to_sanitized_dict * removed n_gpu from sanitized dict * fix quality check errors
-
Teven authored
* Fixed empty asserts * black-reformatted stragglers in templates * More code quality checks * Update src/transformers/convert_marian_to_pytorch.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/convert_marian_to_pytorch.py Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> * removed unused line as per @sshleifer Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sam Shleifer <sshleifer@gmail.com>
-
- 31 Jul, 2020 2 commits
-
-
Sylvain Gugger authored
* Harmonize both Trainers API * Fix test * main_prcess -> process_zero
-
Prajjwal Bhargava authored
* fixed type; add Pytorch Native CUDA AMP support * reverted commit on modeling_utils * confirming to HF black formatting rule * changed bool value of _use_apex * scaler support for gradient clipping * fix inplace operation of clip_grad_norm * removed not while version comparison
-
- 30 Jul, 2020 1 commit
-
-
Sylvain Gugger authored
* Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by:Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Julien Plu <plu.julien@gmail.com>
-
- 28 Jul, 2020 1 commit
-
-
Lysandre Debut authored
-
- 27 Jul, 2020 1 commit
-
-
Gong Linyuan authored
* Add Adam beta1, beta2 to trainier * Make style consistent
-
- 26 Jul, 2020 1 commit
-
-
Stas Bekman authored
* don't complain about missing W&B when WANDB_DISABLED=true * reformat to elif * typo
-
- 23 Jul, 2020 1 commit
-
-
Sylvain Gugger authored
* Clean up Trainer and expose customization points * Formatting * eval_step -> prediction_step
-
- 20 Jul, 2020 3 commits
-
-
Sylvain Gugger authored
-
Stas Bekman authored
* DataParallel fixes: 1. switched to a more precise check - if self.args.n_gpu > 1: + if isinstance(model, nn.DataParallel): 2. fix tests - require the same fixup under DataParallel as the training module * another fix
-
Pradhy729 authored
* Don't pass sampler for iterable dataset * Added check for test and eval dataloaders. * Formatting * Don't pass sampler for iterable dataset * Added check for test and eval dataloaders. * Formatting * Cleaner if nesting. * Added test for trainer and iterable dataset * Formatting for test * Fixed import when torch is available only. * Added require torch decorator to helper class * Moved dataset class inside unittest * Removed nested if and changed model in test * Checking torch availability for IterableDataset
-
- 13 Jul, 2020 1 commit
-
-
Sylvain Gugger authored
* Fix Trainer in DataParallel setting * Fix typo Co-authored-by:Sam Shleifer <sshleifer@gmail.com>
-
- 01 Jul, 2020 2 commits
-
-
Sylvain Gugger authored
* Cleanup and unify Trainer/TFTrainer * Forgot to adapt TFTrainingArgs * In tf scripts n_gpu -> n_replicas * Update src/transformers/training_args.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Address review comments * Formatting * Fix typo Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Sylvain Gugger authored
* Add support for past states * Style and forgotten self * You mean, documenting is not enough? I have to actually add it too? * Add memory support during evaluation * Fix tests in eval and add TF support * No need to change this line anymore
-
- 30 Jun, 2020 1 commit
-
-
Sylvain Gugger authored
* Documentation for the Trainer API * Address review comments * Address comments
-
- 23 Jun, 2020 1 commit
-
-
Sylvain Gugger authored
* Only put tensors on a device * Type hint and unpack list comprehension
-
- 22 Jun, 2020 1 commit
-
-
Patrick von Platen authored
* finish benchmark * fix isort * fix setup cfg * retab * fix time measuring of tf graph mode * fix tf cuda * clean code * better error message
-
- 17 Jun, 2020 2 commits
-
-
Saurabh Misra authored
-
Sylvain Gugger authored
* Make default_data_collator more flexible * Accept tensors for all features * Document code * Refactor * Formatting
-
- 16 Jun, 2020 1 commit
-
-
Boris Dayma authored
-
- 15 Jun, 2020 2 commits
-
-
Boris Dayma authored
* feat(tftrainer): improve logging * fix(trainer): consider case with evaluation only * refactor(tftrainer): address comments * refactor(tftrainer): move self.epoch_logging to __init__
-
Sylvain Gugger authored
* Make DataCollator a callable * Update src/transformers/data/data_collator.py Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
- 11 Jun, 2020 1 commit
-
-
Setu Shah authored
-
- 10 Jun, 2020 2 commits
-
-
Matthew Goldey authored
* check type before logging to ensure it's a scalar * log when Trainer attempts to add a non-scalar value using TensorboardX's writer.add_scalar so we know what kinds of fixes are appropriate * black it * rephrase log message to clarify attribute was dropped Co-authored-by:
Julien Chaumond <chaumond@gmail.com> Co-authored-by:
Julien Chaumond <chaumond@gmail.com>
-
Lysandre Debut authored
* Run a single wandb instance per TPU run * wandb: self.is_world_master * make style Co-authored-by:Julien Chaumond <chaumond@gmail.com>
-
- 09 Jun, 2020 1 commit
-
-
Lysandre authored
-