Commits · 8f98faf934f630061f60988e66ced2697c4e0319 · chenpangpang / transformers

24 Aug, 2020 4 commits

Lat fix for Ray HP search (#6691) · 8f98faf9
Sylvain Gugger authored Aug 24, 2020

8f98faf9

Add hyperparameter search to Trainer (#6576) · 3a7fdd3f

Sylvain Gugger authored Aug 24, 2020



* Add optuna hyperparameter search to Trainer

* @julien-c suggestions
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Make compute_objective an arg function

* Formatting

* Rework to make it easier to add ray

* Formatting

* Initial support for Ray

* Formatting

* Polish and finalize

* Add trial id to checkpoint with Ray

* Smaller default

* Use GPU in ray if available

* Formatting

* Fix test

* Update install instruction
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Address review comments

* Formatting post-merge
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

3a7fdd3f

Missing commit · 0a850d21
sgugger authored Aug 24, 2020

0a850d21

Don't reset the dataset type + plug for rm unused columns (#6683) · b30879fe

Sylvain Gugger authored Aug 24, 2020



* Don't reset the type of the dataset

* Formatting

* Update trainer.py
Co-authored-by: Teven <teven.lescao@gmail.com>

b30879fe

20 Aug, 2020 4 commits

Trainer automatically drops unused columns in nlp datasets (#6449) · e5f45227

Sylvain Gugger authored Aug 20, 2020

* Add a classmethod to easily build a Trainer from nlp dataset and metric

* Fix docstrings

* Split train/eval

* Formatting

* Log dropped columns + docs

* Authorize callable activations

* Poc for auto activation

* Be framework-agnostic

* Formatting

* Remove class method

* Remove unnecessary code

e5f45227

Add tests to Trainer (#6605) · 573bdb0a

Sylvain Gugger authored Aug 20, 2020

* Add tests to Trainer

* Test if removing long breaks everything

* Remove ugly hack

* Fix distributed test

* Use float for number of epochs

573bdb0a

Fix CI · b3e54698
sgugger authored Aug 20, 2020

b3e54698
removed redundant arg in prepare_inputs (#6614) · 33bf4264
Prajjwal Bhargava authored Aug 20, 2020
```
* removed redundant arg in prepare_inputs

* made same change in prediction_loop
```
33bf4264

14 Aug, 2020 1 commit

Fix TPU Convergence bug introduced by PR#6151 (#6488) · 24107c2c

Jin Young (Daniel) Sohn authored Aug 14, 2020

Currently with the bug introduced we're taking two optimizer steps per
batch: one global one, where `xm.optimizer_step` injects a CRS between
all cores in training, and one without. This has been affecting training
accuracy (for example, XLNet GLUE on MNLI is not converging, etc.).

24107c2c

12 Aug, 2020 1 commit
- Move prediction_loss_only to TrainingArguments (#6426) · 34fabe16
  Sylvain Gugger authored Aug 12, 2020
  
  34fabe16
11 Aug, 2020 1 commit

Warn if debug requested without TPU fixes (#6308) (#6390) · 87e124c2

David LaPalomento authored Aug 11, 2020



* Warn if debug requested without TPU fixes (#6308)
Check whether a PyTorch compatible TPU is available before attempting to print TPU metrics after training has completed. This way, users who apply `--debug` without reading the documentation aren't suprised by a stacktrace.

* Style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

87e124c2

06 Aug, 2020 1 commit

Adds comet_ml to the list of auto-experiment loggers (#6176) · b923871b

Doug Blank authored Aug 06, 2020



* Support for Comet.ml

* Need to import comet first

* Log this model, not the one in the backprop step

* Log args as hyperparameters; use framework to allow fine control

* Log hyperparameters with context

* Apply black formatting

* isort fix integrations

* isort fix __init__

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/trainer_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Address review comments

* Style + Quality, remove Tensorboard import test
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

b923871b

05 Aug, 2020 1 commit

Trainer + wandb quality of life logging tweaks (#6241) · bd0eab35

Teven authored Aug 05, 2020



* added `name` argument for wandb logging, also logging model config with trainer arguments

* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* added tf, post-review changes
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

bd0eab35

03 Aug, 2020 2 commits

Adds train_batch_size, eval_batch_size, and n_gpu to to_sanitized_dict output for logging. (#5331) · cedc547e

Jay Mody authored Aug 03, 2020

* Adds train_batch_size, eval_batch_size, and n_gpu to to_sanitized_dict() output

* Update wandb config logging to use to_sanitized_dict

* removed n_gpu from sanitized dict

* fix quality check errors

cedc547e

Empty assert hunt (#6056) · 5a0dac53

Teven authored Aug 03, 2020



* Fixed empty asserts

* black-reformatted stragglers in templates

* More code quality checks

* Update src/transformers/convert_marian_to_pytorch.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/convert_marian_to_pytorch.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* removed unused line as per @sshleifer
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

5a0dac53

31 Jul, 2020 2 commits

Harmonize both Trainers API (#6157) · 86caab1e
Sylvain Gugger authored Jul 31, 2020
```
* Harmonize both Trainers API

* Fix test

* main_prcess -> process_zero
```
86caab1e

Add Pytorch Native AMP support in Trainer (#6151) · 0034a1d2

Prajjwal Bhargava authored Jul 31, 2020

* fixed type; add Pytorch Native CUDA AMP support

* reverted commit on modeling_utils

* confirming to HF black formatting rule

* changed bool value of _use_apex

* scaler support for gradient clipping

* fix inplace operation of clip_grad_norm

* removed not while version comparison

0034a1d2

30 Jul, 2020 1 commit

Switch from return_tuple to return_dict (#6138) · 91cb9546

Sylvain Gugger authored Jul 30, 2020



* Switch from return_tuple to return_dict

* Fix test

* [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614)

* Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests

* AutoModels


Tiny tweaks

* Style

* Final changes before merge

* Re-order for simpler review

* Final fixes

* Addressing @sgugger's comments

* Test MultipleChoice

* Rework TF trainer (#6038)

* Fully rework training/prediction loops

* fix method name

* Fix variable name

* Fix property name

* Fix scope

* Fix method name

* Fix tuple index

* Fix tuple index

* Fix indentation

* Fix variable name

* fix eval before log

* Add drop remainder for test dataset

* Fix step number + fix logging datetime

* fix eval loss value

* use global step instead of step + fix logging at step 0

* Fix logging datetime

* Fix global_step usage

* Fix breaking loop + logging datetime

* Fix step in prediction loop

* Fix step breaking

* Fix train/test loops

* Force TF at least 2.2 for the trainer

* Use assert_cardinality to facilitate the dataset size computation

* Log steps per epoch

* Make tfds compliant with TPU

* Make tfds compliant with TPU

* Use TF dataset enumerate instead of the Python one

* revert previous commit

* Fix data_dir

* Apply style

* rebase on master

* Address Sylvain's comments

* Address Sylvain's and Lysandre comments

* Trigger CI

* Remove unused import

* Switch from return_tuple to return_dict

* Fix test

* Add recent model
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>

91cb9546

28 Jul, 2020 1 commit
- Logs should not be hidden behind a logger.info (#6097) · 06834bc3
  Lysandre Debut authored Jul 28, 2020
  
  06834bc3
27 Jul, 2020 1 commit
- Allow to set Adam beta1, beta2 in TrainingArgs (#5592) · b21993b3
  Gong Linyuan authored Jul 27, 2020
```
* Add Adam beta1, beta2 to trainier

* Make style consistent
```
  b21993b3
26 Jul, 2020 1 commit
- don't complain about missing W&B when WANDB_DISABLED=true (#6036) · fb0589a0
  Stas Bekman authored Jul 26, 2020
```
* don't complain about missing W&B when WANDB_DISABLED=true

* reformat to elif

* typo
```
  fb0589a0
23 Jul, 2020 1 commit
- Cleanup Trainer and expose customization points (#5982) · e168488a
  Sylvain Gugger authored Jul 23, 2020
```
* Clean up Trainer and expose customization points

* Formatting

* eval_step -> prediction_step
```
  e168488a
20 Jul, 2020 3 commits

Clarify arg class (#5916) · 4781afd0
Sylvain Gugger authored Jul 20, 2020

4781afd0

DataParallel fixes (#5733) · 35cb101e

Stas Bekman authored Jul 20, 2020

* DataParallel fixes:

1. switched to a more precise check
-        if self.args.n_gpu > 1:
+        if isinstance(model, nn.DataParallel):

2. fix tests - require the same fixup under DataParallel as the training module

* another fix

35cb101e

Trainer support for iterabledataset (#5834) · 290b6e18

Pradhy729 authored Jul 20, 2020

* Don't pass sampler for iterable dataset

* Added check for test and eval dataloaders.

* Formatting

* Don't pass sampler for iterable dataset

* Added check for test and eval dataloaders.

* Formatting

* Cleaner if nesting.

* Added test for trainer and iterable dataset

* Formatting for test

* Fixed import when torch is available only.

* Added require torch decorator to helper class

* Moved dataset class inside unittest

* Removed nested if and changed model in test

* Checking torch availability for IterableDataset

290b6e18

13 Jul, 2020 1 commit

Fix Trainer in DataParallel setting (#5685) · ce374ba8

Sylvain Gugger authored Jul 13, 2020



* Fix Trainer in DataParallel setting

* Fix typo
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

ce374ba8

01 Jul, 2020 2 commits

Clean up diffs in Trainer/TFTrainer (#5417) · 734a28a7

Sylvain Gugger authored Jul 01, 2020



* Cleanup and unify Trainer/TFTrainer

* Forgot to adapt TFTrainingArgs

* In tf scripts n_gpu -> n_replicas

* Update src/transformers/training_args.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Formatting

* Fix typo
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

734a28a7

Add support for past states (#5399) · 64e3d966

Sylvain Gugger authored Jul 01, 2020

* Add support for past states

* Style and forgotten self

* You mean, documenting is not enough? I have to actually add it too?

* Add memory support during evaluation

* Fix tests in eval and add TF support

* No need to change this line anymore

64e3d966

30 Jun, 2020 1 commit
- Documentation for the Trainer API (#5383) · 87716a6d
  Sylvain Gugger authored Jun 30, 2020
```
* Documentation for the Trainer API

* Address review comments

* Address comments
```
  87716a6d
23 Jun, 2020 1 commit
- Only put tensors on a device (#5223) · 9022ef02
  Sylvain Gugger authored Jun 23, 2020
```
* Only put tensors on a device

* Type hint and unpack list comprehension
```
  9022ef02
22 Jun, 2020 1 commit

Benchmarks (#4912) · fa0be6d7

Patrick von Platen authored Jun 22, 2020

* finish benchmark

* fix isort

* fix setup cfg

* retab

* fix time measuring of tf graph mode

* fix tf cuda

* clean code

* better error message

fa0be6d7

17 Jun, 2020 2 commits
- Fixing TPU training by disabling wandb.watch gradients logging for TPU (#4926) · bb154ac5
  Saurabh Misra authored Jun 17, 2020
  
  bb154ac5
- Make default_data_collator more flexible and deprecate old behavior (#5060) · 20fa8289
  Sylvain Gugger authored Jun 17, 2020
```
* Make default_data_collator more flexible

* Accept tensors for all features

* Document code

* Refactor

* Formatting
```
  20fa8289
16 Jun, 2020 1 commit
- refactor(wandb): consolidate import (#5044) · edcb3ac5
  Boris Dayma authored Jun 16, 2020
  
  edcb3ac5
15 Jun, 2020 2 commits

feat(TFTrainer): improve logging (#4946) · 1bf4098e

Boris Dayma authored Jun 15, 2020

* feat(tftrainer): improve logging

* fix(trainer): consider case with evaluation only

* refactor(tftrainer): address comments

* refactor(tftrainer): move self.epoch_logging to __init__

1bf4098e

Make DataCollator a callable (#5015) · 1affde2f

Sylvain Gugger authored Jun 15, 2020



* Make DataCollator a callable

* Update src/transformers/data/data_collator.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

1affde2f

11 Jun, 2020 1 commit
- TFTrainer: Add dataloader_drop_last (#4925) · 699541c4
  Setu Shah authored Jun 10, 2020
  
  699541c4
10 Jun, 2020 2 commits

check type before logging in trainer to ensure values are scalars (#4883) · f6da8b22

Matthew Goldey authored Jun 10, 2020



* check type before logging to ensure it's a scalar

* log when Trainer attempts to add a non-scalar value using TensorboardX's writer.add_scalar so we know what kinds of fixes are appropriate

* black it

* rephrase log message to clarify attribute was dropped
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

f6da8b22

Run a single wandb instance per TPU run (#4851) · 3ae2e86b

Lysandre Debut authored Jun 10, 2020



* Run a single wandb instance per TPU run

* wandb: self.is_world_master

* make style
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

3ae2e86b

09 Jun, 2020 1 commit
- uninstalled wandb raises AttributeError · 13aa1741
  Lysandre authored Jun 09, 2020
  
  13aa1741