Commits · 0106826a65cfecb7c21d4dc525d72799868e0380 · chenpangpang / transformers

20 Oct, 2021 2 commits
- Fix missing autocast() in Trainer.prediction_step() (#14075) · 0106826a
  Kwanghee Choi authored Oct 20, 2021
```
Co-authored-by: jonas <jonas@hpcnt.com>
```
  0106826a
- Trainer._load_rng_state() path fix (#14069) (#14071) · 3fefa292
  Robert Stone authored Oct 19, 2021
  
  3fefa292
11 Oct, 2021 2 commits
- Make username optional in hub_model_id (#13940) · 32634bce
  Sylvain Gugger authored Oct 11, 2021
  
  32634bce
- [Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer... · dca67968
  Patrick von Platen authored Oct 11, 2021
```
[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled (#13961)

* up

* correct test
```
  dca67968
07 Oct, 2021 1 commit
- Add missing whitespace to multiline strings (#13916) · 57420b10
  Alex Hedges authored Oct 07, 2021
  
  57420b10
06 Oct, 2021 3 commits
- Fix nan-loss condition (#13911) · 5d390e9e
  Anton Lozhkov authored Oct 06, 2021
  
  5d390e9e
- Fix hp search for non sigopt backends (#13897) · 8f2c07d3
  Sylvain Gugger authored Oct 06, 2021
  
  8f2c07d3
- Fix trainer logging_nan_inf_filter in torch_xla mode (#13896) · 77770ec7
  Yanming Wang authored Oct 06, 2021
```
* Fix logging_nan_inf_filter in torch_xla mode

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix format
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  77770ec7
05 Oct, 2021 1 commit
- Allow dataset to be an optional argument for (Distributed)LengthGroupedSampler (#13820) · 1b74af76
  Zhaofeng Wu authored Oct 05, 2021
```
* Allow dataset to be an optional argument for (Distributed)LengthGroupedSampler

* Fix
```
  1b74af76
27 Sep, 2021 1 commit

Fix loss computation in Trainer (#13760) · 3ffd18a6

Sylvain Gugger authored Sep 27, 2021


Co-authored-by: quantitative-technologies <james.hirschorn@quantitative-technologies.com>
Co-authored-by: quantitative-technologies <james.hirschorn@quantitative-technologies.com>

3ffd18a6

26 Sep, 2021 1 commit

[Trainer] Make sure shown loss in distributed training is correctly averaged... · 91df4551

Patrick von Platen authored Sep 26, 2021

[Trainer] Make sure shown loss in distributed training is correctly averaged over all workers (#13681)

* push

* improve tr loss gather

91df4551

23 Sep, 2021 2 commits

Add cpu distributed fine-tuning support for transformers Trainer API (#13574) · 8632a60d

kding1 authored Sep 23, 2021



* update trainer with cpu distributed fine-tuning support.
Signed-off-by: Ding, Ke <ke.ding@intel.com>

* Style.

* refinement on cpu dist training check.
Signed-off-by: Ding, Ke <ke.ding@intel.com>

* style.
Signed-off-by: Ding, Ke <ke.ding@intel.com>

* Test over private field not public one.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>
Co-authored-by: Funtowicz Morgan <mfuntowicz@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

8632a60d

Add SigOpt HPO to transformers trainer api (#13572) · 6a3a197f

kding1 authored Sep 23, 2021



* add sigopt hpo to transformers.
Signed-off-by: Ding, Ke <ke.ding@intel.com>

* extend sigopt changes to test code and others..
Signed-off-by: Ding, Ke <ke.ding@intel.com>

* Style.

* fix style for sigopt integration.
Signed-off-by: Ding, Ke <ke.ding@intel.com>

* Add necessary information to run unittests on SigOpt.
Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>

6a3a197f

22 Sep, 2021 1 commit

Make gradient_checkpointing a training argument (#13657) · 27d46397

Sylvain Gugger authored Sep 22, 2021



* Make gradient_checkpointing a training argument

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix tests

* Style

* document Gradient Checkpointing as a performance feature

* Small rename

* PoC for not using the config

* Adapt BC to new PoC

* Forgot to save

* Rollout changes to all other models

* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

27d46397

17 Sep, 2021 1 commit

[Trainer] Add nan/inf logging filter (#13619) · 1f9dcfc1

Patrick von Platen authored Sep 17, 2021

* finish

* add test

* push

* remove unnecessary code

* up

* correct test

* Update src/transformers/training_args.py

1f9dcfc1

14 Sep, 2021 2 commits

separate model card git push from the rest (#13514) · 054b6013
elishowk authored Sep 14, 2021
```
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
054b6013

Push to hub when saving checkpoints (#13503) · 3081d386

Sylvain Gugger authored Sep 14, 2021

* Push to hub when saving checkpoints

* Add model card

* Revert partial model card

* Small fix for checkpoint

* Add tests

* Add documentation

* Fix tests

* Bump huggingface_hub

* Fix test

3081d386

09 Sep, 2021 1 commit
- Refactor internals for Trainer push_to_hub (#13486) · e59d4d01
  Sylvain Gugger authored Sep 09, 2021
  
  e59d4d01
31 Aug, 2021 1 commit
- Handle nested dict/lists of tensors as inputs in the Trainer (#13338) · 4d10474f
  Sylvain Gugger authored Aug 31, 2021
  
  4d10474f
30 Aug, 2021 3 commits

Use DS callable API to allow hf_scheduler + ds_optimizer (#13216) · 42f359d0

Olatunji Ruwase authored Aug 30, 2021



* Use DS callable API to allow hf_scheduler + ds_optimizer

* Preserve backward-compatibility

* Restore backward compatibility

* Tweak arg positioning

* Tweak arg positioning

* bump the required version

* Undo indent

* Update src/transformers/trainer.py

* style
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

42f359d0

Fall back to `observed_batch_size` when the `dataloader` does not know the `batch_size`. (#13188) · 03056730
Maxwell Forbes authored Aug 30, 2021

03056730
Check None before going through iteration (#13250) · d5064953
Li-Huai (Allan) Lin authored Aug 30, 2021
```
* Check None before going through iteration

* Format
```
d5064953

23 Aug, 2021 1 commit

SageMaker: Fix sagemaker DDP & metric logs (#13181) · f689743e

Philipp Schmid authored Aug 23, 2021



* Barrier -> barrier

* added logger for metrics

* removed stream handler in trainer

* moved handler

* removed streamhandler from trainer

* updated test image and instance type added datasets version to test

* Update tests/sagemaker/scripts/pytorch/requirements.txt
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

f689743e

19 Aug, 2021 1 commit
- Update namespaces inside torch.utils.data to the latest. (#13167) · 91ff480e
  Allan Lin authored Aug 19, 2021
```
* Update torch.utils.data namespaces to the latest.

* Format

* Update Dataloader.

* Style
```
  91ff480e
06 Aug, 2021 2 commits

Tpu tie weights (#13030) · 7fcee113

Sylvain Gugger authored Aug 06, 2021

* Fix tied weights on TPU

* Manually tie weights in no trainer examples

* Fix for test

* One last missing

* Gettning owned by my scripts

* Address review comments

* Fix test

* Fix tests

* Fix reformer tests

7fcee113

[WIP] Disentangle auto modules from other modeling files (#13023) · 9870093f

Sylvain Gugger authored Aug 06, 2021

* Initial work

* All auto models

* All tf auto models

* All flax auto models

* Tokenizers

* Add feature extractors

* Fix typos

* Fix other typo

* Use the right config

* Remove old mapping names and update logic in AutoTokenizer

* Update check_table

* Fix copies and check_repo script

* Fix last test

* Add back name

* clean up

* Update template

* Update template

* Forgot a )

* Use alternative to fixup

* Fix TF model template

* Address review comments

* Address review comments

* Style

9870093f

03 Aug, 2021 1 commit

fix `Trainer.train(resume_from_checkpoint=False)` is causing an exception (#12981) · b7439675

Philip May authored Aug 03, 2021



* fix #12970

* Update tests/test_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove unnecessary issue link

* fix test formatting
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

b7439675

30 Jul, 2021 1 commit
- Add substep callbacks (#12951) · fe6ff4a9
  wulu473 authored Jul 30, 2021
```
Co-authored-by: Lukas Wutschitz <lukas.wutschitz@microsoft.com>
```
  fe6ff4a9
26 Jul, 2021 1 commit
- Fix push_to_hub for TPUs (#12895) · ba15fe79
  Sylvain Gugger authored Jul 26, 2021
  
  ba15fe79
21 Jul, 2021 3 commits
- Raise warning in HP search when hp is not in args (#12831) · 8c2384d8
  Sylvain Gugger authored Jul 21, 2021
  
  8c2384d8
- [debug] DebugUnderflowOverflow doesn't work with DP (#12816) · cf0755aa
  Stas Bekman authored Jul 21, 2021
  
  cf0755aa
- Refer warmup_ratio when setting warmup_num_steps. (#12818) · 037bdf82
  Masatoshi TSUCHIYA authored Jul 21, 2021
```
* Refer warmup_ratio when setting warmup_num_steps.

* Add a method to get number of warmup steps to TrainerArguments class.

* Fix.

* Fix.
```
  037bdf82
14 Jul, 2021 1 commit

[trainer] release tmp memory in checkpoint load (#12718) · 1a3deae8

Stas Bekman authored Jul 14, 2021



* [trainer] release tmp memory in checkpoint load

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1a3deae8

09 Jul, 2021 1 commit
- Fix arg count for partial functions (#12609) · 18ca59e1
  Sylvain Gugger authored Jul 09, 2021
  
  18ca59e1
08 Jul, 2021 1 commit
- Don't stop at num_epochs when using IterableDataset (#12561) · 0085e712
  Sylvain Gugger authored Jul 08, 2021
  
  0085e712
07 Jul, 2021 2 commits
- Double check for attribute num_examples (#12562) · b8682609
  Sylvain Gugger authored Jul 07, 2021
```
* Double check for attribute

* Use right name
```
  b8682609
- [trainer] add option to ignore keys for the train function too (#11719) (#12551) · 3488ef5a
  shabie authored Jul 07, 2021
  
  3488ef5a
30 Jun, 2021 1 commit

Add option to save on each training node (#12421) · 31a81109

Sylvain Gugger authored Jun 30, 2021



* Add option to save on each training node

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

31a81109

23 Jun, 2021 1 commit

Clean push to hub API (#12187) · 53c60bab

Sylvain Gugger authored Jun 23, 2021



* Clean push to hub API

* Create working dir if it does not exist

* Different tweak

* New API + all models + test Flax

* Adds the Trainer clean up

* Update src/transformers/file_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* (nit) output types

* No need to set clone_from when folder exists

* Update src/transformers/trainer.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Add generated_from_trainer tag

* Update to new version

* Fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

53c60bab

22 Jun, 2021 1 commit
- [trainer] 2 bug fixes and a rename (#12309) · ebe54135
  Stas Bekman authored Jun 22, 2021
```
* bug fixes and a rename

* add extended DDP test
```
  ebe54135