Commits · 38a716cd41f22f6a7d5ff3dc081903090198803a · chenpangpang / transformers

26 Apr, 2021 1 commit
- make style (#11442) · 32dbb2d9
  Patrick von Platen authored Apr 26, 2021
  
  32dbb2d9
23 Apr, 2021 2 commits

Sylvain Gugger authored Apr 23, 2021



* Initial support for upload to hub

* push -> upload

* Fixes + examples

* Fix torchhub test

* Torchhub test I hate you

* push_model_to_hub -> push_to_hub

* Apply mixin to other pretrained models

* Remove ABC inheritance

* Add tests

* Typo

* Run tests

* Install git-lfs

* Change approach

* Add push_to_hub to all

* Staging test suite

* Typo

* Maybe like this?

* More deps

* Cache

* Adapt name

* Quality

* MOAR tests

* Put it in testing_utils

* Docs + torchhub last hope

* Styling

* Wrong method

* Typos

* Update src/transformers/file_utils.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address review comments

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

bf2e0cf7

Fixed trainer total_flos relaoding in distributed mode (#11383) · 7bc86bea
Teven authored Apr 23, 2021
```
* Fixed trainer total_flos relaoding in distributed mode

* logging flos at the end of training
```
7bc86bea

22 Apr, 2021 1 commit
- Fix Trainer with remove_unused_columns=False (#11382) · 3ed5e97b
  Sylvain Gugger authored Apr 22, 2021
```
* Fix Trainer with remove_unused_columns=False

* Typo
```
  3ed5e97b
21 Apr, 2021 1 commit

[deepspeed] fix resume from checkpoint (#11352) · ca7ff64f

Stas Bekman authored Apr 21, 2021

This PR fixes a bug that most likely somehow got exposed (not caused) by https://github.com/huggingface/transformers/pull/11318 - surprisingly the same test worked just fine before that other PR.

ca7ff64f

20 Apr, 2021 2 commits
- Update to use datasets remove_cloumns method (#11343) · f1b938fd
  Sylvain Gugger authored Apr 20, 2021
```
* Update to use datasets remove_cloumns method

* Quality
```
  f1b938fd
- Load checkpoint without re-creating the model (#11318) · c0328a6c
  Sylvain Gugger authored Apr 19, 2021
  
  c0328a6c
19 Apr, 2021 2 commits
- [Trainer] Add a progress bar for batches skipped (#11324) · 95037a16
  Sylvain Gugger authored Apr 19, 2021
  
  95037a16
- [Trainer] fix the placement on device with fp16_full_eval (#11322) · 95ffbe16
  Stas Bekman authored Apr 19, 2021
```
* fix the placement on device with fp16_full_eval

* deepspeed never goes on device
```
  95ffbe16
16 Apr, 2021 1 commit

Trainer support for IterableDataset for evaluation and predict (#11286) · d9c62047

Sylvain Gugger authored Apr 16, 2021

* Bulk of the work

* Polish and tests

* Update QA Trainer

* Avoid breaking the predict method

* Deprecation warnings

* Store real eval dataloder

* Get eval dataset reference before wrap

d9c62047

15 Apr, 2021 1 commit
- Support for set_epoch (#11258) · 6e1ee47b
  Sylvain Gugger authored Apr 15, 2021
  
  6e1ee47b
14 Apr, 2021 1 commit

Trainer iterable dataset (#11254) · aaaed56f

Sylvain Gugger authored Apr 14, 2021



* IterableDatasetShard

* Test and integration in Trainer

* Update src/transformers/trainer_pt_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Style
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

aaaed56f

08 Apr, 2021 4 commits

[setup] make fairscale and deepspeed setup extras (#11151) · c2e0fd52

Stas Bekman authored Apr 08, 2021



* make fairscale and deepspeed setup extras

* fix default

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* no reason not to ask for the good version

* update the CIs
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c2e0fd52

[trainer] solve "scheduler before optimizer step" warning (#11144) · 1ed24afe
Stas Bekman authored Apr 08, 2021
```
* solve "scheduler before optimizer step" warning

* style

* correct the state evaluation test
```
1ed24afe

[DeepSpeed] ZeRO Stage 3 (#10753) · c6d66484

Stas Bekman authored Apr 08, 2021



* synced gpus

* fix

* fix

* need to use t5-small for quality tests

* notes

* complete merge

* fix a disappearing std stream problem

* start zero3 tests

* wip

* tune params

* sorting out the pre-trained model loading

* reworking generate loop wip

* wip

* style

* fix tests

* split the tests

* refactor tests

* wip

* parameterized

* fix

* workout the resume from non-ds checkpoint pass + test

* cleanup

* remove no longer needed code

* split getter/setter functions

* complete the docs

* suggestions

* gpus and their compute capabilities link

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* style

* remove invalid paramgd

* automatically configure zero3 params that rely on hidden size

* make _get_resized_embeddings zero3-aware

* add test exercising resize_token_embeddings()

* add docstring
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

c6d66484

Fix typing error in Trainer class (prediction_step) (#11138) · f8e90d6f

Jannis Born authored Apr 08, 2021

* fix: docstrings in prediction_step

* ci: Satisfy line length requirements

* ci: character length requirements

f8e90d6f

31 Mar, 2021 2 commits

Merge trainers (#10975) · cd56f3fe

Sylvain Gugger authored Mar 31, 2021



* Replace is_sagemaker_distributed_available

* Merge SageMakerTrainer into Trainer

* Test with shorter condition

* Put back deleted line

* Deprecate SageMakerTrainer and SageMakerTrainingArguments

* Apply suggestions from code review
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>

cd56f3fe

Enforce string-formatting with f-strings (#10980) · acc3bd9d

Sylvain Gugger authored Mar 31, 2021



* First third

* Styling and fix mistake

* Quality

* All the rest

* Treat %s and %d

* typo

* Missing )

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

acc3bd9d

29 Mar, 2021 1 commit

Allow use of pre-computed lengths when grouping by length. (#10953) · ae6b6963

pcuenca authored Mar 29, 2021

A new argument `length_column_name` has been added to
`TrainingArguments`, with default value `"length"`. If this column
exists and `group_by_length` is `True`, the train sampler will use
it for grouping rather than computing it before training starts.

This is an optimization that allows the user to prepare data for fast
processing, preventing sequential access to the dataset as described in
issue #10909.

ae6b6963

24 Mar, 2021 1 commit

error type of tokenizer in __init__ definition (#10879) · f81077fc

imzhengzx authored Mar 24, 2021

the orignal code in line 246 is
```
tokenizer: Optional["PreTrainedTokenizerBase"] = None,
```

it should be
```
tokenizer: Optional[PreTrainedTokenizerBase] = None,
```

f81077fc

23 Mar, 2021 1 commit
- fixed typo (#10861) · eb330e89
  Bhadresh Savani authored Mar 23, 2021
  
  eb330e89
22 Mar, 2021 2 commits
- Modify the Trainer class to handle simultaneous execution of Ray Tune and Weights & Biases (#10823) · a8d4d677
  Ruan Chaves authored Mar 22, 2021
```
* Modify the _hp_search_setup method on the Trainer class to handle the wandb argument passed by Ray Tune to model config.

* Reformat single quotes as double quotes.
```
  a8d4d677
- Add simple one character fix so that on_step_begin and on_step_end are called... · b230181d
  Sidd Karamcheti authored Mar 22, 2021
```
Add simple one character fix so that on_step_begin and on_step_end are called at the right times (#10839)
```
  b230181d
18 Mar, 2021 1 commit
- Fix distributed evaluation (#10795) · 008672e6
  Sylvain Gugger authored Mar 18, 2021
```
* Fix distributed evaluation

* Use logger
```
  008672e6
17 Mar, 2021 3 commits

Smmp batch not divisible by microbatches fix (#10778) · 0282e24e

Mansi Mane authored Mar 17, 2021



* Added debug prints

* Added config

* Added prints

* Added prints

* Added extra samples to SequentialDistributedSampler

* Added extra samples to SequentialDistributedSampler

Updated SequentialDistributedSampler call

* Added deubg prints

* Removed extra prints

* Making predicitons and labels multiple of batchsize

* updated number of microbatches

* Removed extra prints

* Made start_remainder similar to DistributedSamplerWithLoop

* Minor spacing update

* Added debug prints

Added config

Added prints

Added prints

* Added extra samples to SequentialDistributedSampler

Updated SequentialDistributedSampler call

Added extra samples to SequentialDistributedSampler

Added deubg prints

Removed extra prints

Making predicitons and labels multiple of batchsize

updated number of microbatches

Removed extra prints

Squashing redundant commits

* Made start_remainder similar to DistributedSamplerWithLoop

Minor spacing update

Made start_remainder similar to DistributedSamplerWithLoop

* Test and styling

* Rename test
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

0282e24e

make failure to find a resume checkpoint fatal + tests (#10777) · 3318c246
Stas Bekman authored Mar 17, 2021

3318c246
[DeepSpeed] improve checkpoint loading code plus tests (#10760) · cd8c93f7
Stas Bekman authored Mar 17, 2021
```
* deepspeed checkpoint loading code plus tests

* style

* style
```
cd8c93f7

16 Mar, 2021 2 commits

[Deepspeed] Allow HF optimizer and scheduler to be passed to deepspeed (#10464) · c83fbc5f

Cheng Li authored Mar 16, 2021



* pass hf optimizer and scheduler to deepspeed if not specified in ds config

* pass hf optimizer and scheduler to deepspeed if not specified in ds config

* update

* make init_deepspeed support config dict

* fix docstring formatting

* clean up trainer's comments

* add new tests

* fix type

* composit argparse doesn't work

* style

* add a new test, rename others

* document new functionality

* complete tests, add docs

* style

* correct level

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add new methods to the doc

* must tell DS we are using a non-native optimizer

* add protection against cpu_offload + HF optimizer combo

* fix the cli overrides

* sync docs + tests

* restore AdamW

* better docs

* need new version

* no longer needed

* remove outdate information

* refactor duplicated code
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c83fbc5f

Add DistributedSamplerWithLoop (#10746) · a0a027c2
Sylvain Gugger authored Mar 16, 2021
```
* Add DistributedSamplerWithLoop

* Fix typo

* Test and small fix
```
a0a027c2

15 Mar, 2021 2 commits

Multiple fixes in SageMakerTrainer (#10687) · 6bef7645

Sylvain Gugger authored Mar 15, 2021

* Handle save differently

* Missing imports

* Fix typo

* Adapt to recent changes in save_pretrained

* Forgotten brackets

* Optimizer load

* Fix world size

* Deal wth None

* Remove needless self

6bef7645

Distributed barrier before loading model (#10685) · e12d6f51
Sylvain Gugger authored Mar 15, 2021

e12d6f51

12 Mar, 2021 1 commit
- Add auto_wrap option in fairscale integration (#10673) · e8246f78
  Sylvain Gugger authored Mar 12, 2021
```
* Add auto_wrap option in fairscale integration

* Style
```
  e8246f78
11 Mar, 2021 1 commit
- Ensure metric results are JSON-serializable (#10632) · 63c295ac
  Sylvain Gugger authored Mar 11, 2021
  
  63c295ac
10 Mar, 2021 1 commit

Extend trainer logging for sm (#10633) · 49c61a4a

Philipp Schmid authored Mar 10, 2021

* renamed logging to hf_logging

* changed logging from hf_logging to logging and loggin to native_logging

* removed everything trying to fix import Trainer error

* adding imports again

* added custom add_handler function to logging.py

* make style

* added remove_handler

* added another conditional to assert

49c61a4a

09 Mar, 2021 1 commit
- Fairscale FSDP fix model save (#10596) · 0d909f6b
  Sylvain Gugger authored Mar 09, 2021
```
* Hotfix fairscale FSDP

* Evaluation works

* Save on process zero
```
  0d909f6b
08 Mar, 2021 4 commits
- Check layer types for Optimizer construction (#10598) · 3ced9b3e
  Sylvain Gugger authored Mar 08, 2021
```
* Check layer types for Optimizer construction

* Duplicate class
```
  3ced9b3e
- Revert "Tests" · 821d518e
  Sylvain Gugger authored Mar 08, 2021
```
This reverts commit b35e7b68.
```
  821d518e
- Tests · b35e7b68
  Sylvain Gugger authored Mar 08, 2021
  
  b35e7b68
- fix double wrapping + test (#10583) · f8829660
  Stas Bekman authored Mar 08, 2021
  
  f8829660
05 Mar, 2021 1 commit
- fixed dead link in trainer doc (#10554) · 9f8bc87c
  Joakim Warholm authored Mar 05, 2021
  
  9f8bc87c