Commits · dc193c906dfb3b9663f8963735c46e030a15b914 · chenpangpang / transformers

27 Sep, 2021 1 commit
- Release: v4.11.0 · dc193c90
  Lysandre authored Sep 27, 2021
  
  dc193c90
24 Sep, 2021 1 commit
- Add model card creation snippet to example scripts (#13730) · 38580455
  Gunjan Chhablani authored Sep 24, 2021
```
* Update run_glue.py

* Update run_glue.py

* Add model creation snippet to other scripts

* Fix style
```
  38580455
22 Sep, 2021 1 commit

Make gradient_checkpointing a training argument (#13657) · 27d46397

Sylvain Gugger authored Sep 22, 2021



* Make gradient_checkpointing a training argument

* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Fix tests

* Style

* document Gradient Checkpointing as a performance feature

* Small rename

* PoC for not using the config

* Adapt BC to new PoC

* Forgot to save

* Rollout changes to all other models

* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>

27d46397

21 Sep, 2021 1 commit

Add push_to_hub to no_trainer examples (#13659) · b7d264be

Sylvain Gugger authored Sep 21, 2021

* Add push_to_hub to no_trainer examples

* Quality

* Document integration

* Roll out to other examples

b7d264be

31 Aug, 2021 2 commits
- Docs for v4.10.0 · 5ee67a44
  Lysandre authored Aug 31, 2021
  
  5ee67a44
- Release: v4.10.0 · d12bbe49
  Lysandre authored Aug 31, 2021
  
  d12bbe49
28 Aug, 2021 1 commit

examples: only use keep_linebreaks when reading TXT files (#13320) · 4046e66e

Stefan Schweter authored Aug 28, 2021

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

4046e66e

27 Aug, 2021 1 commit

examples: add keep_linebreaks option to CLM examples (#13150) · 319d840b

Stefan Schweter authored Aug 27, 2021

* examples: add keep_linebreaks option to text dataset loader for all CLM examples

* examples: introduce new keep_linebreaks option as data argument in CLM examples

319d840b

19 Aug, 2021 1 commit
- Update namespaces inside torch.utils.data to the latest. (#13167) · 91ff480e
  Allan Lin authored Aug 19, 2021
```
* Update torch.utils.data namespaces to the latest.

* Format

* Update Dataloader.

* Style
```
  91ff480e
06 Aug, 2021 1 commit

Tpu tie weights (#13030) · 7fcee113

Sylvain Gugger authored Aug 06, 2021

* Fix tied weights on TPU

* Manually tie weights in no trainer examples

* Fix for test

* One last missing

* Gettning owned by my scripts

* Address review comments

* Fix test

* Fix tests

* Fix reformer tests

7fcee113

26 Jul, 2021 1 commit
- Add accelerate to examples requirements (#12888) · 303989de
  Sylvain Gugger authored Jul 26, 2021
  
  303989de
22 Jul, 2021 2 commits
- Docs for v4.10.0dev0 · 40de2d5a
  Lysandre authored Jul 22, 2021
  
  40de2d5a
- Release: v4.9.0 · 72aee83c
  Lysandre authored Jul 22, 2021
  
  72aee83c
08 Jul, 2021 1 commit
- Fix group_lengths for short datasets (#12558) · 6f1adc43
  Sylvain Gugger authored Jul 08, 2021
  
  6f1adc43
07 Jul, 2021 1 commit

MLM training fails with no validation file(same as #12406 for pytorch now) (#12517) · 1d6623c6

Souvic Chakraborty authored Jul 07, 2021

* Validation split percentage to be used for custom data files also

Issue same as https://github.com/huggingface/transformers/issues/12406 fixed for pytorch branch run_mlm.py

* Validation split added in the right place

* Update run_clm.py

* validation split added for custom files

* Validation split added for custom files

* Update run_plm.py

* fixed validation split for custom files as input for pytorch examples in lm

* Update run_clm_no_trainer.py

* args modified

1d6623c6

28 Jun, 2021 2 commits

[Examples] Added context manager to datasets map (#12367) · 04dbea31

Bhadresh Savani authored Jun 28, 2021

* added cotext manager to datasets map

* fixed style and spaces

* fixed warning of deprecation

* changed desc

04dbea31

Update run_mlm.py (#12344) · 9490d668

Taha ValizadehAslani authored Jun 28, 2021

Before the code could not be used for validation only because of this line:
extension = data_args.train_file.split(".")[-1]
was assuming that extension must be extracted from the training dataset. This line would run regardless of the training or validation options of the user. This would lead to an error if the user only wants to run an evaluation only and does not want to do train (because the training file does not exist). I modified it to extract extension from the training file if the user wants to do train and extract it from the validation file if the user wants to run eval. This way the code can be used for both training and validation separately.

9490d668

25 Jun, 2021 2 commits
- [Examples] Replicates the new --log_level feature to all trainer-based pytorch (#12359) · 539ee456
  Bhadresh Savani authored Jun 25, 2021
```
* added log_level

* fix comment

* fixed log_level

* Trigger CI

* Unfied logging

* simplified args for log_level
```
  539ee456
- remove extra white space from log format (#12360) · 4a872cae
  Stas Bekman authored Jun 25, 2021
  
  4a872cae
23 Jun, 2021 2 commits
- v4.9.0.dev0 · 2150dfed
  Sylvain Gugger authored Jun 23, 2021
  
  2150dfed
- Release: v4.8.0 · 9252a512
  Sylvain Gugger authored Jun 23, 2021
  
  9252a512
17 Jun, 2021 3 commits
- update desc for map in all examples (#12226) · e43e1126
  Bhavitvya Malik authored Jun 18, 2021
```
* update desc for map in all examples

* added plm

* suggestions
```
  e43e1126
- Docs for v4.8.0 · 0daadc19
  Lysandre authored Jun 17, 2021
  
  0daadc19
- Release: v4.7.0 · 7a6c9fab
  Lysandre authored Jun 17, 2021
  
  7a6c9fab
15 Jun, 2021 1 commit

Model card defaults (#12122) · 7d7ceca3

Sylvain Gugger authored Jun 15, 2021



* [WIP] Model card defaults

* finetuned_from default value

* Add all mappings to the mapping file

* Be more defensive on finetuned_from arg

* Add default task tag

* Separate tags from tasks

* Edge case for dataset

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

7d7ceca3

14 Jun, 2021 3 commits

[lm examples] Replicate --config_overrides addition to other LM examples (#12135) · 9de62cfb

Kumar Abhishek authored Jun 14, 2021



* [lm examples] Replicate --config_overrides addition to other LM examples

* Removing no trainer files changes

* Update README
Co-authored-by: Kumar Abhishek <kabhishek@expedia.com>

9de62cfb

Use text_column_name variable instead of "text" (#12132) · cd7961b6

Nicholas Broad authored Jun 14, 2021



* Use text_column_name variable instead of "text"

`text_column_name` was already defined above where I made the changes and it was also used below where I made changes.

This is a very minor change. If a dataset does not use "text" as the column name, then the `tokenize_function` will now use whatever column is assigned to `text_column_name`. `text_column_name` is just the first column name if "text" is not a column name. It makes the function a little more robust, though I would assume that 90% + of datasets use "text" anyway.

* black formatting

* make style
Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>

cd7961b6

Don't log anything before logging is setup in examples (#12121) · b8ab5413
Sylvain Gugger authored Jun 14, 2021
```
* Don't log anything before logging is setup in examples

* Last example
```
b8ab5413

08 Jun, 2021 2 commits

Properly indent block_size (#12070) · fd690283
Sylvain Gugger authored Jun 08, 2021

fd690283

Add torch to requirements.txt in language-modeling (#12040) · 49bee0ae

cdleong authored Jun 08, 2021



* Add torch to requirements.txt in language-modeling

* Update examples/pytorch/language-modeling/requirements.txt
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

49bee0ae

25 May, 2021 3 commits

[Examples] create model with custom config on the fly (#11798) · 1b653010

Stas Bekman authored May 25, 2021



* create custom model on the flight

* better wording

* add update_from_string

* cleanup

* cleanup

* Update src/transformers/configuration_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more bool options

* style

* fix logger

* add test

* add the doc

* assert on conflict of options
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1b653010

[lm examples] fix overflow in perplexity calc (#11855) · 6287c929
Stas Bekman authored May 25, 2021
```
* fix overflow in perplexity calc

* use inf

* fix
```
6287c929
Add option to log only once in multinode training (#11819) · f086652b
Sylvain Gugger authored May 25, 2021
```
* Add option to long only once in multinode training

* Use an alternate property
```
f086652b

12 May, 2021 2 commits
- Docs for v4.7.0.dev0 · d77eb0cf
  Lysandre authored May 12, 2021
  
  d77eb0cf
- Release: v4.6.0 · 64e78564
  Lysandre authored May 12, 2021
  
  64e78564
11 May, 2021 1 commit

Auto modelcard (#11599) · a135f595

Sylvain Gugger authored May 11, 2021



* Autogenerate model cards from the Trainer

* ModelCard deprecated

* Fix test

* Style

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments

* Quality

* With all metadata

* Metadata

* Post-merge conflict mess

* Data args and all examples

* Default license and languages when possible
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

a135f595

07 May, 2021 1 commit
- Fix comment in run_clm_no_trainer.py (#11624) · 6f40e317
  Jonathan Chang authored May 07, 2021
  
  6f40e317
29 Apr, 2021 1 commit
- Split checkpoint from model_name_or_path in examples (#11492) · b29eb247
  Sylvain Gugger authored Apr 29, 2021
```
* Split checkpoint from model_name_or_path in examples

* Address review comments

* Address review comments
```
  b29eb247
26 Apr, 2021 1 commit

[Examples] Fixes inconsistency around eval vs val and predict vs test (#11380) · 1d30ec95

Bhadresh Savani authored Apr 26, 2021

* added changes for uniformity

* modified files

* corrected typo

* fixed qa scripts

* fix typos

* fixed predict typo in qa no trainer

* fixed test file

* reverted trainer changes

* reverted trainer changes in custom exmaples

* updated readme

* added changes in deepspeed test

* added changes for predict and eval

1d30ec95

23 Apr, 2021 1 commit

Trainer push to hub (#11328) · bf2e0cf7

Sylvain Gugger authored Apr 23, 2021



* Initial support for upload to hub

* push -> upload

* Fixes + examples

* Fix torchhub test

* Torchhub test I hate you

* push_model_to_hub -> push_to_hub

* Apply mixin to other pretrained models

* Remove ABC inheritance

* Add tests

* Typo

* Run tests

* Install git-lfs

* Change approach

* Add push_to_hub to all

* Staging test suite

* Typo

* Maybe like this?

* More deps

* Cache

* Adapt name

* Quality

* MOAR tests

* Put it in testing_utils

* Docs + torchhub last hope

* Styling

* Wrong method

* Typos

* Update src/transformers/file_utils.py
Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address review comments

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

bf2e0cf7