Commits · cb251ba619a471afa3c2ade481c7b8027d4b123f · chenpangpang / transformers

12 Apr, 2021 1 commit
- Fix typo (#11188) · cb251ba6
  Takuya Makino authored Apr 13, 2021
  
  cb251ba6
06 Apr, 2021 2 commits
- Development on v4.6.0dev0 · 9853c5dd
  Lysandre authored Apr 06, 2021
  
  9853c5dd
- Release v4.5.0 · 4906a29f
  Lysandre authored Apr 06, 2021
  
  4906a29f
31 Mar, 2021 1 commit

Enforce string-formatting with f-strings (#10980) · acc3bd9d

Sylvain Gugger authored Mar 31, 2021



* First third

* Styling and fix mistake

* Quality

* All the rest

* Treat %s and %d

* typo

* Missing )

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

acc3bd9d

29 Mar, 2021 2 commits

Remove duplicate code · 4002f95e
Sylvain Gugger authored Mar 29, 2021

4002f95e

Add `examples/run_ner_no_trainer.py` (#10902) · d7b50ce4

Daniel Stancl authored Mar 29, 2021

* Add NER example with accelerate library

* This commit contains the first (yet really unfinished)
version of a script for showing how to train HuggingFace model
with their new accelerate library.

* Fix metric calculation

* make style quality

* mv ner_no_trainer to token-classification dir

* Delete --debug flag from running script

* hf_datasets -> raw_datasets

* Make a few slight adjustments

* Add an informative comment + rewrite a help comment

* Change header

* Fix a few things

* Enforce to use fast tokenizers only

* DataCollatorWithPadding -> DataCollatorForTokenClassification

* Change bash script: python3 -> accelerate launch

* make style

* Add a few missing things (see below)

* Add a max-lenghth padding to predictions and labels to
enable accelerate gather functionality

* Add PyTorch no trainer example to the example README.md

* Remove --do-train from args as being redundant for now

* DataCollatorWithPadding -> DataCollatorForTokenClassification

* Remove some obsolete args.do_train conditions from the script

* Delete --do_train from bash running script

* Delete use_slow_tokenizer from args

* Add unintentionally removed flag --label_all_tokens

* Delete --debug flag from running script

d7b50ce4

19 Mar, 2021 1 commit

Expand a bit the presentation of examples (#10799) · 946400fb

Sylvain Gugger authored Mar 19, 2021



* Expand a bit the presentation of examples

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

946400fb

16 Mar, 2021 2 commits
- Development on v4.5.0dev0 · 1b5ce1e6
  Lysandre authored Mar 16, 2021
  
  1b5ce1e6
- Release v4.4.0 · c988db5a
  Lysandre authored Mar 16, 2021
  
  c988db5a
15 Mar, 2021 1 commit

Add minimum version check in examples (#10724) · 4c379daf

Sylvain Gugger authored Mar 15, 2021

* Add minimum version check in examples

* Style

* No need for new line maybe?

* Add helpful comment

4c379daf

08 Mar, 2021 1 commit

Added max_sample_ arguments (#10551) · dfd16af8

Bhadresh Savani authored Mar 09, 2021

* reverted changes of logging and saving metrics

* added max_sample arguments

* fixed code

* white space diff

* reformetting code

* reformatted code

dfd16af8

27 Feb, 2021 1 commit
- updated logging and saving metrics (#10436) · aca6288f
  Bhadresh Savani authored Feb 27, 2021
```
* updated logging and saving metrics

* space removal
```
  aca6288f
19 Feb, 2021 1 commit
- Move the TF NER example (#10276) · 536aee99
  Julien Plu authored Feb 19, 2021
  
  536aee99
05 Feb, 2021 1 commit
- [examples] make run scripts executable (#10037) · 8ea412a8
  Stas Bekman authored Feb 05, 2021
```
* make executable

* make executable

* same for the template

* cleanup
```
  8ea412a8
28 Jan, 2021 1 commit
- Deprecate model_path in Trainer.train (#9854) · b4e559cf
  Sylvain Gugger authored Jan 28, 2021
  
  b4e559cf
27 Jan, 2021 1 commit
- Setup logging with a stdout handler (#9816) · f2fabedb
  Sylvain Gugger authored Jan 27, 2021
  
  f2fabedb
26 Jan, 2021 1 commit

Improve pytorch examples for fp16 (#9796) · 10e5f282

Andrea Cappelli authored Jan 26, 2021



* Pad to 8x for fp16 multiple choice example (#9752)

* Pad to 8x for fp16 squad trainer example (#9752)

* Pad to 8x for fp16 ner example (#9752)

* Pad to 8x for fp16 swag example (#9752)

* Pad to 8x for fp16 qa beam search example (#9752)

* Pad to 8x for fp16 qa example (#9752)

* Pad to 8x for fp16 seq2seq example (#9752)

* Pad to 8x for fp16 glue example (#9752)

* Pad to 8x for fp16 new ner example (#9752)

* update script template #9752

* Update examples/multiple-choice/run_swag.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/question-answering/run_qa.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/question-answering/run_qa_beam_search.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve code quality #9752
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

10e5f282

25 Jan, 2021 1 commit

Auto-resume training from checkpoint (#9776) · caf4abf7

Sylvain Gugger authored Jan 25, 2021



* Auto-resume training from checkpoint

* Update examples/text-classification/run_glue.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Roll out to other examples
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

caf4abf7

14 Jan, 2021 1 commit

Switch metrics in run_ner to datasets (#9567) · 46ed56cf

Sylvain Gugger authored Jan 14, 2021

* Switch metrics in run_ner to datasets

* Add flag to return all metrics

* Upstream (and rename) sortish_sampler

* Revert "Upstream (and rename) sortish_sampler"

This reverts commit e07d0dcf650c2bae36da011dd76c77a8bb4feb0d.

46ed56cf

06 Jan, 2021 1 commit
- Allow example to use a revision and work with private models (#9407) · 453a70d4
  Sylvain Gugger authored Jan 06, 2021
```
* Allow example to use a revision and work with private models

* Copy to other examples and template

* Styling
```
  453a70d4
22 Dec, 2020 1 commit
- Add speed metrics to all example scripts + template (#9260) · ab177588
  Sylvain Gugger authored Dec 22, 2020
  
  ab177588
18 Dec, 2020 1 commit
- Fix link to old NER fine-tuning script (#9182) · 66a14a2f
  Manuel Romero authored Dec 18, 2020
  
  66a14a2f
11 Dec, 2020 1 commit

Reorganize examples (#9010) · 783d7d26

Sylvain Gugger authored Dec 11, 2020



* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

783d7d26

07 Dec, 2020 1 commit
- Use word_ids to get labels in run_ner (#8962) · 7f9ccffc
  Sylvain Gugger authored Dec 07, 2020
```
* Use word_ids to get labels in run_ner

* Add sanity check
```
  7f9ccffc
30 Nov, 2020 1 commit
- token-classification: use is_world_process_zero instead of deprecated is_world_master() (#8828) · 19fa01ce
  Stefan Schweter authored Nov 30, 2020
  
  19fa01ce
19 Nov, 2020 1 commit
- Fix run_ner script (#8664) · 20b65860
  Sylvain Gugger authored Nov 19, 2020
```
* Fix run_ner script

* Pin datasets
```
  20b65860
17 Nov, 2020 2 commits

Remove deprecated (#8604) · dd52804f

Sylvain Gugger authored Nov 17, 2020



* Remove old deprecated arguments
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

dd52804f

Tokenizers: ability to load from model subfolder (#8586) · 042a6aa7

Julien Chaumond authored Nov 17, 2020



* <small>tiny typo</small>

* Tokenizers: ability to load from model subfolder

* use subfolder for local files as well

* Uniformize model shortcut name => model id

* from s3 => from huggingface.co
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>

042a6aa7

12 Nov, 2020 1 commit
- Try to understand and apply Sylvain's comments (#8458) · 27b3ff31
  Julien Plu authored Nov 12, 2020
  
  27b3ff31
11 Nov, 2020 1 commit

Example NER script predicts on tokenized dataset (#8468) · a38d1c7c

sarnoult authored Nov 11, 2020

The new run_ner.py script tries to run prediction on the input
test set `datasets["test"]`, but it should be the tokenized set
`tokenized_datasets["test"]`

a38d1c7c

10 Nov, 2020 1 commit
- using multi_gpu consistently (#8446) · 02bdfc02
  Stas Bekman authored Nov 10, 2020
```
* s|multiple_gpu|multi_gpu|g; s|multigpu|multi_gpu|g'

* doc
```
  02bdfc02
09 Nov, 2020 3 commits

[github CI] add a multi-gpu job for all example tests (#8341) · 190df585

Stas Bekman authored Nov 09, 2020



* add a multi-gpu job for all example tests

* run only ported tests

* rename

* explain why env is re-activated on each step

* mark all unported/checked tests with @require_torch_non_multigpu_but_fix_me

* style

* Apply suggestions from code review
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

190df585

Fix typo · 5c766ecb
Sylvain Gugger authored Nov 09, 2020

5c766ecb

Add new token classification example (#8340) · 908a2889

Sylvain Gugger authored Nov 09, 2020



* Add new token classification example

* Remove txt file

* Add test

* With actual testing done

* Less warmup is better

* Update examples/token-classification/run_ner_new.py
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Address review comments

* Fix test

* Make Lysandre happy

* Last touches and rename

* Rename in tests

* Address review comments

* More run_ner -> run_ner_old
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

908a2889

05 Nov, 2020 1 commit

change TokenClassificationTask class methods to static methods (#7902) · 52f44dd6

Bobby Donchev authored Nov 05, 2020



* change TokenClassificationTask class methods to static methods

Since we do not require self in the class methods of TokenClassificationTask we should probably switch to static methods. Also, since the class TokenClassificationTask does not contain a constructor it is currently unusable as is. By switching to static methods this fixes the issue of having to document the intent of the broken class.

Also, since the get_labels and read_examples_from_file methods are ought to be implemented. Static method definitions are unchanged even after inheritance, which means that it can be overridden, similar to other class methods.

* Trigger Build
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

52f44dd6

28 Oct, 2020 1 commit
- Upgrade PyTorch Lightning to 1.0.2 (#7852) · 5e24982e
  Sean Naren authored Oct 28, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  5e24982e
18 Sep, 2020 1 commit
- token-classification: update url of GermEval 2014 dataset (#6571) · ee9eae4e
  Stefan Schweter authored Sep 18, 2020
  
  ee9eae4e
27 Aug, 2020 1 commit

Fix the TF Trainer gradient accumulation and the TF NER example (#6713) · 6f289dc9

Julien Plu authored Aug 27, 2020

* Align TF NER example over the PT one

* Fix Dataset call

* Fix gradient accumulation training

* Apply style

* Address Sylvain's comments

* Address Sylvain's comments

* Apply style

6f289dc9

26 Aug, 2020 1 commit
- Black 20 release · a75c64d8
  Lysandre authored Aug 26, 2020
  
  a75c64d8
24 Aug, 2020 1 commit
- Fix PL token classification examples (#6682) · dd522da0
  vblagoje authored Aug 24, 2020
  
  dd522da0