Commits · 4eec5d0cf67116e98770c305640b5710571da4f6 · chenpangpang / transformers

05 Jan, 2021 1 commit
- [examples/text-classification] Fix a bug for using one's own dataset of a regression task (#9411) · 57a66269
  Yusuke Mori authored Jan 05, 2021
  
  57a66269
04 Jan, 2021 3 commits

Bump notebook from 6.1.4 to 6.1.5 in /examples/research_projects/lxmert (#9402) · 5dd389d1

dependabot[bot] authored Jan 04, 2021

Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5.
- [Release notes](https://github.com/jupyter/jupyterhub/releases)
- [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md)
- [Commits](https://github.com/jupyter/jupyterhub/commits

)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

5dd389d1

Put back LXMert example (#9401) · 23a71449
Sylvain Gugger authored Jan 04, 2021

23a71449
simplify marian distillation script (#9394) · 8eb7f26d
Sam Shleifer authored Jan 04, 2021

8eb7f26d

03 Jan, 2021 1 commit

Fix typos in README and bugs in RAG example code for end-to-end evaluation and finetuning (#9355) · d944966b

Yoshitomo Matsubara authored Jan 03, 2021

* fix a bug in eval_batch_retrieval

* should return parser as well as other staticmethod

* remove duplicate argument

* these kwargs are no longer accepted (cause TypeError in self.generator.generate of modeling_rag.py)

* fixed file paths in README

* moved an arg to add_ray_specific_args

d944966b

23 Dec, 2020 1 commit
- Adapt to new name of `label_smoothing_factor` training arg (#9282) · a1cb6e98
  Sylvain Gugger authored Dec 23, 2020
  
  a1cb6e98
22 Dec, 2020 5 commits

Revert renaming in finetune_trainer (#9262) · e6c1f1ca
Sylvain Gugger authored Dec 22, 2020

e6c1f1ca
Add speed metrics to all example scripts + template (#9260) · ab177588
Sylvain Gugger authored Dec 22, 2020

ab177588
Fix link to bertabs/README.md (#9255) · 37d6fb5d
Manuel Romero authored Dec 22, 2020

37d6fb5d
Fix link to old language modeling script (#9254) · 189c1b91
Manuel Romero authored Dec 22, 2020

189c1b91

Seq2seq trainer (#9241) · 490b39e6

Sylvain Gugger authored Dec 22, 2020



* Add label smoothing in Trainer

* Add options for scheduler and Adafactor in Trainer

* Put Seq2SeqTrainer in the main lib

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments and adapt scripts

* Documentation

* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

490b39e6

21 Dec, 2020 3 commits

Update the README of the text classification example (#9237) · ec07da65

Sylvain Gugger authored Dec 21, 2020



* Update the README of the text classification example

* Update examples/README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adapt comment from review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

ec07da65

Adding performer fine-tuning research exampke (#9239) · 4eef5889
Teven authored Dec 21, 2020
```
* added run_mlm_performer.py research example

* make styke

* make styke

* Added a README !
```
4eef5889

[RAG] Add Ray implementation for distributed retrieval (#9197) · a4b21cdd

Amog Kamsetty authored Dec 21, 2020



* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* uncomment

* uncomment

* wip

* updates

* add docstring

* updates

* fix arg

* fixes

* add unit tests

* update readme

* update readme

* update finetune script

* update test

* add test

* add ray to test dependencies

* separate ray and ray tune

* formatting

* shutdown ray at end of test

* fix tests

* formatting

* formatting

* even more formatting

* address comments

* formatting

* add files

* Update examples/research_projects/rag/test_distributed_retriever.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address comments

* addressing comments
Co-authored-by: Ubuntu <ubuntu@ip-172-31-21-208.us-west-2.compute.internal>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

a4b21cdd

20 Dec, 2020 1 commit
- better logging and help (#9203) · f38c4ad3
  Stas Bekman authored Dec 20, 2020
  
  f38c4ad3
19 Dec, 2020 1 commit
- [run_glue] add speed metrics (#9198) · 6b850b67
  Stas Bekman authored Dec 18, 2020
```
* add speed metrics

* suggestions
```
  6b850b67
18 Dec, 2020 7 commits
- GPT-model attention heads pruning example (#9189) · 291974c6
  Aleksey Tikhonov authored Dec 18, 2020
```
* Pruning for GPT attn heads

* The code formatted according to the transformers requirements

* Update run_prune_gpt.py

* Update run_prune_gpt.py
```
  291974c6
- Add timing inside Trainer (#9196) · 1198ba8f
  Sylvain Gugger authored Dec 18, 2020
```
* Add timing inside Trainer

* Fix tests

* Add n_objs for train

* Sort logs
```
  1198ba8f
- Add new run_swag example (#9175) · 9a25c5bd
  Sylvain Gugger authored Dec 18, 2020
```
* Add new run_swag example

* Add check

* Add sample

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Very important change to make Lysandre happy
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
```
  9a25c5bd
- Fix link to old SQUAD fine-tuning script (#9181) · 077a5dce
  Manuel Romero authored Dec 18, 2020
  
  077a5dce
- fixed JSON error in run_qa with fp16 (#9186) · fd7b6a52
  Wissam Antoun authored Dec 18, 2020
  
  fd7b6a52
- Fix link to old NER fine-tuning script (#9182) · 66a14a2f
  Manuel Romero authored Dec 18, 2020
  
  66a14a2f
- [trainer] apex fixes and tests (#9180) · f06d0fad
  Stas Bekman authored Dec 17, 2020
  
  f06d0fad
17 Dec, 2020 1 commit
- add tests for the new sharded ddp fairscale integration (#9177) · 63841c55
  Stas Bekman authored Dec 17, 2020
  
  63841c55
16 Dec, 2020 3 commits

Experimental support for fairscale ShardedDDP (#9139) · 9a671853

Sylvain Gugger authored Dec 16, 2020

* Experimental stupport for fairscale ShardedDDP

* Add import error if fairscale not available

* Address review comments

* Fix seq2seq trainer

9a671853

Update notebook table and transformers intro notebook (#9136) · 4d489735
Sylvain Gugger authored Dec 16, 2020

4d489735

[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) · 640e6fe1

Patrick von Platen authored Dec 16, 2020



* save intermediate

* save intermediate

* save intermediate

* correct flax bert model file

* new module / model naming

* make style

* almost finish BERT

* finish roberta

* make fix-copies

* delete keys file

* last refactor

* fixes in run_mlm_flax.py

* remove pooled from run_mlm_flax.py`

* fix gelu | gelu_new

* remove Module from inits

* splits

* dirty print

* preventing warmup_steps == 0

* smaller splits

* make fix-copies

* dirty print

* dirty print

* initial_evaluation argument

* declaration order fix

* proper model initialization/loading

* proper initialization

* run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug

* removed tokenizers warning hack, fixed model re-initialization

* reverted training_args.py changes

* fix flax from pretrained

* improve test in flax

* apply sylvains tips

* update init

* make 0.3.0 compatible

* revert tevens changes

* revert tevens changes 2

* finalize revert

* fix bug

* add docs

* add pretrained to init

* Update src/transformers/modeling_flax_utils.py

* fix copies

* final improvements
Co-authored-by: TevenLeScao <teven.lescao@gmail.com>

640e6fe1

15 Dec, 2020 4 commits

[Examples] Add automatic dataset splitting in language-modeling examples (#9133) · 2a7e8e16

Teven authored Dec 15, 2020

* replaced jnp.split + removing textual model inputs + ensuring warmup_steps > 0

* Add automatic dataset splitting in language-modeling examples

2a7e8e16

native amp leak fix landed in 1.7.1 (#9115) · 14c79c3e
Stas Bekman authored Dec 15, 2020
```
update README with good news that the leak fix has been applied to pytorch-1.7.1.
```
14c79c3e
fix a bug in eval_batch_retrieval (#9089) · 44c340f4
Yoshitomo Matsubara authored Dec 15, 2020

44c340f4

[finetune_trainer] enhancements and fixes (#9042) · c19d0462

Stas Bekman authored Dec 14, 2020



* trainer and finetune_trainer enhancements and fixes

* add fallback default

* move the fixing of incorrect keys back into finetune trainer

* s/eval/val/ to match the split

* trainer can now use a different prefix than eval_ for metrics

* document new arg

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* use 'eval' as the default for metric_key_prefix

* complete adjust var names + disambiguate

* fix logger

* add clarifying comment

* add clarifying comment

* style

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/trainer.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* complete removal of optional for metric_key_prefix

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

c19d0462

11 Dec, 2020 3 commits

Fix min_null_pred in the run_qa script (#9067) · 29e45979
Sylvain Gugger authored Dec 11, 2020

29e45979

Bump notebook in /examples/research_projects/movement-pruning/lxmert (#9062) · 24f6cdea

dependabot[bot] authored Dec 11, 2020

Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5.
- [Release notes](https://github.com/jupyter/jupyterhub/releases)
- [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md)
- [Commits](https://github.com/jupyter/jupyterhub/commits

)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

24f6cdea

Reorganize examples (#9010) · 783d7d26

Sylvain Gugger authored Dec 11, 2020



* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

783d7d26

10 Dec, 2020 1 commit

Fix typo #9012 (#1) (#9038) · 91ab02af

NatLun137 authored Dec 10, 2020

There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)

91ab02af

09 Dec, 2020 1 commit

Flax Masked Language Modeling training example (#8728) · 75627148

Funtowicz Morgan authored Dec 09, 2020



* Remove "Model" suffix from Flax models to look more :hugs:
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Initial working (forward + backward) for Flax MLM training example.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Simply code
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing comments, using module and moving to LM task.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore parameter name "module" wrongly renamed model.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore correct output ordering...
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Actually commit the example 😅

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Add FlaxBertModelForMaskedLM after rebasing.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make it possible to initialize the training from scratch
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Reuse flax linen example of cross entropy loss
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added specific data collator for flax
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove todo for data collator
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added evaluation step
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added ability to provide dtype to support bfloat16 on TPU
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable flax tensorboard output
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable jax.pmap support.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Ensure batches are correctly sized to be dispatched with jax.pmap
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable bfloat16 with --fp16 cmdline args
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Correctly export metrics to tensorboard
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added dropout and ability to use it.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Effectively enable & disable during training and evaluation steps.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Oops.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable specifying kernel initializer scale
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added warmup step to the learning rate scheduler.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix typo.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Print training loss
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make style
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix linter issue (flake8)
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix model matching
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix dummies
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix non default dtype on Flax models
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Use the same create_position_ids_from_input_ids for FlaxRoberta
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make Roberta attention as Bert
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix copy
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Wording.
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>

75627148

08 Dec, 2020 1 commit

New squad example (#8992) · 447808c8

Sylvain Gugger authored Dec 08, 2020



* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add tick

* Update README

* Address review comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

447808c8

07 Dec, 2020 3 commits
- Copyright (#8970) · 00aa9dbc
  Sylvain Gugger authored Dec 07, 2020
```
* Add copyright everywhere missing

* Style
```
  00aa9dbc
- Small fix to the run clm script (#8973) · 62d30e05
  Sylvain Gugger authored Dec 07, 2020
  
  62d30e05
- Use word_ids to get labels in run_ner (#8962) · 7f9ccffc
  Sylvain Gugger authored Dec 07, 2020
```
* Use word_ids to get labels in run_ner

* Add sanity check
```
  7f9ccffc