Commits · 86896de064f166b9ea347f139c9698c248c7cc4a · chenpangpang / transformers

10 Dec, 2020 1 commit

NatLun137 authored Dec 10, 2020

There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)

91ab02af

09 Dec, 2020 1 commit

Flax Masked Language Modeling training example (#8728) · 75627148

Funtowicz Morgan authored Dec 09, 2020



* Remove "Model" suffix from Flax models to look more :hugs:
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Initial working (forward + backward) for Flax MLM training example.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Simply code
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing comments, using module and moving to LM task.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore parameter name "module" wrongly renamed model.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore correct output ordering...
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Actually commit the example 😅

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Add FlaxBertModelForMaskedLM after rebasing.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make it possible to initialize the training from scratch
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Reuse flax linen example of cross entropy loss
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added specific data collator for flax
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove todo for data collator
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added evaluation step
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added ability to provide dtype to support bfloat16 on TPU
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable flax tensorboard output
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable jax.pmap support.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Ensure batches are correctly sized to be dispatched with jax.pmap
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable bfloat16 with --fp16 cmdline args
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Correctly export metrics to tensorboard
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added dropout and ability to use it.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Effectively enable & disable during training and evaluation steps.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Oops.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable specifying kernel initializer scale
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added warmup step to the learning rate scheduler.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix typo.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Print training loss
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make style
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix linter issue (flake8)
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix model matching
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix dummies
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix non default dtype on Flax models
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Use the same create_position_ids_from_input_ids for FlaxRoberta
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make Roberta attention as Bert
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix copy
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Wording.
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>

75627148

08 Dec, 2020 1 commit

New squad example (#8992) · 447808c8

Sylvain Gugger authored Dec 08, 2020



* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add tick

* Update README

* Address review comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

447808c8

07 Dec, 2020 3 commits
- Copyright (#8970) · 00aa9dbc
  Sylvain Gugger authored Dec 07, 2020
```
* Add copyright everywhere missing

* Style
```
  00aa9dbc
- Small fix to the run clm script (#8973) · 62d30e05
  Sylvain Gugger authored Dec 07, 2020
  
  62d30e05
- Use word_ids to get labels in run_ner (#8962) · 7f9ccffc
  Sylvain Gugger authored Dec 07, 2020
```
* Use word_ids to get labels in run_ner

* Add sanity check
```
  7f9ccffc
05 Dec, 2020 1 commit

Don't pass in token_type_ids to BART for GLUE (#8929) · 8dfc8c72

Ethan Perez authored Dec 05, 2020

Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.

8dfc8c72

04 Dec, 2020 2 commits

[seq2seq] document the caveat of leaky native amp (#8930) · df311a5c

Stas Bekman authored Dec 04, 2020



* document the caveat of leaky native amp

* Update examples/seq2seq/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

df311a5c

[s2s finetune_trainer] add instructions for distributed training (#8884) · 4c3d98dd
Stas Bekman authored Dec 03, 2020

4c3d98dd

01 Dec, 2020 1 commit
- start using training_args.parallel_mode (#8882) · 379005c9
  Stas Bekman authored Dec 01, 2020
  
  379005c9
30 Nov, 2020 3 commits

[s2s trainer] fix DP mode (#8823) · 7f34d757

Stas Bekman authored Nov 30, 2020

* fix DP case on multi-gpu

* make executable

* test all 3 modes

* use the correct check for distributed

* dp doesn't need a special case

* restore original name

* cleanup

7f34d757

Remove deprecated `evalutate_during_training` (#8852) · 55302990

Sylvain Gugger authored Nov 30, 2020



* Remove deprecated `evalutate_during_training`

* Update src/transformers/training_args_tf.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

55302990

token-classification: use is_world_process_zero instead of deprecated is_world_master() (#8828) · 19fa01ce
Stefan Schweter authored Nov 30, 2020

19fa01ce

26 Nov, 2020 4 commits
- potpurri of small fixes (#8807) · ddf3c646
  Stas Bekman authored Nov 26, 2020
  
  ddf3c646
- Fix PPLM (#8779) · 52708d26
  chutaklee authored Nov 27, 2020
```
* Fix pplm

* fix style

* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
```
  52708d26
- Revert "finetune.py: specifying generation min_length (#8478)" (#8805) · 8f07f5c4
  Patrick von Platen authored Nov 26, 2020
```
This reverts commit 5aa361f3.
```
  8f07f5c4
- finetune.py: specifying generation min_length (#8478) · 5aa361f3
  Daniel Khashabi authored Nov 25, 2020
  
  5aa361f3
24 Nov, 2020 3 commits

[core] implement support for run-time dependency version checking (#8645) · 82d443a7

Stas Bekman authored Nov 24, 2020



* implement support for run-time dependency version checking

* try not escaping !

* use findall that works on py36

* small tweaks

* autoformatter worship

* simplify

* shorter names

* add support for non-versioned checks

* add deps

* revert

* tokenizers not required, check version only if installed

* make a proper distutils cmd and add make target

* tqdm must be checked before tokenizers

* workaround the DistributionNotFound peculiar setup

* handle the rest of packages in setup.py

* fully sync setup.py's install_requires - to check them all

* nit

* make install_requires more readable

* typo

* Update setup.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* restyle

* add types

* simplify

* simplify2
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

82d443a7

fix rag index names in eval_rag.py example (#8730) · a7d73cfd
Quentin Lhoest authored Nov 24, 2020

a7d73cfd

Support various BERT relative position embeddings (2nd) (#8276) · 2c83b3c3

zhiheng-huang authored Nov 24, 2020



* Support BERT relative position embeddings

* Fix typo in README.md

* Address review comment

* Fix failing tests

* [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py

* make fix copies

* fix configs of electra and albert and fix longformer

* remove copy statement from longformer

* fix albert

* fix electra

* Add bert variants forward tests for various position embeddings

* [tiny] Fix style for test_modeling_bert.py

* improve docstring

* [tiny] improve docstring and remove unnecessary dependency

* [tiny] Remove unused import

* re-add to ALBERT

* make embeddings work for ALBERT

* add test for albert
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

2c83b3c3

23 Nov, 2020 2 commits
- Fix max length in run_plm script (#8738) · 367f497d
  Sylvain Gugger authored Nov 23, 2020
  
  367f497d
- [trainer] make generate work with multigpu (#8716) · 1e45bef0
  Stas Bekman authored Nov 23, 2020
```
* make generate work with multigpu

* better fix - thanks @sgugger
```
  1e45bef0
22 Nov, 2020 1 commit
- Fix many typos (#8708) · e1f3156b
  Santiago Castro authored Nov 22, 2020
  
  e1f3156b
20 Nov, 2020 1 commit

Fix rag finetuning + add finetuning test (#8585) · 8062fa63

Quentin Lhoest authored Nov 20, 2020

* replace init_ddp_connection for index init

* style

* add finetune test

* add test data

* move generate tensors to device

* add test on EM metric

* style

* allow multi process test

* keep gloo process group for retrieval

* add multi-gpu test

* use custom accelerator

* clean test finetune

* minor

* style

* style

* typo

* use python call instead of imported main fumction

* return_dict fix in modeling_rag

* use float32 in retrieval

* store as float32 as well in the custom knowledge dataset example

* style

* rename to finetune_rag

* style

* update readme

* rename utils and callbacks to utils_rag and callbacks_rag

* fix test

* patrick's comments

* generate dummy data in the finetue test script

* remove dummy data files

* style

8062fa63

19 Nov, 2020 6 commits
- [examples/seq2seq] fix PL deprecation warning (#8577) · 0ad45e10
  Stas Bekman authored Nov 19, 2020
```
* fix deprecation warning

* fix
```
  0ad45e10
- Fix run_ner script (#8664) · 20b65860
  Sylvain Gugger authored Nov 19, 2020
```
* Fix run_ner script

* Pin datasets
```
  20b65860
- Fix a few last paths for the new repo org (#8666) · cb3e5c33
  Sylvain Gugger authored Nov 19, 2020
  
  cb3e5c33
- fix small typo (#8644) · a79a96dd
  Matthias authored Nov 19, 2020
```
Fixed a small typo on the XLNet and permutation language modelling section
```
  a79a96dd
- Better filtering of the model outputs in Trainer (#8633) · 4208f496
  Sylvain Gugger authored Nov 19, 2020
```
* Better filtering of the model outputs in Trainer

* Fix examples tests

* Add test for Lysandre
```
  4208f496
- fix missing return dict (#8653) · 62cd9ce9
  Quentin Lhoest authored Nov 19, 2020
  
  62cd9ce9
18 Nov, 2020 5 commits
- Update README.md (#8635) · 28d16e7a
  Tim Isbister authored Nov 19, 2020
  
  28d16e7a
- [s2s] distillation apex breaks return_dict obj (#8631) · d86d57fa
  Stas Bekman authored Nov 18, 2020
```
* apex breaks return_dict obj

* style
```
  d86d57fa
- Fix training from scratch in new scripts (#8623) · a0c62d24
  Sylvain Gugger authored Nov 18, 2020
  
  a0c62d24
- fix to adjust for #8530 changes (#8612) · cdf1b7ae
  Stas Bekman authored Nov 18, 2020
  
  cdf1b7ae
- [s2s] broken test (#8613) · 2819da02
  Stas Bekman authored Nov 18, 2020
  
  2819da02
17 Nov, 2020 4 commits

Remove deprecated (#8604) · dd52804f

Sylvain Gugger authored Nov 17, 2020



* Remove old deprecated arguments
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* Remove needless imports

* Fix tests
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

dd52804f

these should run fine on multi-gpu (#8582) · f0435f5a
Stas Bekman authored Nov 17, 2020

f0435f5a

Tokenizers: ability to load from model subfolder (#8586) · 042a6aa7

Julien Chaumond authored Nov 17, 2020



* <small>tiny typo</small>

* Tokenizers: ability to load from model subfolder

* use subfolder for local files as well

* Uniformize model shortcut name => model id

* from s3 => from huggingface.co
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>

042a6aa7

Reorganize repo (#8580) · c89bdfbe

Sylvain Gugger authored Nov 16, 2020

* Put models in subfolders

* Styling

* Fix imports in tests

* More fixes in test imports

* Sneaky hidden imports

* Fix imports in doc files

* More sneaky imports

* Finish fixing tests

* Fix examples

* Fix path for copies

* More fixes for examples

* Fix dummy files

* More fixes for example

* More model import fixes

* Is this why you're unhappy GitHub?

* Fix imports in conver command

c89bdfbe

16 Nov, 2020 1 commit

Switch `return_dict` to `True` by default. (#8530) · 1073a2bd

Sylvain Gugger authored Nov 16, 2020

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Use the CI to identify failing tests

* Remove from all examples and tests

* More default switch

* Fixes

* More test fixes

* More fixes

* Last fixes hopefully

* Run on the real suite

* Fix slow tests

1073a2bd