Commits · ab17758874f62c03b6e5627f846a697920b16dd8 · chenpangpang / transformers

22 Dec, 2020 2 commits
- Add speed metrics to all example scripts + template (#9260) · ab177588
  Sylvain Gugger authored Dec 22, 2020
  
  ab177588
- Fix link to old language modeling script (#9254) · 189c1b91
  Manuel Romero authored Dec 22, 2020
  
  189c1b91
16 Dec, 2020 1 commit

[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) · 640e6fe1

Patrick von Platen authored Dec 16, 2020



* save intermediate

* save intermediate

* save intermediate

* correct flax bert model file

* new module / model naming

* make style

* almost finish BERT

* finish roberta

* make fix-copies

* delete keys file

* last refactor

* fixes in run_mlm_flax.py

* remove pooled from run_mlm_flax.py`

* fix gelu | gelu_new

* remove Module from inits

* splits

* dirty print

* preventing warmup_steps == 0

* smaller splits

* make fix-copies

* dirty print

* dirty print

* initial_evaluation argument

* declaration order fix

* proper model initialization/loading

* proper initialization

* run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug

* removed tokenizers warning hack, fixed model re-initialization

* reverted training_args.py changes

* fix flax from pretrained

* improve test in flax

* apply sylvains tips

* update init

* make 0.3.0 compatible

* revert tevens changes

* revert tevens changes 2

* finalize revert

* fix bug

* add docs

* add pretrained to init

* Update src/transformers/modeling_flax_utils.py

* fix copies

* final improvements
Co-authored-by: TevenLeScao <teven.lescao@gmail.com>

640e6fe1

15 Dec, 2020 1 commit

[Examples] Add automatic dataset splitting in language-modeling examples (#9133) · 2a7e8e16

Teven authored Dec 15, 2020

* replaced jnp.split + removing textual model inputs + ensuring warmup_steps > 0

* Add automatic dataset splitting in language-modeling examples

2a7e8e16

11 Dec, 2020 1 commit

Reorganize examples (#9010) · 783d7d26

Sylvain Gugger authored Dec 11, 2020



* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

783d7d26

10 Dec, 2020 1 commit

Fix typo #9012 (#1) (#9038) · 91ab02af

NatLun137 authored Dec 10, 2020

There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)

91ab02af

09 Dec, 2020 1 commit

Flax Masked Language Modeling training example (#8728) · 75627148

Funtowicz Morgan authored Dec 09, 2020



* Remove "Model" suffix from Flax models to look more :hugs:
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Initial working (forward + backward) for Flax MLM training example.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Simply code
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing comments, using module and moving to LM task.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore parameter name "module" wrongly renamed model.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore correct output ordering...
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Actually commit the example 😅

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Add FlaxBertModelForMaskedLM after rebasing.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make it possible to initialize the training from scratch
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Reuse flax linen example of cross entropy loss
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added specific data collator for flax
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove todo for data collator
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added evaluation step
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added ability to provide dtype to support bfloat16 on TPU
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable flax tensorboard output
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable jax.pmap support.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Ensure batches are correctly sized to be dispatched with jax.pmap
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable bfloat16 with --fp16 cmdline args
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Correctly export metrics to tensorboard
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added dropout and ability to use it.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Effectively enable & disable during training and evaluation steps.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Oops.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable specifying kernel initializer scale
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added warmup step to the learning rate scheduler.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix typo.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Print training loss
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make style
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix linter issue (flake8)
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix model matching
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix dummies
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix non default dtype on Flax models
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Use the same create_position_ids_from_input_ids for FlaxRoberta
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make Roberta attention as Bert
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix copy
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Wording.
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>

75627148

07 Dec, 2020 1 commit
- Small fix to the run clm script (#8973) · 62d30e05
  Sylvain Gugger authored Dec 07, 2020
  
  62d30e05
23 Nov, 2020 1 commit
- Fix max length in run_plm script (#8738) · 367f497d
  Sylvain Gugger authored Nov 23, 2020
  
  367f497d
19 Nov, 2020 1 commit

fix small typo (#8644) · a79a96dd

Matthias authored Nov 19, 2020

Fixed a small typo on the XLNet and permutation language modelling section

a79a96dd

18 Nov, 2020 2 commits
- Update README.md (#8635) · 28d16e7a
  Tim Isbister authored Nov 19, 2020
  
  28d16e7a
- Fix training from scratch in new scripts (#8623) · a0c62d24
  Sylvain Gugger authored Nov 18, 2020
  
  a0c62d24
17 Nov, 2020 1 commit

Tokenizers: ability to load from model subfolder (#8586) · 042a6aa7

Julien Chaumond authored Nov 17, 2020



* <small>tiny typo</small>

* Tokenizers: ability to load from model subfolder

* use subfolder for local files as well

* Uniformize model shortcut name => model id

* from s3 => from huggingface.co
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>

042a6aa7

12 Nov, 2020 2 commits
- Try to understand and apply Sylvain's comments (#8458) · 27b3ff31
  Julien Plu authored Nov 12, 2020
  
  27b3ff31
- quick fix on concatenating text to support more datasets (#8474) · 924c624a
  zeyuyun1 authored Nov 12, 2020
  
  924c624a
04 Nov, 2020 4 commits
- Clean up data collators and datasets (#8308) · 9c4aa4ac
  Sylvain Gugger authored Nov 04, 2020
```
* Clean up data collators and datasets

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Remove needless clone
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
```
  9c4aa4ac
- Fix path to old run_language_modeling.py script (#8302) · b1d3e95e
  Manuel Romero authored Nov 04, 2020
  
  b1d3e95e
- Fix validation file loading in scripts (#8298) · cf897246
  Sylvain Gugger authored Nov 04, 2020
  
  cf897246
- Fix typo in language-modeling README.md (#8287) · 734afa37
  Pengzhi Gao authored Nov 04, 2020
  
  734afa37
02 Nov, 2020 1 commit

Add line by line option to mlm/plm scripts (#8240) · e1b1b614

Sylvain Gugger authored Nov 02, 2020



* Make line by line optional in run_mlm

* Add option to disable dynamic padding

* Add option to plm too and update README

* Typos

* More typos

* Even more typos

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

e1b1b614

30 Oct, 2020 2 commits

Remove deprecated arguments from new run_clm (#8197) · 9eb3a410
Sylvain Gugger authored Oct 30, 2020

9eb3a410

Finalize lm examples (#8188) · cdc48ce9

Sylvain Gugger authored Oct 30, 2020



* Finish the cleanup of the language-modeling examples

* Update main README

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Propagate changes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

cdc48ce9

29 Oct, 2020 2 commits

Fix eval ref miss in Chinese WWM. (#8115) · 9a21b506

wlhgtc authored Oct 30, 2020



* ADD: add whole word mask proxy for both eng and chinese

* MOD: adjust format

* MOD: reformat code

* MOD: update import

* MOD: fix bug

* MOD: add import

* MOD: fix bug

* MOD: decouple code and update readme

* MOD: reformat code

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change wwm to whole_word_mask

* reformat code

* reformat

* format

* Code quality

* ADD: update chinese ref readme

* MOD: small changes

* MOD: small changes2

* update readme

* fix eval ref file miss bug

* format file

* MOD: move ref code to contrib

* MOD: add delimeter check

* reformat code

* refomat code

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

9a21b506

Add a template for examples and apply it for mlm and plm examples (#8153) · 69117628

Sylvain Gugger authored Oct 29, 2020

* Add a template for example scripts and apply it to mlm

* Formatting

* Fix test

* Add plm script

* Add a template for example scripts and apply it to mlm

* Formatting

* Fix test

* Add plm script

* Add a template for example scripts and apply it to mlm

* Formatting

* Fix test

* Add plm script

* Styling

69117628

28 Oct, 2020 1 commit

New run_clm script (#8105) · 47dfa65b

Sylvain Gugger authored Oct 28, 2020



* New run_clm script

* Formatting

* More comments

* Remove unused imports

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Address review comments

* Change link to the hub
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

47dfa65b

26 Oct, 2020 1 commit

Update README.md (#8050) · 098ddc22

mohammadreza-Banaei73 authored Oct 26, 2020

--wwm cant be used as an argument given run_language_modeling.py and should be changed to --whole_word_mask

098ddc22

22 Oct, 2020 1 commit

# Add whole word mask support for lm fine-tune (#7925) · a16e568f

wlhgtc authored Oct 22, 2020



* ADD: add whole word mask proxy for both eng and chinese

* MOD: adjust format

* MOD: reformat code

* MOD: update import

* MOD: fix bug

* MOD: add import

* MOD: fix bug

* MOD: decouple code and update readme

* MOD: reformat code

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update examples/language-modeling/run_language_modeling.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change wwm to whole_word_mask

* reformat code

* reformat

* format

* Code quality

* ADD: update chinese ref readme

* MOD: small changes

* MOD: small changes2

* update readme
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

a16e568f

12 Oct, 2020 2 commits
- Fix code quality · d6175a42
  sgugger authored Oct 12, 2020
  
  d6175a42
- The input training data files (multiple files in glob format). (#7717) · f176e707
  Kelvin authored Oct 12, 2020
```
Very often splitting large files to smaller files can prevent tokenizer going out of memory in environment like Colab that does not have swap memory
```
  f176e707
01 Sep, 2020 1 commit

Add cache_dir to save features TextDataset (#6879) · 21d71923

Jin Young (Daniel) Sohn authored Sep 01, 2020

* Add cache_dir to save features TextDataset

This is in case the dataset is in a RO filesystem, for which is the case
in tests (GKE TPU tests).

* style

21d71923

26 Aug, 2020 1 commit
- Black 20 release · a75c64d8
  Lysandre authored Aug 26, 2020
  
  a75c64d8
29 Jul, 2020 1 commit
- XLNet PLM Readme (#6121) · 641b873c
  Lysandre Debut authored Jul 29, 2020
  
  641b873c
07 Jul, 2020 1 commit

Added data collator for permutation (XLNet) language modeling and related calls (#5522) · 3dcb748e

Shashank Gupta authored Jul 07, 2020

* Added data collator for XLNet language modeling and related calls

Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
to generate necessary inputs for language modeling training with
XLNetLMHeadModel. Also added related arguments, logic and calls in
examples/language-modeling/run_language_modeling.py.

Resolves: #4739, #2008 (partially)

* Changed name to `DataCollatorForPermutationLanguageModeling`

Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
similar to `mems` for XLNet).
Changed calls and imports appropriately.

* Added detailed comments, changed variable names

Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.

* Added tests for new data collator

Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.

* Fixed styling issues

3dcb748e

25 May, 2020 1 commit
- add DistilBERT to supported models (#4558) · 50d1ce41
  Antonis Maronikolakis authored May 25, 2020
  
  50d1ce41
19 May, 2020 1 commit

Distributed eval: SequentialDistributedSampler + gather all results (#4243) · 5e7fe8b5

Julien Chaumond authored May 18, 2020

* Distributed eval: SequentialDistributedSampler + gather all results

* For consistency only write to disk from world_master

Close https://github.com/huggingface/transformers/issues/4272

* Working distributed eval

* Hook into scripts

* Fix #3721 again

* TPU.mesh_reduce: stay in tensor space

Thanks @jysohn23

* Just a small comment

* whitespace

* torch.hub: pip install packaging

* Add test scenarii

5e7fe8b5

18 May, 2020 1 commit
- fix(run_language_modeling): use arg overwrite_cache (#4407) · d9ece823
  Boris Dayma authored May 18, 2020
  
  d9ece823
15 May, 2020 1 commit
- [skip ci] remove local rank · 15550ce0
  Julien Chaumond authored May 15, 2020
  
  15550ce0
14 May, 2020 1 commit
- Use Filelock to ensure distributed barriers · c547f15a
  Julien Chaumond authored May 14, 2020
```
see context in https://github.com/huggingface/transformers/pull/4223
```
  c547f15a
13 May, 2020 1 commit

(v2) Improvements to the wandb integration (#4324) · 24175910

Julien Chaumond authored May 12, 2020



* Improvements to the wandb integration

* small reorg + no global necessary

* feat(trainer): log epoch and final metrics

* Simplify logging a bit

* Fixup

* Fix crash when just running eval
Co-authored-by: Chris Van Pelt <vanpelt@gmail.com>
Co-authored-by: Boris Dayma <boris.dayma@gmail.com>

24175910

08 May, 2020 1 commit

[TPU] Doc, fix xla_spawn.py, only preprocess dataset once (#4223) · 7b75aa9f

Julien Chaumond authored May 08, 2020

* [TPU] Doc, fix xla_spawn.py, only preprocess dataset once

* Update examples/README.md

* [xla_spawn] Add `_mp_fn` to other Trainer scripts

* [TPU] Fix: eval dataloader was None

7b75aa9f