Commits · e3fce2f868ed3cc444b89e351f1a173d3a560724 · chenpangpang / transformers

02 Jul, 2021 4 commits
- Update README.md · e3fce2f8
  Patrick von Platen authored Jul 02, 2021
```
Thanks a lot @BirgerMoell
```
  e3fce2f8
- Fix TAPAS test uncovered by #12446 (#12480) · b889d3f6
  Lysandre Debut authored Jul 02, 2021
  
  b889d3f6
- fixed typo in flax-projects readme (#12466) · b4ecc6be
  Matthew LeMay authored Jul 02, 2021
  
  b4ecc6be
- Rework notebooks and move them to the Notebooks repo (#12471) · e52288a1
  Sylvain Gugger authored Jul 02, 2021
  
  e52288a1
01 Jul, 2021 12 commits

[roberta] fix lm_head.decoder.weight ignore_key handling (#12446) · 2d1d9218

Stas Bekman authored Jul 01, 2021



* fix lm_head.decoder.weight ignore_key handling

* fix the mutable class variable

* Update src/transformers/models/roberta/modeling_roberta.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* replicate the comment

* make deterministic
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

2d1d9218

Fixing bug with param count without embeddings (#12461) · 7f0027db

Teven authored Jul 01, 2021



* fixing bug with param count without embeddings

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7f0027db

Validation split added: custom data files @sgugger, @patil-suraj (#12407) · d5b8fe3b

Souvic Chakraborty authored Jul 01, 2021



* Validation split added: custom data files

Validation split added in case of no validation file and loading custom data

* Updated documentation with custom file usage

Updated documentation with custom file usage

* Update README.md

* Update README.md

* Update README.md

* Made some suggested stylistic changes

* Used logger instead of print.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Made similar changes to add validation split

In case of a missing validation file, a validation split will be used now.

* max_train_samples to be used for training only

max_train_samples got misplaced, now corrected so that it is applied on training data only, not whole data.

* styled

* changed ordering

* Improved language of documentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Improved language of documentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fixed styling issue

* Update run_mlm.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d5b8fe3b

Import check_inits handling of duplicate definitions. (#12467) · f929462b
Thibault FEVRY authored Jul 01, 2021
```
* Import fix_inits handling of duplicate definitions.

* Style fix
```
f929462b

Add TPU README (#12463) · 7f87bfc9

Patrick von Platen authored Jul 01, 2021



* Add TPU README

* Apply suggestions from code review

* Update examples/research_projects/jax-projects/README.md

* Update examples/research_projects/jax-projects/README.md
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Stefan Schweter <stefan@schweter.it>

7f87bfc9

Update README.md · 1457839f
Patrick von Platen authored Jul 01, 2021

1457839f
Added talk details (#12465) · c18af5d4
Suzana Ilić authored Jul 01, 2021

c18af5d4
Fix training_args.py barrier for torch_xla (#12464) · 6c5b20aa
Jin Young (Daniel) Sohn authored Jul 01, 2021
```
torch_xla currently has its own synchronization primitives, so use
xm.rendezvous(tag) instead.
```
6c5b20aa
Comment fast GPU TF tests (#12452) · 2a501ac9
Lysandre Debut authored Jul 01, 2021

2a501ac9
[Wav2Vec2, Hubert] Fix ctc loss test (#12458) · 27d348f2
Patrick von Platen authored Jul 01, 2021
```
* fix_torch_device_generate_test

* remove @

* fix test
```
27d348f2

[Flax community event] How to use hub during training (#12447) · b655f16d

Patrick von Platen authored Jul 01, 2021



* fix_torch_device_generate_test

* remove @

* upload

* finish doc

* Apply suggestions from code review
Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* finish
Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

b655f16d

Add test for a WordLevel tokenizer model (#12437) · 3aa37b94
SaulLu authored Jul 01, 2021
```
* add a test for a WordLevel tokenizer

* adapt common test to new tokenizer
```
3aa37b94

30 Jun, 2021 10 commits

[Flax] Add wav2vec2 (#12271) · 0d1f67e6

Patrick von Platen authored Jun 30, 2021



* fix_torch_device_generate_test

* remove @

* start flax wav2vec2

* save intermediate

* forward pass has correct shape

* add weight norm

* add files

* finish ctc

* make style

* finish gumbel quantizer

* correct docstrings

* correct some more files

* fix vit

* finish quality

* correct tests

* correct docstring

* correct tests

* start wav2vec2 pretraining script

* save intermediate

* start pretraining script

* finalize pretraining script

* finish

* finish

* small typo

* finish

* correct

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* make style

* push
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

0d1f67e6

[JAX/Flax readme] add philosophy doc (#12419) · 3f36a2c0

Suraj Patil authored Jun 30, 2021



* add philosophy doc

* fix typos

* update doc

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* address Patricks suggestions

* add a training example and fix typos

* jit the training step

* jit train step

* fix example code

* typo

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

3f36a2c0

Add to talks section (#12442) · 1ad1c4a8
Suzana Ilić authored Jun 30, 2021

1ad1c4a8
fix typo in mt5 configuration docstring (#12432) · 42477d68
fcakyon authored Jun 30, 2021

42477d68
Document patch release v4.8.2 · 89073a95
Lysandre authored Jun 30, 2021

89073a95

Add CANINE (#12024) · 6e685978

NielsRogge authored Jun 30, 2021



* First pass

* More progress

* Add support for local attention

* More improvements

* More improvements

* Conversion script working

* Add CanineTokenizer

* Make style & quality

* First draft of integration test

* Remove decoder test

* Improve tests

* Add documentation

* Mostly docs improvements

* Add CanineTokenizer tests

* Fix most tests on GPU, improve upsampling projection

* Address most comments by @dhgarrette

* Remove decoder logic

* Improve Canine tests, improve docs of CanineConfig

* All tokenizer tests passing

* Make fix-copies and fix tokenizer tests

* Fix test_model_outputs_equivalence test

* Apply suggestions from @sgugger's review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Address some more comments

* Add support for hidden_states and attentions of shallow encoders

* Define custom CanineModelOutputWithPooling, tests pass

* First pass

* More progress

* Add support for local attention

* More improvements

* More improvements

* Conversion script working

* Add CanineTokenizer

* Make style & quality

* First draft of integration test

* Remove decoder test

* Improve tests

* Add documentation

* Mostly docs improvements

* Add CanineTokenizer tests

* Fix most tests on GPU, improve upsampling projection

* Address most comments by @dhgarrette

* Remove decoder logic

* Improve Canine tests, improve docs of CanineConfig

* All tokenizer tests passing

* Make fix-copies and fix tokenizer tests

* Fix test_model_outputs_equivalence test

* Apply suggestions from @sgugger's review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Address some more comments

* Make conversion script work for Canine-c too

* Fix tokenizer tests

* Remove file
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

6e685978

Add default bos_token and eos_token for tokenizer of deberta_v2 (#12429) · 69f57015

Jabin Huang authored Jun 30, 2021



* fix ids_to_tokens naming error in tokenizer of deberta v2

* Update tokenization_deberta_v2.py

Add bos_token and eos_token.

* format code
Co-authored-by: Jipeng Huang <jihuan@microsoft.com>

69f57015

Fix default bool in argparser (#12424) · c9486fd0
Sylvain Gugger authored Jun 30, 2021
```
* Fix default bool in argparser

* Add more to test
```
c9486fd0
Added to talks section (#12433) · 90d69456
Suzana Ilić authored Jun 30, 2021
```
Added one more confirmed speaker, zoom links and gcal event links
```
90d69456

Add option to save on each training node (#12421) · 31a81109

Sylvain Gugger authored Jun 30, 2021



* Add option to save on each training node

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

31a81109

29 Jun, 2021 11 commits

[modelcard] fix (#12422) · 990540b7

Stas Bekman authored Jun 29, 2021

this PR is fixing an incorrect attribute - probably some tests are needed?

990540b7

Easily train a new fast tokenizer from a given one (#12361) · dc42e770

Sylvain Gugger authored Jun 29, 2021



* [WIP] Easily train a new fast tokenizer from a given one

* Fix test

* Roll out to other tokenizers and add tests

* Fix bug with unk id and add emoji to test

* Really use something different in test

* Implement special tokens map

* Map special tokens in the Transformers tokenizers

* Fix test

* Make test more robust

* Fix test for BPE

* More robust map and test

Co-authored-by SaulLu

* Test file

* Stronger tests
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>

* Map unk token for Wordpiece and address review comment

* Fix lowercase test and address review comment

* Fix all tests

* Simplify test

* Fix tests for realsies

* Easily train a new fast tokenizer from a given one - tackle the special tokens format (str or AddedToken) (#12420)

* Propose change in tests regarding lower case

* add new test for special tokens types

* put back the test part about decoding

* add feature: the AddedToken is re-build with the different mapped content

* Address review comment: simplify AddedToken building
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* Update src/transformers/tokenization_utils_fast.py
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

dc42e770

Added talks (#12415) · b440b8d1
Suzana Ilić authored Jun 29, 2021

b440b8d1
minor fixes in original RAG training (#12395) · 5257818e
Shamane Siri authored Jun 30, 2021

5257818e
fix ids_to_tokens naming error in tokenizer of deberta v2 (#12412) · e3f39a29
Jabin Huang authored Jun 29, 2021
```
Co-authored-by: Jipeng Huang <jihuan@microsoft.com>
```
e3f39a29
[Flax] Example scripts - correct weight decay (#12409) · 81332868
Patrick von Platen authored Jun 29, 2021
```
* fix_torch_device_generate_test

* remove @

* finish

* finish

* correct style
```
81332868

[example/flax] add summarization readme (#12393) · aecae533

Suraj Patil authored Jun 29, 2021



* add readme

* update readme and add requirements

* Update examples/flax/summarization/README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

aecae533

Fix TFWav2Vec2 SpecAugment (#12289) · 38861045
Will Rice authored Jun 29, 2021
```
* Fix TFWav2Vec2 SpecAugment

* Invert masks

* Feedback changes
```
38861045
Add out of vocabulary error to ASR models (#12288) · bc084938
Will Rice authored Jun 29, 2021
```
* Add OOV error to ASR models

* Feedback changes
```
bc084938

Rename detr targets to labels (#12280) · 1fc6817a

NielsRogge authored Jun 29, 2021

* Rename target to labels in DetrFeatureExtractor

* Update DetrFeatureExtractor tests accordingly

* Improve docs of DetrFeatureExtractor

* Improve docs

* Make style

1fc6817a

[models] respect dtype of the model when instantiating it (#12316) · 7682e977

Stas Bekman authored Jun 28, 2021



* [models] respect dtype of the model when instantiating it

* cleanup

* cleanup

* rework to handle non-float dtype

* fix

* switch to fp32 tiny model

* improve

* use dtype.is_floating_point

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix the doc

* recode to use explicit torch_dtype_auto_detect, torch_dtype args

* docs and tweaks

* docs and tweaks

* docs and tweaks

* merge 2 args, add docs

* fix

* fix

* better doc

* better doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7682e977

28 Jun, 2021 3 commits

[Flax] Add T5 pretraining script (#12355) · 31c3e7e7

Patrick von Platen authored Jun 28, 2021



* fix_torch_device_generate_test

* remove @

* add length computatan

* finish masking

* finish

* upload

* fix some bugs

* finish

* fix dependency table

* correct tensorboard

* Apply suggestions from code review

* correct processing

* slight change init

* correct some more mistakes

* apply suggestions

* improve readme

* fix indent

* Apply suggestions from code review
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

* correct tokenizer

* finish

* finish

* finish

* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>

31c3e7e7

pass the matching trainer log level to deepspeed (#12401) · e2770748
Stas Bekman authored Jun 28, 2021

e2770748

Tensorflow LM examples (#12358) · 7e22609e

Matt authored Jun 28, 2021

* Tensorflow MLM example

* Add CLM example

* Style fixes, adding missing checkpoint code from the CLM example

* Fix TPU training, avoid massive dataset warnings

* Fix incorrect training length calculation for multi-GPU training

* Fix incorrect training length calculation for multi-GPU training

* Refactors and nitpicks from the review

* Style pass

* Adding README

7e22609e