Commits · 251eb70c979d74d3823e999236ff3621b07510a1 · chenpangpang / transformers

14 Dec, 2020 10 commits

Also pin TF CPU · 251eb70c
Sylvain Gugger authored Dec 14, 2020

251eb70c
Pin TF to < 2.4 · e4ef57a9
Sylvain Gugger authored Dec 14, 2020

e4ef57a9

Fix T5 and BART for TF (#9063) · df3f4d2a

Julien Plu authored Dec 14, 2020

* Fix T5 for graphe compilation+execution

* Fix BART

* Fix import

* Fix naming

* fix attribute name

* Oops

* fix import

* fix tests

* fix tests

* Update test

* Add mising import

* Address Patrick's comments

* Style

* Address Patrick's comment

df3f4d2a

Add parallelization support for T5EncoderModel (#9082) · a9c8bff7

Ahmed Elnaggar authored Dec 14, 2020



* add model parallelism to T5EncoderModel

add model parallelism to T5EncoderModel

* remove decoder from T5EncoderModel parallelize

* uodate T5EncoderModel docs

* Extend T5ModelTest for T5EncoderModel

* fix T5Stask using range for get_device_map

* fix style
Co-authored-by: Ahmed Elnaggar <elnaggar@rostlab.informatik.tu-muenchen.de>

a9c8bff7

Testing Experimental CI Features (#9070) · b00eb4fb
Stas Bekman authored Dec 14, 2020

b00eb4fb
Fixed a broken link in documentation (#9101) · 74daf1f9
Simon Brandeis authored Dec 14, 2020

74daf1f9
correct var name in TrainingArguments docstring (#9096) · d6af344c
Navjot authored Dec 14, 2020

d6af344c
[RAG, Bart] Align RAG, Bart cache with T5 and other models of transformers (#9098) · fa1ddced
Patrick von Platen authored Dec 14, 2020
```
* fix rag

* fix slow test

* fix past in bart
```
fa1ddced
Patch *ForCausalLM model (#9092) · 6587cf9f
Lysandre Debut authored Dec 14, 2020

6587cf9f

Fix embeddings resizing in TF models (#8657) · 51d9c569

Julien Plu authored Dec 14, 2020

* Resize the biases in same time than the embeddings

* Trigger CI

* Biases are not reset anymore

* Remove get_output_embeddings + better LM model detection in generation utils

* Apply style

* First test on BERT

* Update docstring + new name

* Apply the new resizing logic to all the models

* fix tests

* Apply style

* Update the template

* Fix naming

* Fix naming

* Apply style

* Apply style

* Remove unused import

* Revert get_output_embeddings

* Trigger CI

* Update num parameters

* Restore get_output_embeddings in TFPretrainedModel and add comments

* Style

* Add decoder resizing

* Style

* Fix tests

* Separate bias and decoder resize

* Fix tests

* Fix tests

* Apply style

* Add bias resizing in MPNet

* Trigger CI

* Apply style

51d9c569

11 Dec, 2020 16 commits

[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013) · 3552d0e0

Julien Chaumond authored Dec 12, 2020



* rm all model cards

* Update the .rst

@sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler

* Add a rootlevel README.md with simple instructions/context

* Update docs/source/model_sharing.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* rm all model cards
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

3552d0e0

Fix min_null_pred in the run_qa script (#9067) · 29e45979
Sylvain Gugger authored Dec 11, 2020

29e45979
Make ProphetNetModel really compatible with EncoderDecoder (#9033) · 9cc9f412
Patrick von Platen authored Dec 11, 2020
```
* improve

* finish

* upload model

* fix lm head

* fix test
```
9cc9f412

Bump notebook in /examples/research_projects/movement-pruning/lxmert (#9062) · 24f6cdea

dependabot[bot] authored Dec 11, 2020

Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5.
- [Release notes](https://github.com/jupyter/jupyterhub/releases)
- [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md)
- [Commits](https://github.com/jupyter/jupyterhub/commits

)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

24f6cdea

Remove docs only check (#9065) · 91fa7072
Lysandre Debut authored Dec 11, 2020

91fa7072
Fix PreTrainedTokenizer.pad when first inputs are empty (#9018) · 70527ba6
Sylvain Gugger authored Dec 11, 2020
```
* Fix PreTrainedTokenizer.pad when first inputs are empty

* Handle empty inputs case
```
70527ba6

Reorganize examples (#9010) · 783d7d26

Sylvain Gugger authored Dec 11, 2020



* Reorganize example folder

* Continue reorganization

* Change requirements for tests

* Final cleanup

* Finish regroup with tests all passing

* Copyright

* Requirements and readme

* Make a full link for the documentation

* Address review comments

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add symlink

* Reorg again

* Apply suggestions from code review
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Adapt title

* Update to new strucutre

* Remove test

* Update READMEs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

783d7d26

update tatoeba workflow (#9051) · 86896de0
Suraj Patil authored Dec 11, 2020

86896de0

Create README.md (#8096) · 7c8f5f64

Ganesh Kharad authored Dec 11, 2020



* Create README.md

* Fix model card
Co-authored-by: Julien Chaumond <julien@huggingface.co>

7c8f5f64

Create README.md (#8281) · 5527f787

RamonMamon authored Dec 11, 2020



* Create README.md

* Update model_cards/kiri-ai/distiluse-base-multilingual-cased-et/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

5527f787

Create README.md (#8751) · c615df74

joangines authored Dec 11, 2020



* Create README.md

* Update model_cards/Cinnamon/electra-small-japanese-generator/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

c615df74

QARiB Arabic and dialects models (#8796) · 76df5593

Ahmed Abdelali authored Dec 11, 2020



* Add QARiB models

* fix README.md

* Fix README.md

* Fix README.md

* Fix README.md

* Fix QARiB files

* add models card for QARiB models 860k, 1790k, and 1970k

* try to fix PR

* re-add files

* links aren't allowed here :)
Co-authored-by: Ahmed Abdelali <aabdelali@hbku.edu.qa>
Co-authored-by: Julien Chaumond <julien@huggingface.co>

76df5593

Update README.md (#8820) · b161f1ae
moniquebm authored Dec 11, 2020

b161f1ae

Initial README for `t5-base-indonesian-summarization-cased` model (#9028) · 649d389d

Panggi Libersa Jasri Akadol authored Dec 11, 2020

* Create README.md

Initial README for `t5-base-indonesian-summarization-cased` model

* Update README for t5-base-indonesian-summarization-cased

Typo in README, change from `small` to `base`

649d389d

Create README.md (#9030) · 5e794b66
Panggi Libersa Jasri Akadol authored Dec 11, 2020
```
Initial README for `t5-small-indonesian-summarization-cased` model
```
5e794b66
🎨 Change nn.dropout to layer.Dropout (#9047) · 935e3469
Cola authored Dec 11, 2020

935e3469

10 Dec, 2020 7 commits
- Remove value error (#8985) · b01ddc95
  Julien Plu authored Dec 10, 2020
```
* Remove value error

* Try a fix for parameter ordering

* Restore previous behavior

* Add documentation

* Review the comment
```
  b01ddc95
- Fix typo #9012 (#1) (#9038) · 91ab02af
  NatLun137 authored Dec 10, 2020
```
There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)
```
  91ab02af
- Refactor FLAX tests (#9034) · 8d4bb020
  Sylvain Gugger authored Dec 10, 2020
  
  8d4bb020
- Enforce all objects in the main init are documented (#9014) · 1310e1a7
  Sylvain Gugger authored Dec 10, 2020
  
  1310e1a7
- MPNet copyright files (#9015) · 51e81e58
  Sylvain Gugger authored Dec 10, 2020
  
  51e81e58
- Fix documention of book in LayoutLM (#9017) · 35bffd70
  Sylvain Gugger authored Dec 10, 2020
  
  35bffd70
- ✏ Fix typo (#9020) · c95de29e
  Cola authored Dec 10, 2020
  
  c95de29e
09 Dec, 2020 7 commits

[wip] [ci] doc-job-skip take #4 dry-run (#8980) · 5e637e6c

Stas Bekman authored Dec 09, 2020

* ci-doc-job-skip-take-4

* wip

* wip

* wip

* wip

* skip yaml

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* ready to test

* yet another way

* trying with HEAD

* trying with head.sha

* trying with head.sha fix

* trying with head.sha fix wip

* undo

* try to switch to sha

* current branch

* current branch

* PR number check

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

* joy ride

5e637e6c

[Bart] Refactor - fix issues, consistency with the library, naming (#8900) · 06971ac4

Patrick von Platen authored Dec 09, 2020

* remove make on the fly linear embedding

* start refactor

* big first refactor

* save intermediate

* save intermediat

* correct mask issue

* save tests

* refactor padding masks

* make all tests pass

* further refactor

* make pegasus test pass

* fix bool if

* fix leftover tests

* continue

* bart renaming

* delete torchscript test hack

* fix imports in tests

* correct shift

* fix docs and repo cons

* re-add fix for FSTM

* typo in test

* fix typo

* fix another typo

* continue

* hot fix 2 for tf

* small fixes

* refactor types linting

* continue

* finish refactor

* fix import in tests

* better bart names

* further refactor and add test

* delete hack

* apply sylvains and lysandres commens

* small perf improv

* further perf improv

* improv perf

* fix typo

* make style

* small perf improv

06971ac4

Flax Masked Language Modeling training example (#8728) · 75627148

Funtowicz Morgan authored Dec 09, 2020



* Remove "Model" suffix from Flax models to look more :hugs:
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Initial working (forward + backward) for Flax MLM training example.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Simply code
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing comments, using module and moving to LM task.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore parameter name "module" wrongly renamed model.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore correct output ordering...
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Actually commit the example 😅

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Add FlaxBertModelForMaskedLM after rebasing.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make it possible to initialize the training from scratch
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Reuse flax linen example of cross entropy loss
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added specific data collator for flax
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove todo for data collator
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added evaluation step
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added ability to provide dtype to support bfloat16 on TPU
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable flax tensorboard output
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable jax.pmap support.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Ensure batches are correctly sized to be dispatched with jax.pmap
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable bfloat16 with --fp16 cmdline args
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Correctly export metrics to tensorboard
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added dropout and ability to use it.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Effectively enable & disable during training and evaluation steps.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Oops.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable specifying kernel initializer scale
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added warmup step to the learning rate scheduler.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix typo.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Print training loss
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make style
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix linter issue (flake8)
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix model matching
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix dummies
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix non default dtype on Flax models
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Use the same create_position_ids_from_input_ids for FlaxRoberta
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make Roberta attention as Bert
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix copy
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Wording.
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>

75627148

Add MP Net 2 (#9004) · df2af6d8
StillKeepTry authored Dec 09, 2020

df2af6d8
fixes #8968 (#9009) · 87291098
cronoik authored Dec 09, 2020

87291098
Add the code_search_net dataset tag to CodeBERTa model cards (#9005) · e977ed21
Simon Brandeis authored Dec 09, 2020

e977ed21
push (#9008) · da37a21c
Patrick von Platen authored Dec 09, 2020

da37a21c