Commits · ca33278fdbb83426dff7415f78cbf4070e873db0 · chenpangpang / transformers

18 May, 2021 1 commit

Suraj Patil authored May 19, 2021



* flax gpt2

* combine masks

* handle shared embeds

* add causal LM sample

* style

* add tests

* style

* fix imports, docs, quality

* don't use cache

* add cache

* add cache 1st version

* make use cache work

* start adding test for generation

* finish generation loop compilation

* rewrite test

* finish

* update

* update

* apply sylvains suggestions

* update

* refactor

* fix typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

ca33278f

04 May, 2021 1 commit

[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py (#11470) · 084a187d

Patrick von Platen authored May 04, 2021



* add flax roberta

* make style

* correct initialiazation

* modify model to save weights

* fix copied from

* fix copied from

* correct some more code

* add more roberta models

* Apply suggestions from code review

* merge from master

* finish

* finish docs
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

084a187d

29 Apr, 2021 1 commit

[Flax] Add docstrings & model outputs (#11498) · f748bd42

Patrick von Platen authored Apr 29, 2021



* add attentions & hidden states

* add model outputs + docs

* finish docs

* finish tests

* finish impl

* del @

* finish

* finish

* correct test

* apply sylvains suggestions

* Update src/transformers/models/bert/modeling_flax_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* simplify more
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

f748bd42

23 Apr, 2021 1 commit

[Flax] Big FlaxBert Refactor (#11364) · 8c9b5fcb

Patrick von Platen authored Apr 23, 2021

* improve flax

* refactor

* typos

* Update src/transformers/modeling_flax_utils.py

* Apply suggestions from code review

* Update src/transformers/modeling_flax_utils.py

* fix typo

* improve error tolerance

* typo

* correct nasty saving bug

* fix from pretrained

* correct tree map

* add note

* correct weight tying

8c9b5fcb

31 Mar, 2021 1 commit

[Flax] Add other BERT classes (#10977) · e87505f3

Patrick von Platen authored Mar 31, 2021

* add first code structures

* add all bert models

* add to init and docs

* correct docs

* make style

e87505f3

30 Mar, 2021 1 commit

[WIP][Flax] Add general conversion script (#10809) · 8780caa3

Patrick von Platen authored Mar 30, 2021



* save intermediate

* finish first version

* delete some more

* improve import

* fix roberta

* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* small corrections

* apply all comments

* fix deterministic

* make fix-copies
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

8780caa3

18 Mar, 2021 1 commit

[Flax] Adapt Flax models to new structure (#9484) · 0b98ca36

Patrick von Platen authored Mar 18, 2021



* Create modeling_flax_eletra with code copied from modeling_flax_bert

* Add ElectraForMaskedLM and ElectraForPretraining

* Add modeling test for Flax electra and fix naming and arg in Flax Electra model

* Add documentation

* Fix code style

* Create modeling_flax_eletra with code copied from modeling_flax_bert

* Add ElectraForMaskedLM and ElectraForPretraining

* Add modeling test for Flax electra and fix naming and arg in Flax Electra model

* Add documentation

* Fix code style

* Fix code quality

* Adjust tol in assert_almost_equal due to very small difference between model output, ranging 0.0010 - 0.0016

* Remove redundant ElectraPooler

* save intermediate

* adapt

* correct bert flax design

* adapt roberta as well

* finish roberta flax

* finish

* apply suggestions

* apply suggestions
Co-authored-by: Chris Nguyen <anhtu2687@gmail.com>

0b98ca36

16 Mar, 2021 1 commit

Flax testing should not run the full torch test suite (#10725) · 9f8619c6

Patrick von Platen authored Mar 16, 2021

* make flax tests pytorch independent

* fix typo

* finish

* improve circle ci

* fix return tensors

* correct flax test

* re-add sentencepiece

* last tokenizer fixes

* finish maybe now

9f8619c6

16 Dec, 2020 1 commit

[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) · 640e6fe1

Patrick von Platen authored Dec 16, 2020



* save intermediate

* save intermediate

* save intermediate

* correct flax bert model file

* new module / model naming

* make style

* almost finish BERT

* finish roberta

* make fix-copies

* delete keys file

* last refactor

* fixes in run_mlm_flax.py

* remove pooled from run_mlm_flax.py`

* fix gelu | gelu_new

* remove Module from inits

* splits

* dirty print

* preventing warmup_steps == 0

* smaller splits

* make fix-copies

* dirty print

* dirty print

* initial_evaluation argument

* declaration order fix

* proper model initialization/loading

* proper initialization

* run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug

* removed tokenizers warning hack, fixed model re-initialization

* reverted training_args.py changes

* fix flax from pretrained

* improve test in flax

* apply sylvains tips

* update init

* make 0.3.0 compatible

* revert tevens changes

* revert tevens changes 2

* finalize revert

* fix bug

* add docs

* add pretrained to init

* Update src/transformers/modeling_flax_utils.py

* fix copies

* final improvements
Co-authored-by: TevenLeScao <teven.lescao@gmail.com>

640e6fe1

10 Dec, 2020 1 commit
- Refactor FLAX tests (#9034) · 8d4bb020
  Sylvain Gugger authored Dec 10, 2020
  
  8d4bb020