Commits · 2da8853775b61cde0894dee17c6c713aba711688 · chenpangpang / transformers

18 Sep, 2023 1 commit

[`Tokenizer`] attemp to fix add_token issues

(#23909) · 2da88537

Arthur authored Sep 18, 2023



* fix test for bart. Order is correct now let's skip BPEs

* ouf

* styling

* fix bert....

* slow refactoring

* current updates

* massive refactoring

* update

* NICE!

* update to see where I am at

* updates

* update

* update

* revert

* updates

* updates

* start supporting legacy_save

* styling

* big update

* revert some changes

* nits

* nniiiiiice

* small fixes

* kinda fix t5 with new behaviour

* major update

* fixup

* fix copies

* today's updates

* fix byt5

* upfate

* update

* update

* updates

* update vocab size test

* Barthez does not use not need the fairseq offset ids

* super calll must be after

* calll super

* move all super init

* move other super init

* fixup

* nits

* more fixes

* nits

* more fixes

* nits

* more fix

* remove useless files

* ouch all of them are affected

* and more!

* small imporvements

* no more sanitize token

* more changes around unique no split tokens

* partially fix more things

* keep legacy save but add warning

* so... more fixes

* updates

* guess deberta tokenizer could be nuked

* fixup

* fixup did some bad things

* nuke it if it breaks

* remove prints and pretrain fast from slow with new format.

* fixups

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fiou

* nit

* by default specials should not be normalized?

* update

* remove brakpoint

* updates

* a lot of updates

* fixup

* fixes revert some changes to match fast

* small nits

* that makes it cleaner

* fix camembert accordingly

* update

* some lest breaking changes

* update

* fixup

* fix byt5 and whisper mostly

* some more fixes, canine's byte vocab

* fix gpt2

* fix most of the perceiver tests (4 left)

* fix layout lmv3

* fixup

* fix copies for gpt2 style

* make sure to only warn once

* fix perciever and gpt2 tests

* some more backward compatibility: also read special tokens map because some ppl use it........////.....

* fixup

* add else when reading

* nits

* fresh updates

* fix copies

* will this make everything faster?

* fixes

* more fixes

* update

* more fixes

* fixup

* is the source of truth right?

* sorry camembert for the troubles

* current updates

* fixup

* update led

* update

* fix regression

* fix single word

* more model specific fixes

* fix t5 tests

* fixup

* more comments

* update

* fix nllb

* rstrip removed

* small fixes

* better handle additional_special_tokens and vocab sizes

* fixing

* styling

* fix 4 / 21

* fixup

* fix nlbb's tests

* some fixes

* fix t5

* fixes

* style

* fix canine tests

* damn this is nice

* nits

* m2m100 nit

* fixups

* fixes!

* fixup

* stash

* fix merge

* revert bad change

* fixup

* correct order for code Llama

* fix speecht5 post merge

* styling

* revert source of 11 fails

* small nits

* all changes in one go

* fnet hack

* fix 2 more tests

* update based on main branch of tokenizers

* fixup

* fix VITS issues

* more fixes

* fix mgp test

* fix camembert issues

* oups camembert still has 2 failing tests

* mluke fixes

* decode fixes

* small nits

* nits

* fix llama and vits

* fix camembert

* smal nits

* more fixes when initialising a fast from a slow and etc

* fix one of the last test

* fix CPM tokenizer test

* fixups

* fix pop2piano

* fixup

* ⚠️ Change tokenizers required version ⚠️

* ⚠️ Change tokenizers required version ⚠️

* "tokenizers>=0.14,<0.15", don't forget smaller than

* fix musicgen tests and pretraiendtokenizerfast

* fix owlvit and all

* update t5

* fix 800 red

* fix tests

* fix the fix of the fix of t5

* styling

* documentation nits

* cache _added_tokens_encoder

* fixups

* Nit

* fix red tests

* one last nit!

* make eveything a lot simpler

* Now it's over 😉



* few small nits

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates that work for now

* tests that should no be skipped / changed and fixed next

* fixup

* i am ashamed

* pushe the fix

* update

* fixups

* nits

* fix added_tokens_encoder

* fix canine test

* fix pegasus vocab

* fix transfoXL

* fixup

* whisper needs to be fixed for train new

* pegasus nits

* more pegasus fixes

* minor update

* better error message in failed test

* fix whisper failing test

* fix whisper failing test

* fix pegasus

* fixup

* fix **** pegasus

* reset things

* remove another file

* attempts to fix the strange custome encoder and offset

* nits here and there

* update

* fixup

* nit

* fix the whisper test

* nits nits

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates based on review

* some small update to potentially remove

* nits

* import rlu cache

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* move warning to `from_pretrained`

* update tests results now that the special tokens are always added

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

2da88537

29 Aug, 2023 1 commit

[`LlamaTokenizer`] `tokenize` nits. (#25793) · 5b5ee235

Arthur authored Aug 29, 2023



* return when length is zero

* Add tests
Co-authored-by: Avnish Narayan <38871737avnishn@users.noreply.github.com>

* Co-authored-by: avnishn
<38871737+avnishn@users.noreply.github.com>

* codeLlama doc should not be on Main

* update test

---------
Co-authored-by: Avnish Narayan <38871737avnishn@users.noreply.github.com>

5b5ee235

17 Aug, 2023 2 commits

🚨

[`SPM`] Finish fix spm models

🚨

(#25224) · b4d55488

Arthur authored Aug 17, 2023

* fix EVERYTHING

* more fixes

* ⚗️⚗️ Tokenizer magic ⚗️⚗

️

* wrong value but test passes for the TODO

* update

* updat

* safe protobuf import?

* style

* non gated repo

* update

* fixup

* Update src/transformers/models/llama/tokenization_llama.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/llama/tokenization_llama.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/t5/test_tokenization_t5.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* nits

* fix t5 too

* use assert equal

* fix llama decoding

* nits on t5

* fixup

* only remove the prefix space, not other spaces

* more deconding tests and more todos

* fix CI as well

* fixup

* skip failing test on CI (its tf its ok)

* skip test_subword_regularization_tokenizer that is also crashing on the CI for TF

* update llama

* revert good fixes

* fixup

* empty

* explain why we need to encode with an additional token

* better warning?

* nits

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

b4d55488

Skip `test_beam_search_xla_generate_simple` for `T5` (#25566) · d2871b29
Yih-Dar authored Aug 17, 2023
```
* fix

* fix

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
d2871b29

02 Aug, 2023 1 commit
- CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266) · bd90cda9
  Yih-Dar authored Aug 02, 2023
```
* CI with layers=2

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  bd90cda9
31 Jul, 2023 1 commit

Update tiny model info. and pipeline testing (#25213) · 1b4f6199

Yih-Dar authored Jul 31, 2023



* update tiny_model_summary.json

* update

* update

* update

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

1b4f6199

25 Jul, 2023 1 commit

[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726) · 8f36ab3e

Sebastian Husch Lee authored Jul 25, 2023

* Initial addition of t5forsequenceclassification

* Adding imports and adding tests

* Formatting

* Running make fix-copies

* Adding mt5forseq

* Formatting

* run make fix-copies

* Adding to docs

* Add model_parallel

* Fix bug

* Fix

* Remove TODO

* Fixing tests for T5ForSequenceClassification

* Undo changes to dependency_versions_table.py

* Change classification head to work with T5Config directly

* Change seq length to let tests pass

* PR comments for formatting

* Formatting

* Initial addition of UMT5ForSequenceClassification

* Adding to inits and formatting

* run make fix-copies

* Add doc for UMT5ForSeqClass

* Update UMT5 config

* Fix docs

* Skip torch fx test for SequenceClassification

* Formatting

* Add skip to UMT5 tests as well

* Fix umt5 tests

* Running make fix-copies

* PR comments

* Fix for change to sentence_representation

* Rename seq_len to hidden_size since that's what it is

* Use base_model to follow format of the rest of the library

* Update docs

* Extract the decoder_input_ids changes and make one liner

* Make one-liner

8f36ab3e

11 Jul, 2023 1 commit

[Patch-t5-tokenizer] Patches the changes on T5 to make sure previous behaviour... · b15343de

Arthur authored Jul 11, 2023


[Patch-t5-tokenizer] Patches the changes on T5 to make sure previous behaviour is still valide for beginning of words (#24622)

* patch `_tokenize` function

* more tests

* properly fix

* fixup

* Update src/transformers/models/t5/tokenization_t5.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix without ifs

* update

* protect import

* add python processing

* is first needed

* add doc and update with lefacy

* updaate

* fix T5 SPM converter

* styling

* fix T5 warning

* add is_seqio_available

* remove is_first

* revert some changes

* more tests and update

* update llama test batterie

* fixup

* refactor T5 spm common tests

* draft the llama tests

* update

* uopdate test

* nits

* refine

* name nit

* fix t5 tests

* fix T5

* update

* revert convert slow to fast changes that fail lots of tests

* legacy support

* fixup

* nits is first not defined

* don't use legacy behaviour for switch transformers

* style

* My attempt to check.

* nits

* fixes

* update

* fixup

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* updates

* fixup

* add legacy warning

* fixup

* warning_once nit

* update t5 documentation test

* update llama tok documentation

* add space to warning

* nits

* nit

* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* last nits

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

b15343de

30 Jun, 2023 2 commits

Speed up TF tests by reducing hidden layer counts (#24595) · 134caef3

Matt authored Jun 30, 2023

* hidden layers, huh, what are they good for (absolutely nothing)

* Some tests break with 1 hidden layer, use 2

* Use 1 hidden layer in a few slow models

* Use num_hidden_layers=2 everywhere

* Slightly higher tol for groupvit

* Slightly higher tol for groupvit

134caef3

⚠

️

⚠

️[`T5Tokenize`] Fix T5 family tokenizers

⚠

️

⚠

️ (#24565) · b52a03cd

Arthur authored Jun 30, 2023



* don't add space before single letter chars that don't have a merge

* fix the fix

* fixup

* add a test

* more testing

* fixup

* hack to make sure fast is also fixed

* update switch transformers test

* revert convert slow

* Update src/transformers/models/t5/tokenization_t5.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add typechecking

* quality

---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

b52a03cd

27 Jun, 2023 1 commit

[`T5`] Add T5ForQuestionAnswering and MT5ForQuestionAnswering (#24481) · 06910f5a

Sebastian authored Jun 27, 2023



* Adding T5ForQuestionAnswering

* Changed weight initialization that results in better initial loss when fine-tuning

* Update to class variables

* Running make fixup

* Running make fix-copies

* Remove model_parallel

* Adding MT5ForQuestionAnswering

* Adding docs

* Fix wrong doc

* Update src/transformers/models/mt5/modeling_mt5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/t5/modeling_t5.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* File formatting

* Undoing change

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

06910f5a

16 Jun, 2023 1 commit

Big TF test cleanup (#24282) · 34037129

Matt authored Jun 16, 2023

* Fix one BLIP arg not being optional, remove misspelled arg

* Remove the lxmert test overrides and just use the base test_saved_model_creation

* saved_model_creation fixes and re-enabling tests across the board

* Remove unnecessary skip

* Stop caching sinusoidal embeddings in speech_to_text

* Fix transfo_xl compilation

* Fix transfo_xl compilation

* Fix the conditionals in xglm

* Set the save spec only when building

* Clarify comment

* Move comment correctly

* Correct embeddings generation for speech2text

* Mark RAG generation tests as @slow

* Remove redundant else:

* Add comment to clarify the save_spec line in build()

* Fix size tests for XGLM at last!

* make fixup

* Remove one band_part operation

* Mark test_keras_fit as @slow

34037129

13 Jun, 2023 1 commit
- TF: standardize `test_model_common_attributes` for language models (#23457) · 7bb6933b
  Joao Gante authored Jun 13, 2023
  
  7bb6933b
24 May, 2023 1 commit

Better TF docstring types (#23477) · f8b25744

Matt authored May 24, 2023

* Rework TF type hints to use | None instead of Optional[] for tf.Tensor

* Rework TF type hints to use | None instead of Optional[] for tf.Tensor

* Don't forget the imports

* Add the imports to tests too

* make fixup

* Refactor tests that depended on get_type_hints

* Better test refactor

* Fix an old hidden bug in the test_keras_fit input creation code

* Fix for the Deit tests

f8b25744

06 Apr, 2023 1 commit

update_pip_test_mapping (#22606) · fa01127a

Yih-Dar authored Apr 06, 2023



* Add TFBlipForConditionalGeneration

* update pipeline_model_mapping

* Add import

* Revert changes in GPTSanJapaneseTest

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

fa01127a

28 Feb, 2023 2 commits

🔥

Rework pipeline testing by removing `PipelineTestCaseMeta`

🚀

(#21516) · 871c31a6

Yih-Dar authored Feb 28, 2023



* Add PipelineTesterMixin

* remove class PipelineTestCaseMeta

* move validate_test_components

* Add for ViT

* Add to SPECIAL_MODULE_TO_TEST_MAP

* style and quality

* Add feature-extraction

* update

* raise instead of skip

* add tiny_model_summary.json

* more explicit

* skip tasks not in mapping

* add availability check

* Add Copyright

* A way to diable irrelevant tests

* update with main

* remove disable_irrelevant_tests

* skip tests

* better skip message

* better skip message

* Add all pipeline task tests

* revert

* Import PipelineTesterMixin

* subclass test classes with PipelineTesterMixin

* Add pipieline_model_mapping

* Fix import after adding pipieline_model_mapping

* Fix style and quality after adding pipieline_model_mapping

* Fix one more import after adding pipieline_model_mapping

* Fix style and quality after adding pipieline_model_mapping

* Fix test issues

* Fix import requirements

* Fix mapping for MobileViTModelTest

* Update

* Better skip message

* pipieline_model_mapping could not be None

* Remove some PipelineTesterMixin

* Fix typo

* revert tests_fetcher.py

* update

* rename

* revert

* Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests

* style and quality

* test fetcher for all pipeline/model tests

---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

871c31a6

[`T5`] Fix torchquant issue (#21843) · ae9230af
Younes Belkada authored Feb 28, 2023
```
* fix torchquant issue

* add tests
```
ae9230af

22 Feb, 2023 1 commit
- Apply ruff flake8-comprehensions (#21694) · 5e8c8eb5
  Aaron Gokaslan authored Feb 22, 2023
  
  5e8c8eb5
06 Feb, 2023 1 commit

Update quality tooling for formatting (#21480) · 6f79d264

Sylvain Gugger authored Feb 06, 2023

* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies

6f79d264

03 Feb, 2023 1 commit
- 🚨🚨 Generate: standardize beam search behavior across frameworks (#21368) · f21af262
  Joao Gante authored Feb 03, 2023
  
  f21af262
18 Jan, 2023 1 commit
- using raw string for regex to search <extra_id> (#21162) · 8ad06b7c
  Pengfei Liu authored Jan 18, 2023
```
* using raw string for regex to search <extra_id>

* fix the same issue in test file:`tokenization_t5.py`
```
  8ad06b7c
13 Dec, 2022 1 commit

Add `keep_in_fp32_modules` support (#20683) · 1af4bee8

Younes Belkada authored Dec 13, 2022



* add `keep_in_fp32_modules` support

* pass it as class attribute

* few modifs

- make tests `slow`
- fix logic

* better logic

* fix failing test

* `bfloat16` support

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* simplify tests

* simplify tests

* fix test

* modify message

* more checks

* fix failing tests

* add more conditions

- add `is_accelerate_available`
- fixes pipleine tests that failed

* add suggestions

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix failing `bnb` test

* add last safety checker
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1af4bee8

28 Nov, 2022 1 commit
- [FLAX] Add dtype to embedding for bert/bart/opt/t5 (#20340) · ac2f6674
  Lianmin Zheng authored Nov 28, 2022
```
* [FLAX] Add dtype to embedding for bert/bart/opt/t5

* Fix all copies

* Add a test case
```
  ac2f6674
23 Nov, 2022 1 commit

change the way sentinel tokens can retrived (#20373) · 03ae1f06

raghavanone authored Nov 23, 2022

* change the way sentinel tokens can retrived

* Fix line length for doc string

* Fix line length for doc string

* Add more stronger test for t5 tokenization

* Format file changes

* Make a stronger test for filtering sentinel tokens

* fix file format issues

03ae1f06

09 Nov, 2022 1 commit

Generate: move generation_*.py src files into generation/*.py (#20096) · f270b960

Joao Gante authored Nov 09, 2022

* move generation_*.py src files into generation/*.py

* populate generation.__init__ with lazy loading

* move imports and references from generation.xxx.object to generation.object

f270b960

01 Nov, 2022 1 commit

Generate: contrastive search with full optional outputs (#19963) · 831590f6

Joao Gante authored Nov 01, 2022

* Use beam search functionality; Add extra outputs and test

* Add full tests for contrastive search

* Add error message on unconventional cache format

831590f6

11 Oct, 2022 1 commit

🚨

TF: Remove `TFWrappedEmbeddings` (breaking: TF embedding initialization... · 462cd641

Joao Gante authored Oct 11, 2022

🚨🚨🚨  TF: Remove `TFWrappedEmbeddings` (breaking: TF embedding initialization updated for encoder-decoder models) (#19263)

* added test

* correct embedding init

* some changes in blenderbot (incomplete)

* update blenderbot (diff to be used as reference)

* update blenderbot_small

* update LED

* update marian

* update T5 and remove TFWrappedEmbeddings

* nullcontext() -> ContextManagers()

* fix embedding init

462cd641

29 Jul, 2022 1 commit

Replace `as_target` context managers by direct calls (#18325) · 986526a0

Sylvain Gugger authored Jul 29, 2022



* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py
Co-authored-by: amyeroberts <amy@huggingface.co>

* Style
Co-authored-by: amyeroberts <amy@huggingface.co>

986526a0

22 Jul, 2022 1 commit

Update serving code to enable `saved_model=True` (#18153) · 8e838466

amyeroberts authored Jul 22, 2022



* Add serving_output and serving methods to some vision models

* Add serving outputs for DeiT

* Don't convert hidden states - differing shapes

* Make saveable

* Fix up

* Make swin saveable

* Add in tests

* Fix funnel tests (can't convert to tensor)

* Fix numpy call

* Tidy up a bit

* Add in hidden states - resnet

* Remove numpy

* Fix failing tests - tensor shape and skipping tests

* Remove duplicated function

* PR comments - formatting and var names

* PR comments
Add suggestions made by Joao Gante:
* Use tf.shape instead of shape_list
* Use @tooslow decorator on tests
* Simplify some of the logic

* PR comments
Address Yih-Dar Sheih comments - making tensor names consistent and make types float

* Types consistent with docs; disable test on swin (slow)

* CI trigger

* Change input_features to float32

* Add serving_output for segformer

* Fixup
Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>

8e838466

18 Jul, 2022 1 commit
- Fix expected loss values in some (m)T5 tests (#18177) · cb19c2af
  Yih-Dar authored Jul 18, 2022
```
* fix expected loss values
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  cb19c2af
05 Jul, 2022 1 commit
- Fix T5/mT5 tests (#18029) · 5ae087cf
  Matt authored Jul 05, 2022
  
  5ae087cf
04 Jul, 2022 1 commit
- TF: T5 can now handle a padded past (i.e. XLA generation) (#17969) · f0982682
  Joao Gante authored Jul 04, 2022
```
* get the right slicing index for position_bias
```
  f0982682
29 Jun, 2022 2 commits

Flax t5 Encoder (#17784) · 692e61e9

Crystina authored Jun 29, 2022



* first draft adding Flax-t5-encoder and Flax-mt5-encoder

* imports

* after make fixup

* flax t5 encoder test

* black on test

* make fix-copies

* clean

* all_model_classes -> tuple

* clean test

* is_encoder_decoder=False in t5-enc tester

* remove file docstring before FlaxT5Encoder

* black

* isort

* commit suggestions on src/transformers/models/t5/modeling_flax_t5.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* commit suggestions on src/transformers/models/t5/modeling_flax_t5.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* remove _get_encoder_module

* self.decoder_seq_length -> self.encoder_seq_length as t5-enc does not have decoder

* bugfix - self.module_class is class itself, not instance;

* docs for mt5 and t5

* call -> __call__ in t5 doc

* FlaxMT5EncoderModel to TYPE_HINT

* run doc-builder to allow change the files
Co-authored-by: Suraj Patil <surajp815@gmail.com>

692e61e9

TF: XLA beam search + most generation-compatible models are now also... · e6d27ca5

Joao Gante authored Jun 29, 2022

TF: XLA beam search + most generation-compatible models are now also XLA-generate-compatible (#17857)

* working beam search 🎉

* XLA generation compatible with ALL classes

* add xla generation slow test

e6d27ca5

20 Jun, 2022 1 commit
- TF: BART compatible with XLA generation (#17479) · 132402d7
  Joao Gante authored Jun 20, 2022
```
* Also propagate changes to blenderbot, blenderbot_small, marian, mbart, and pegasus
```
  132402d7
07 Jun, 2022 1 commit
- Skip disk offload test for T5 · 9e72eb44
  Sylvain Gugger authored Jun 07, 2022
  
  9e72eb44
03 Jun, 2022 2 commits

Fix all offload and MP tests (#17533) · 83439012
Sylvain Gugger authored Jun 03, 2022

83439012

Add Gated-SiLU to T5 (#17420) · 607acd4f

DanielHesslow authored Jun 03, 2022



* Add gated-silu to t5 architecture to support UL2

* Fix error message

* formatting

* formatting again

* refactor

* fix classnames in _init_weights

* remove is_gated

* add test

* fix test

* Try without the test?

* Add back the test.

* Improve error message.
Co-authored-by: Daniel Hesslow <daniel@lighton.ai>

607acd4f

31 May, 2022 1 commit

Fx support for multiple model architectures (#17393) · 28d00482

Michael Benayoun authored May 31, 2022

* Support for Bart and LayoutLM, and partial support for XLNet

* Support for mbart

* A lot of new models supported

* Support for other models

* LayoutLM fix

* Use strings instead of classes

28d00482

18 May, 2022 1 commit
- Fix test_t5_decoder_model_past_large_inputs (#17320) · b3b9f99e
  Yih-Dar authored May 18, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  b3b9f99e