Unverified Commit 2da88537 authored by Arthur's avatar Arthur Committed by GitHub
Browse files

🚨🚨 🚨🚨 [`Tokenizer`] attemp to fix add_token issues🚨🚨 🚨🚨 (#23909)

* fix test for bart. Order is correct now let's skip BPEs

* ouf

* styling

* fix bert....

* slow refactoring

* current updates

* massive refactoring

* update

* NICE!

* update to see where I am at

* updates

* update

* update

* revert

* updates

* updates

* start supporting legacy_save

* styling

* big update

* revert some changes

* nits

* nniiiiiice

* small fixes

* kinda fix t5 with new behaviour

* major update

* fixup

* fix copies

* today's updates

* fix byt5

* upfate

* update

* update

* updates

* update vocab size test

* Barthez does not use not need the fairseq offset ids

* super calll must be after

* calll super

* move all super init

* move other super init

* fixup

* nits

* more fixes

* nits

* more fixes

* nits

* more fix

* remove useless files

* ouch all of them are affected
...
parent 835b0a05
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment