1. 16 Nov, 2023 1 commit
  2. 15 Nov, 2023 1 commit
  3. 26 Oct, 2023 1 commit
  4. 23 Oct, 2023 1 commit
  5. 19 Oct, 2023 1 commit
    • Matt's avatar
      Pin Keras for now (#26904) · cbd278f0
      Matt authored
      * Pin Keras for now out of paranoia
      
      * Add the keras pin to _tests_requirements.txt too
      
      * Make sure the Keras version matches the TF one
      
      * make fixup
      cbd278f0
  6. 06 Oct, 2023 1 commit
  7. 21 Sep, 2023 1 commit
  8. 18 Sep, 2023 1 commit
    • Arthur's avatar
      🚨🚨 🚨🚨 [`Tokenizer`] attemp to fix add_token issues🚨🚨 🚨🚨 (#23909) · 2da88537
      Arthur authored
      
      
      * fix test for bart. Order is correct now let's skip BPEs
      
      * ouf
      
      * styling
      
      * fix bert....
      
      * slow refactoring
      
      * current updates
      
      * massive refactoring
      
      * update
      
      * NICE!
      
      * update to see where I am at
      
      * updates
      
      * update
      
      * update
      
      * revert
      
      * updates
      
      * updates
      
      * start supporting legacy_save
      
      * styling
      
      * big update
      
      * revert some changes
      
      * nits
      
      * nniiiiiice
      
      * small fixes
      
      * kinda fix t5 with new behaviour
      
      * major update
      
      * fixup
      
      * fix copies
      
      * today's updates
      
      * fix byt5
      
      * upfate
      
      * update
      
      * update
      
      * updates
      
      * update vocab size test
      
      * Barthez does not use not need the fairseq offset ids
      
      * super calll must be after
      
      * calll super
      
      * move all super init
      
      * move other super init
      
      * fixup
      
      * nits
      
      * more fixes
      
      * nits
      
      * more fixes
      
      * nits
      
      * more fix
      
      * remove useless files
      
      * ouch all of them are affected
      
      * and more!
      
      * small imporvements
      
      * no more sanitize token
      
      * more changes around unique no split tokens
      
      * partially fix more things
      
      * keep legacy save but add warning
      
      * so... more fixes
      
      * updates
      
      * guess deberta tokenizer could be nuked
      
      * fixup
      
      * fixup did some bad things
      
      * nuke it if it breaks
      
      * remove prints and pretrain fast from slow with new format.
      
      * fixups
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fiou
      
      * nit
      
      * by default specials should not be normalized?
      
      * update
      
      * remove brakpoint
      
      * updates
      
      * a lot of updates
      
      * fixup
      
      * fixes revert some changes to match fast
      
      * small nits
      
      * that makes it cleaner
      
      * fix camembert accordingly
      
      * update
      
      * some lest breaking changes
      
      * update
      
      * fixup
      
      * fix byt5 and whisper mostly
      
      * some more fixes, canine's byte vocab
      
      * fix gpt2
      
      * fix most of the perceiver tests (4 left)
      
      * fix layout lmv3
      
      * fixup
      
      * fix copies for gpt2 style
      
      * make sure to only warn once
      
      * fix perciever and gpt2 tests
      
      * some more backward compatibility: also read special tokens map because some ppl use it........////.....
      
      * fixup
      
      * add else when reading
      
      * nits
      
      * fresh updates
      
      * fix copies
      
      * will this make everything faster?
      
      * fixes
      
      * more fixes
      
      * update
      
      * more fixes
      
      * fixup
      
      * is the source of truth right?
      
      * sorry camembert for the troubles
      
      * current updates
      
      * fixup
      
      * update led
      
      * update
      
      * fix regression
      
      * fix single word
      
      * more model specific fixes
      
      * fix t5 tests
      
      * fixup
      
      * more comments
      
      * update
      
      * fix nllb
      
      * rstrip removed
      
      * small fixes
      
      * better handle additional_special_tokens and vocab sizes
      
      * fixing
      
      * styling
      
      * fix 4 / 21
      
      * fixup
      
      * fix nlbb's tests
      
      * some fixes
      
      * fix t5
      
      * fixes
      
      * style
      
      * fix canine tests
      
      * damn this is nice
      
      * nits
      
      * m2m100 nit
      
      * fixups
      
      * fixes!
      
      * fixup
      
      * stash
      
      * fix merge
      
      * revert bad change
      
      * fixup
      
      * correct order for code Llama
      
      * fix speecht5 post merge
      
      * styling
      
      * revert source of 11 fails
      
      * small nits
      
      * all changes in one go
      
      * fnet hack
      
      * fix 2 more tests
      
      * update based on main branch of tokenizers
      
      * fixup
      
      * fix VITS issues
      
      * more fixes
      
      * fix mgp test
      
      * fix camembert issues
      
      * oups camembert still has 2 failing tests
      
      * mluke fixes
      
      * decode fixes
      
      * small nits
      
      * nits
      
      * fix llama and vits
      
      * fix camembert
      
      * smal nits
      
      * more fixes when initialising a fast from a slow and etc
      
      * fix one of the last test
      
      * fix CPM tokenizer test
      
      * fixups
      
      * fix pop2piano
      
      * fixup
      
      * ️ Change tokenizers required version ️
      
      * ️ Change tokenizers required version ️
      
      * "tokenizers>=0.14,<0.15", don't forget smaller than
      
      * fix musicgen tests and pretraiendtokenizerfast
      
      * fix owlvit and all
      
      * update t5
      
      * fix 800 red
      
      * fix tests
      
      * fix the fix of the fix of t5
      
      * styling
      
      * documentation nits
      
      * cache _added_tokens_encoder
      
      * fixups
      
      * Nit
      
      * fix red tests
      
      * one last nit!
      
      * make eveything a lot simpler
      
      * Now it's over 😉
      
      
      
      * few small nits
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * updates that work for now
      
      * tests that should no be skipped / changed and fixed next
      
      * fixup
      
      * i am ashamed
      
      * pushe the fix
      
      * update
      
      * fixups
      
      * nits
      
      * fix added_tokens_encoder
      
      * fix canine test
      
      * fix pegasus vocab
      
      * fix transfoXL
      
      * fixup
      
      * whisper needs to be fixed for train new
      
      * pegasus nits
      
      * more pegasus fixes
      
      * minor update
      
      * better error message in failed test
      
      * fix whisper failing test
      
      * fix whisper failing test
      
      * fix pegasus
      
      * fixup
      
      * fix **** pegasus
      
      * reset things
      
      * remove another file
      
      * attempts to fix the strange custome encoder and offset
      
      * nits here and there
      
      * update
      
      * fixup
      
      * nit
      
      * fix the whisper test
      
      * nits nits
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * updates based on review
      
      * some small update to potentially remove
      
      * nits
      
      * import rlu cache
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * move warning to `from_pretrained`
      
      * update tests results now that the special tokens are always added
      
      ---------
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      2da88537
  9. 31 Aug, 2023 1 commit
  10. 22 Aug, 2023 1 commit
  11. 07 Aug, 2023 1 commit
  12. 03 Aug, 2023 1 commit
  13. 13 Jul, 2023 1 commit
  14. 03 Jul, 2023 1 commit
  15. 30 Jun, 2023 2 commits
  16. 28 Jun, 2023 2 commits
  17. 23 Jun, 2023 1 commit
    • Matt's avatar
      Improved keras imports (#24448) · 8e164c54
      Matt authored
      * An end to accursed version-specific imports
      
      * No more K.is_keras_tensor() either
      
      * Update dependency tables
      
      * Use a cleaner call context function getter
      
      * Add a cap to <2.14
      
      * Add cap to examples requirements too
      8e164c54
  18. 14 Jun, 2023 1 commit
  19. 08 Jun, 2023 1 commit
  20. 07 Jun, 2023 1 commit
  21. 01 Jun, 2023 1 commit
  22. 31 May, 2023 2 commits
    • Zachary Mueller's avatar
      Upgrade safetensors version (#23911) · 55451c66
      Zachary Mueller authored
      * Upgrade safetensors
      
      * Second table
      55451c66
    • Sanchit Gandhi's avatar
      Unpin numba (#23162) · 8f915c45
      Sanchit Gandhi authored
      * fix for ragged list
      
      * unpin numba
      
      * make style
      
      * np.object -> object
      
      * propagate changes to tokenizer as well
      
      * np.long -> "long"
      
      * revert tokenization changes
      
      * check with tokenization changes
      
      * list/tuple logic
      
      * catch numpy
      
      * catch else case
      
      * clean up
      
      * up
      
      * better check
      
      * trigger ci
      
      * Empty commit to trigger CI
      8f915c45
  23. 12 May, 2023 1 commit
  24. 11 May, 2023 1 commit
  25. 10 May, 2023 1 commit
  26. 08 May, 2023 1 commit
  27. 04 May, 2023 1 commit
  28. 03 May, 2023 1 commit
  29. 20 Apr, 2023 1 commit
  30. 18 Apr, 2023 1 commit
  31. 29 Mar, 2023 1 commit
  32. 24 Mar, 2023 2 commits
  33. 22 Mar, 2023 1 commit
  34. 21 Mar, 2023 2 commits
  35. 17 Mar, 2023 1 commit
    • Ali Hassani's avatar
      Fix natten (#22229) · 3028b20a
      Ali Hassani authored
      * Add kernel size to NATTEN's QK arguments.
      
      The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional
      argument to the QK operation to allow optional RPBs.
      
      This ends up failing NATTEN tests.
      
      This commit adds NATTEN back to circleci and adds the arguments to get
      it working again.
      
      * Force NATTEN >= 0.14.5
      3028b20a