1. 20 Oct, 2023 1 commit
    • Pedro Cuenca's avatar
      Fix Fuyu image scaling bug (#26918) · c030fc89
      Pedro Cuenca authored
      * Fix Fuyu image scaling bug
      
      It could produce negative padding and hence inference errors for certain
      image sizes.
      
      * Fix aspect ratio scaling test
      c030fc89
  2. 18 Oct, 2023 2 commits
    • Pablo Montalvo's avatar
      Add fuyu model (#26911) · caa0ff0b
      Pablo Montalvo authored
      
      
      * initial commit
      
      * add processor, add fuyu naming
      
      * add draft processor
      
      * fix processor
      
      * remove dropout to fix loading of weights
      
      * add image processing fixes from Pedro
      
      * fix
      
      * fix processor
      
      * add basic processing fuyu test
      
      * add documentation and TODO
      
      * address comments, add tests, add doc
      
      * replace assert with torch asserts
      
      * add Mixins and fix tests
      
      * clean imports
      
      * add model tester, clean imports
      
      * fix embedding test
      
      * add updated tests from pre-release model
      
      * Processor: return input_ids used for inference
      
      * separate processing and model tests
      
      * relax test tolerance for embeddings
      
      * add test for logit comparison
      
      * make sure fuyu image processor is imported in the init
      
      * fix formattingh
      
      * more formatting issues
      
      * and more
      
      * fixups
      
      * remove some stuff
      
      * nits
      
      * update init
      
      * remove the fuyu file
      
      * Update integration test with release model
      
      * Update conversion script.
      
      The projection is not used, as confirmed by the authors.
      
      * improve geenration
      
      * Remove duplicate function
      
      * Trickle down patches to model call
      
      * processing fuyu updates
      
      * remove things
      
      * fix prepare_inputs_for_generation to fix generate()
      
      * remove model_input
      
      * update
      
      * add generation tests
      
      * nits
      
      * draft leverage automodel and autoconfig
      
      * nits
      
      * fix dtype patch
      
      * address comments, update READMEs and doc, include tests
      
      * add working processing test, remove refs to subsequences
      
      * add tests, remove Sequence classification
      
      * processing
      
      * update
      
      * update the conversion script
      
      * more processing cleanup
      
      * safe import
      
      * take out ModelTesterMixin for early release
      
      * more cl;eanup
      
      * more cleanup
      
      * more cleanup
      
      * and more
      
      * register a buffer
      
      * nits
      
      * add postprocessing of generate output
      
      * nits
      
      * updates
      
      * add one working test
      
      * fix test
      
      * make fixup works
      
      * fixup
      
      * Arthur's updates
      
      * nits
      
      * update
      
      * update
      
      * fix processor
      
      * update tests
      
      * passe more fixups
      
      * fix
      
      * nits
      
      * don't import torch
      
      * skip fuyu config for now
      
      * fixup done
      
      * fixup
      
      * update
      
      * oups
      
      * nits
      
      * Use input embeddings
      
      * no buffer
      
      * update
      
      * styling processing fuyu
      
      * fix test
      
      * update licence
      
      * protect torch import
      
      * fixup and update not doctested
      
      * kwargs should be passed
      
      * udpates
      
      * update the impofixuprts in the test
      
      * protect import
      
      * protecting imports
      
      * protect imports in type checking
      
      * add testing decorators
      
      * protect top level import structure
      
      * fix typo
      
      * fix check init
      
      * move requires_backend to functions
      
      * Imports
      
      * Protect types
      
      ---------
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      Co-authored-by: default avatarArthurZucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarLysandre <lysandre@huggingface.co>
      caa0ff0b
    • Arthur's avatar
      [`Tokenizer`] Fix slow and fast serialization (#26570) · ef7e9369
      Arthur authored
      * fix
      
      * last attempt
      
      * current work
      
      * fix forward compatibility
      
      * save all special tokens
      
      * current state
      
      * revert additional changes
      
      * updates
      
      * remove tokenizer.model
      
      * add a test and the fix
      
      * nit
      
      * revert one more break
      
      * fix typefield issue
      
      * quality
      
      * more tests
      
      * fix fields for FC
      
      * more nits?
      
      * new additional changes
      
      * how
      
      * some updates
      
      * simplify all
      
      * more nits
      
      * revert some things to original
      
      * nice
      
      * nits
      
      * a small hack
      
      * more nits
      
      * ahhaha
      
      * fixup
      
      * update
      
      * make test run on ci
      
      * use subtesting
      
      * update
      
      * Update .circleci/create_circleci_config.py
      
      * updates
      
      * fixup
      
      * nits
      
      * replace typo
      
      * fix the test
      
      * nits
      
      * update
      
      * None max dif pls
      
      * a partial fix
      
      * had to revert one thing
      
      * test the fast
      
      * updates
      
      * fixup
      
      * and more nits
      
      * more fixes
      
      * update
      
      * Oupsy 馃憗
      
      
      
      * nits
      
      * fix marian
      
      * on our way to heaven
      
      * Update src/transformers/models/t5/tokenization_t5.py
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * fixup
      
      * Update src/transformers/tokenization_utils_fast.py
      Co-authored-by: default avatarLeo Tronchon <leo.tronchon@gmail.com>
      
      * Update src/transformers/tokenization_utils_base.py
      Co-authored-by: default avatarLeo Tronchon <leo.tronchon@gmail.com>
      
      * fix phobert
      
      * skip some things, test more
      
      * nits
      
      * fixup
      
      * fix deberta
      
      * update
      
      * update
      
      * more updates
      
      * skip one test
      
      * more updates
      
      * fix camembert
      
      * can't test this one
      
      * more good fixes
      
      * kind of a major update
      
      - seperate what is only done in fast in fast init and refactor
      - add_token(AddedToken(..., speicla = True)) ignores it in fast
      - better loading
      
      * fixup
      
      * more fixups
      
      * fix pegasus and mpnet
      
      * remove skipped tests
      
      * fix phoneme tokenizer if self.verbose
      
      * fix individual models
      
      * update common tests
      
      * update testing files
      
      * all over again
      
      * nits
      
      * skip test for markup lm
      
      * fixups
      
      * fix order of addition in fast by sorting the added tokens decoder
      
      * proper defaults for deberta
      
      * correct default for fnet
      
      * nits on add tokens, string initialized to special if special
      
      * skip irrelevant herbert tests
      
      * main fixes
      
      * update test added_tokens_serialization
      
      * the fix for bart like models and class instanciating
      
      * update bart
      
      * nit!
      
      * update idefix test
      
      * fix whisper!
      
      * some fixup
      
      * fixups
      
      * revert some of the wrong chanegs
      
      * fixup
      
      * fixup
      
      * skip marian
      
      * skip the correct tests
      
      * skip for tf and flax as well
      
      ---------
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      Co-authored-by: default avatarLeo Tronchon <leo.tronchon@gmail.com>
      ef7e9369
  3. 17 Oct, 2023 1 commit
  4. 16 Oct, 2023 2 commits
  5. 13 Oct, 2023 4 commits
    • NielsRogge's avatar
      Add OWLv2, bis (#26668) · 762af3e3
      NielsRogge authored
      * First draft
      
      * Update conversion script
      
      * Update copied from statements
      
      * Fix style
      
      * Add copied from to config
      
      * Add copied from to processor
      
      * Run make fixup
      
      * Add docstring
      
      * Update docstrings
      
      * Add method
      
      * Improve docstrings
      
      * Fix docstrings
      
      * Improve docstrings
      
      * Remove onnx
      
      * Add flag
      
      * Address comments
      
      * Add copied from to model tests
      
      * Add flag to conversion script
      
      * Add code snippet
      
      * Address more comments
      
      * Address comment
      
      * Improve conversion script
      
      * More improvements
      
      * Add expected objectness logits
      
      * Skip test
      
      * Improve conversion script
      
      * Extend conversion script
      
      * Convert large checkpoint
      
      * Fix doc tests
      
      * Convert all checkpoints, update integration tests
      
      * Add checkpoint_path arg
      
      * Fix repo_id
      762af3e3
    • Matt's avatar
      Fix Falcon generation test (#26770) · bdb391e9
      Matt authored
      bdb391e9
    • Matt's avatar
      Disable default system prompt for LLaMA (#26765) · c9785d95
      Matt authored
      * Disable default system prompt for LLaMA
      
      * Update test to not expect default prompt
      c9785d95
    • Yih-Dar's avatar
  6. 12 Oct, 2023 3 commits
  7. 11 Oct, 2023 2 commits
  8. 06 Oct, 2023 3 commits
  9. 05 Oct, 2023 3 commits
  10. 04 Oct, 2023 1 commit
  11. 03 Oct, 2023 5 commits
  12. 02 Oct, 2023 3 commits
    • Arthur's avatar
      Code-llama-nit (#26300) · bab33319
      Arthur authored
      * fix encoding when the fill token is None
      
      * add tests and edge cases
      
      * fiuxp
      
      * Update tests/models/code_llama/test_tokenization_code_llama.py
      bab33319
    • Arthur's avatar
      Fix model integration ci (#26322) · 63864e05
      Arthur authored
      * fix wav2vec2
      
      * nit
      
      * stash
      
      * one more file to update
      
      * fix byt5
      
      * vocab size is 256, don't change that!
      
      * use other revision
      
      * test persimon in smaller size
      
      * style
      
      * tests
      
      * nits
      
      * update add tokens from pretrained
      
      * test tokenization
      
      * nits
      
      * potential fnet fix?
      
      * more nits
      
      * nits
      
      * correct test
      
      * assert close
      
      * udpate
      
      * ouch
      
      * fix it
      
      * some more nits
      
      * FINALLU
      
      * use `adept` checkpoints
      
      * more adept checkpoints
      
      * that was invlved!
      63864e05
    • Lysandre Debut's avatar
      Revert falcon exception (#26472) · 67239f73
      Lysandre Debut authored
      * Revert "Falcon: fix revision propagation (#26006)"
      
      This reverts commit 118c676ef3124423e5d062b665f05cde55bc9a90.
      
      * Revert "Put Falcon back (#25960)"
      
      This reverts commit 22a69f1d.
      67239f73
  13. 29 Sep, 2023 1 commit
  14. 28 Sep, 2023 1 commit
  15. 27 Sep, 2023 2 commits
  16. 26 Sep, 2023 2 commits
    • sanjeevk-os's avatar
    • NielsRogge's avatar
      Add Nougat (#25942) · ace74d16
      NielsRogge authored
      
      
      * Add conversion script
      
      * Add NougatImageProcessor
      
      * Add crop margin
      
      * More improvements
      
      * Add docs, READMEs
      
      * Remove print statements
      
      * Include model_max_length
      
      * Add NougatTokenizerFast
      
      * Fix imports
      
      * Improve postprocessing
      
      * Improve image processor
      
      * Fix image processor
      
      * Improve normalize method
      
      * More improvements
      
      * More improvements
      
      * Add processor, improve docs
      
      * Simplify fast tokenizer
      
      * Remove test file
      
      * Fix docstrings
      
      * Use NougatProcessor in conversion script
      
      * Add is_levensthein_available
      
      * Add tokenizer tests
      
      * More improvements
      
      * Use numpy instead of opencv
      
      * Add is_cv2_available
      
      * Fix cv2_available
      
      * Add is_nltk_available
      
      * Add image processor tests, improve crop_margin
      
      * Add integration tests
      
      * Improve integration test
      
      * Use do_rescale instead of hacks, thanks Amy
      
      * Remove random_padding
      
      * Address comments
      
      * Address more comments
      
      * Add import
      
      * Address more comments
      
      * Address more comments
      
      * Address comment
      
      * Address comment
      
      * Set max_model_input_sizes
      
      * Add tests
      
      * Add requires_backends
      
      * Add Nougat to exotic tests
      
      * Use to_pil_image
      
      * Address comment regarding nltk
      
      * Add NLTK
      
      * Improve variable names, integration test
      
      * Add test
      
      * refactor, document, and test regexes
      
      * remove named capture groups, add comments
      
      * format
      
      * add non-markdown fixed tokenization
      
      * format
      
      * correct flakyness of args parse
      
      * add regex comments
      
      * test functionalities for crop_image, align long axis and expected output
      
      * add regex tests
      
      * remove cv2 dependency
      
      * test crop_margin equality between cv2 and python
      
      * refactor table regexes to markdown
      
      add newline
      
      * change print to log, improve doc
      
      * fix high count tables correction
      
      * address PR comments: naming, linting, asserts
      
      * Address comments
      
      * Add copied from
      
      * Update conversion script
      
      * Update conversion script to convert both small and base versions
      
      * Add inference example
      
      * Add more info
      
      * Fix style
      
      * Add require annotators to test
      
      * Define all keyword arguments explicitly
      
      * Move cv2 annotator
      
      * Add tokenizer init method
      
      * Transfer checkpoints
      
      * Add reference to Donut
      
      * Address comments
      
      * Skip test
      
      * Remove cv2 method
      
      * Add copied from statements
      
      * Use cached_property
      
      * Fix docstring
      
      * Add file to not doctested
      
      ---------
      Co-authored-by: default avatarPablo Montalvo <pablo.montalvo.leroux@gmail.com>
      ace74d16
  17. 25 Sep, 2023 1 commit
  18. 22 Sep, 2023 1 commit
  19. 21 Sep, 2023 2 commits