1. 19 May, 2023 1 commit
    • Matt's avatar
      TF port of the Segment Anything Model (SAM) (#22970) · 1c460a52
      Matt authored
      
      
      * First commit
      
      * Add auto-translation with GPT-4
      
      * make fixup
      
      * Add a functional layernorm for TF
      
      * Add all the auxiliary imports etc.
      
      * Add the extra processor and tests
      
      * rebase to main
      
      * Add all the needed fixes to the GPT code
      
      * make fixup
      
      * Make convolutions channels-last so they run on CPU
      
      * make fixup
      
      * Fix final issues
      
      * Fix other models affected by test change
      
      * Clarify comment on the sparse_prompt_embeddings check
      
      * Refactor functional_layernorm, use shape_list in place of .shape in some places
      
      * Remove deprecated torch-alike code
      
      * Update tests/models/sam/test_modeling_tf_sam.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update tests/models/sam/test_modeling_tf_sam.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Refactor processor with common methods and separated private methods
      
      * make fixup
      
      * Quietly delete the file that didn't do anything (sorry Sylvain)
      
      * Refactor the processor tests into one file
      
      * make fixup
      
      * Clean up some unnecessary indirection
      
      * Fix TF mask postprocessing
      
      * Add more processor equivalence tests
      
      * Refactor generate_crop_boxes to use framework-neutral np code
      
      * Make the serving output correctly conditional
      
      * Fix error message line length
      
      * Use dict keys rather than indices internally in both TF and PT SAM call/forward
      
      * Return dicts internally in the call/forward methods
      
      * Revert changes to common tests and just override check_pt_tf_outputs
      
      * Revert changes to other model tests
      
      * Clarify comments for functional layernorm
      
      * Add missing transpose from PT code
      
      * Removed unused copied from in PT code
      
      * Remove overrides for tests that don't exist in TF
      
      * Fix transpose and update tests for PT and TF to check pred_masks
      
      * Add training flag
      
      * Update tests to use TF checkpoints
      
      * Update index.mdx
      
      * Add missing cross-test decorator
      
      * Remove optional extra asterisks
      
      * Revert return_dict changes in PT code
      
      * Update src/transformers/models/sam/modeling_tf_sam.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Remove None return annotations on init methods
      
      * Update tests/models/sam/test_processor_sam.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Fix input_boxes shapes
      
      * make fixup
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      1c460a52
  2. 16 May, 2023 1 commit
  3. 15 May, 2023 1 commit
  4. 12 May, 2023 1 commit
  5. 09 May, 2023 2 commits
  6. 07 May, 2023 1 commit
  7. 05 May, 2023 2 commits
  8. 04 May, 2023 3 commits
  9. 03 May, 2023 3 commits
  10. 02 May, 2023 1 commit
  11. 01 May, 2023 1 commit
  12. 28 Apr, 2023 1 commit
  13. 27 Apr, 2023 2 commits
  14. 26 Apr, 2023 1 commit
    • Ritik Nandwal's avatar
      Add TensorFlow Wav2Vec2 for sequence classification (#22073) · 20ac86c6
      Ritik Nandwal authored
      * Add initial changes for TF wav2vec2 for sequence classification
      
      * Add suggested changes
      
      * Add serving and serving output methods
      
      * Add serving_output implementation and fix layer_weights
      
      * Add fixes
      
      * Fixed test cases
      
      * Fixing test and adding suggested changes
      20ac86c6
  15. 25 Apr, 2023 1 commit
  16. 24 Apr, 2023 1 commit
  17. 23 Apr, 2023 1 commit
  18. 21 Apr, 2023 2 commits
  19. 20 Apr, 2023 3 commits
  20. 19 Apr, 2023 1 commit
    • Arthur's avatar
      Add Segment Anything Model (SAM) (#22654) · 474bf508
      Arthur authored
      
      
      * initial commit
      
      * keys match
      
      * update, fix conversion
      
      * fixes, inference working
      
      * fix
      
      * more fixes
      
      * more fixes
      
      * clean up
      
      * more clean up
      
      * fix copies and add convext copied layer norm
      
      * stash
      
      * pretty big upfate
      
      * cleaning
      
      * more cleaning
      
      * fixup stuffs
      
      * fix copies
      
      * fix iinit
      
      * update test removing tokenizer
      
      * nits
      
      * add pretrained
      
      * more nits
      
      * remove tracking of pipeline
      
      * few fixes
      
      * update san and conversion script
      
      * fix mask decoder and prompt encoder conversion
      
      * fixes
      
      * small update
      
      * fix order
      
      * fix
      
      * fix image embeddings
      
      * nites
      
      * few fixes
      
      * fix logits
      
      * clean up
      
      * fixes boxes inference
      
      * v1 AMG
      
      * clean up
      
      * some clean up
      
      * multi points support
      
      * amg working
      
      * fixup
      
      * clean up
      
      * readme
      
      * update toctree
      
      * fix type hint
      
      * multiple fixes
      
      * fixup
      
      * fixes
      
      * updates
      
      * updates
      
      * more tests
      
      * few fixes
      
      * change to `SamForMaskGeneration`
      
      * doc
      
      * fixup
      
      * fix more tests
      
      * multiple fixes
      
      * fix CI tests
      
      * refactor processor
      
      * renamings
      
      * draft the pipeline
      
      * refactor
      
      * fix tests
      
      * fix test
      
      * few cleanings
      
      * fix test
      
      * edit pipelien support chunking
      
      * udate
      
      * add slow tests
      
      * fix nit
      
      * fixup
      
      * fix nit
      
      * current chunk pipleine
      
      * cast boxes in fp32
      
      * nit
      
      * current updates
      
      * piepleine works
      
      * fixup
      
      * clean up config
      
      * fix slow tests
      
      * fix slow tests
      
      * clean up
      
      * update doc and pipeline
      
      * adds more slow tests
      
      * fix slow tests
      
      * cleaning
      
      * tests pass
      
      * add docstring
      
      * fix copies
      
      * clean up
      
      * support batch of images
      
      * style
      
      * dummy is needed, add tests
      
      * fix slow tests
      
      * fix CI
      
      * update
      
      * adds more tests
      
      * fixes
      
      * fixes
      
      * fixup
      
      * fixes
      
      * few fixes
      
      * filter
      
      * few fixes
      
      * some refactor
      
      * touches finales
      
      * fix
      
      * style
      
      * remove pipeline files
      
      * fixes nits
      
      * revert pipeline changes
      
      * fix test
      
      * fixup
      
      * remove automodel for automatic mask generation
      
      * fix failing torch tests
      
      * update mdx
      
      * revert removal of `MODEL_FOR_AUTOMATIC_MASK_GENERATION_MAPPING`
      
      * update sam config based on review
      Co-authored-by: default avataramyeroberts <aeroberts4444@gmail.com>
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      
      * update low_resolution_masks -> pred_masks
      inti ln with layer_norm_eps
      add_decomposed_rel_pos doc
      forward doc of SamForMaskGeneration
      
      * update processor docstring
      
      * remove image processor import empty
      
      * update for testing
      
      * output vision hidden states + clean recomm
      also test all iou values
      
      * fixup
      
      * fixup
      
      * remove unused
      
      * Update src/transformers/models/sam/modeling_sam.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/sam/image_processing_sam.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * nits
      
      * fix
      
      * fix CI tests and slow tests
      
      * replace with Amy's processor
      
      * clearer docstring
      
      * add `SamVisionNeck`
      
      * refactor - all CI tests should pass
      
      * fix broken import on Gcolab
      
      * few fixes here and there
      
      * fix another bug
      
      * fix more bugs
      
      * update and merge
      
      * correct ckpt
      
      * address comments
      
      * add tips
      
      * revert
      
      * fix docstring
      
      * replace with `SamModel`
      
      * make fixup
      
      * add support for bathed images and batch ed points
      
      * make fixup this time, really
      
      * make fixup again and again
      
      * few fixes here and there, this should be the touche finale
      
      * Update docs/source/en/model_doc/sam.mdx
      
      * fixup
      
      * correct checkpoints
      
      * correct name
      
      * rm unneeded file
      
      * add notebook
      
      ---------
      Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
      Co-authored-by: default avataramyeroberts <aeroberts4444@gmail.com>
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      474bf508
  21. 13 Apr, 2023 2 commits
  22. 12 Apr, 2023 2 commits
    • pioliverse's avatar
      add model resources for CPMAnt (new) (#20906) · 523ca4e0
      pioliverse authored
      
      
      * resolve conflicts
      
      * rebase and make style
      
      * test
      
      * test
      
      * test
      
      * rebase and make style
      
      * rebase and make style
      
      * tests
      
      * tests
      
      * rewrite some functions
      
      * rebase and make style
      
      * fix load_tf_weights_in_cpmant
      
      * reformat some unrelated files
      
      * upgrade quality
      
      * fix some bugs & docstring
      
      * add models and tests
      
      * solve conflicts
      
      * resolve conflicts
      
      * resolve conflicts
      
      * resolve conflicts
      
      * resolve conflicts
      
      * tests
      
      * resolve conflicts
      
      * resolve conflicts
      
      * fix load_tf_weights_in_cpmant
      
      * reformat some unrelated files
      
      * upgrade quality
      
      * fix some bugs & docstring
      
      * save resolution
      
      * make style
      
      * delete redefinition code
      
      * reformat function
      
      * reformat
      
      * resolve conflicts
      
      * resolve conflicts
      
      * resolve conflicts
      
      * resolve conflicts
      
      * resolve conflicts
      
      * tests
      
      * resolve conflicts
      
      * resolve conflicts
      
      * fix load_tf_weights_in_cpmant
      
      * reformat some unrelated files
      
      * upgrade quality
      
      * resolve conflicts
      
      * resolve conflicts
      
      * resolve conflicts
      
      * resolve conflicts
      
      * resolve conflicts
      
      * fix load_tf_weights_in_cpmant
      
      * reformat some unrelated files
      
      * upgrade quality
      
      * resolve conflicts
      
      * make style
      
      * fix bugs and refactor
      
      * modify docstrings and make style
      
      * unify import format in __init__.py
      
      * fix import-altclp bug
      
      * fix copies to update index.md
      
      * fix unused config parameters
      
      * fix unused config parameters
      
      * fix unused config parameters
      
      * update README_ja.md
      
      * dummy commit for unit test
      
      * fix attention mask
      
      * add CPMAntTokenizer&-Fast to auto-mapping
      
      * drop redundant changes in README_ko
      
      * fix  defaults in docstring
      
      * fix use_cache and some docstring
      
      * add missing args in tokenizer
      
      * modify tester inheritance
      
      * add is_jieba_available
      
      * fix some bugs
      
      * make style and fix-copies
      
      * add doctests
      
      * skip integration tests
      
      * add is_jieba_available
      
      * fix bugs in common tests
      
      * adjust docstrings and make style
      
      * add argument docstring
      
      * adjust code to some specifications
      
      * make style and fix-copies
      
      * add fast tokenization test
      
      * dummy commit for unit test
      
      * dummy commit for unit test
      
      * dummy commit for unit test
      
      * normalize some comments and names
      
      * Bert->CPMAnt
      
      * camel names and drop redundant codes
      
      * make style and fix-coies
      
      * add CpmTokenizerFast _import_structure
      
      * drop cpmanttokenizerfast in model_doc
      
      * fix some problems
      
      * fix CPMAnt tokenization for common test
      
      * make style and fixup
      
      * fix copies and fixup
      
      * fix bugs in tokenization test
      
      * dummy commit for connection failure in unittest
      
      * fix copies
      
      * drop trailing comma
      
      * fix decorator in tests
      
      * dummy commit for connection failure in unittest
      
      ---------
      Co-authored-by: default avatarGong Baitao <gongbaitao11@gmail.com>
      523ca4e0
    • Arthur's avatar
      remove wrong doc in readme (#22723) · b76e6ebd
      Arthur authored
      b76e6ebd
  23. 10 Apr, 2023 2 commits
    • Sugawara's avatar
      add GPTNeoXForSequenceClassification (#22671) · 6daa9cb5
      Sugawara authored
      * add GPTNeoXForSequenceClassification
      
      * move the labels to logits.device (ref: #22561)
      
      * fix
      6daa9cb5
    • Joel Lamy-Poirier's avatar
      Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575) · e0921c6b
      Joel Lamy-Poirier authored
      
      
      * Add model with cli tool
      
      * Remove unwanted stuff
      
      * Add new code
      
      * Remove inference runner
      
      * Style
      
      * Fix checks
      
      * Test updates
      
      * make fixup
      
      * fix docs
      
      * fix doc
      
      * fix test
      
      * hopefully fix pipeline tests
      
      * refactor
      
      * fix CIs
      
      * add comment
      
      * rename to `GPTBigCodeForCausalLM`
      
      * correct readme
      
      * make fixup + docs
      
      * make fixup
      
      * fixes
      
      * fixes
      
      * Remove pruning
      
      * Remove import
      
      * Doc updates
      
      * More pruning removal
      
      * Combine copies
      
      * Single MQA implementation, remove kv cache pre-allocation and padding
      
      * Update doc
      
      * Revert refactor to match gpt2 style
      
      * Merge back key and value caches, fix some type hints
      
      * Update doc
      
      * Fix position ids pith padding (PR 21080)
      
      * Add conversion script temporarily
      
      * Update conversion script
      
      * Remove checkpoint conversion
      
      * New model
      
      * Fix MQA test
      
      * Fix copies
      
      * try fix tests
      
      * FIX TEST!!
      
      * remove  `DoubleHeadsModel`
      
      * add MQA tests
      
      * add slow tests
      
      * clean up
      
      * add CPU checker
      
      * final fixes
      
      * fixes
      
      - fix GPU issue
      - fixed slow tests
      - skip disk offload
      
      * fix final issue
      
      * Simplify and comment baddbmm fix
      
      * Remove unnecessary code
      
      * Transpose tweaks
      
      * Use beta=1 on cpu, improve tests
      
      ---------
      Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
      e0921c6b
  24. 06 Apr, 2023 1 commit
    • Nicolas Patry's avatar
      Adding Llama FastTokenizer support. (#22264) · 1670be4b
      Nicolas Patry authored
      * Adding Llama FastTokenizer support.
      
      - Requires https://github.com/huggingface/tokenizers/pull/1183 version
      - Only support byte_fallback for llama, raise otherwise (safety net).
      - Lots of questions are special tokens
      
      How to test:
      
      ```python
      
      from transformers.convert_slow_tokenizer import convert_slow_tokenizer
      from transformers import AutoTokenizer
      from tokenizers import Tokenizer
      
      tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b")
      
      if False:
          new_tokenizer = Tokenizer.from_file("tok.json")
      else:
          new_tokenizer = convert_slow_tokenizer(tokenizer)
          new_tokenizer.save("tok.json")
      
      strings = [
          "This is a test",
          "生活的真谛是",
          "生活的真谛是[MASK]。",
          # XXX: This one is problematic because of special tokens
          # "<s> Something something",
      ]
      
      for string in strings:
          encoded = tokenizer(string)["input_ids"]
          encoded2 = new_tokenizer.encode(string).ids
      
          assert encoded == encoded2, f"{encoded} != {encoded2}"
      
          decoded = tokenizer.decode(encoded)
          decoded2 = new_tokenizer.decode(encoded2)
      
          assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}"
      ```
      
      The converter + some test script.
      
      The test script.
      
      Tmp save.
      
      Adding Fast tokenizer + tests.
      
      Adding the tokenization tests.
      
      Correct combination.
      
      Small fix.
      
      Fixing tests.
      
      Fixing with latest update.
      
      Rebased.
      
      fix copies + normalized added tokens  + copies.
      
      Adding doc.
      
      TMP.
      
      Doc + split files.
      
      Doc.
      
      Versions + try import.
      
      Fix Camembert + warnings -> Error.
      
      Fix by ArthurZucker.
      
      Not a decorator.
      
      * Fixing comments.
      
      * Adding more to docstring.
      
      * Doc rewriting.
      1670be4b
  25. 05 Apr, 2023 1 commit
  26. 04 Apr, 2023 2 commits