1. 01 Feb, 2024 2 commits
    • Matt's avatar
      Add tip on setting tokenizer attributes (#28764) · 7bc6d763
      Matt authored
      * Add tip on setting tokenizer attributes
      
      * Grammar
      
      * Remove the bit that was causing doc builds to fail
      7bc6d763
    • JB (Don)'s avatar
      Adding [T5/MT5/UMT5]ForTokenClassification (#28443) · 0d26abdd
      JB (Don) authored
      * Adding [T5/MT5/UMT5]ForTokenClassification
      
      * Add auto mappings for T5ForTokenClassification and variants
      
      * Adding ForTokenClassification to the list of models
      
      * Adding attention_mask param to the T5ForTokenClassification test
      
      * Remove outdated comment in test
      
      * Adding EncoderOnly and Token Classification tests for MT5 and UMT5
      
      * Fix typo in umt5 string
      
      * Add tests for all the existing MT5 models
      
      * Fix wrong comment in dependency_versions_table
      
      * Reverting change to common test for _keys_to_ignore_on_load_missing
      
      The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing.
      
      * Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model
      
      * Add fix-copies to MT5ModelTest
      0d26abdd
  2. 31 Jan, 2024 1 commit
    • Kian Sierra McGettigan's avatar
      Flax mistral (#26943) · f7076cd3
      Kian Sierra McGettigan authored
      * direct copy from llama work
      
      * mistral modules forward pass working
      
      * flax mistral forward pass with sliding window
      
      * added tests
      
      * added layer collection approach
      
      * Revert "added layer collection approach"
      
      This reverts commit 0e2905bf2236ec323163fc1a9f0c016b21aa8b8f.
      
      * Revert "Revert "added layer collection approach""
      
      This reverts commit fb17b6187ac5d16da7c461e1130514dc3d137a43.
      
      * fixed attention outputs
      
      * added mistral to init and auto
      
      * fixed import name
      
      * fixed layernorm weight dtype
      
      * freeze initialized weights
      
      * make sure conversion consideres bfloat16
      
      * added backend
      
      * added docstrings
      
      * added cache
      
      * fixed sliding window causal mask
      
      * passes cache tests
      
      * passed all tests
      
      * applied make style
      
      * removed commented out code
      
      * applied fix-copies ignored other model changes
      
      * applied make fix-copies
      
      * removed unused functions
      
      * passed generation integration test
      
      * slow tests pass
      
      * fixed slow tests
      
      * changed default dtype from jax.numpy.float32 to float32 for docstring check
      
      * skip cache test  for FlaxMistralForSequenceClassification since if pad_token_id in input_ids it doesn't score previous input_ids
      
      * updated checkpoint since from_pt not included
      
      * applied black style
      
      * removed unused args
      
      * Applied styling and fixup
      
      * changed checkpoint for doc back
      
      * fixed rf after adding it to hf hub
      
      * Add dummy ckpt
      
      * applied styling
      
      * added tokenizer to new ckpt
      
      * fixed slice format
      
      * fix init and slice
      
      * changed ref for placeholder TODO
      
      * added copies from Llama
      
      * applied styling
      
      * applied fix-copies
      
      * fixed docs
      
      * update weight dtype reconversion for sharded weights
      
      * removed Nullable input ids
      
      * Removed unnecessary output attentions in Module
      
      * added embedding weight initialziation
      
      * removed unused past_key_values
      
      * fixed deterministic
      
      * Fixed RMS Norm and added copied from
      
      * removed input_embeds
      
      * applied make style
      
      * removed nullable input ids from sequence classification model
      
      * added copied from GPTJ
      
      * added copied from Llama on FlaxMistralDecoderLayer
      
      * added copied from to FlaxMistralPreTrainedModel methods
      
      * fix test deprecation warning
      
      * freeze gpt neox random_params and fix copies
      
      * applied make style
      
      * fixed doc issue
      
      * skipped docstring test to allign # copied from
      
      * applied make style
      
      * removed FlaxMistralForSequenceClassification
      
      * removed unused padding_idx
      
      * removed more sequence classification
      
      * removed sequence classification
      
      * applied styling and consistency
      
      * added copied from in tests
      
      * removed sequence classification test logic
      
      * applied styling
      
      * applied make style
      
      * removed freeze and fixed copies
      
      * undo test change
      
      * changed repeat_kv to tile
      
      * fixed to key value groups
      
      * updated copyright year
      
      * split casual_mask
      
      * empty to rerun failed pt_flax_equivalence test FlaxWav2Vec2ModelTest
      
      * went back to 2023 for tests_pr_documentation_tests
      
      * went back to 2024
      
      * changed tile to repeat
      
      * applied make style
      
      * empty for retry on Wav2Vec2
      f7076cd3
  3. 30 Jan, 2024 2 commits
  4. 29 Jan, 2024 3 commits
  5. 26 Jan, 2024 2 commits
  6. 25 Jan, 2024 4 commits
    • Peter Götz's avatar
      [`docs`] Improve visualization for vertical parallelism (#28583) · 28751958
      Peter Götz authored
      The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.
      28751958
    • Yusuf's avatar
      Update question_answering.md (#28694) · 24f1a00e
      Yusuf authored
      fix typo:
      
      from:
      
       "model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")"
      
      to:
      model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")
      24f1a00e
    • Merve Noyan's avatar
      Improve Backbone API docs (#28666) · 20000956
      Merve Noyan authored
      Update backbones.md
      20000956
    • NielsRogge's avatar
      Add Depth Anything (#28654) · 963db81a
      NielsRogge authored
      * First draft
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * Add docs
      
      * Remove file
      
      * Add copied from
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Fix style
      
      * Update docs
      
      * Convert all checkpoints, add integration test
      
      * Rename checkpoints
      
      * Add pretrained backbone attributes
      
      * Fix default config
      
      * Address comment
      
      * Add figure to docs
      
      * Fix bug thanks to @xenova
      
      * Update conversion script
      
      * Fix integration test
      963db81a
  7. 24 Jan, 2024 3 commits
  8. 22 Jan, 2024 2 commits
  9. 19 Jan, 2024 1 commit
  10. 18 Jan, 2024 1 commit
    • Yoach Lacombe's avatar
      Add new meta w2v2-conformer BERT-like model (#28165) · d2cdefb9
      Yoach Lacombe authored
      
      
      * first commit
      
      * correct default value non causal
      
      * update config and modeling code
      
      * update converting checkpoint
      
      * clean modeling and fix tests
      
      * make style
      
      * add new config parameters to docstring
      
      * fix copied from statements
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * make position_embeddings_type docstrings clearer
      
      * clean converting script
      
      * remove function not used
      
      * clean modeling file
      
      * apply suggestion for test file + add convert script to not_doctested
      
      * modify tests according to review - cleaner logic and more tests
      
      * Apply nit suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * add checker of valid position embeddings type
      
      * instantiate new layer norm layer with the right eps
      
      * fix freeze_feature_encoder since it can be None in some cases
      
      * add test same output in convert script
      
      * restore wav2vec2conformer and add new model
      
      * create processor and FE + clean
      
      * add new model code
      
      * fix convert script and set default config parameters
      
      * correct model id paths
      
      * make style
      
      * make fix-copies and cleaning files
      
      * fix copied from statements
      
      * complete .md and fixe copies
      
      * clean convert script argument defaults
      
      * fix config parameters docstrings
      
      * fix config docstring
      
      * add copied from and enrich FE tests
      
      * fix copied from and repo-consistency
      
      * add autotokenizer
      
      * make test input length shorter and change docstring code
      
      * fix docstrings and copied from
      
      * add add_adapter to ASR training example
      
      * make testing of adapters more robust
      
      * adapt to multi adapter layers
      
      * refactor input_values->input_features and remove w2v2-bert feature extractor
      
      * remove pretraining model
      
      * remove depreciated features and useless lines
      
      * add copied from and ignore statements to modeling tests
      
      * remove pretraining model #2
      
      * change import in convert script
      
      * change default in convert script
      
      * update readme and remove useless line
      
      * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * refactor BERT to Bert for consistency
      
      * remove useless ignore copy statement
      
      * add persistent to buffer in rotary
      
      * add eps in LayerNorm init and remove copied from
      
      * add adapter activation parameters and add copied from statements
      
      * Fix copied statements and add unitest.skip reasons
      
      * add copied statement in test_processor
      
      * refactor processor
      
      * make style
      
      * replace numpy random by torch rand
      
      * remove expected output CTC
      
      * improve converting script with processor class
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * remove gumbel class
      
      * remove tests related to previously deleted class
      
      * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * correct typos
      
      * remove uused parameters
      
      * update processor to takes both text and audio
      
      * update checkpoints
      
      * update expected output and add ctc expected output
      
      * add label_attention_mask
      
      * replace pt with np in processor tests
      
      * fix typo
      
      * revert to behaviour with labels_attention_mask
      
      ---------
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      d2cdefb9
  11. 17 Jan, 2024 2 commits
    • Junyang Lin's avatar
      Add qwen2 (#28436) · d6ffe74d
      Junyang Lin authored
      
      
      * add config, modeling, and tokenization
      
      * add auto and init
      
      * update readme
      
      * update readme
      
      * update team name
      
      * fixup
      
      * fixup
      
      * update config
      
      * update code style
      
      * update for fixup
      
      * update for fixup
      
      * update for fixup
      
      * update for testing
      
      * update for testing
      
      * fix bug for config and tokenization
      
      * fix bug for bos token
      
      * not doctest
      
      * debug tokenizer
      
      * not doctest
      
      * debug tokenization
      
      * debug init for tokenizer
      
      * fix style
      
      * update init
      
      * delete if in token auto
      
      * add tokenizer doc
      
      * add tokenizer in init
      
      * Update dummy_tokenizers_objects.py
      
      * update
      
      * update
      
      * debug
      
      * Update tokenization_qwen2.py
      
      * debug
      
      * Update convert_slow_tokenizer.py
      
      * add copies
      
      * add copied from and make style
      
      * update files map
      
      * update test
      
      * fix style
      
      * fix merge reading and update tests
      
      * fix tests
      
      * fix tests
      
      * fix style
      
      * debug a variable in readme
      
      * Update src/transformers/models/qwen2/configuration_qwen2.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * update test and copied from
      
      * fix style
      
      * update qwen2 tokenization  and tests
      
      * Update tokenization_qwen2.py
      
      * delete the copied from after property
      
      * fix style
      
      * update tests
      
      * update tests
      
      * add copied from
      
      * fix bugs
      
      * update doc
      
      * add warning for sliding window attention
      
      * update qwen2 tokenization
      
      * fix style
      
      * Update src/transformers/models/qwen2/modeling_qwen2.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fix tokenizer fast
      
      ---------
      Co-authored-by: default avatarRen Xuancheng <jklj077@users.noreply.github.com>
      Co-authored-by: default avatarrenxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      d6ffe74d
    • Gustavo de Rosa's avatar
      Fixes default value of `softmax_scale` in `PhiFlashAttention2`. (#28537) · d93ef7d7
      Gustavo de Rosa authored
      * fix(phi): Phi does not use softmax_scale in Flash-Attention.
      
      * chore(docs): Update Phi docs.
      d93ef7d7
  12. 16 Jan, 2024 1 commit
  13. 15 Jan, 2024 3 commits
  14. 12 Jan, 2024 1 commit
  15. 11 Jan, 2024 2 commits
  16. 10 Jan, 2024 2 commits
  17. 08 Jan, 2024 2 commits
    • NielsRogge's avatar
      Add SigLIP (#26522) · 3b742ea8
      NielsRogge authored
      
      
      * Add first draft
      
      * Use appropriate gelu function
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * Convert checkpoint
      
      * More improvements
      
      * Improve docs, remove print statements
      
      * More improvements
      
      * Add link
      
      * remove unused masking function
      
      * begin tokenizer
      
      * do_lower_case
      
      * debug
      
      * set split_special_tokens=True
      
      * Remove script
      
      * Fix style
      
      * Fix rebase
      
      * Use same design as CLIP
      
      * Add fast tokenizer
      
      * Add SiglipTokenizer to init, remove extra_ids
      
      * Improve conversion script
      
      * Use smaller inputs in conversion script
      
      * Update conversion script
      
      * More improvements
      
      * Add processor to conversion script
      
      * Add tests
      
      * Remove print statements
      
      * Add tokenizer tests
      
      * Fix more tests
      
      * More improvements related to weight initialization
      
      * More improvements
      
      * Make more tests pass
      
      * More improvements
      
      * More improvements
      
      * Add copied from
      
      * Add canonicalize_text
      
      * Enable fast tokenizer tests
      
      * More improvements
      
      * Fix most slow tokenizer tests
      
      * Address comments
      
      * Fix style
      
      * Remove script
      
      * Address some comments
      
      * Add copied from to tests
      
      * Add more copied from
      
      * Add more copied from
      
      * Add more copied from
      
      * Remove is_flax_available
      
      * More updates
      
      * Address comment
      
      * Remove SiglipTokenizerFast for now
      
      * Add caching
      
      * Remove umt5 test
      
      * Add canonicalize_text inside _tokenize, thanks Arthur
      
      * Fix image processor tests
      
      * Skip tests which are not applicable
      
      * Skip test_initialization
      
      * More improvements
      
      * Compare pixel values
      
      * Fix doc tests, add integration test
      
      * Add do_normalize
      
      * Remove causal mask and leverage ignore copy
      
      * Fix attention_mask
      
      * Fix remaining tests
      
      * Fix dummies
      
      * Rename temperature and bias
      
      * Address comments
      
      * Add copied from to tokenizer tests
      
      * Add SiglipVisionModel to auto mapping
      
      * Add copied from to image processor tests
      
      * Improve doc
      
      * Remove SiglipVisionModel from index
      
      * Address comments
      
      * Improve docs
      
      * Simplify config
      
      * Add first draft
      
      * Make it like mistral
      
      * More improvements
      
      * Fix attention_mask
      
      * Fix output_attentions
      
      * Add note in docs
      
      * Convert multilingual model
      
      * Convert large checkpoint
      
      * Convert more checkpoints
      
      * Add pipeline support, correct image_mean and image_std
      
      * Use padding=max_length by default
      
      * Make processor like llava
      
      * Add code snippet
      
      * Convert more checkpoints
      
      * Set keep_punctuation_string=None as in OpenCLIP
      
      * Set normalized=False for special tokens
      
      * Fix doc test
      
      * Update integration test
      
      * Add figure
      
      * Update organization
      
      * Happy new year
      
      * Use AutoModel everywhere
      
      ---------
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      3b742ea8
    • Rosie Wood's avatar
      Add segmentation map processing to SAM Image Processor (#27463) · 73c88012
      Rosie Wood authored
      
      
      * add segmentation map processing to sam image processor
      
      * fixup
      
      * add tests
      
      * reshaped_input_size is shape before padding
      
      * update tests for size/shape outputs
      
      * fixup
      
      * add code snippet to docs
      
      * Update docs/source/en/model_doc/sam.md
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Add missing backticks
      
      * add `segmentation_maps` as arg for SamProcessor.__call__()
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      73c88012
  18. 05 Jan, 2024 1 commit
  19. 04 Jan, 2024 1 commit
  20. 03 Jan, 2024 2 commits
    • Connor Henderson's avatar
      Add FastSpeech2Conformer (#23439) · d83ff5ee
      Connor Henderson authored
      * start - docs, SpeechT5 copy and rename
      
      * add relevant code from FastSpeech2 draft, have tests pass
      
      * make it an actual conformer, demo ex.
      
      * matching inference with original repo, includes debug code
      
      * refactor nn.Sequentials, start more desc. var names
      
      * more renaming
      
      * more renaming
      
      * vocoder scratchwork
      
      * matching vocoder outputs
      
      * hifigan vocoder conversion script
      
      * convert model script, rename some config vars
      
      * replace postnet with speecht5's implementation
      
      * passing common tests, file cleanup
      
      * expand testing, add output hidden states and attention
      
      * tokenizer + passing tokenizer tests
      
      * variety of updates and tests
      
      * g2p_en pckg setup
      
      * import structure edits
      
      * docstrings and cleanup
      
      * repo consistency
      
      * deps
      
      * small cleanup
      
      * forward signature param order
      
      * address comments except for masks and labels
      
      * address comments on attention_mask and labels
      
      * address second round of comments
      
      * remove old unneeded line
      
      * address comments part 1
      
      * address comments pt 2
      
      * rename auto mapping
      
      * fixes for failing tests
      
      * address comments part 3 (bart-like, train loss)
      
      * make style
      
      * pass config where possible
      
      * add forward method + tests to WithHifiGan model
      
      * make style
      
      * address arg passing and generate_speech comments
      
      * address Arthur comments
      
      * address Arthur comments pt2
      
      * lint  changes
      
      * Sanchit comment
      
      * add g2p-en to doctest deps
      
      * move up self.encoder
      
      * onnx compatible tensor method
      
      * fix is symbolic
      
      * fix paper url
      
      * move models to espnet org
      
      * make style
      
      * make fix-copies
      
      * update docstring
      
      * Arthur comments
      
      * update docstring w/ new updates
      
      * add model architecture images
      
      * header size
      
      * md wording update
      
      * make style
      d83ff5ee
    • lain's avatar
      fix documentation for zero_shot_object_detection (#28267) · 6eba901d
      lain authored
      remove broken space
      6eba901d
  21. 02 Jan, 2024 1 commit
  22. 22 Dec, 2023 1 commit