1. 10 May, 2024 2 commits
  2. 09 May, 2024 3 commits
  3. 08 May, 2024 2 commits
  4. 07 May, 2024 1 commit
  5. 06 May, 2024 1 commit
  6. 02 May, 2024 4 commits
    • mobicham's avatar
      Add HQQ quantization support (#29637) · 59952994
      mobicham authored
      
      
      * update HQQ transformers integration
      
      * push import_utils.py
      
      * add force_hooks check in modeling_utils.py
      
      * fix | with Optional
      
      * force bias as param
      
      * check bias is Tensor
      
      * force forward for multi-gpu
      
      * review fixes pass
      
      * remove torch grad()
      
      * if any key in linear_tags fix
      
      * add cpu/disk check
      
      * isinstance return
      
      * add multigpu test + refactor tests
      
      * clean hqq_utils imports in hqq.py
      
      * clean hqq_utils imports in quantizer_hqq.py
      
      * delete hqq_utils.py
      
      * Delete src/transformers/utils/hqq_utils.py
      
      * ruff init
      
      * remove torch.float16 from __init__ in test
      
      * refactor test
      
      * isinstance -> type in quantizer_hqq.py
      
      * cpu/disk device_map check in quantizer_hqq.py
      
      * remove type(module) nn.linear check in quantizer_hqq.py
      
      * add BaseQuantizeConfig import inside HqqConfig init
      
      * remove hqq import in hqq.py
      
      * remove accelerate import from test_hqq.py
      
      * quant config.py doc update
      
      * add hqqconfig to main_classes doc
      
      * make style
      
      * __init__ fix
      
      * ruff __init__
      
      * skip_modules list
      
      * hqqconfig format fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * test_hqq.py remove mistral comment
      
      * remove self.using_multi_gpu is False
      
      * torch_dtype default val set and logger.info
      
      * hqq.py isinstance fix
      
      * remove torch=None
      
      * torch_device test_hqq
      
      * rename test_hqq
      
      * MODEL_ID in test_hqq
      
      * quantizer_hqq setattr fix
      
      * quantizer_hqq typo fix
      
      * imports quantizer_hqq.py
      
      * isinstance quantizer_hqq
      
      * hqq_layer.bias reformat quantizer_hqq
      
      * Step 2 as comment in quantizer_hqq
      
      * prepare_for_hqq_linear() comment
      
      * keep_in_fp32_modules fix
      
      * HqqHfQuantizer reformat
      
      * quantization.md hqqconfig
      
      * quantization.md model example reformat
      
      * quantization.md # space
      
      * quantization.md space   })
      
      * quantization.md space   })
      
      * quantization_config fix doc
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * axis value check in quantization_config
      
      * format
      
      * dynamic config explanation
      
      * quant config method in quantization.md
      
      * remove shard-level progress
      
      * .cuda fix modeling_utils
      
      * test_hqq fixes
      
      * make fix-copies
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      59952994
    • Joao Gante's avatar
      Docs: add missing `StoppingCriteria` autodocs (#30617) · 66abe139
      Joao Gante authored
      
      
      * add missing docstrings to docs
      
      * Update src/transformers/generation/stopping_criteria.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      66abe139
    • Joao Gante's avatar
      Docs: fix `generate`-related rendering issues (#30600) · aa55ff44
      Joao Gante authored
      * does this work?
      
      * like this?
      
      * fix the other generate links
      
      * missing these
      aa55ff44
    • amitportnoy's avatar
      phi3 chat_template does not support system role (#30606) · 801894e0
      amitportnoy authored
      * phi3 chat_template does not support system role
      
      * fix doc test error
      801894e0
  7. 01 May, 2024 3 commits
  8. 30 Apr, 2024 2 commits
  9. 29 Apr, 2024 1 commit
  10. 26 Apr, 2024 4 commits
    • Eitan Turok's avatar
      Fix link in dbrx.md (#30509) · 73014b56
      Eitan Turok authored
      73014b56
    • Eduardo Pacheco's avatar
      [SegGPT] Fix seggpt image processor (#29550) · 6d4cabda
      Eduardo Pacheco authored
      * Fixed SegGptImageProcessor to handle 2D and 3D prompt mask inputs
      
      * Added new test to check prompt mask equivalence
      
      * New proposal
      
      * Better proposal
      
      * Removed unnecessary method
      
      * Updated seggpt docs
      
      * Introduced do_convert_rgb
      
      * nits
      6d4cabda
    • amyeroberts's avatar
      Fix GroundingDINO, DPR after BERT SDPA update (#30506) · e7d52a10
      amyeroberts authored
      Fix GroundingDINO, DPR after BET SDPA update
      e7d52a10
    • JB (Don)'s avatar
      [`BERT`] Add support for sdpa (#28802) · dfa7b580
      JB (Don) authored
      * Adding SDPA support for BERT
      
      * Using the proper input name for testing model input in inference()
      
      * Adding documentation for SDPA in BERT model page
      
      * Use the stable link for the documentation
      
      * Adding a gate to only call .contiguous() for torch < 2.2.0
      
      * Additions and fixes to the documentation
      
      * Minor updates to documentation
      
      * Adding extra requirements needed for the contiguous() bug
      
      * Adding "Adapted from" in plcae of the "Copied from"
      
      * Add benchmark speedup tables to the documentation
      
      * Minor fixes to the documentation
      
      * Use ClapText as a replacemenet for Bert in the Copied-From
      
      * Some more fixes for the fix-copies references
      
      * Overriding the test_eager_matches_sdpa_generate in bert tests to not load with low_cpu_mem_usage
      
      [test all]
      
      * Undo changes to separate test
      
      * Refactored SDPA self attention code for KV projections
      
      * Change use_sdpa to attn_implementation
      
      * Fix test_sdpa_can_dispatch_on_flash by preparing input (required for MultipleChoice models)
      dfa7b580
  11. 25 Apr, 2024 4 commits
  12. 24 Apr, 2024 4 commits
    • Gustavo de Rosa's avatar
      Phi-3 (#30423) · c9693db2
      Gustavo de Rosa authored
      * chore(root): Initial commit of Phi-3 files.
      
      * fix(root): Fixes Phi-3 missing on readme.
      
      * fix(root): Ensures files are consistent.
      
      * fix(phi3): Fixes unit tests.
      
      * fix(tests): Fixes style of phi-3 test file.
      
      * chore(tests): Adds integration tests for Phi-3.
      
      * fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm.
      
      * fix(phi3): Fixes incorrect docstrings.
      
      * fix(phi3): Fixes docstring typos.
      
      * fix(phi3): Adds support for Su and Yarn embeddings.
      
      * fix(phi3): Improves according first batch of reviews.
      
      * fix(phi3): Uses up_states instead of y in Phi3MLP.
      
      * fix(phi3): Uses gemma rotary embedding to support torch.compile.
      
      * fix(phi3): Improves how rotary embedding classes are defined.
      
      * fix(phi3): Fixes inv_freq not being re-computed for extended RoPE.
      
      * fix(phi3): Adds last suggestions to modeling file.
      
      * fix(phi3): Splits inv_freq calculation in two lines.
      c9693db2
    • Arthur's avatar
      Add llama3 (#30334) · 89c510d8
      Arthur authored
      
      
      * nuke
      
      * add co-author
      
      * add co-author
      
      * update card
      
      * fixup and fix copies to please our ci
      
      * nit fixup
      
      * super small nits
      
      * remove tokenizer_path from call to `write_model`
      
      * always safe serialize by default
      
      ---------
      Co-authored-by: default avatarpcuenca <pcuenca@users.noreply.github.com>
      Co-authored-by: default avatarxenova <xenova@users.noreply.github.com>
      89c510d8
    • Lysandre Debut's avatar
      Remove add-new-model in favor of add-new-model-like (#30424) · d4e92f1a
      Lysandre Debut authored
      * Remove add-new-model in favor of add-new-model-like
      
      * nits
      d4e92f1a
    • Lysandre Debut's avatar
  13. 23 Apr, 2024 2 commits
  14. 22 Apr, 2024 4 commits
  15. 19 Apr, 2024 3 commits
    • NielsRogge's avatar
      [Grounding DINO] Add resources (#30232) · 8c12690c
      NielsRogge authored
      
      
      * Add resources
      
      * Address comments
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      8c12690c
    • Jo茫o David's avatar
      Add TF swiftformer (#23342) · d2cec09b
      Jo茫o David authored
      
      
      * Duplicate swiftformer
      
      * Convert SwiftFormerPatchEmbedding
      
      * Convert SwiftFormerEmbeddings
      
      * Convert TFSwiftFormerMlp
      
      * Convert TFSwiftFormerConvEncoder
      
      * Convert TFSwiftFormerLocalRepresentation
      
      * convert TFSwiftFormerEncoderBlock
      
      * Convert SwiftFormerStage
      
      * Convert SwiftFormerEncoder
      
      * Add TFSWiftFormerPreTrainedModel
      
      * Convert SwiftFormerForImageClassification
      
      * Add kwargs and start drop path
      
      * Fix syntax
      
      * Change Model class name
      
      * Add TFSwiftFormer to __init__
      
      * Duplicate test_modeling_swiftformer
      
      * First test conversions
      
      * Change require_torch to require_tf
      
      * Add exports to swiftformer __init__
      
      * Add TFSwiftFormerModel wrapper
      
      * Fix __init__ and run black
      
      * Remove docstring from MainLayer, fix padding
      
      * Use keras.layers.Activation on keras.Sequential
      
      * Fix swiftformer exports
      
      * Fix activation layer from config
      
      * Remove post_inits
      
      * Use tf.keras.layers.ZeroPadding2D
      
      * Convert torch normalize
      
      * Change tf test input shape
      
      * Fix softmax and reduce_sum
      
      * Convert expand_dims and repeat
      
      * Add missing reshape and tranpose
      
      * Simplify TFSwiftFormerEncoderBlock.call
      
      * Fix mismatch in patch embeddings
      
      * Fix expected output shape to match channels last
      
      * Fix swiftformer typo
      
      * Disable test_onnx
      
      * Fix TFSwiftFormerForImageClassification call
      
      * Add unpack inputs
      
      * Convert flatten(2).mean(-1)
      
      * Change vision dummy inputs (to be reviewed)
      
      * Change test_forward_signature to use .call
      
      * Fix @unpack_inputs
      
      * Set return_tensors="tf" and rename class
      
      * Rename wrongly named patch_embeddings layer
      
      * Add serving_output and change dummy_input shape
      
      * Make dimensions BCHW and transpose inside embedding layer
      
      * Change SwiftFormerEncoderBlock
      
      * Fix ruff problems
      
      * Add image size to swiftformer config
      
      * Change tranpose to MainLayer and use -1 for reshape
      
      * Remove serving_outputs and dummy_inputs
      
      * Remove test_initialization test from tf model
      
      * Make Sequential component a separate layer
      
      * Fix layers' names
      
      * Tranpose encoder outputs
      
      * Fix tests and check if hidden states is not None
      
      * Fix TFSwiftFormerForImageClassification
      
      * Run make fixup
      
      * Run make fix-copies
      
      * Update modeling_tf_auto
      
      * Update docs
      
      * Fix modeling auto mapping
      
      * Update modelint_tf_swiftformer docs
      
      * Fill image_size doc and type
      
      * Add reduction=None to loss computation
      
      * Update docs
      
      * make style
      
      * Debug: Delete the tip to see if that changes anything
      
      * Re-add tip
      
      * Remove add_code_sample_docstrings
      
      * Remove unused import
      
      * Get the debug to actually tell us the problem it has with the docs
      
      * Try a substitution to match the PyTorch file?
      
      * Add swiftformer to ignore list
      
      * Add build() methods
      
      * Update copyright year
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Remove FIXME comment
      
      * Remove from_pt
      
      * Update copyright year
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Rename one-letter variables
      
      * Remove FIXMEs related to momentum
      
      * Remove old TODO comment
      
      * Remove outstanding FIXME comments
      
      * Get dropout rate from config
      
      * Add specific dropout config for MLP
      
      * Add convencoder dropout to config
      
      * Pass config to SwiftFormerDropPath layer
      
      * Fix drop_path variable name and add Adapted from comment
      
      * Run ruff
      
      * Removed copied from comment
      
      * Run fix copies
      
      * Change drop_path to identity to match pt
      
      * Cleanup build() methods and move to new keras imports
      
      * Update docs/source/en/model_doc/swiftformer.md
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      
      * Raise error if drop_path_rate > 0.0
      
      * Apply suggestions from code review
      
      Replace (self.dim), with self.dim,
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      
      * Remove drop_path function
      
      * Add training to TFSwiftFormerEncoder
      
      * Set self.built = True last
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Should have been added to previous commit
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Change default_feature_extractor to default_image_processor
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Import Keras from modeling_tf_utils
      
      * Remove relative import
      
      * Run ruff --fix
      
      * Move import keras to tf_available
      
      * Add copied from comment to test_forward_signature
      
      * Reduce batch size and num_labels
      
      * Extract loss logic to hf_compute_loss
      
      * Run ruff format
      
      ---------
      Co-authored-by: default avatarMatt <rocketknight1@gmail.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      d2cec09b
    • Matt's avatar
      Deprecate default chat templates (#30346) · 0927bfd0
      Matt authored
      * initial commit, remove warnings on default chat templates
      
      * stash commit
      
      * Raise a much sterner warning for default chat templates, and prepare for depreciation
      
      * Update the docs
      0927bfd0