- 14 Jul, 2024 2 commits
-
-
Joao Gante authored
* tmp commit * shorter * nit * explicit kwargs * propagate changes * mass propagation with a few manual touches (let's see how CI behaves) * fix cacheless case * Update src/transformers/generation/utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * make fixup --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
fxmarty authored
use torch.compiler.is_compiling() when possible
-
- 12 Jul, 2024 1 commit
-
-
Aviv Shamsian authored
* fix prompt strip to support tensors and np arrays * framework agnostic * change logic check before converting prompt into list Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adding _convert_to_list to tokenization_whisper_fast * adding tests for prompt decoding * adding comment Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adding comment Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * revert minor * make style formatting * style formatting after update * Update src/transformers/models/whisper/tokenization_whisper_fast.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fixing _strip_prompt to handle _decode_with_timestamps * fix copies --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
- 11 Jul, 2024 16 commits
-
-
jiqing-feng authored
* fix qa pipeline * fix tensor to numpy
-
Naman Garg authored
* initialized Structure * Updated variable names * Added Config class, basic HF setup, convert_to_hf * Fixed Convert function, added hiera to HF files, Initilized test files * better naming for x in forward pass * Moved utils to hiera * Change hiera -> hiera_model * Fixed integration into tranformers * Fix: Convert Checkpoint * added documentation for hiera * added documentation for hiera * added Docstings to models, Transformers based changes * make style and quality * make style and quality * Integration & Block tests running * Fixed bugs * initialized Structure * Updated variable names * Added Config class, basic HF setup, convert_to_hf * Fixed Convert function, added hiera to HF files, Initilized test files * better naming for x in forward pass * Moved utils to hiera * Change hiera -> hiera_model * Fixed integration into tranformers * Fix: Convert Checkpoint * added documentation for hiera * added documentation for hiera * added Docstings to models, Transformers based changes * make style and quality * make style and quality * Integration & Block tests running * Fixed bugs * Removed tim dependency * added HieraBlock * fixed: Model name * added tests for HieraModel, HieraBlock * fixed imports * fixed quality & copies * Fixes * Update docs/source/en/model_doc/hiera.md Fix name Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/hiera.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/model_doc/hiera.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/configuration_hiera.py Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/configuration_hiera.py Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Fixed formatting * Code quality & Import differences * quality and repo-consistency fix * fixed no torch error * Docstring fix * Docstring fix * doc string fix * fixed example usage * Resolved issues in modeling_hiera * Removed Hiera MAE * Added test and resolved bug * fixed doc string * First commit * Finished conversion script and model forward working * Resolved all issues * nits * Improving tests * Nits * More nits * Improving HieraForMaskedImageModeling * More improvements and nits * Fixed docstrings of outputs * More fixes * More imrpovments * Updated conversion script * Fixed docstrings * Improved tests * Fixed attentou outputs test * All tests green * Removed unnecessary file * contribution attribution * Resolved a few issues * Resolved Comments * Updated model repo id and fixed bugs * Removed loss print * Make tests green * Updated docstrings * Fix style * Fixed num_heads in config * Removed unnecessary video checkpoint related code in the conversion script * Fix style * Changed atol in conversion script * HieraConfig * Fix copies * Fixed typo * Resolved few issues * make * converted conv_nd -> nn.Module * Removed video complexities * Removed video complexities * fix style * Addressing comments * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/hiera/modeling_hiera.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix style * Fixed tests * Fixed typo * Fixed interpolate test * Made torch fx compatible * Made sure imageprocesor is correct * Addressed comments * Noise directly as torch * Remove unnecesary attr * Added return_dit * Update src/transformers/models/hiera/__init__.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated checkpoints * [run_slow] hiera * Fixed device mismatch * [run_slow] hiera * Fixed GPU tests * [run_slow] hiera --------- Co-authored-by:
Ubuntu <ubuntu@ip-172-31-29-50.us-east-2.compute.internal> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Eduardo Pacheco <eduardo.pach@hotmail.com> Co-authored-by:
Eduardo Pacheco <69953243+EduardoPach@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Apoorv Khandelwal authored
* Change `Trainer.get_optimizer_cls_and_kwargs` to `self.` * Make `get_optimizer_cls_and_kwargs` an instance method * Fixing typo * Revert `get_optimizer_cls_and_kwargs` to staticmethod * restore newline to trainer.py eof
-
t11s authored
fix(SigLip): remove spurious exclusion of first vision output token in classifier
-
Joao Gante authored
fix sliding cache
-
Arthur authored
* dumb commit * nit * update * something like this * unpack in modeling utils * safe import * oups * update * nits * diff convert gemma * update * start propagating * udpate other modeling code as well * update for sliding window models * nits * more init cleanups * styling * fixup * noice * pass fixup * typo typing_extension -> typing_extensions * torch.nn.functionnal -> torch.nn.functional * add to import structure * unpack * simplify a bit more for this first version * nut * update * update * nit * ease the import of `Unpack` * remove useless `use_sliding_window` * no qua please * protect import? * style * [run-slow] * [run slow] llama,gemma,mistral,mixtral * remove extra kwargs * fix llama * address review comments * apply diff_model_converter to modeling_gemma.py * remove cache_position 1 * remove cache_position 2 * some cleaning * refactor gemma2 as well * apply review comments * rename file to modeling_flash_attention_utils.py * siglip refactor * remove dead code * is the hub down? * still down? * fix siglip * fix gemma2 * fatal: Could not read from remote repository. * fix typo in softcap implem * flacky * Failed: Timeout >120.0s --------- Co-authored-by:fxmarty <9808326+fxmarty@users.noreply.github.com>
-
fxmarty authored
* fix tests * [test_all] check * address review comments
-
Omar Salman authored
* Add warning message for and parameters * Fix when the warning is raised * Formatting changes * Improve testing and remove duplicated warning from _fix_key
-
Sangbum Daniel Choi authored
* add gather_use_object arguments * fix name and pass the CI test for Seq2SeqTrainer * make style * make it to functools * fix typo * add accelerate version: * adding warning * Update src/transformers/trainer.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * make style * Update src/transformers/training_args.py * check function move to initial part * add test for eval_use_gather_object * fix minor --------- Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
Sai-Suraj-27 authored
Fixed the first argument name in few classmethods.
-
Isotr0py authored
* add missing methods for FuyuForCausalLM * fix a typo * format code * add missing tie_weights * format code
-
Arthur authored
* Support softcapping * strictly greater than * update
-
Arthur authored
* preserve the order * oups * oups * nit * trick * fix issues
-
Raushan Turganbay authored
* accept kwargs in processors * return unused kwargs * fix tests * typo * update the other way
-
turboderp authored
* HybridCache: Flip order of alternating global-attn/sliding-attn layers * HybridCache: Read sliding_window argument from cache_kwargs * Gemma2Model: Flip order of alternating global-attn/sliding-attn layers * Code formatting
-
Raushan Turganbay authored
* update docs * one more change
-
- 10 Jul, 2024 6 commits
-
-
haikuoxin authored
fix bug: https://github.com/huggingface/transformers/issues/31852
-
Marc Sun authored
Save sharded checkpoint in Trainer
-
Sai-Suraj-27 authored
Removed duplicate field definitions in classes.
-
Yih-Dar authored
* Revert "Revert "Fix `_init_weights` for `ResNetPreTrainedModel`" (#31868)" This reverts commit b45dd5de . * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
yukionfire authored
-
Raushan Turganbay authored
* add conversion for interleave llava * remove debug lines * remove unused imports * Update src/transformers/models/llava/convert_llava_weights_to_hf.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * small changes + docs --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 09 Jul, 2024 8 commits
-
-
Yun Dai authored
* add warning when using with FSDP full shard * fix style * Update src/transformers/training_args.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add hybrid shard warn * fix style --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Mauricio Villegas authored
Update modeling_utils.py Add return type annotation to PreTrainedModel.from_pretrained
-
Yih-Dar authored
* init * test --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yung-Sung Chuang authored
Co-authored-by:Joao Gante <joao@huggingface.co>
-
kallewoof authored
-
NielsRogge authored
Add model
-
Raushan Turganbay authored
* deprrecate `vocab_size` in other two VLMs * Update src/transformers/models/fuyu/configuration_fuyu.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * depracate until 4.44 --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 08 Jul, 2024 6 commits
-
-
Joao Gante authored
* enable strict signature * this should not have been deleted * recurrent_gemma too
-
Andr茅 Storhaug authored
* Fix wrong acclerator device setup when using MPS * More robust TrainingArguments MPS handling * Update training_args.py * Cleanup
-
fxmarty authored
* symbolic trace supports inputs_embeds * fix test? * Update tests/test_modeling_common.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Pavel Iakubovskii authored
* Rebase to main * Fix attention implementation autoset for tex and vision configs * Fixup * Minor fixes * Fix copies * Fix attention_mask for FA2 * Add eqvivalence tests for siglip * Remove right padding test * Uncomment flaky * Fix import * Add to docs * Fix test message * Add sdpa * Add sdpa equivalence test * Add siglip sdpa to docs * Fix typing for attention output * Add sdpa tests * Fix signature of FA2 * Autoset attn_implementation in config * Rename bsz -> batch_size * Move back autoset attn method * Mark as flaky * Correct attention mask padding * [run-slow] siglip * Add FA2 and sdpa docs * Style fix * Remove flaky for FA2 test * Change attention implementation set * Change attn_implementaiton propogation * Fix typos * Add modality to assert message * Add more sdpa backends in test * [run slow] siglip * Add math sdpa backend for all options * [run slow] siglip
-
Dingli Yang authored
avoiding crash when BatchEncoding data is None
-
NielsRogge authored
* First draft * Add docs * Clean up code * Convert model * Add image processor * Convert Zoe_K * More improvements * Improve variable names and docstrings * Improve variable names * Improve variable names * Replace nn.sequential * More improvements * Convert ZoeD_NK * Fix most tests * Verify pixel values * Verify pixel values * Add squeeze * Update beit to support arbitrary window sizes * Improve image processor * Improve docstring * Improve beit * Improve model outputs * Add figure * Fix beit * Update checkpoint * Fix repo id * Add _keys_to_ignore_on_load_unexpected * More improvements * Address comments * Address comments * Address comments * Address comments * Rename variable name * Add backbone_hidden_size * Vectorize * Vectorize more * Address comments * Clarify docstring * Remove backbone_hidden_size * Fix image processor * Remove print statements * Remove print statement * Add integration test * Address comments * Address comments * Address comments * Address comments * Add requires_backends * Clean up * Simplify conversion script * Simplify more * Simplify more * Simplify more * Clean up * Make sure beit is loaded correctly * Address comment * Address bin_configurations * Use bin_configurations * Convert models, add integration tests * Fix doc test * Address comments * Unify regressor classes * Clarify arguments * Improve resize_image * Add num_relative_features * Address comment * [run-slow]beit,data2vec,zoedepth * [run-slow]beit,data2vec,zoedepth * Address comments * Address comment * Address comment * Replace nn.TransformerEncoderLayer and nn.TransformerEncoder * Replace nn.MultiheadAttention * Add attributes for patch transformer to config * Add tests for ensure_multiple_of * Update organization * Add tests * [run-slow] beit data2vec * Update ruff * [run-slow] beit data2vec * Add comment * Improve docstrings, add test * Fix interpolate_pos_encoding * Fix slow tests * Add docstring * Update src/transformers/models/zoedepth/image_processing_zoedepth.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/zoedepth/image_processing_zoedepth.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Improve tests and docstrings * Use run_common_tests * Improve docstrings * Improve docstrings * Improve tests * Improve tests * Remove print statements --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 05 Jul, 2024 1 commit
-
-
Pedro Cuenca authored
* Depth Anything: update conversion script for V2 * Update docs * Style * Revert "Update docs" This reverts commit be0ca47ea1be4f3cd9aa2113bdd8efcc9959119e. * Add docs for depth anything v2 * Add depth_anything_v2 to MODEL_NAMES_MAPPING Done similarly to Flan-T5: https://github.com/huggingface/transformers/pull/19892/files * Add tip in original docs
-