- 19 Mar, 2024 2 commits
-
-
StevenBucaille authored
* Added SuperPoint docs * Added tests * Removed commented part * Commit to create and fix add_superpoint branch with a new branch * Fixed dummy_pt_objects * Committed missing files * Fixed README.md * Apply suggestions from code review Fixed small changes Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Moved ImagePointDescriptionOutput from modeling_outputs.py to modeling_superpoint.py * Removed AutoModelForKeypointDetection and related stuff * Fixed inconsistencies in image_processing_superpoint.py * Moved infer_on_model logic simply in test_inference * Fixed bugs, added labels to forward method with checks whether it is properly a None value, also added tests about this logic in test_modeling_superpoint.py * Added tests to SuperPointImageProcessor to ensure that images are properly converted to grayscale * Removed remaining mentions of MODEL_FOR_KEYPOINT_DETECTION_MAPPING * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixed from (w, h) to (h, w) as input for tests * Removed unnecessary condition * Moved last_hidden_state to be the first returned * Moved last_hidden_state to be the first returned (bis) * Moved last_hidden_state to be the first returned (ter) * Switched image_width and image_height in tests to match recent changes * Added config as first SuperPointConvBlock init argument * Reordered README's after merge * Added missing first config argument to SuperPointConvBlock instantiations * Removed formatting error * Added SuperPoint to README's de, pt-br, ru, te and vi * Checked out README_fr.md * Fixed README_fr.md * Test fix README_fr.md * Test fix README_fr.md * Last make fix-copies ! * Updated checkpoint path * Removed unused SuperPoint doc * Added missing image * Update src/transformers/models/superpoint/modeling_superpoint.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Removed unnecessary import * Update src/transformers/models/superpoint/modeling_superpoint.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added SuperPoint to _toctree.yml --------- Co-authored-by:
steven <steven.bucaillle@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Steven Bucaille <steven.bucaille@buawei.com>
-
Younes Belkada authored
* add galore v1 * add import * add tests and doc * fix doctest * forward contrib credits from discussions * forward contrib credits from discussions * Apply suggestions from code review Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * fix failing tests' * switch to `optim_target_modules` and clarify docs * more clarification * enhance lookup logic * update a test to add peak memory * add regex, all-linear and single string support * add layer-wise optimization through DummyOptimizers and LRSchedulers * forward contrib credits from discussions and original idea * add a section about DDP not supported in layerwise * Update src/transformers/trainer.py Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * fix self * check only if layer_wise * Update src/transformers/training_args.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * oops * make use of intervals * clarify comment * add matching tests * GaLoRe -> GaLore * move to `get_scheduler` * add note on docs * add a warning * adapt a bit the docs * update docstring * support original API * Update docs/source/en/trainer.md * slightly refactor * Update docs/source/en/trainer.md Co-authored-by:
Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix args parsing and add tests * remove warning for regex * fix type hint * add note about extra args * make `is_regex` return optional --------- Co-authored-by: Maxime <maximegmd @users.noreply.github.com> Co-authored-by: Wing Lian <winglian @users.noreply.github.com> Co-authored-by:
Zach Mueller <muellerzr@gmail.com> Co-authored-by:
hiyouga <hiyouga@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
-
- 18 Mar, 2024 2 commits
-
-
Abubakar Abid authored
* Update pipeline_tutorial.md to include gradio * Update pipeline_tutorial.md * Update docs/source/en/pipeline_tutorial.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/pipeline_tutorial.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/pipeline_tutorial.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/pipeline_tutorial.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update pipeline_tutorial.md * Update docs/source/en/pipeline_tutorial.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Yoach Lacombe authored
* first modeling code * make repository * still WIP * update model * add tests * add latest change * clean docstrings and copied from * update docstrings md and readme * correct chroma function * correct copied from and remove unreleated test * add doc to toctree * correct imports * add convert script to notdoctested * Add suggestion from Sanchit Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct get_uncoditional_inputs docstrings * modify README according to SANCHIT feedback * add chroma to audio utils * clean librosa and torchaudio hard dependencies * fix FE * refactor audio decoder -> audio encoder for consistency with previous musicgen * refactor conditional -> encoder * modify sampling rate logics * modify license at the beginning * refactor all_self_attns->all_attentions * remove ignore copy from causallm generate * add copied from for from_sub_models * fix make copies * add warning if audio is truncated * add copied from where relevant * remove artefact * fix convert script * fix torchaudio and FE * modify chroma method according to feedback-> better naming * refactor input_values->input_features * refactor input_values->input_features and fix import fe * add input_features to docstrigs * correct inputs_embeds logics * remove dtype conversion * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation * change warning for chroma length * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * change way to save wav, using soundfile * correct docs and change to soundfile * fix import * fix init proj layers * remove line breaks from md * fix issue with docstrings * add FE suggestions * improve is in logics and remove useless imports * remove custom from_pretrained * simplify docstring code * add suggestions for modeling tests * make style * update converting script with sanity check * remove encoder attention mask from conditional generation * replace musicgen melody checkpoints with official orga * rename ylacombe->facebook in checkpoints * fix copies * remove unecessary warning * add shape in code docstrings * add files to slow doc tests * fix md bug and add md to not_tested * make fix-copies * fix hidden states test and batching --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
- 15 Mar, 2024 3 commits
-
-
Aaron Jimenez authored
* add attention to es/ and edit es/_toctree.yml * translate attention.md * fix transformers * fix transformers
-
Marc Sun authored
* start integration * fix * add and debug tests * update tests * make pytorch serialization works * compatible with device_map and offload * fix tests * make style * add ref * guard against safetensors * add float8 and style * fix is_serializable * Fix shard_checkpoint compatibility with quanto * more tests * docs * adjust memory * better * style * pass tests * Update src/transformers/modeling_utils.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add is_safe_serialization instead * Update src/transformers/quantizers/quantizer_quanto.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add QbitsTensor tests * fix tests * simplify activation list * Update docs/source/en/quantization.md Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * better comment * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * find and fix edge case * Update docs/source/en/quantization.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * pass weights_only_kwarg instead * fix shard_checkpoint loading * simplify update_missing_keys * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * recursion to get all tensors * block serialization * skip serialization tests * fix * change by cuda:0 for now * fix regression * update device_map * fix doc * add noteboon * update torch_dtype * update doc * typo * typo * remove comm --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Younes Belkada <younesbelkada@gmail.com>
-
Saurabh Dash authored
* Cohere Model Release (#1) Cohere Model Release * Remove unnecessary files and code (#2) Some cleanup * Delete cohere-model directory (#3) * Make Fix (#5) * Pr fixes (#6) * fixes for pr * pr fixes for the format * pr fixes for the format * src/transformers/models/auto/tokenization_auto.py * Tokenizer test (#8) * tokenizer test * format fix * Adding Docs and other minor changes (#7) * Add modeling tests (#9) * Smol Fix (#11) * tokenization tests are fixed * format fixes * fix pr doc tests * fix pr doc tests * fix pr doc tests * fix pr style check * small changes in cohere.md * FIX: Address final comments for transformers integration (#13) * fix modeling final nits and add proper test file * for now leave empty tests * add integration test * push new test * fix modeling cohere (#14) * Update chat templates to use the new API (#15) --------- Co-authored-by:
ahmetustun <ahmetustun89@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com>
-
- 13 Mar, 2024 6 commits
-
-
Aaron Jimenez authored
* remove ChatML link from en/ * remove ChatML link in ja/ * remove ChatML link in zh/
-
Nate Cibik authored
* Added pytests for pvt-v2, all passed * Added pvt_v2 to docs/source/end/model_doc * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Reverted batch eval changes for PR * Expanded type support for Pvt-v2 config * Fixed config docstring. Added channels property * Fixed model names in tests * Fixed config backbone compat. Added additional type support for image size in config * Fixed config backbone compat * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * Set key and value layers to use separate linear modules. Fixed pruning function * Set AvgPool to 7 * Fixed issue in init * PvT-v2 now works in AutoModel * Successful conversion of pretrained weights for PVT-v2 * Successful conversion of pretrained weights for PVT-v2 models * Added pytests for pvt-v2, all passed * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * Set key and value layers to use separate linear modules. Fixed pruning function * Set AvgPool to 7 * Fixed issue in init * PvT-v2 now works in AutoModel * Successful conversion of pretrained weights for PVT-v2 * Successful conversion of pretrained weights for PVT-v2 models * Added pytests for pvt-v2, all passed * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * Reverted batch eval changes for PR * Updated index.md * Expanded type support for Pvt-v2 config * Fixed config docstring. Added channels property * Fixed model names in tests * Fixed config backbone compat * Ran fix-copies * Fixed PvtV2Backbone tests * Added TFRegNet to OBJECTS_TO_IGNORE in check_docstrings.py * Fixed backbone stuff and fixed tests: all passing * Ran make fixup * Made modifications for code checks * Remove ONNX config from configuration_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use explicit image size dict in test_modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make image_size optional in test_modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove _ntuple use in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove reference to fp16_enabled * Model modules now take config as first argument even when not used * Replaced abbreviations for "SR" and "AP" with explicit "spatialreduction" and "averagepooling" * All LayerNorm now instantiates with config.layer_norm_eps * Added docstring for depth-wise conv layer * PvtV2Config now only takes Union[int, Tuple[int, int]] for image size * Refactored PVTv2 in prep for gradient checkpointing * Gradient checkpointing ready to test * Removed override of _set_gradient_checkpointing * Cleaned out old code * Applied code fixup * Applied code fixup * Began debug of pvt_v2 tests * Leave handling of num_labels to base pretrained config class * Deactivated gradient checkpointing tests until it is fixed * Removed PvtV2ImageProcessor which duped PvtImageProcessor * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * Set key and value layers to use separate linear modules. Fixed pruning function * Set AvgPool to 7 * Fixed issue in init * PvT-v2 now works in AutoModel * Successful conversion of pretrained weights for PVT-v2 * Successful conversion of pretrained weights for PVT-v2 models * Added pytests for pvt-v2, all passed * Added pvt_v2 to docs/source/end/model_doc * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Reverted batch eval changes for PR * Expanded type support for Pvt-v2 config * Fixed config docstring. Added channels property * Fixed model names in tests * Fixed config backbone compat. Added additional type support for image size in config * Fixed config backbone compat * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * Set key and value layers to use separate linear modules. Fixed pruning function * Set AvgPool to 7 * Fixed issue in init * PvT-v2 now works in AutoModel * Successful conversion of pretrained weights for PVT-v2 * Successful conversion of pretrained weights for PVT-v2 models * Added pytests for pvt-v2, all passed * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * Set key and value layers to use separate linear modules. Fixed pruning function * Set AvgPool to 7 * Fixed issue in init * PvT-v2 now works in AutoModel * Successful conversion of pretrained weights for PVT-v2 * Successful conversion of pretrained weights for PVT-v2 models * Added pytests for pvt-v2, all passed * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * Reverted batch eval changes for PR * Expanded type support for Pvt-v2 config * Fixed config docstring. Added channels property * Fixed model names in tests * Fixed config backbone compat * Ran fix-copies * Fixed PvtV2Backbone tests * Added TFRegNet to OBJECTS_TO_IGNORE in check_docstrings.py * Fixed backbone stuff and fixed tests: all passing * Ran make fixup * Made modifications for code checks * Remove ONNX config from configuration_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use explicit image size dict in test_modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make image_size optional in test_modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove _ntuple use in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove reference to fp16_enabled * Model modules now take config as first argument even when not used * Replaced abbreviations for "SR" and "AP" with explicit "spatialreduction" and "averagepooling" * All LayerNorm now instantiates with config.layer_norm_eps * Added docstring for depth-wise conv layer * PvtV2Config now only takes Union[int, Tuple[int, int]] for image size * Refactored PVTv2 in prep for gradient checkpointing * Gradient checkpointing ready to test * Removed override of _set_gradient_checkpointing * Cleaned out old code * Applied code fixup * Applied code fixup * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Ran fix-copies and fixup. All checks passed * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Reverted batch eval changes for PR * Fixed config docstring. Added channels property * Fixed config backbone compat * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Ran fix-copies and fixup. All checks passed * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Fixed config backbone compat * Ran fix-copies * Began debug of pvt_v2 tests * Leave handling of num_labels to base pretrained config class * Deactivated gradient checkpointing tests until it is fixed * Removed PvtV2ImageProcessor which duped PvtImageProcessor * Fixed issue from rebase * Fixed issue from rebase * Set tests for gradient checkpointing to skip those using reentrant since it isn't supported * Fixed issue from rebase * Fixed issue from rebase * Changed model name in docs * Removed duplicate PvtV2Backbone * Work around type switching issue in tests * Fix model name in config comments * Update docs/source/en/model_doc/pvt_v2.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Changed name of variable from 'attn_reduce' to 'sr_type' * Changed name of variable from 'attn_reduce' to 'sr_type' * Changed from using 'sr_type' to 'linear_attention' for clarity * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py Removed old code * Changed from using 'sr_type' to 'linear_attention' for clarity * Fixed Class names to be more descriptive * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py Removed outdated code * Moved paper abstract to single line in pvt_v2.md * Added usage tips to pvt_v2.md * Simplified module inits by passing layer_idx * Fixed typing for hidden_act in PvtV2Config * Removed unusued import * Add pvt_v2 to docs/source/en/_toctree.yml * Updated documentation in docs/source/en/model_doc/pvt_v2.md to be more comprehensive. * Updated documentation in docs/source/en/model_doc/pvt_v2.md to be more comprehensive. * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py Move function parameters to single line Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py Update year of copyright to 2024 Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py Make code more explicit Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated sr_ratio to be more explicit spatial_reduction_ratio * Removed excess type hints in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Move params to single line in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Removed needless comment in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update copyright date in pvt_v2.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Moved params to single line in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated copyright date in configuration_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Cleaned comments in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Renamed spatial_reduction Conv2D operation * Revert "Update src/transformers/models/pvt_v2/modeling_pvt_v2.py " This reverts commit c4a04416dde8f3475ab405d1feb368600e0f8538. * Updated conversion script to reflect module name change * Deprecated reshape_last_stage option in config * Removed unused imports * Code formatting * Fixed outdated decorators on test_inference_fp16 * Added "Copied from" comments in test_modeling_pvt_v2.py * Fixed import listing * Updated model name * Force empty commit for PR refresh * Fixed linting issue * Removed # Copied from comments * Added PVTv2 to README_fr.md * Ran make fix-copies * Replace all FoamoftheSea hub references with OpenGVLab * Fixed out_indices and out_features logic in configuration_pvt_v2.py * Made ImageNet weight conversion verification optional in convert_pvt_v2_to_pytorch.py * Ran code fixup * Fixed order of parent classes in PvtV2Config to fix the to_dict method override --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
njackman-2344 authored
* torchscript and trainer md es translation * corrected md es files and even corrected spelling in en md * made es corrections to trainer.md * deleted entrenamiento... title on yml * placed entrenamiento in right place * translated es chat_templating.md w/ yml addition * requested es changes to md and yml * last es changes to md
-
Dries Verachtert authored
-
Lysandre Debut authored
* Warn against remote tool use * Additional disclaimer * Update docs/source/en/custom_tools.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
bytebarde authored
* initial implementation of flash attention for gptj * modify flash attention and overwrite test_flash_attn_2_generate_padding_right * update flash attention support list * remove the copy line in the `CodeGenBlock` * address copy mechanism * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add GPTJ attention classes * add expected outputs in the gptj test * Ensure repo consistency with 'make fix-copies' --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 12 Mar, 2024 3 commits
-
-
Furkan Akkurt authored
Update quantization.md
-
Dries Verachtert authored
-
Raushan Turganbay authored
fix fuyu docs
-
- 11 Mar, 2024 5 commits
-
-
fzyzcjy authored
* Update add_new_model.md * Update docs/source/en/add_new_model.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Amrit Gupta authored
Fixed broken link for Resources -> Token Classification -> Finetuning BERT for named-entity
-
Yitong Huang authored
* add USE_TORCH_XLA env * rename torch_tpu to torch_xla * better is_torch_xla_available; fix some fsdp and performance issues * fix format * fix bug when pjrt_device is cpu * fix bug * fix the deprecation handling --------- Co-authored-by:
anw90 <ang868@gmail.com> Co-authored-by:
wangang.wa <wangang.wa@alibaba-inc.com>
-
j-gc authored
-
Arthur authored
* post merge update * nit * oups
-
- 07 Mar, 2024 1 commit
-
-
Joao Gante authored
-
- 06 Mar, 2024 3 commits
-
-
Joao Gante authored
-
Younes Belkada authored
* add accelerate docs * Apply suggestions from code review Co-authored-by:
Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com> * Update starcoder2.md * add correct generation --------- Co-authored-by:
Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>
-
Younes Belkada authored
* add docs on exllamav2 + AWQ * Update docs/source/en/quantization.md
-
- 05 Mar, 2024 4 commits
-
-
AI4Harmony authored
* Update ko _toctree.yml * Create ko: generation_strategies.md * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com>
-
Michael authored
* [i18n-zh] Translate add_new_pipeline.md into Chinese * apply suggestions from Fan-Lin
-
Arthur authored
* initial-commit * start cleaning * small nits * small nits * current updates * add kernels * small refactoring little step * add comments * styling * nit * nits * Style * Small changes * Push dummy mambda simple slow * nit * Use original names * Use original names and remove norm * Updates for inference params * Style nd updates * nits * Match logits * Add a test * Add expected generated text * nits doc, imports and styling * style * oups * dont install kernels, invite users to install the required kernels * let use use the original packages * styling * nits * fix some copieds * update doc * fix-copies * styling done * nits * fix import check * run but wrong cuda ress * mamba CUDA works :) * fix the fast path * config naming nits * conversion script is not required at this stage * finish fixing the fast path: generation make sense now! * nit * Let's start working on the CIs * style * better style * more nits * test nit * quick fix for now * nits * nit * nit * nit * nits * update test rest * fixup * update test * nit * some fixes * nits * update test values * fix styling * nit * support peft * integrations tests require torchg * also add slow markers * styling * chose forward wisely * nits * update tests * fix gradient checkpointing * fixup * nit * fix doc * check copies * fix the docstring * fix some more tests * style * fix beam search * add init schene * update * nit * fix * fixup the doc * fix the doc * fixup * tentative update but slow is no longer good * nit * should we always use float32? * nits * revert wrong changes * res in float32 * cleanup * skip fmt for now * update generation values * update test values running original model * fixup * update tests + rename inference_params to cache_params + make sure training does not use cache_params * small nits * more nits * fix final CIs * style * nit doc * I hope final doc nits * nit * 馃珷 * final touch! * fix torch import * Apply suggestions from code review Co-authored-by:
Lysandre Debut <hi@lysand.re> * Apply suggestions from code review * fix fix and fix * fix base model prefix! * nit * Update src/transformers/models/mamba/__init__.py * Update docs/source/en/model_doc/mamba.md Co-authored-by:
Lysandre Debut <hi@lysand.re> * nit --------- Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
Joao Gante authored
-
- 04 Mar, 2024 2 commits
-
-
njackman-2344 authored
* torchscript and trainer md es translation * corrected md es files and even corrected spelling in en md * made es corrections to trainer.md * deleted entrenamiento... title on yml * placed entrenamiento in right place
-
NielsRogge authored
* First draft * More improvements * More improvements * More fixes * Fix copies * More improvements * More fixes * More improvements * Convert checkpoint * More improvements, set up tests * Fix more tests * Add UdopModel * More improvements * Fix equivalence test * More fixes * Redesign model * Extend conversion script * Use real inputs for conversion script * Add image processor * Improve conversion script * Add UdopTokenizer * Add fast tokenizer * Add converter * Update README's * Add processor * Add fully fledged tokenizer * Add fast tokenizer * Use processor in conversion script * Add tokenizer tests * Fix one more test * Fix more tests * Fix tokenizer tests * Enable fast tokenizer tests * Fix more tests * Fix additional_special_tokens of fast tokenizer * Fix tokenizer tests * Fix more tests * Fix equivalence test * Rename image to pixel_values * Rename seg_data to bbox * More renamings * Remove vis_special_token * More improvements * Add docs * Fix copied from * Update slow tokenizer * Update fast tokenizer design * Make text input optional * Add first draft of processor tests * Fix more processor tests * Fix decoder_start_token_id * Fix test_initialization * Add integration test * More improvements * Improve processor, add test * Add more copied from * Add more copied from * Add more copied from * Add more copied from * Remove print statement * Update README and auto mapping * Delete files * Delete another file * Remove code * Fix test * Fix docs * Remove asserts * Add doc tests * Include UDOP in exotic model tests * Add expected tesseract decodings * Add sentencepiece * Use same design as T5 * Add UdopEncoderModel * Add UdopEncoderModel to tests * More fixes * Fix fast tokenizer * Fix one more test * Remove parallelisable attribute * Fix copies * Remove legacy file * Copy from T5Tokenizer * Fix rebase * More fixes, copy from T5 * More fixes * Fix init * Use ArthurZ/udop for tests * Make all model tests pass * Remove UdopForConditionalGeneration from auto mapping * Fix more tests * fixups * more fixups * fix the tokenizers * remove un-necessary changes * nits * nits * replace truncate_sequences_boxes with truncate_sequences for fix-copies * nit current path * add a test for input ids * ids that we should get taken from c9f7a32f57440d90ff79890270d376a1cc0acb68 * nits converting * nits * apply ruff * nits * nits * style * fix slow order of addition * fix udop fast range as well * fixup * nits * Add docstrings * Fix gradient checkpointing * Update code examples * Skip tests * Update integration test * Address comment * Make fixup * Remove extra ids from tokenizer * Skip test * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update year * Address comment * Address more comments * Address comments * Add copied from * Update CI * Rename script * Update model id * Add AddedToken, skip tests * Update CI * Fix doc tests * Do not use Tesseract for the doc tests * Remove kwargs * Add original inputs * Update casting * Fix doc test * Update question * Update question * Use LayoutLMv3ImageProcessor * Update organization * Improve docs * Update forward signature * Make images optional * Remove deprecated device argument * Add comment, add add_prefix_space * More improvements * Remove kwargs --------- Co-authored-by:
ArthurZucker <arthur.zucker@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 28 Feb, 2024 3 commits
-
-
Michael authored
* [i18n-zh] Sync source/zh/index.md * apply review comments
-
Marc Sun authored
* [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd. * revert correct change --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
RaymondLi0 authored
* Copy model * changes * misc * fixes * add embed and residual dropout (#30) * misc * remove rms norm and gated MLP * remove copied mentions where its not a copy anymore * remove unused _shape * copied from mistral instead * fix copies * fix copies * add not doctested * fix * fix copyright * Update docs/source/en/model_doc/starcoder2.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/starcoder2/configuration_starcoder2.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/starcoder2/configuration_starcoder2.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix doc * revert some changes * add fa2 tests * fix styling nit * fix * push dummy docs --------- Co-authored-by:
Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com> Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 27 Feb, 2024 2 commits
-
-
Michael authored
* [i18n-zh] Translate fsdp.md into Chinese Signed-off-by:
windsonsea <haifeng.yao@daocloud.io> * apply suggestions from Fan-Lin --------- Signed-off-by:
windsonsea <haifeng.yao@daocloud.io>
-
Merve Noyan authored
* Image Feature Extraction docs * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update image_feature_extraction.md * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address comments * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update image_feature_extraction.md * Update image_feature_extraction.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Maria Khalusova <kafooster@gmail.com>
-
- 26 Feb, 2024 4 commits
-
-
Eduardo Pacheco authored
* First commit * Improvements * More improvements * Converted original checkpoint to HF checkpoint * Fix style * Fixed forward * More improvements * More improvements * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Remove asserts * Remove unnecessary attributes * Changed model name to camel case * Improve forward doc * Improve tests * More improvements * Fix copies * Fix doc * Make SegGptImageProcessor more flexible * Added few-shot test * Fix style * Update READMEs and docs * Update READMEs * Make inputs required * Add SegGptForImageSegmentation * Make tests pass * Rename to out_indicies * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Fixed naming convention * Copying SegGptMlp from modeling_sam.py * Some minor improvements * Remove mlp_ratio * Fix docstrings * Fixed docstring match * Objects defined before use * Storing only patch_size and beta for SegGptLoss * removed _prepare_inputs method * Removed modified from headers * Renamed to output_indicies * Removed unnecessary einsums * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixing issues * Raise error as soon as possible * More fixes * Fix merge * Added palette to SegGptImageProcessor * Fixed typo * Fixed shape typo * Added permute before doing palette to class mapping * Fixed style * Fixed and added tests * Fixed docstrings * Matching SegFormer API for post_processing_semantic_segmentation * Fixed copies * Fixed SegGptImageProcessor to handle both binary and RGB masks * Updated docstrings of SegGptImageProcessor * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/seggpt.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/convert_seggpt_to_hf.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_image_processing_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Object definitions above & fix style * Renamed output_indices to intermediate_feature_indices * Removed unnecessary check on bool_masked_pos * Loss first in the outputs * Added validation for do_normalize * Improved SegGptImageProcessor and added new tests * Added comment * Added docstrings to SegGptLoss * Reimplemented ensemble condition logic in SegGptEncoder * Update src/transformers/models/seggpt/__init__.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/convert_seggpt_to_hf.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updated docstrings to use post_process_semantic_segmentation * Fixed typo on docstrings * moved pixel values test to test_image_processing_seggpt * Addressed comments * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated docstrings for SegGptLoss * Address comments * Added SegGpt example to model docs * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * moved patchify and unpatchify * Rename checkpoint * Renamed intermediate_features to intermediate_hidden_states for consistency * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Replaced post_process_masks for post_process_semantic_segmentation in the docs --------- Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Niels <niels.rogge1@gmail.com> Co-authored-by:
Eduardo Pacheco <eduardo.pacheco@limehome.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Michael authored
* [zh] Translate a task: asr.md Signed-off-by:
windsonsea <haifeng.yao@daocloud.io> * apply suggestions from Fan-Lin --------- Signed-off-by:
windsonsea <haifeng.yao@daocloud.io>
-
Ming Xu (寰愭槑) authored
* [Pix2struct] Simplify generation (#22527) * Add model to doc tests * Remove generate and replace by prepare_inputs_for_generation * More fixes * Remove print statements * Update integration tests * Fix generate * Remove model from auto mapping * Use auto processor * Fix integration tests * Fix test * Add inference code snippet * Remove is_encoder_decoder * Update docs * Remove notebook link * Release: v4.28.0 * Revert (for now) the change on `Deta` in #22437 (#22750) fix Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> * Patch release: v4.28.1 * update zh chat template. * Update docs/source/zh/chat_templating.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/zh/_toctree.yml Co-authored-by:
Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by:
Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by:
Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by:
Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by:
Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by:
Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by:
Michael <haifeng.yao@daocloud.io> --------- Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Michael <haifeng.yao@daocloud.io>
-
Michael authored
Signed-off-by:windsonsea <haifeng.yao@daocloud.io>
-