- 05 Feb, 2024 3 commits
-
-
amyeroberts authored
* Draft pipeline * Fixup * Fix docstrings * Update doctest * Update pipeline_model_mapping * Update docstring * Update tests * Update src/transformers/pipelines/image_feature_extraction.py Co-authored-by:
Omar Sanseviero <osanseviero@gmail.com> * Fix docstrings - review comments * Remove pipeline mapping for composite vision models * Add to pipeline tests * Remove for flava (multimodal) * safe pil import * Add requirements for pipeline run * Account for super slow efficientnet * Review comments * Fix tests * Swap order of kwargs * Use build_pipeline_init_args * Add back FE pipeline for Vilt * Include image_processor_kwargs in docstring * Mark test as flaky * Update TODO * Update tests/pipelines/test_pipelines_image_feature_extraction.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add license header --------- Co-authored-by:
Omar Sanseviero <osanseviero@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Yoach Lacombe authored
* Correct wav2vec2-bert inputs_to_logits_ratio * correct ratio * correct ratio, clean asr pipeline * refactor on one line
-
Nicolas Patry authored
* [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 02 Feb, 2024 4 commits
-
-
amyeroberts authored
Mark test as flaky
-
Yih-Dar authored
* fix / skip some tests before we can switch to torch 2.2 * style --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Juri Ganitkevitch authored
* Add missing None check for hf_quantizer * Add test, fix logic. * make style * Switch test model to Mistral * Comment * Update tests/test_modeling_utils.py --------- Co-authored-by:Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Klaus Hipp authored
* Fix typos and grammar mistakes in docs and examples * Fix typos in docstrings and comments * Fix spelling of `tokenizer` in model tests * Remove erroneous spaces in decorators * Remove extra spaces in Markdown link texts
-
- 01 Feb, 2024 2 commits
-
-
fxmarty authored
* fix symbolic_trace with kv cache * comment & better test
-
JB (Don) authored
* Adding [T5/MT5/UMT5]ForTokenClassification * Add auto mappings for T5ForTokenClassification and variants * Adding ForTokenClassification to the list of models * Adding attention_mask param to the T5ForTokenClassification test * Remove outdated comment in test * Adding EncoderOnly and Token Classification tests for MT5 and UMT5 * Fix typo in umt5 string * Add tests for all the existing MT5 models * Fix wrong comment in dependency_versions_table * Reverting change to common test for _keys_to_ignore_on_load_missing The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing. * Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model * Add fix-copies to MT5ModelTest
-
- 31 Jan, 2024 4 commits
-
-
Joao Gante authored
DeepSpeed: hardcode `torch.arange` dtype on `float` usage to avoid incorrect initialization (#28760)
-
Kian Sierra McGettigan authored
* direct copy from llama work * mistral modules forward pass working * flax mistral forward pass with sliding window * added tests * added layer collection approach * Revert "added layer collection approach" This reverts commit 0e2905bf2236ec323163fc1a9f0c016b21aa8b8f. * Revert "Revert "added layer collection approach"" This reverts commit fb17b6187ac5d16da7c461e1130514dc3d137a43. * fixed attention outputs * added mistral to init and auto * fixed import name * fixed layernorm weight dtype * freeze initialized weights * make sure conversion consideres bfloat16 * added backend * added docstrings * added cache * fixed sliding window causal mask * passes cache tests * passed all tests * applied make style * removed commented out code * applied fix-copies ignored other model changes * applied make fix-copies * removed unused functions * passed generation integration test * slow tests pass * fixed slow tests * changed default dtype from jax.numpy.float32 to float32 for docstring check * skip cache test for FlaxMistralForSequenceClassification since if pad_token_id in input_ids it doesn't score previous input_ids * updated checkpoint since from_pt not included * applied black style * removed unused args * Applied styling and fixup * changed checkpoint for doc back * fixed rf after adding it to hf hub * Add dummy ckpt * applied styling * added tokenizer to new ckpt * fixed slice format * fix init and slice * changed ref for placeholder TODO * added copies from Llama * applied styling * applied fix-copies * fixed docs * update weight dtype reconversion for sharded weights * removed Nullable input ids * Removed unnecessary output attentions in Module * added embedding weight initialziation * removed unused past_key_values * fixed deterministic * Fixed RMS Norm and added copied from * removed input_embeds * applied make style * removed nullable input ids from sequence classification model * added copied from GPTJ * added copied from Llama on FlaxMistralDecoderLayer * added copied from to FlaxMistralPreTrainedModel methods * fix test deprecation warning * freeze gpt neox random_params and fix copies * applied make style * fixed doc issue * skipped docstring test to allign # copied from * applied make style * removed FlaxMistralForSequenceClassification * removed unused padding_idx * removed more sequence classification * removed sequence classification * applied styling and consistency * added copied from in tests * removed sequence classification test logic * applied styling * applied make style * removed freeze and fixed copies * undo test change * changed repeat_kv to tile * fixed to key value groups * updated copyright year * split casual_mask * empty to rerun failed pt_flax_equivalence test FlaxWav2Vec2ModelTest * went back to 2023 for tests_pr_documentation_tests * went back to 2024 * changed tile to repeat * applied make style * empty for retry on Wav2Vec2
-
Patrick von Platen authored
* up * Fix more * Correct more * Fix more tests * fix fast tests * Fix more * fix more * push all files * finish all * make style * Fix timestamp wrap * make style * make style * up * up * up * Fix lang detection behavior * Fix lang detection behavior * Add lang detection test * Fix lang detection behavior * make style * Update src/transformers/models/whisper/generation_whisper.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * better error message * make style tests * add warning --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
tom-p-reichel authored
* test that tied output embeddings aren't initialized on load * don't initialize the output embeddings if we're going to tie them to the input embeddings
-
- 30 Jan, 2024 4 commits
-
-
Matt authored
* Port core files + ESM (because ESM code is odd) * Search-replace in modelling code * Fix up transfo_xl as well * Fix other core files + tests (still need to add correct import to tests) * Fix cookiecutter * make fixup, fix imports in some more core files * Auto-add imports to tests * Cleanup, add imports to sagemaker tests * Use correct exception for importing tf_keras * Fixes in modeling_tf_utils * make fixup * Correct version parsing code * Ensure the pipeline tests correctly revert to float32 after each test * Ensure the pipeline tests correctly revert to float32 after each test * More tf.keras -> keras * Add dtype cast * Better imports of tf_keras * Add a cast for tf.assign, just in case * Fix callback imports
-
amyeroberts authored
* Enable instantiating model with pretrained backbone weights * Remove doc updates until changes made in modeling code * Use load_backbone instead * Add use_timm_backbone to the model configs * Add missing imports and arguments * Update docstrings * Make sure test is properly configured * Include recent DPT updates
-
Poedator authored
* squashed earlier commits for easier rebase * rm rebase leftovers * 4bit save enabled @quantizers * TMP gptq test use exllama * fix AwqConfigTest::test_wrong_backend for A100 * quantizers AWQ fixes * _load_pretrained_model low_cpu_mem_usage branch * quantizers style * remove require_low_cpu_mem_usage attr * rm dtype arg from process_model_before_weight_loading * rm config_origin from Q-config * rm inspect from q_config * fixed docstrings in QuantizationConfigParser * logger.warning fix * mv is_loaded_in_4(8)bit to BnbHFQuantizer * is_accelerate_available error msg fix in quantizer * split is_model_trainable in bnb quantizer class * rm llm_int8_skip_modules as separate var in Q * Q rm todo * fwd ref to HFQuantizer in type hint * rm note re optimum.gptq.GPTQQuantizer * quantization_config in __init__ simplified * replaced NonImplemented with create_quantized_param * rm load_in_4/8_bit deprecation warning * QuantizationConfigParser refactoring * awq-related minor changes * awq-related changes * awq config.modules_to_not_convert * raise error if no q-method in q-config in args * minor cleanup * awq quantizer docstring * combine common parts in bnb process_model_before_weight_loading * revert test_gptq * .process_model_ cleanup * restore dict config warning * removed typevars in quantizers.py * cleanup post-rebase 16 jan * QuantizationConfigParser classmethod refactor * rework of handling of unexpected aux elements of bnb weights * moved q-related stuff from save_pretrained to quantizers * refactor v1 * more changes * fix some tests * remove it from main init * ooops * Apply suggestions from code review Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix awq issues * fix * fix * fix * fix * fix * fix * add docs * Apply suggestions from code review Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/hf_quantizer.md * address comments * fix * fixup * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * address final comment * update * Update src/transformers/quantizers/base.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * add kwargs update * fixup * add `optimum_quantizer` attribute * oops * rm unneeded file * fix doctests --------- Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Omar Sanseviero authored
* Update quantization_config.py * Style * Protect from setting directly * add tests * Update tests/quantization/bnb/test_4bit.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 29 Jan, 2024 3 commits
-
-
amyeroberts authored
* Make test_constrained_beam_search_generate as flaky * Update tests/generation/test_utils.py
-
Wesley Gifford authored
*
馃悰 fix .max bug * remove prediction_length from regression output dimensions * fix parameter names, fix output names, update tests * ensure shape for PatchTST * ensure output shape for PatchTSMixer * update model, batch, and expected for regression distribution test * update test expected Signed-off-by:Wesley M. Gifford <wmgifford@us.ibm.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * standardize on patch_length Signed-off-by:
Wesley M. Gifford <wmgifford@us.ibm.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make arguments more explicit Signed-off-by:
Wesley M. Gifford <wmgifford@us.ibm.com> * adjust prepared inputs Signed-off-by:
Wesley M. Gifford <wmgifford@us.ibm.com> --------- Signed-off-by:
Wesley M. Gifford <wmgifford@us.ibm.com> Co-authored-by:
Wesley M. Gifford <wmgifford@us.ibm.com> Co-authored-by:
Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Angela Yi authored
* Add serialized type name to pytrees * Modify context * add serde test
-
- 26 Jan, 2024 1 commit
-
-
Sanchit Gandhi authored
-
- 25 Jan, 2024 1 commit
-
-
NielsRogge authored
* First draft * More improvements * More improvements * More improvements * More improvements * Add docs * Remove file * Add copied from * Address comments * Address comments * Address comments * Fix style * Update docs * Convert all checkpoints, add integration test * Rename checkpoints * Add pretrained backbone attributes * Fix default config * Address comment * Add figure to docs * Fix bug thanks to @xenova * Update conversion script * Fix integration test
-
- 24 Jan, 2024 1 commit
-
-
Khai Mai authored
* fix the function load_balancing_loss_func in Mixtral_Moe to include attention_mask * format code using black and ruff * skip computing mask if attention_mask=None * add tests for load balancing loss Mixtral-Moe * fix assert loss is different in mixtral_test * fix pad_leng * use assertNotAlmostEqual and print to debug * remove print for debug * minor updates * reduce rtol and atol
-
- 23 Jan, 2024 2 commits
-
-
amyeroberts authored
* Enable instantiating model with pretrained backbone weights * Update tests so backbone checkpoint isn't passed in * Remove doc updates until changes made in modeling code * Clarify pretrained import * Update configs - docs and validation check * Update src/transformers/utils/backbone_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Clarify exception message * Update config init in tests * Add test for when use_timm_backbone=True * Small test updates --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Lysandre Debut authored
Enable safetensors conversion from PyTorch to other frameworks without the torch requirement (#27599) * Initial commit * Requirements & tests * Tests * Tests * Rogue import * Rogue torch import * Cleanup * Apply suggestions from code review Co-authored-by:
Nicolas Patry <patry.nicolas@protonmail.com> * bfloat16 management * Sanchit's comments * Import shield * apply suggestions from code review * correct bf16 * rebase --------- Co-authored-by:
Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by:
sanchit-gandhi <sanchit@huggingface.co>
-
- 22 Jan, 2024 1 commit
-
-
Yih-Dar authored
* avoid root logger's level being changed --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 21 Jan, 2024 1 commit
-
-
Arthur authored
* fix dtype issue * add a test * update copied from mentions * nits * fixup * fix copies * Apply suggestions from code review
-
- 19 Jan, 2024 7 commits
-
-
Sangbum Daniel Choi authored
* [DETA] fix freeze/unfreeze function * Update src/transformers/models/deta/modeling_deta.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deta/modeling_deta.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * add freeze/unfreeze test case in DETA * fix type * fix typo 2 * fix : enable aux and enc loss in training pipeline * Add unsynced variables from original DETA for training * modification for passing CI test * make style * make fix * manual make fix * change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking * remove print * divide configuration in DetaModel and DetaForObjectDetection * image smaller size than 224 will give topk error * pred_boxes and logits should be equivalent to two_stage_num_proposals * add missing part in DetaConfig * Update src/transformers/models/deta/modeling_deta.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add docstring in configure and prettify TO DO part * change distribute related code to accelerate * Update src/transformers/models/deta/configuration_deta.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/deta/test_modeling_deta.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * protect importing accelerate * change variable name to specific value * wrong import * fix aux_loss in conditional_detr * add test aux_loss * add aux_loss test in deta and table_transformer * fix yolos since it doesn't have auxiliary function * fix maskformer auxiliary_loss related code * make style * change param 'auxiliary_loss' to 'use_auxiliary_loss' * change param 'auxiliary_loss' to 'use_auxiliary_loss' in tests * make style & fix-copies, also revert yolos related parameter * revert variable name 'use_auxiliary_loss' to 'auxiliary_loss' due to DetrConfig * revert variable name in yolos * revert maskformer * add aux_loss test in maskformer * make style * Update src/transformers/models/yolos/configuration_yolos.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Ofir Zafrir authored
-
Matt authored
* Allow non-special tokens to be added * Add test, fix token adding code * Revert changes to id_to_token and token_to_id * Update the ESM tokenizer to be a bit more standardized * Update src/transformers/models/esm/tokenization_esm.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Patrick von Platen authored
* finalize * make fix copies whisper * [Tests] Make sure that we don't run tests mulitple times * Update src/transformers/models/whisper/modeling_whisper.py * [Tests] Make sure that we don't run tests mulitple times * fix more * improve * improve * improve further * improve more * improve * fix more * git commit and git push * fix more * fix more * fix more * New try * Fix more whisper stuff * Improve * correct more * correct more * correct more * Fix some tests * Add more tests * correct more * correct more * correct more * push * correct more * Fix more * Better * without dec mask * correct more * clean * save intermediate * Fix more * Fix VAD for large-v2 * Save new * Correct more * make cleaner * correct tests * correct src * Finish * Fix more * Fix more * finish * Fix edge cases * fix return_dict_in_generate * fix all tests * make style * add docstrings * add docstrings * Fix logit processor * make style * fix pipeline test * fix more style * Apply suggestions from code review * apply feedback Sanchit * correct more * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct more * correct more * correct more * Fix staticmethod * correct more * fix * fix slow tests * make style * fix tokenizer test * fix tokenizer test * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * finish * finish * revert kwargs change --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Saibo-creator authored
-
Yoach Lacombe authored
* generalize asr pipeline to fbank models * change w2v2 pipeline output * Update test_pipelines_automatic_speech_recognition.py
-
Yih-Dar authored
* not save if empty * fix * fix * fix * fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 18 Jan, 2024 5 commits
-
-
Sanchit Gandhi authored
* fix * tests * fix test
-
Yih-Dar authored
* inside with LoggingLevel * remove is_flaky --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yoach Lacombe authored
* first commit * correct default value non causal * update config and modeling code * update converting checkpoint * clean modeling and fix tests * make style * add new config parameters to docstring * fix copied from statements * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * make position_embeddings_type docstrings clearer * clean converting script * remove function not used * clean modeling file * apply suggestion for test file + add convert script to not_doctested * modify tests according to review - cleaner logic and more tests * Apply nit suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add checker of valid position embeddings type * instantiate new layer norm layer with the right eps * fix freeze_feature_encoder since it can be None in some cases * add test same output in convert script * restore wav2vec2conformer and add new model * create processor and FE + clean * add new model code * fix convert script and set default config parameters * correct model id paths * make style * make fix-copies and cleaning files * fix copied from statements * complete .md and fixe copies * clean convert script argument defaults * fix config parameters docstrings * fix config docstring * add copied from and enrich FE tests * fix copied from and repo-consistency * add autotokenizer * make test input length shorter and change docstring code * fix docstrings and copied from * add add_adapter to ASR training example * make testing of adapters more robust * adapt to multi adapter layers * refactor input_values->input_features and remove w2v2-bert feature extractor * remove pretraining model * remove depreciated features and useless lines * add copied from and ignore statements to modeling tests * remove pretraining model #2 * change import in convert script * change default in convert script * update readme and remove useless line * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor BERT to Bert for consistency * remove useless ignore copy statement * add persistent to buffer in rotary * add eps in LayerNorm init and remove copied from * add adapter activation parameters and add copied from statements * Fix copied statements and add unitest.skip reasons * add copied statement in test_processor * refactor processor * make style * replace numpy random by torch rand * remove expected output CTC * improve converting script with processor class * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove gumbel class * remove tests related to previously deleted class * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * correct typos * remove uused parameters * update processor to takes both text and audio * update checkpoints * update expected output and add ctc expected output * add label_attention_mask * replace pt with np in processor tests * fix typo * revert to behaviour with labels_attention_mask --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Arthur authored
* fix * last attempt * current work * fix forward compatibility * save all special tokens * current state * revert additional changes * updates * remove tokenizer.model * add a test and the fix * nit * revert one more break * fix typefield issue * quality * more tests * fix fields for FC * more nits? * new additional changes * how * some updates * the fix * where do we stand * nits * nits * revert unrelated changes * nits nits nits * styling * don't break llama just yet * revert llama changes * safe arg check * fixup * Add a test for T5 * Necessary changes * Tests passing, added tokens need to not be normalized. If the added tokens are normalized, it will the stripping which seems to be unwanted for a normal functioning * Add even more tests, when normalization is set to True (which does not work
馃槗 ) * Add even more tests, when normalization is set to True (which does not work馃槗 ) * Update to main * nits * fmt * more and more test * comments * revert change as tests are failing * make the test more readble * nits * refactor the test * nit * updates * simplify * style * style * style convert slow * Update src/transformers/convert_slow_tokenizer.py -
Yih-Dar authored
* save processor * Update tests/models/auto/test_processor_auto.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/test_processing_common.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 17 Jan, 2024 1 commit
-
-
fxmarty authored
* skip bf16 test if not supported by device * fix * fix bis * use is_torch_bf16_available_on_device * use is_torch_fp16_available_on_device * fix & use public llama * use 1b model * fix flacky test --------- Co-authored-by:Your Name <you@example.com>
-