"vscode:/vscode.git/clone" did not exist on "483861d52db59cf99219a0281695d1e7e8859218"
- 04 Nov, 2022 1 commit
-
-
Matt authored
* Fix esm lm head test * make fixup
-
- 03 Nov, 2022 1 commit
-
-
Sanchit Gandhi authored
* [Whisper Tokenizer] Make more user-friendly * use property * make indexing rigorous * small clean-up * tests * skip seq2seq tests * remove multilingual arg * reorder args * collapse to one function Co-authored-by:
ArthurZucker <arthur@huggingface.co> * option to override attributes Co-authored-by:
ArthurZucker <arthur@huggingface.co> * add to docs * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make comment more clear Co-authored-by:
sgugger <sylvain@huggingface.co> * don't add special tokens in get_decoder_prompt_ids * add test for set_prefix_tokens Co-authored-by:
ArthurZucker <arthur@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
sgugger <sylvain@huggingface.co>
-
- 02 Nov, 2022 5 commits
-
-
Ben Eyal authored
馃毃 馃毃 馃毃 Fix Issue 15003: SentencePiece Tokenizers Not Adding Special Tokens in `convert_tokens_to_string` (#15775) * Add test for SentencePiece not adding special tokens to strings * Add SentencePieceStringConversionMixin to fix issue 15003 * Fix conversion from tokens to string for most SentencePiece tokenizers Tokenizers fixed: - AlbertTokenizer - BarthezTokenizer - CamembertTokenizer - FNetTokenizer - M2M100Tokenizer - MBart50Tokenizer - PegasusTokenizer - Speech2TextTokenizer * Fix MarianTokenizer, adjust SentencePiece test to accomodate vocab * Fix DebertaV2Tokenizer * Ignore LayoutXLMTokenizer in SentencePiece string conversion test * Run 'make style' and 'make quality' * Clean convert_tokens_to_string test Instead of explicitly ignoring LayoutXLMTokenizer in the test, override the test in LayoutLMTokenizationTest and do nothing in it. * Remove commented out code * Improve robustness of convert_tokens_to_string test Instead of comparing lengths of re-tokenized text and input_ids, check that converting all special tokens to string yields a string with all special tokens. * Inline and remove SentencePieceStringConversionMixin The convert_tokens_to_string method is now implemented in each relevant SentencePiece tokenizer. * Run 'make style' and 'make quality' * Revert removal of space in convert_tokens_to_string * Remove redundant import * Revert test text to original * Uncomment the lowercasing of the reverse_text variable * Mimic Rust tokenizer behavior for tokenizers - Albert - Barthez - Camembert - MBart50 - T5 * Fix accidentally skipping test in wrong tokenizer * Add test for equivalent Rust and slow tokenizer behavior * Override _decode in BigBirdTokenizer to mimic Rust behavior * Override _decode in FNetTokenizer to mimic Rust behavior * Override _decode in XLNetTokenizer to mimic Rust behavior * Remove unused 're' import * Update DebertaV2Tokenizer to mimic Rust tokenizer * Deberta tokenizer now behaves like Albert and its `convert_tokens_to_string` is not tested. * Ignore problematic tests in Deberta V2 * Add comment on why the Deberta V2 tests are skipped -
Yih-Dar authored
* part 1 * part 2 * part 3 * fix * For CANINE * For ESMFold Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
amyeroberts authored
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
amyeroberts authored
* Add CLIP image processor * Crop size as dict too * Update warning * Actually use logger this time * Normalize doesn't change dtype of input * Add perceiver image processor * Tidy up * Add DPT image processor * Add Vilt image processor * Tidy up * Add poolformer image processor * Tidy up * Add LayoutLM v2 and v3 imsge processors * Tidy up * Add Flava image processor * Tidy up * Add deit image processor * Tidy up * Add ConvNext image processor * Tidy up * Add levit image processor * Add segformer image processor * Add in post processing * Fix up * Add ImageGPT image processor * Fixup * Add mobilevit image processor * Tidy up * Add postprocessing * Fixup * Add VideoMAE image processor * Tidy up * Add ImageGPT image processor * Fixup * Add ViT image processor * Tidy up * Add beit image processor * Add mobilevit image processor * Tidy up * Add postprocessing * Fixup * Fix up * Fix flava and remove tree module * Fix image classification pipeline failing tests * Update feature extractor in trainer scripts * Update pad_if_smaller to accept tuple and int size * Update for image segmentation pipeline * Update src/transformers/models/perceiver/image_processing_perceiver.py Co-authored-by:
Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Update src/transformers/image_processing_utils.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/beit/image_processing_beit.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * PR comments - docstrings; remove accidentally added resize; var names * Update docstrings * Add exception if size is not in the right format * Fix exception check * Fix up * Use shortest_edge in tuple in script Co-authored-by:
Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
- 01 Nov, 2022 2 commits
-
-
Joao Gante authored
* Use beam search functionality; Add extra outputs and test * Add full tests for contrastive search * Add error message on unconventional cache format
-
Matt authored
* initial commit * First draft that gets outputs without crashing! * Add all the ported openfold dependencies * testing * Restructure config files for ESMFold * Debugging to find output discrepancies * Mainly style * Make model runnable without extra deps * Remove utils and merge them to the modeling file * Use correct gelu and remove some debug prints * More cleanup * Update esm docs * Update conversion script to support ESMFold properly * Port some top-level changes from ESMFold repo * Expand EsmFold docstrings * Make attention_mask optional (default to all 1s) * Add inference test for ESMFold * Use config and not n kwargs * Add modeling output class * Remove einops * Remove chunking in ESM FFN * Update tests for ESMFold * Quality * REpo consistency * Remove tree dependency from ESMFold * make fixup * Add an error in case my structure map function breaks later * Remove needless code * Stop auto-casting the LM to float16 so CPU tests pass * Stop auto-casting the LM to float16 so CPU tests pass * Final test updates * Split test file * Copyright and quality * Unpin PyTorch to see built doc * Fix config file to_dict() method * Add some docstrings to the output * Skip TF checkpoint tests for ESM until we reupload those * make fixup * More docstrings * Unpin to get even with main * Flag example to write Co-authored-by:Sylvain Gugger <Sylvain.gugger@gmail.com>
-
- 31 Oct, 2022 2 commits
-
-
NielsRogge authored
Co-authored-by:Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
NielsRogge authored
* Add postprocessing methods * Update docs * Add fix * Add test * Add test for deformable detr postprocessing * Add post processing methods for segmentation * Update code examples * Add post_process to make the pipeline work * Apply updates Co-authored-by:Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
- 28 Oct, 2022 1 commit
-
-
donguk.lim authored
* Support segformer fx * Add fx_compatible attribute to test_modeling_segformer.py * Update glpn model (fx support) glpn model was copied from segformer. * Update utils/fx.py | add semantic-segmentation for SegformerForSemanticSegmentation model * Fix minor import order(isort) * Add random input generation for segformer fx Co-authored-by:noelbird <lduldu00228@gmail.com>
-
- 27 Oct, 2022 2 commits
-
-
Antonio Carlos Falc茫o Petri authored
* Fix tests when running on GPU * Fix tests that require mp.set_start_method
-
Yih-Dar authored
* Add pegasus_x * ViTMSN * ESM Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 25 Oct, 2022 2 commits
-
-
Lysandre Debut authored
* Support for Vilt in v1.9 * Skip if not higher or equal than 1.10 * Move test :) * I am bad at python
-
Guillaume Klein authored
-
- 24 Oct, 2022 1 commit
-
-
Yih-Dar authored
* Update expected values * fix style Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 21 Oct, 2022 4 commits
-
-
Yih-Dar authored
* Run some TF Whisper tests in subprocesses to avoid GPU OOM Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Alara Dirik authored
* Fix panoptic segmentation and pipeline * Update ImageSegmentationPipeline tests and reenable test_small_model_pt * Resolve backward compatibility issues
-
Hao Wang authored
* support sentencepiece for bertjapanesetokenizer * add test vocab file for sentencepiece, bertjapanesetokenizer * make BasicTokenizer be identical to transformers.models.bert.tokenization_bert.BasicTokenizer * fix missing of \n in comment * fix init argument missing in tests * make spm_file be optional, exclude spiece.model from tests/fixtures, and add description comments * make comment length less than 119 * apply doc style check
-
Yih-Dar authored
* First step of PT->TF for composite models * Update the tests * For VisionEncoderDecoderModel * Fix * Fix * Add comment * Fix * clean up import * Save memory * For (TF)EncoderDecoderModel * For (TF)EncoderDecoderModel Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 18 Oct, 2022 5 commits
-
-
David Yang authored
* Clean up deprecation warnings Notes: Changed some strings in tests to raw strings, which will change the literal content of the strings as they are fed into whatever machine handles them. Test cases for past in the past/past_key_values switch changed/removed due to warning of impending removal * Add PILImageResampling abstraction for PIL.Image.Resampling
-
NielsRogge authored
* First draft * Add conversion script * Make conversion work * Upload checkpoints * Add final fixes * Revert changes of conditional and deformable detr * Fix toctree, add and remove copied from * Use model type * Improve docs * Improve code example * Update copies * Add copied formt * Don't update conditional detr * Don't update deformable detr
-
Antonio Carlos Falc茫o Petri authored
* [Wav2Vec2] Allow user-managed Pool in Wav2Vec2ProcessorWithLM.batch_decode * [Wav2Vec2] Add user-managed LM's pool tests and usage examples * Improve styling Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [Wav2Vec2] Fix hyperlink references Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
NielsRogge authored
* Improve DETR models * Fix Deformable DETR loss and matcher * Fixup * Fix integration tests * Improve variable names * Apply suggestion * Fix copies * Fix DeformableDetrLoss * Make Conditional DETR copy from Deformable DETR * Copy from deformable detr's hungarian matcher * Fix bug
-
Arthur authored
-
- 17 Oct, 2022 1 commit
-
-
Matt authored
* Partial TF port for ESM model * Add ESM-TF tests * Add the various imports for TF-ESM * TF weight conversion almost ready * Stop ignoring the decoder weights in PT * Add tests and lots of fixes * fix-copies * Fix imports, add model docs * Add get_vocab() to tokenizer * Fix vocab links for pretrained files * Allow multiple inputs with a sep * Use EOS as SEP token because ESM vocab lacks SEP * Correctly return special tokens mask from ESM tokenizer * make fixup * Stop testing unsupported embedding resizing * Handle TF bias correctly * Skip all models with slow tokenizers in the token classification test * Fixing the batch/unbatcher of pipelines to accomodate the `None` being passed around. * Fixing pipeline bug caused by slow tokenizer being different. * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update set_input_embeddings and the copyright notices Co-authored-by:
Your Name <you@example.com> Co-authored-by:
Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
- 14 Oct, 2022 3 commits
-
-
Pi Esposito authored
* add suport for non fast tf bert tokenizer * add tests for non fast tf bert tokenizer * fix fast bert tf tokenizer flag * double tokenizers list on tf tokenizers test to aovid breaking zip on test output equivalence * reformat code with black to comply with code quality checks * trigger ci
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Sanchit Gandhi authored
* [Whisper] Don't return attention mask in feat extractor * remove attention mask from test * fix failing tests * quality
-
- 13 Oct, 2022 1 commit
-
-
Sanchit Gandhi authored
* [Whisper] Freeze params of encoder * add tests
-
- 12 Oct, 2022 4 commits
-
-
Yih-Dar authored
* return None to avoid recursive call * Give error * Give error * Add test * More tests * Quality Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Sylvain Gugger authored
* Add a decorator for flaky tests * Quality * Don't break the rest * Address review comments * Fix test name * Fix typo and print to stderr
-
NielsRogge authored
* Fix XCLIP doc tests * Add model to doc test list * Fix tests
-
NielsRogge authored
* First draft * Fix more things * Improve more things * Remove some head models * Fix more things * Add missing layers * Remove tokenizer * Fix more things * Fix copied from statements * Make all tests pass * Remove print statements * Remove files * Fix README and docs * Add integration test and fix organization * Add tips * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Make tests faster, improve docs * Fix doc tests * Add model to toctree * Add docs * Add note about creating new checkpoint * Remove is_decoder * Make tests smaller, add docs Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 11 Oct, 2022 5 commits
-
-
Mathieu Jouffroy authored
* implemented TFCvtModel and TFCvtForImageClassification and modified relevant files, added an exception in convert_tf_weight_name_to_pt_weight_name, added quick testing file to compare with pytorch model * added docstring + testing file in transformers testing suite * added test in testing file, modified docs to pass repo-consistency, passed formatting test * refactoring + passing all test * small refacto, removing unwanted comments * improved testing config * corrected import error * modified acces to pretrained model archive list, to pass tf_test * corrected import structure in init files * modified testing for keras_fit with cpu * correcting PR issues + Refactoring * Refactoring : improving readability and reducing the number of permutations * corrected momentum value + cls_token initialization * removed from_pt as weights were added to the hub * Update tests/models/cvt/test_modeling_tf_cvt.py Co-authored-by:Joao Gante <joaofranciscocardosogante@gmail.com>
-
David Yang authored
* Make cpm tokenization independent of xlnet * Make bert japanese tokenization independent of bert
-
Joao Gante authored
馃毃 馃毃 馃毃 TF: Remove `TFWrappedEmbeddings` (breaking: TF embedding initialization updated for encoder-decoder models) (#19263) * added test * correct embedding init * some changes in blenderbot (incomplete) * update blenderbot (diff to be used as reference) * update blenderbot_small * update LED * update marian * update T5 and remove TFWrappedEmbeddings * nullcontext() -> ContextManagers() * fix embedding init -
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* Fix TFGroupViT CI Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-