- 07 Nov, 2022 11 commits
-
-
Tom Aarsen authored
-
Steven Liu authored
* add new terms * apply review
-
Tom Aarsen authored
* docs: Fixed variables in f-strings * Replace unknown `block` with known `block_type` in ValueError Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add missing torch import in docs code block Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
TAGAMI Yukihiro authored
-
Tom Aarsen authored
* docs: Fix typo in ONNX parser help: 'tolerence' => 'tolerance' * docs: Resolve many typos in the English docs Typos found via 'codespell ./docs/source/en'
-
Tom Aarsen authored
With https://github.com/TimDettmers/bitsandbytes, which is by the same author and is still being updated
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Saad Mahmud authored
* swap RobertaConfig with PretrainedConfig * Add camembert specific attributes * Add PretrainedConfig docstring * Add arguments docstring * Change CamembertConfig docstring definition * Fix typo CamembertConfig -> CamembertModel * Fix typo BertModel -> CamembertModel * Fix style of CamembertConfig
-
Saad Mahmud authored
* Add example docstring for DPRConfig * Add DPRConfig to documentation_tests
-
Joao Gante authored
* Add contrastive search
-
- 04 Nov, 2022 14 commits
-
-
Christopher Akiki authored
-
Christopher Akiki authored
-
amyeroberts authored
* Update defaults and logic to match old FE * Use docker run rest values
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* POC * For more CLIP-like models Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Jordan Clive authored
Update documentation on seq2seq models with absolute positional embeddings, to be in line with Tips section for BERT and GPT2 (#20068) Co-authored-by:jordiclive <jordiclive19@imperial.ac.uk>
-
Matt authored
* Update READMEs for ESMFold and add notebooks * Fix PyCharm formatting * make fix-copies
-
H. Jhoo authored
-
NielsRogge authored
* Fix Swin * Remove file * Update code snippet * Add copied from to maskformer * Fix docstring * Add whole name to replace Co-authored-by:Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
amyeroberts authored
* Poolformer image processor defaults to previous FE * Remove unnecessary math.floor
-
Sanchit Gandhi authored
-
Sourab Mangrulkar authored
-
bhuang authored
-
Matt authored
* Fix esm lm head test * make fixup
-
- 03 Nov, 2022 11 commits
-
-
Patrick Deutschmann authored
* Speed up TF postprocessing by converting to numpy before * Fix bug that was triggered when offset_mapping was None Co-authored-by:Patrick Deutschmann <patrick.deutschmann@dedalus.com>
-
Sylvain Gugger authored
* Only resize embeddings when necessary * Add comment
-
Michael Benayoun authored
-
Matt authored
* Update ESM conversion script for ESMfold * Fix bug in ESMFold example * make fixup and move restypes to one line
-
Wang, Yi authored
fix jit trace error for model forward sequence is not aligned with jit.trace tuple input sequence, update related doc (#19891) * fix jit trace error for classification usecase, update related doc Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * add implementation in torch 1.14.0 Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * update_doc Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * update_doc Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
-
Arthur authored
* fix led eos_mask * add Futur Warning * revert uselesss cahnges * Update src/transformers/models/led/modeling_led.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sanchit Gandhi authored
* [Whisper Tokenizer] Make more user-friendly * use property * make indexing rigorous * small clean-up * tests * skip seq2seq tests * remove multilingual arg * reorder args * collapse to one function Co-authored-by:
ArthurZucker <arthur@huggingface.co> * option to override attributes Co-authored-by:
ArthurZucker <arthur@huggingface.co> * add to docs * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make comment more clear Co-authored-by:
sgugger <sylvain@huggingface.co> * don't add special tokens in get_decoder_prompt_ids * add test for set_prefix_tokens Co-authored-by:
ArthurZucker <arthur@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
sgugger <sylvain@huggingface.co>
-
Saad Mahmud authored
* Add example docstring for CamembertConfig * Add configuration_camembert to documentation_tests
-
Yih-Dar authored
* Add skip_special_tokens=True in some doctest * For T5 * Fix for speech_to_text.mdx Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
amyeroberts authored
-
Nicolas Patry authored
-
- 02 Nov, 2022 4 commits
-
-
Steven Liu authored
-
Yih-Dar authored
* Show versions * check * store outputs * revert Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Ben Eyal authored
馃毃 馃毃 馃毃 Fix Issue 15003: SentencePiece Tokenizers Not Adding Special Tokens in `convert_tokens_to_string` (#15775) * Add test for SentencePiece not adding special tokens to strings * Add SentencePieceStringConversionMixin to fix issue 15003 * Fix conversion from tokens to string for most SentencePiece tokenizers Tokenizers fixed: - AlbertTokenizer - BarthezTokenizer - CamembertTokenizer - FNetTokenizer - M2M100Tokenizer - MBart50Tokenizer - PegasusTokenizer - Speech2TextTokenizer * Fix MarianTokenizer, adjust SentencePiece test to accomodate vocab * Fix DebertaV2Tokenizer * Ignore LayoutXLMTokenizer in SentencePiece string conversion test * Run 'make style' and 'make quality' * Clean convert_tokens_to_string test Instead of explicitly ignoring LayoutXLMTokenizer in the test, override the test in LayoutLMTokenizationTest and do nothing in it. * Remove commented out code * Improve robustness of convert_tokens_to_string test Instead of comparing lengths of re-tokenized text and input_ids, check that converting all special tokens to string yields a string with all special tokens. * Inline and remove SentencePieceStringConversionMixin The convert_tokens_to_string method is now implemented in each relevant SentencePiece tokenizer. * Run 'make style' and 'make quality' * Revert removal of space in convert_tokens_to_string * Remove redundant import * Revert test text to original * Uncomment the lowercasing of the reverse_text variable * Mimic Rust tokenizer behavior for tokenizers - Albert - Barthez - Camembert - MBart50 - T5 * Fix accidentally skipping test in wrong tokenizer * Add test for equivalent Rust and slow tokenizer behavior * Override _decode in BigBirdTokenizer to mimic Rust behavior * Override _decode in FNetTokenizer to mimic Rust behavior * Override _decode in XLNetTokenizer to mimic Rust behavior * Remove unused 're' import * Update DebertaV2Tokenizer to mimic Rust tokenizer * Deberta tokenizer now behaves like Albert and its `convert_tokens_to_string` is not tested. * Ignore problematic tests in Deberta V2 * Add comment on why the Deberta V2 tests are skipped -
Yih-Dar authored
* Fix doctest Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-