- 21 Sep, 2022 1 commit
-
-
Sylvain Gugger authored
-
- 27 Jun, 2022 1 commit
-
-
Matt authored
* Add a TF in-graph tokenizer for BERT * Add from_pretrained * Add proper truncation, option handling to match other tokenizers * Add proper imports and guards * Add test, fix all the bugs exposed by said test * Fix truncation of paired texts in graph mode, more test updates * Small fixes, add a (very careful) test for savedmodel * Add tensorflow-text dependency, make fixup * Update documentation * Update documentation * make fixup * Slight changes to tests * Add some docstring examples * Update tests * Update tests and add proper lowercasing/normalization * make fixup * Add docstring for padding! * Mark slow tests * make fixup * Fall back to BertTokenizerFast if BertTokenizer is unavailable * Fall back to BertTokenizerFast if BertTokenizer is unavailable * make fixup * Properly handle tensorflow-text dummies
-
- 17 May, 2022 1 commit
-
-
Sylvain Gugger authored
-
- 23 Mar, 2022 1 commit
-
-
Sylvain Gugger authored
* Split file_utils in several submodules * Fixes * Add back more objects * More fixes * Who exactly decided to import that from there? * Second suggestion to code with code review * Revert wront move * Fix imports * Adapt all imports * Adapt all imports everywhere * Revert this import, will fix in a separate commit
-
- 14 Jan, 2022 1 commit
-
-
Sylvain Gugger authored
* Better dummies * See if this fixes the issue * Fix quality * Style * Add doc for DummyObject
-
- 21 Nov, 2021 1 commit
-
-
Sylvain Gugger authored
* Fix dummy objects for quantization * Add more models
-
- 16 Nov, 2021 1 commit
-
-
Sylvain Gugger authored
* Add forward method to dummy models * Fix quality
-
- 15 Jun, 2021 1 commit
-
-
Lysandre Debut authored
-
- 11 Jun, 2021 1 commit
-
-
Lysandre Debut authored
* Add from_pretrained to dummy timm * Fix at the source * Update utils/check_dummies.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Missing pretrained dummies * Style Co-authored-by:
Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 26 Apr, 2021 1 commit
-
-
Patrick von Platen authored
-
- 07 Apr, 2021 1 commit
-
-
Sylvain Gugger authored
* Replaces requires_xxx by one generic method * Quality and update check_dummies * Fix inits check * Post-merge cleanup
-
- 06 Apr, 2021 1 commit
-
-
Sylvain Gugger authored
* AutoFeatureExtractor * Init and first tests * Tests * Damn you gitignore * Quality * Defensive test for when not all backends are here * Use pattern for Speech2Text models
-
- 26 Mar, 2021 1 commit
-
-
Sylvain Gugger authored
* Add ImageFeatureExtractionMixin * Add dummy vision objects * Add require_vision * Add tests * Fix test
-
- 07 Jan, 2021 1 commit
-
-
Sylvain Gugger authored
* Main init work * Add version * Change from absolute to relative imports * Fix imports * One more typo * More typos * Styling * Make quality script pass * Add necessary replace in template * Fix typos * Spaces are ignored in replace for some reason * Forgot one models. * Fixes for import Co-authored-by:
LysandreJik <lysandre.debut@reseau.eseo.fr> * Add documentation * Styling Co-authored-by:
LysandreJik <lysandre.debut@reseau.eseo.fr>
-
- 12 Nov, 2020 1 commit
-
-
Julien Plu authored
-
- 10 Nov, 2020 1 commit
-
-
Julien Chaumond authored
* fix typo * rm use_cdn & references, and implement new hf_bucket_url * I'm pretty sure we don't need to `read` this file * same here * [BIG] file_utils.networking: do not gobble up errors anymore * Fix CI
馃槆 * Apply suggestions from code review Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Tiny doc tweak * Add doc + pass kwarg everywhere * Add more tests and explain cc @sshleifer let me know if better Co-Authored-By:
Sam Shleifer <sshleifer@gmail.com> * Also implement revision in pipelines In the case where we're passing a task name or a string model identifier * Fix CI
馃槆 * Fix CI * [hf_api] new methods + command line implem * make style * Final endpoints post-migration * Fix post-migration * Py3.6 compat cc @stefan-it Thank you @stas00 Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sam Shleifer <sshleifer@gmail.com>
-
- 20 Oct, 2020 1 commit
-
-
Sylvain Gugger authored
-
- 18 Oct, 2020 1 commit
-
-
Thomas Wolf authored
* splitting fast and slow tokenizers [WIP] * [WIP] splitting sentencepiece and tokenizers dependencies * update dummy objects * add name_or_path to models and tokenizers * prefix added to file names * prefix * styling + quality * spliting all the tokenizer files - sorting sentencepiece based ones * update tokenizer version up to 0.9.0 * remove hard dependency on sentencepiece
馃帀 * and removed hard dependency on tokenizers馃帀 * update conversion script * update missing models * fixing tests * move test_tokenization_fast to main tokenization tests - fix bugs * bump up tokenizers * fix bert_generation * update ad fix several tokenizers * keep sentencepiece in deps for now * fix funnel and deberta tests * fix fsmt * fix marian tests * fix layoutlm * fix squeezebert and gpt2 * fix T5 tokenization * fix xlnet tests * style * fix mbart * bump up tokenizers to 0.9.2 * fix model tests * fix tf models * fix seq2seq examples * fix tests without sentencepiece * fix slow => fast conversion without sentencepiece * update auto and bert generation tests * fix mbart tests * fix auto and common test without tokenizers * fix tests without tokenizers * clean up tests lighten up when tokenizers + sentencepiece are both off * style quality and tests fixing * add sentencepiece to doc/examples reqs * leave sentencepiece on for now * style quality split hebert and fix pegasus * WIP Herbert fast * add sample_text_no_unicode and fix hebert tokenization * skip FSMT example test for now * fix style * fix fsmt in example tests * update following Lysandre and Sylvain's comments * Update src/transformers/testing_utils.py Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/testing_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 05 Oct, 2020 1 commit
-
-
Sylvain Gugger authored
* PoC on RAG * Format class name/obj name * Better name in message * PoC on one TF model * Add PyTorch and TF dummy objects + script * Treat scikit-learn * Bad copy pastes * Typo
-