- 11 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Fix PreTrainedTokenizer.pad when first inputs are empty * Handle empty inputs case
-
- 10 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
-
- 08 Dec, 2020 1 commit
-
-
Lysandre Debut authored
-
- 03 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
-
- 02 Dec, 2020 1 commit
-
-
Nicolas Patry authored
* Warning about too long input for fast tokenizers too If truncation is not set in tokenizers, but the tokenization is too long for the model (`model_max_length`), we used to trigger a warning that The input would probably fail (which it most likely will). This PR re-enables the warning for fast tokenizers too and uses common code for the trigger to make sure it's consistent across. * Checking for pair of inputs too. * Making the function private and adding it's doc. * Remove formatting ?? in odd place. * Missed uppercase.
-
- 01 Dec, 2020 1 commit
-
-
Adam Pocock authored
Prevent BatchEncoding from blindly passing casts down to the tensors it contains. Fixes #6582. (#8860) Update src/transformers/tokenization_utils_base.py with review fix Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 30 Nov, 2020 1 commit
-
-
Fraser Greenlee authored
Related issue: https://github.com/huggingface/transformers/issues/8837
-
- 27 Nov, 2020 1 commit
-
-
Giovanni Compagnoni authored
* update configuration_utils.py typing to allow pathlike objects when sensible * update modeling_utils.py typing to allow pathlike objects when sensible * black * update tokenization_utils_base.py typing to allow pathlike objects when sensible * update tokenization_utils_fast.py typing to allow pathlike objects when sensible * update configuration_auto.py typing to allow pathlike objects when sensible * update configuration_auto.py docstring to allow pathlike objects when sensible * update tokenization_auto.py docstring to allow pathlike objects when sensible * black
-
- 19 Nov, 2020 1 commit
-
-
Stas Bekman authored
* don't reconvert when the type is already right * better name * adjust logic as suggested * merge
-
- 17 Nov, 2020 3 commits
-
-
Sylvain Gugger authored
* Remove old deprecated arguments Co-authored-by:
LysandreJik <lysandre.debut@reseau.eseo.fr> * Remove needless imports * Fix tests Co-authored-by:
LysandreJik <lysandre.debut@reseau.eseo.fr>
-
Lysandre Debut authored
* Tokenizers should be framework agnostic * Run the slow tests * Not testing * Fix documentation * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Julien Chaumond authored
* <small>tiny typo</small> * Tokenizers: ability to load from model subfolder * use subfolder for local files as well * Uniformize model shortcut name => model id * from s3 => from huggingface.co Co-authored-by:Quentin Lhoest <lhoest.q@gmail.com>
-
- 15 Nov, 2020 1 commit
-
-
Thomas Wolf authored
[breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073) * Fixing roberta for slow-fast tests * WIP getting equivalence on pipelines * slow-to-fast equivalence - working on question-answering pipeline * optional FAISS tests * Pipeline Q&A * Move pipeline tests to their own test job again * update tokenizer to add sequence id methods * update to tokenizers 0.9.4 * set sentencepiecce as optional * clean up squad * clean up pipelines to use sequence_ids * style/quality * wording * Switch to use_fast = True by default * update tests for use_fast at True by default * fix rag tokenizer test * removing protobuf from required dependencies * fix NER test for use_fast = True by default * fixing example tests (Q&A examples use slow tokenizers for now) * protobuf in main deps extras["sentencepiece"] and example deps * fix protobug install test * try to fix seq2seq by switching to slow tokenizers for now * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 10 Nov, 2020 1 commit
-
-
Julien Chaumond authored
* fix typo * rm use_cdn & references, and implement new hf_bucket_url * I'm pretty sure we don't need to `read` this file * same here * [BIG] file_utils.networking: do not gobble up errors anymore * Fix CI
馃槆 * Apply suggestions from code review Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Tiny doc tweak * Add doc + pass kwarg everywhere * Add more tests and explain cc @sshleifer let me know if better Co-Authored-By:
Sam Shleifer <sshleifer@gmail.com> * Also implement revision in pipelines In the case where we're passing a task name or a string model identifier * Fix CI
馃槆 * Fix CI * [hf_api] new methods + command line implem * make style * Final endpoints post-migration * Fix post-migration * Py3.6 compat cc @stefan-it Thank you @stas00 Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sam Shleifer <sshleifer@gmail.com>
-
- 29 Oct, 2020 1 commit
-
-
Santiago Castro authored
* Fix doc errors and typos across the board * Fix a typo * Fix the CI * Fix more typos * Fix CI * More fixes * Fix CI * More fixes * More fixes
-
- 26 Oct, 2020 2 commits
-
-
Sylvain Gugger authored
* Important files * Styling them all * Revert "Styling them all" This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e. * Syling them for realsies * Fix syntax error * Fix benchmark_utils * More fixes * Fix modeling auto and script * Remove new line * Fixes * More fixes * Fix more files * Style * Add FSMT * More fixes * More fixes * More fixes * More fixes * Fixes * More fixes * More fixes * Last fixes * Make sphinx happy
-
Thomas Wolf authored
* fixing #8001 * make T5 tokenizer serialization more robust - style
-
- 24 Oct, 2020 1 commit
-
-
Suraj Patil authored
-
- 23 Oct, 2020 2 commits
-
-
Anthony MOI authored
-
Thomas Wolf authored
[tests|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970) * WIP refactoring pipeline tests - switching to fast tokenizers * fix dialog pipeline and fill-mask * refactoring pipeline tests backbone * make large tests slow * fix tests (tf Bart inactive for now) * fix doc... * clean up for merge * fixing tests - remove bart from summarization until there is TF * fix quality and RAG * Add new translation pipeline tests - fix JAX tests * only slow for dialog * Fixing the missing TF-BART imports in modeling_tf_auto * spin out pipeline tests in separate CI job * adding pipeline test to CI YAML * add slow pipeline tests * speed up tf and pt join test to avoid redoing all the standalone pt and tf tests * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/pipelines.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/testing_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add require_torch and require_tf in is_pt_tf_cross_test Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 21 Oct, 2020 1 commit
-
-
Evan Pete Walsh authored
* fix docstring for 'special_tokens_mask' * revert auto formatter changes * revert another auto format * revert another auto format
-
- 19 Oct, 2020 2 commits
-
-
Funtowicz Morgan authored
* WIP flax bert * Initial commit Bert Jax/Flax implementation. * Embeddings working and equivalent to PyTorch. * Move embeddings in its own module BertEmbeddings * Added jax.jit annotation on forward call * BertEncoder on par with PyTorch ! :D * Add BertPooler on par with PyTorch !! * Working Jax+Flax implementation of BertModel with < 1e-5 differences on the last layer. * Fix pooled output to take only the first token of the sequence. * Refactoring to use BertConfig from transformers. * Renamed FXBertModel to FlaxBertModel * Model is now initialized in FlaxBertModel constructor and reused. * WIP JaxPreTrainedModel * Cleaning up the code of FlaxBertModel * Added ability to load Flax model saved through save_pretrained() * Added ability to convert Pytorch Bert model to FlaxBert * FlaxBert can now load every Pytorch Bert model with on-the-fly conversion * Fix hardcoded shape values in conversion scripts. * Improve the way we handle LayerNorm conversion from PyTorch to Flax. * Added positional embeddings as parameter of BertModel with default to np.arange. * Let's roll FlaxRoberta ! * Fix missing position_ids parameters on predict for Bert * Flax backend now supports batched inputs Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Make it possible to load msgpacked model on convert from pytorch in last resort. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Moved save_pretrained to Jax base class along with more constructor parameters. * Use specialized, model dependent conversion functio. * Expose `is_flax_available` in file_utils. * Added unittest for Flax models. * Added run_tests_flax to the CI. * Introduce FlaxAutoModel * Added more unittests * Flax model reference the _MODEL_ARCHIVE_MAP from PyTorch model. * Addressing review comments. * Expose seed in both Bert and Roberta * Fix typo suggested by @stefan-it Co-Authored-By:
Stefan Schweter <stefan@schweter.it> * Attempt to make style * Attempt to make style in tests too * Added jax & jaxlib to the flax optional dependencies. * Attempt to fix flake8 warnings ... * Redo black again and again * When black and flake8 fight each other for a space ...
馃挜 馃挜 馃挜 * Try removing trailing comma to make both black and flake happy! * Fix invalid is_<framework>_available call, thanks @LysandreJik馃帀 * Fix another invalid import in flax_roberta test * Bump and pin flax release to 0.1.0. * Make flake8 happy, remove unused jax import * Change the type of the catch for msgpack. * Remove unused import. * Put seed as optional constructor parameter. * trigger ci again * Fix too much parameters in BertAttention. * Formatting. * Simplify Flax unittests to avoid machine crashes. * Fix invalid number of arguments when raising issue for an unknown model. * Address @bastings comment in PR, moving jax.jit decorated outside of __call__ * Fix incorrect path to require_flax/require_pytorch functions. Signed-off-by:Morgan Funtowicz <funtowiczmo@gmail.com> * Attempt to make style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correct rebasing of circle-ci dependencies Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix import sorting. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix unused imports. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Again import sorting... Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Installing missing nlp dependency for flax unittests. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix laoding of model for Flax implementations. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * jit the inner function call to make JAX-compatible Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Format ! Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Flake one more time
馃幎 Signed-off-by:Morgan Funtowicz <funtowiczmo@gmail.com> * Rewrites BERT in Flax to the new Linen API (#7211) * Rewrite Flax HuggingFace PR to Linen * Some fixes * Fix tests * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * Expose `is_flax_available` in file_utils. * Added run_tests_flax to the CI. * Attempt to make style * trigger ci again * Fix import sorting. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Revert "Rewrites BERT in Flax to the new Linen API (#7211)" This reverts commit 23703a5eb3364e26a1cbc3ee34b4710d86a674b0. * Remove jnp.lax references Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Reintroduce Linen changes ... Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use jax native's gelu function. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Renaming BertModel to BertModule to highlight the fact this is the Flax Module object. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Rewrite FlaxAutoModel test to not rely on pretrained_model_archive_map Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove unused variable in BertModule. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove unused variable in BertModule again Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Attempt to have is_flax_available working again. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Introduce JAX TensorType Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Improve ImportError message when trying to convert to various TensorType format. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Makes Flax model jittable. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Ensure flax models are jittable in unittests. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Remove unused imports. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Ensure jax imports are guarded behind is_flax_available. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style again Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style again again Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style again again again Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Update src/transformers/file_utils.py Co-authored-by:
Marc van Zee <marcvanzee@gmail.com> * Bump flax to it's latest version Co-authored-by:
Marc van Zee <marcvanzee@gmail.com> * Bump jax version to at least 0.2.0 Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Update the unittest to use TensorType.JAX Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * isort import in tests. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Match new flax parameters name "params" Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove unused imports. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Add flax models to transformers __init__ Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Attempt to address all CI related comments. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correct circle.yml indent. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correct circle.yml indent (2) Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove coverage from flax tests Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Addressing many naming suggestions from comments Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Simplify for loop logic to interate over layers in FlaxBertLayerCollection Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * use f-string syntax for formatting logs. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use config property from FlaxPreTrainedModel. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * use "cls_token" instead of "first_token" variable name. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * use "hidden_state" instead of "h" variable name. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correct class reference in docstring to link to Flax related modules. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added HF + Google Flax team copyright. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make Roberta independent from Bert Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Move activation functions to flax_utils. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Move activation functions to flax_utils for bert. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added docstring for BERT Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Update import for Bert and Roberta tokenizers Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * fix-copies Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correct FlaxRobertaLayer to match PyTorch. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use the same store_artifact for flax unittest Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make sure gradient are disabled only locally for flax unittest using torch equivalence. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use relative imports Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> Co-authored-by:
Stefan Schweter <stefan@schweter.it> Co-authored-by:
Marc van Zee <marcvanzee@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
AndreaSottana authored
* Fix small type hinting error * Update tokenization_utils_base.py * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 18 Oct, 2020 1 commit
-
-
Thomas Wolf authored
* splitting fast and slow tokenizers [WIP] * [WIP] splitting sentencepiece and tokenizers dependencies * update dummy objects * add name_or_path to models and tokenizers * prefix added to file names * prefix * styling + quality * spliting all the tokenizer files - sorting sentencepiece based ones * update tokenizer version up to 0.9.0 * remove hard dependency on sentencepiece
馃帀 * and removed hard dependency on tokenizers馃帀 * update conversion script * update missing models * fixing tests * move test_tokenization_fast to main tokenization tests - fix bugs * bump up tokenizers * fix bert_generation * update ad fix several tokenizers * keep sentencepiece in deps for now * fix funnel and deberta tests * fix fsmt * fix marian tests * fix layoutlm * fix squeezebert and gpt2 * fix T5 tokenization * fix xlnet tests * style * fix mbart * bump up tokenizers to 0.9.2 * fix model tests * fix tf models * fix seq2seq examples * fix tests without sentencepiece * fix slow => fast conversion without sentencepiece * update auto and bert generation tests * fix mbart tests * fix auto and common test without tokenizers * fix tests without tokenizers * clean up tests lighten up when tokenizers + sentencepiece are both off * style quality and tests fixing * add sentencepiece to doc/examples reqs * leave sentencepiece on for now * style quality split hebert and fix pegasus * WIP Herbert fast * add sample_text_no_unicode and fix hebert tokenization * skip FSMT example test for now * fix style * fix fsmt in example tests * update following Lysandre and Sylvain's comments * Update src/transformers/testing_utils.py Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/testing_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 13 Oct, 2020 2 commits
-
-
Tiger authored
-
Patrick von Platen authored
* fix rag * Update tokenizer save_pretrained Co-authored-by:Thomas Wolf <thomwolf@users.noreply.github.com>
-
- 12 Oct, 2020 1 commit
-
-
AndreaSottana authored
Minor spelling corrections in docstrings. "information" is uncountable in English and has no plural.
-
- 08 Oct, 2020 1 commit
-
-
Thomas Wolf authored
Adding Fast tokenizers for SentencePiece based tokenizers - Breaking: remove Transfo-XL fast tokenizer (#7141) * [WIP] SP tokenizers * fixing tests for T5 * WIP tokenizers * serialization * update T5 * WIP T5 tokenization * slow to fast conversion script * Refactoring to move tokenzier implementations inside transformers * Adding gpt - refactoring - quality * WIP adding several tokenizers to the fast world * WIP Roberta - moving implementations * update to dev4 switch file loading to in-memory loading * Updating and fixing * advancing on the tokenizers - updating do_lower_case * style and quality * moving forward with tokenizers conversion and tests * MBart, T5 * dumping the fast version of transformer XL * Adding to autotokenizers + style/quality * update init and space_between_special_tokens * style and quality * bump up tokenizers version * add protobuf * fix pickle Bert JP with Mecab * fix newly added tokenizers * style and quality * fix bert japanese * fix funnel * limite tokenizer warning to one occurence * clean up file * fix new tokenizers * fast tokenizers deep tests * WIP adding all the special fast tests on the new fast tokenizers * quick fix * adding more fast tokenizers in the fast tests * all tokenizers in fast version tested * Adding BertGenerationFast * bump up setup.py for CI * remove BertGenerationFast (too early) * bump up tokenizers version * Clean old docstrings * Typo * Update following Lysandre comments Co-authored-by:Sylvain Gugger <sylvain.gugger@gmail.com>
-
- 06 Oct, 2020 1 commit
-
-
Gabriele Picco authored
* Fix UnboundLocalError when PaddingStrategy is MAX_LENGTH * Fix UnboundLocalError for TruncationStrategy
-
- 23 Sep, 2020 1 commit
-
-
Sylvain Gugger authored
* Clean up model documentation * Formatting * Preparation work * Long lines * Main work on rst files * Cleanup all config files * Syntax fix * Clean all tokenizers * Work on first models * Models beginning * FaluBERT * All PyTorch models * All models * Long lines again * Fixes * More fixes * Update docs/source/model_doc/bert.rst Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update docs/source/model_doc/electra.rst Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Last fixes Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 22 Sep, 2020 1 commit
-
-
Sylvain Gugger authored
* is_pretokenized -> is_split_into_words * Fix tests
-
- 16 Sep, 2020 1 commit
-
-
Xi Ye authored
-
- 14 Sep, 2020 1 commit
-
-
Kevin Canwen Xu authored
* Add Tuna Mirror for Downloads from China * format fix * Use preset instead of hardcoding URL * Fix * make style * update the mirror option doc * update the mirror
-
- 09 Sep, 2020 1 commit
-
-
Lysandre Debut authored
Batch encore plus and overflowing tokens fails when non existing overflowing tokens for a sequence (#6677) * Patch and test * Fix tests
-
- 08 Sep, 2020 1 commit
-
-
Stas Bekman authored
apologies for the tiny PRs, just sending those as I find them.
-
- 04 Sep, 2020 1 commit
-
-
Stas Bekman authored
-
- 26 Aug, 2020 1 commit
-
-
Lysandre Debut authored
* Logging * Style * hf_logging > utils.logging * Address @thomwolf's comments * Update test * Update src/transformers/benchmark/benchmark_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Revert bad change Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 24 Aug, 2020 1 commit
-
-
Sylvain Gugger authored
* Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks
-
- 19 Aug, 2020 1 commit
-
-
Sylvain Gugger authored
-
- 12 Aug, 2020 1 commit
-
-
Sylvain Gugger authored
* allow using tokenizer.pad as a collate_fn in pytorch * allow using tokenizer.pad as a collate_fn in pytorch * Add documentation and tests * Make attention mask the right shape * Better test Co-authored-by:Thomas Wolf <thomwolf@users.noreply.github.com>
-