- 04 Feb, 2021 1 commit
-
-
Sylvain Gugger authored
* Authorize last version of tokenizer * Update version table * Fix conversion of spm tokenizers and fix some hub links * Bump tokenizers version to 0.10.1rc1 * Add script to check tokenizers conversion with XNLI * Add some more mask_token lstrip support * Must modify mask_token in slow tokenizers too * Keep using the old method for Pegasus * add missing import Co-authored-by:Anthony MOI <m.anthony.moi@gmail.com>
-
- 02 Feb, 2021 2 commits
-
-
Patrick von Platen authored
* add raw scaffold * implement feat extract layers * make style * remove + * correctly convert weights * make feat extractor work * make feature extraction proj work * run forward pass * finish forward pass * Succesful decoding example * remove unused files * more changes * add wav2vec tokenizer * add new structure * fix run forward * add other layer norm architecture * finish 2nd structure * add model tests * finish tests for tok and model * clean-up * make style * finish docstring for model and config * make style * correct docstring * correct tests * change checkpoints to fairseq * fix examples * finish wav2vec2 * make style * apply sylvains suggestions * apply lysandres suggestions * change print to log.info * re-add assert statement * add input_values as required input name * finish wav2vec2 tokenizer * Update tests/test_tokenization_wav2vec2.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * apply sylvains suggestions Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Sylvain Gugger authored
-
- 27 Jan, 2021 1 commit
-
-
Patrick von Platen authored
* update jaxlib * Update setup.py * update table
-
- 18 Jan, 2021 1 commit
-
-
Anthony MOI authored
-
- 14 Jan, 2021 1 commit
-
-
Stas Bekman authored
* note on how to get to deps from shell * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix text Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 13 Jan, 2021 3 commits
-
-
Lysandre authored
-
Lysandre authored
-
Stas Bekman authored
-
- 12 Jan, 2021 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
- 06 Jan, 2021 1 commit
-
-
Sylvain Gugger authored
* Don't import libs to check they are available * Don't import integrations at init * Add importlib_metdata to deps * Remove old vars references * Avoid syntax error * Adapt testing utils * Try to appease torchhub * Add dependency * Remove more private variables * Fix typo * Another typo * Refine the tf availability test
-
- 21 Dec, 2020 1 commit
-
-
Julien Plu authored
* Improve BERT-like models attention layers * Apply style * Put back error raising instead of assert * Update template * Fix copies * Apply raising valueerror in MPNet * Restore the copy check for the Intermediate layer in Longformer * Update longformer
-
- 18 Dec, 2020 1 commit
-
-
Stas Bekman authored
setuptools has a pretty fixed expectation of version numbers. This PR fixes the dev version number and adds a comment with correct formats for the future editors This fix removes this warning on `make fixup|style|etc` or any other time `setup.py` is being run. ``` setuptools/dist.py:452: UserWarning: Normalizing '4.2.0dev0' to '4.2.0.dev0' warnings.warn(tmpl.format(**locals())) ``` and the alternative: ``` /setuptools/dist.py:452: UserWarning: Normalizing '4.0.0-rc-1' to '4.0.0rc1 ``` Fixes: #8749 @LysandreJik, @sgugger
-
- 17 Dec, 2020 3 commits
- 16 Dec, 2020 1 commit
-
-
Patrick von Platen authored
* save intermediate * save intermediate * save intermediate * correct flax bert model file * new module / model naming * make style * almost finish BERT * finish roberta * make fix-copies * delete keys file * last refactor * fixes in run_mlm_flax.py * remove pooled from run_mlm_flax.py` * fix gelu | gelu_new * remove Module from inits * splits * dirty print * preventing warmup_steps == 0 * smaller splits * make fix-copies * dirty print * dirty print * initial_evaluation argument * declaration order fix * proper model initialization/loading * proper initialization * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug * removed tokenizers warning hack, fixed model re-initialization * reverted training_args.py changes * fix flax from pretrained * improve test in flax * apply sylvains tips * update init * make 0.3.0 compatible * revert tevens changes * revert tevens changes 2 * finalize revert * fix bug * add docs * add pretrained to init * Update src/transformers/modeling_flax_utils.py * fix copies * final improvements Co-authored-by:TevenLeScao <teven.lescao@gmail.com>
-
- 15 Dec, 2020 1 commit
-
-
Julien Plu authored
* Fix tests for TF 2.4 * Remove <2.4 limitation * Add version condition * Update tests/test_optimization_tf.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_optimization_tf.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_optimization_tf.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 14 Dec, 2020 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
- 07 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Add copyright everywhere missing * Style
-
- 30 Nov, 2020 2 commits
-
-
LysandreJik authored
-
LysandreJik authored
-
- 27 Nov, 2020 1 commit
-
-
Julien Plu authored
enforce unix newline encoding regardless of OS creating the file
-
- 24 Nov, 2020 1 commit
-
-
Stas Bekman authored
* implement support for run-time dependency version checking * try not escaping ! * use findall that works on py36 * small tweaks * autoformatter worship * simplify * shorter names * add support for non-versioned checks * add deps * revert * tokenizers not required, check version only if installed * make a proper distutils cmd and add make target * tqdm must be checked before tokenizers * workaround the DistributionNotFound peculiar setup * handle the rest of packages in setup.py * fully sync setup.py's install_requires - to check them all * nit * make install_requires more readable * typo * Update setup.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * restyle * add types * simplify * simplify2 Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 19 Nov, 2020 1 commit
-
-
LysandreJik authored
-
- 16 Nov, 2020 1 commit
-
-
Sylvain Gugger authored
-
- 15 Nov, 2020 1 commit
-
-
Thomas Wolf authored
[breaking|pipelines|tokenizers] Adding slow-fast tokenizers equivalence tests pipelines - Removing sentencepiece as a required dependency (#8073) * Fixing roberta for slow-fast tests * WIP getting equivalence on pipelines * slow-to-fast equivalence - working on question-answering pipeline * optional FAISS tests * Pipeline Q&A * Move pipeline tests to their own test job again * update tokenizer to add sequence id methods * update to tokenizers 0.9.4 * set sentencepiecce as optional * clean up squad * clean up pipelines to use sequence_ids * style/quality * wording * Switch to use_fast = True by default * update tests for use_fast at True by default * fix rag tokenizer test * removing protobuf from required dependencies * fix NER test for use_fast = True by default * fixing example tests (Q&A examples use slow tokenizers for now) * protobuf in main deps extras["sentencepiece"] and example deps * fix protobug install test * try to fix seq2seq by switching to slow tokenizers for now * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 13 Nov, 2020 1 commit
-
-
Lysandre Debut authored
* Model templates * TensorFlow * Remove pooler * CI * Tokenizer + Refactoring * Encoder-Decoder * Let's go testing * Encoder-Decoder in TF * Let's go testing in TF * Documentation * README * Fixes * Better names * Style * Update docs * Choose to skip either TF or PT * Code quality fixes * Add to testing suite * Update file path * Cookiecutter path * Update `transformers` path * Handle rebasing * Remove seq2seq from model templates * Remove s2s config * Apply Sylvain and Patrick comments * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Last fixes from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 10 Nov, 2020 1 commit
-
-
Lysandre authored
-
- 09 Nov, 2020 1 commit
-
-
Sylvain Gugger authored
-
- 04 Nov, 2020 1 commit
-
-
Sylvain Gugger authored
* Try -j option * Try other thing * Bigger machine * Test lower sphinx version * Remove trailing space
-
- 27 Oct, 2020 3 commits
-
-
Sylvain Gugger authored
-
Jason Wolosonovich authored
Co-authored-by:Lysandre Debut <lysandre@huggingface.co>
-
Stas Bekman authored
-
- 20 Oct, 2020 1 commit
-
-
Lysandre authored
-
- 19 Oct, 2020 1 commit
-
-
Funtowicz Morgan authored
* WIP flax bert * Initial commit Bert Jax/Flax implementation. * Embeddings working and equivalent to PyTorch. * Move embeddings in its own module BertEmbeddings * Added jax.jit annotation on forward call * BertEncoder on par with PyTorch ! :D * Add BertPooler on par with PyTorch !! * Working Jax+Flax implementation of BertModel with < 1e-5 differences on the last layer. * Fix pooled output to take only the first token of the sequence. * Refactoring to use BertConfig from transformers. * Renamed FXBertModel to FlaxBertModel * Model is now initialized in FlaxBertModel constructor and reused. * WIP JaxPreTrainedModel * Cleaning up the code of FlaxBertModel * Added ability to load Flax model saved through save_pretrained() * Added ability to convert Pytorch Bert model to FlaxBert * FlaxBert can now load every Pytorch Bert model with on-the-fly conversion * Fix hardcoded shape values in conversion scripts. * Improve the way we handle LayerNorm conversion from PyTorch to Flax. * Added positional embeddings as parameter of BertModel with default to np.arange. * Let's roll FlaxRoberta ! * Fix missing position_ids parameters on predict for Bert * Flax backend now supports batched inputs Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Make it possible to load msgpacked model on convert from pytorch in last resort. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Moved save_pretrained to Jax base class along with more constructor parameters. * Use specialized, model dependent conversion functio. * Expose `is_flax_available` in file_utils. * Added unittest for Flax models. * Added run_tests_flax to the CI. * Introduce FlaxAutoModel * Added more unittests * Flax model reference the _MODEL_ARCHIVE_MAP from PyTorch model. * Addressing review comments. * Expose seed in both Bert and Roberta * Fix typo suggested by @stefan-it Co-Authored-By:
Stefan Schweter <stefan@schweter.it> * Attempt to make style * Attempt to make style in tests too * Added jax & jaxlib to the flax optional dependencies. * Attempt to fix flake8 warnings ... * Redo black again and again * When black and flake8 fight each other for a space ...
馃挜 馃挜 馃挜 * Try removing trailing comma to make both black and flake happy! * Fix invalid is_<framework>_available call, thanks @LysandreJik馃帀 * Fix another invalid import in flax_roberta test * Bump and pin flax release to 0.1.0. * Make flake8 happy, remove unused jax import * Change the type of the catch for msgpack. * Remove unused import. * Put seed as optional constructor parameter. * trigger ci again * Fix too much parameters in BertAttention. * Formatting. * Simplify Flax unittests to avoid machine crashes. * Fix invalid number of arguments when raising issue for an unknown model. * Address @bastings comment in PR, moving jax.jit decorated outside of __call__ * Fix incorrect path to require_flax/require_pytorch functions. Signed-off-by:Morgan Funtowicz <funtowiczmo@gmail.com> * Attempt to make style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correct rebasing of circle-ci dependencies Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix import sorting. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix unused imports. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Again import sorting... Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Installing missing nlp dependency for flax unittests. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix laoding of model for Flax implementations. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * jit the inner function call to make JAX-compatible Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Format ! Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Flake one more time
馃幎 Signed-off-by:Morgan Funtowicz <funtowiczmo@gmail.com> * Rewrites BERT in Flax to the new Linen API (#7211) * Rewrite Flax HuggingFace PR to Linen * Some fixes * Fix tests * Fix CI with change of name of nlp (#7054) * nlp -> datasets * More nlp -> datasets * Woopsie * More nlp -> datasets * One last * Expose `is_flax_available` in file_utils. * Added run_tests_flax to the CI. * Attempt to make style * trigger ci again * Fix import sorting. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Revert "Rewrites BERT in Flax to the new Linen API (#7211)" This reverts commit 23703a5eb3364e26a1cbc3ee34b4710d86a674b0. * Remove jnp.lax references Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Reintroduce Linen changes ... Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use jax native's gelu function. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Renaming BertModel to BertModule to highlight the fact this is the Flax Module object. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Rewrite FlaxAutoModel test to not rely on pretrained_model_archive_map Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove unused variable in BertModule. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove unused variable in BertModule again Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Attempt to have is_flax_available working again. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Introduce JAX TensorType Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Improve ImportError message when trying to convert to various TensorType format. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Makes Flax model jittable. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Ensure flax models are jittable in unittests. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Remove unused imports. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Ensure jax imports are guarded behind is_flax_available. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style again Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style again again Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style again again again Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Update src/transformers/file_utils.py Co-authored-by:
Marc van Zee <marcvanzee@gmail.com> * Bump flax to it's latest version Co-authored-by:
Marc van Zee <marcvanzee@gmail.com> * Bump jax version to at least 0.2.0 Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Update the unittest to use TensorType.JAX Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * isort import in tests. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Match new flax parameters name "params" Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove unused imports. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Add flax models to transformers __init__ Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Attempt to address all CI related comments. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correct circle.yml indent. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correct circle.yml indent (2) Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove coverage from flax tests Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Addressing many naming suggestions from comments Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Simplify for loop logic to interate over layers in FlaxBertLayerCollection Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * use f-string syntax for formatting logs. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use config property from FlaxPreTrainedModel. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * use "cls_token" instead of "first_token" variable name. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * use "hidden_state" instead of "h" variable name. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correct class reference in docstring to link to Flax related modules. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added HF + Google Flax team copyright. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make Roberta independent from Bert Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Move activation functions to flax_utils. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Move activation functions to flax_utils for bert. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added docstring for BERT Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Update import for Bert and Roberta tokenizers Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * fix-copies Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correct FlaxRobertaLayer to match PyTorch. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use the same store_artifact for flax unittest Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make sure gradient are disabled only locally for flax unittest using torch equivalence. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use relative imports Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> Co-authored-by:
Stefan Schweter <stefan@schweter.it> Co-authored-by:
Marc van Zee <marcvanzee@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 18 Oct, 2020 1 commit
-
-
Thomas Wolf authored
* splitting fast and slow tokenizers [WIP] * [WIP] splitting sentencepiece and tokenizers dependencies * update dummy objects * add name_or_path to models and tokenizers * prefix added to file names * prefix * styling + quality * spliting all the tokenizer files - sorting sentencepiece based ones * update tokenizer version up to 0.9.0 * remove hard dependency on sentencepiece
馃帀 * and removed hard dependency on tokenizers馃帀 * update conversion script * update missing models * fixing tests * move test_tokenization_fast to main tokenization tests - fix bugs * bump up tokenizers * fix bert_generation * update ad fix several tokenizers * keep sentencepiece in deps for now * fix funnel and deberta tests * fix fsmt * fix marian tests * fix layoutlm * fix squeezebert and gpt2 * fix T5 tokenization * fix xlnet tests * style * fix mbart * bump up tokenizers to 0.9.2 * fix model tests * fix tf models * fix seq2seq examples * fix tests without sentencepiece * fix slow => fast conversion without sentencepiece * update auto and bert generation tests * fix mbart tests * fix auto and common test without tokenizers * fix tests without tokenizers * clean up tests lighten up when tokenizers + sentencepiece are both off * style quality and tests fixing * add sentencepiece to doc/examples reqs * leave sentencepiece on for now * style quality split hebert and fix pegasus * WIP Herbert fast * add sample_text_no_unicode and fix hebert tokenization * skip FSMT example test for now * fix style * fix fsmt in example tests * update following Lysandre and Sylvain's comments * Update src/transformers/testing_utils.py Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/testing_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 09 Oct, 2020 1 commit
-
-
Doug Blank authored
* Import intergration libraries first * isort and black happiness * flake8 happiness * Add a test * Black reformat * Ignore import order in tests * A heavy-handed method of disabling comet for tests * Remove comet_ml tests * Run black on setup.py
-