- 18 Oct, 2020 1 commit
-
-
Thomas Wolf authored
* splitting fast and slow tokenizers [WIP] * [WIP] splitting sentencepiece and tokenizers dependencies * update dummy objects * add name_or_path to models and tokenizers * prefix added to file names * prefix * styling + quality * spliting all the tokenizer files - sorting sentencepiece based ones * update tokenizer version up to 0.9.0 * remove hard dependency on sentencepiece
馃帀 * and removed hard dependency on tokenizers馃帀 * update conversion script * update missing models * fixing tests * move test_tokenization_fast to main tokenization tests - fix bugs * bump up tokenizers * fix bert_generation * update ad fix several tokenizers * keep sentencepiece in deps for now * fix funnel and deberta tests * fix fsmt * fix marian tests * fix layoutlm * fix squeezebert and gpt2 * fix T5 tokenization * fix xlnet tests * style * fix mbart * bump up tokenizers to 0.9.2 * fix model tests * fix tf models * fix seq2seq examples * fix tests without sentencepiece * fix slow => fast conversion without sentencepiece * update auto and bert generation tests * fix mbart tests * fix auto and common test without tokenizers * fix tests without tokenizers * clean up tests lighten up when tokenizers + sentencepiece are both off * style quality and tests fixing * add sentencepiece to doc/examples reqs * leave sentencepiece on for now * style quality split hebert and fix pegasus * WIP Herbert fast * add sample_text_no_unicode and fix hebert tokenization * skip FSMT example test for now * fix style * fix fsmt in example tests * update following Lysandre and Sylvain's comments * Update src/transformers/testing_utils.py Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/testing_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 15 Oct, 2020 1 commit
-
-
Stas Bekman authored
in `tests/test_utils_check_copies.py` I was getting intermittently: ``` utils/check_copies.py:52 /mnt/nvme1/code/transformers-comet/utils/check_copies.py:52: DeprecationWarning: invalid escape sequence \s while line_index < len(lines) and re.search(f"^{indent}(class|def)\s+{name}", lines[line_index]) is None: ``` So this should fix it.
-
- 09 Oct, 2020 3 commits
-
-
Sylvain Gugger authored
-
sgugger authored
-
- 05 Oct, 2020 2 commits
-
-
Sylvain Gugger authored
* Check and update model list in index.rst automatically * Check and update model list in index.rst automatically * Adapt template
-
Sylvain Gugger authored
* PoC on RAG * Format class name/obj name * Better name in message * PoC on one TF model * Add PyTorch and TF dummy objects + script * Treat scikit-learn * Bad copy pastes * Typo
-
- 30 Sep, 2020 1 commit
-
-
Sylvain Gugger authored
* Get a better error when check_copies fails * Fix tests
-
- 28 Sep, 2020 1 commit
-
-
Sylvain Gugger authored
Do not merge before Monday
-
- 25 Sep, 2020 1 commit
-
-
Sylvain Gugger authored
-
- 24 Sep, 2020 2 commits
-
-
Sylvain Gugger authored
* Remove mentions of RAG from the docs * Deactivate check
-
Sylvain Gugger authored
* Check decorator order * Adapt for parametrized decorators * Fix typos
-
- 22 Sep, 2020 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Copy code from Bert to Roberta and add safeguard script * Fix docstring * Comment code * Formatting * Update src/transformers/modeling_roberta.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Add test and fix bugs * Fix style and make new comand Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 10 Sep, 2020 1 commit
-
-
Patrick von Platen authored
* add conversion script * improve conversion script * make style * add tryout files * fix * update * add causal bert * better names * add tokenizer file as well * finish causal_bert * fix small bugs * improve generate * change naming * renaming * renaming * renaming * remove leftover files * clean files * add fix tokenizer * finalize * correct slow test * update docs * small fixes * fix link * adapt check repo * apply sams and sylvains recommendations * fix import * implement Lysandres recommendations * fix logger warn
-
- 08 Sep, 2020 1 commit
-
-
Sylvain Gugger authored
* Initial model * Fix upsampling * Add special cls token id and test * Formatting * Test and fist FunnelTokenizerFast * Common tests * Fix the check_repo script and document Funnel * Doc fixes * Add all models * Write doc * Fix test * Initial model * Fix upsampling * Add special cls token id and test * Formatting * Test and fist FunnelTokenizerFast * Common tests * Fix the check_repo script and document Funnel * Doc fixes * Add all models * Write doc * Fix test * Fix copyright * Forgot some layers can be repeated * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/modeling_funnel.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Address review comments * Update src/transformers/modeling_funnel.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments * Update src/transformers/modeling_funnel.py Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> * Slow integration test * Make small integration test * Formatting * Add checkpoint and separate classification head * Formatting * Expand list, fix link and add in pretrained models * Styling * Add the model in all summaries * Typo fixes Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sam Shleifer <sshleifer@gmail.com>
-
- 26 Aug, 2020 1 commit
-
-
Lysandre authored
-
- 14 Aug, 2020 1 commit
-
-
Suraj Patil authored
* add MBartForConditionalGeneration * style * rebase and fixes * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS * fix docs * don't ignore mbart * doc * fix mbart fairseq link * put mbart before bart * apply doc suggestions
-
- 12 Aug, 2020 1 commit
-
-
Sylvain Gugger authored
-
- 10 Aug, 2020 1 commit
-
-
Lysandre Debut authored
* TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo
-
- 07 Aug, 2020 1 commit
-
-
Sylvain Gugger authored
* Add a script to check all models are tested and documented * Apply suggestions from code review Co-authored-by:
Kevin Canwen Xu <canwenxu@126.com> * Address comments Co-authored-by:
Kevin Canwen Xu <canwenxu@126.com>
-
- 02 Mar, 2020 1 commit
-
-
Julien Chaumond authored
* debug env * Restrict TF GPU memory * Fixup * One more test * rm debug logs * Fixup
-
- 06 Jan, 2020 2 commits
-
-
alberduris authored
-
alberduris authored
-
- 22 Dec, 2019 3 commits
-
-
Aymeric Augustin authored
This change is mostly autogenerated with: $ python -m autoflake --in-place --recursive examples templates transformers utils hubconf.py setup.py I made minor changes in the generated diff. -
Aymeric Augustin authored
Fixes flake8 warning W291 (x224).
-
Aymeric Augustin authored
This is the result of: $ isort --recursive examples templates transformers utils hubconf.py setup.py
-
- 21 Dec, 2019 1 commit
-
-
Aymeric Augustin authored
This is the result of: $ black --line-length 119 examples templates transformers utils hubconf.py setup.py There's a lot of fairly long lines in the project. As a consequence, I'm picking the longest widely accepted line length, 119 characters. This is also Thomas' preference, because it allows for explicit variable names, to make the code easier to understand.
-
- 06 Dec, 2019 1 commit
-
-
R茅mi Louf authored
We add a script and a CI workflow to check that all download links present in the source code are valid.
-
- 03 Dec, 2019 1 commit
-
-
Juha Kiili authored
-
- 29 Nov, 2019 1 commit
-
-
Juha Kiili authored
-
- 21 Nov, 2019 1 commit
-
-
Aarni Koskela authored
Original source: https://github.com/kamalkraj/ALBERT-TF2.0/blob/fa90194e5fe729dbb19f32ac29c8d6d6372c0f93/download_glue_data.py Original license: https://github.com/kamalkraj/ALBERT-TF2.0/blob/fa90194e5fe729dbb19f32ac29c8d6d6372c0f93/LICENSE (Apache-2.0)
-