Commits · ba8c4d0ac04acfcdbdeaed954f698d6d5ec3e532 · chenpangpang / transformers

18 Oct, 2020 1 commit

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a

Thomas Wolf authored Oct 18, 2020

* splitting fast and slow tokenizers [WIP]

* [WIP] splitting sentencepiece and tokenizers dependencies

* update dummy objects

* add name_or_path to models and tokenizers

* prefix added to file names

* prefix

* styling + quality

* spliting all the tokenizer files - sorting sentencepiece based ones

* update tokenizer version up to 0.9.0

* remove hard dependency on sentencepiece 🎉

* and removed hard dependency on tokenizers 🎉



* update conversion script

* update missing models

* fixing tests

* move test_tokenization_fast to main tokenization tests - fix bugs

* bump up tokenizers

* fix bert_generation

* update ad fix several tokenizers

* keep sentencepiece in deps for now

* fix funnel and deberta tests

* fix fsmt

* fix marian tests

* fix layoutlm

* fix squeezebert and gpt2

* fix T5 tokenization

* fix xlnet tests

* style

* fix mbart

* bump up tokenizers to 0.9.2

* fix model tests

* fix tf models

* fix seq2seq examples

* fix tests without sentencepiece

* fix slow => fast  conversion without sentencepiece

* update auto and bert generation tests

* fix mbart tests

* fix auto and common test without tokenizers

* fix tests without tokenizers

* clean up tests lighten up when tokenizers + sentencepiece are both off

* style quality and tests fixing

* add sentencepiece to doc/examples reqs

* leave sentencepiece on for now

* style quality split hebert and fix pegasus

* WIP Herbert fast

* add sample_text_no_unicode and fix hebert tokenization

* skip FSMT example test for now

* fix style

* fix fsmt in example tests

* update following Lysandre and Sylvain's comments

* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ba8c4d0a

17 Aug, 2020 1 commit
- Fix flaky ONNX tests (#6531) · b41cc0b8
  Funtowicz Morgan authored Aug 17, 2020
  
  b41cc0b8
29 Jul, 2020 1 commit

Added capability to quantize a model while exporting through ONNX. (#6089) · 6c002853

Funtowicz Morgan authored Jul 29, 2020



* Added capability to quantize a model while exporting through ONNX.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

We do not support multiple extensions
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Reformat files
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* More quality
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Ensure test_generate_identified_name compares the same object types
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added documentation everywhere on ONNX exporter
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Use pathlib.Path instead of plain-old string
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Use f-string everywhere
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Use the correct parameters for black formatting
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Use Python 3 super() style.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Use packaging.version to ensure installed onnxruntime version match requirements
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Fixing imports sorting order.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Missing raise(s)
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Added quantization documentation
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Fix some spelling.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Fix bad list header format
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

6c002853

01 Jul, 2020 1 commit
- Move tests/utils.py -> transformers/testing_utils.py (#5350) · 13deb95a
  Sam Shleifer authored Jul 01, 2020
  
  13deb95a
01 Jun, 2020 1 commit

Fix onnx export input names order (#4641) · ec62b7d9

Rens authored Jun 01, 2020

* pass on tokenizer to pipeline

* order input names when convert to onnx

* update style

* remove unused imports

* make ordered inputs list needs to be mutable

* add test custom bert model

* remove unused imports

ec62b7d9

18 May, 2020 1 commit
- Tag onnx export tests as slow (#4432) · 31c799a0
  Funtowicz Morgan authored May 18, 2020
  
  31c799a0
14 May, 2020 1 commit

Conversion script to export transformers models to ONNX IR. (#4253) · db0076a9

Funtowicz Morgan authored May 14, 2020

* Added generic ONNX conversion script for PyTorch model.

* WIP initial TF support.

* TensorFlow/Keras ONNX export working.

* Print framework version info

* Add possibility to check the model is correctly loading on ONNX runtime.

* Remove quantization option.

* Specify ONNX opset version when exporting.

* Formatting.

* Remove unused imports.

* Make functions more generally reusable from other part of the code.

* isort happy.

* flake happy

* Export only feature-extraction for now

* Correctly check inputs order / filter before export.

* Removed task variable

* Fix invalid args call in load_graph_from_args.

* Fix invalid args call in convert.

* Fix invalid args call in infer_shapes.

* Raise exception and catch in caller function instead of exit.

* Add 04-onnx-export.ipynb notebook

* More WIP on the notebook

* Remove unused imports

* Simplify & remove unused constants.

* Export with constant_folding in PyTorch

* Let's try to put function args in the right order this time ...

* Disable external_data_format temporary

* ONNX notebook draft ready.

* Updated notebooks charts + wording

* Correct error while exporting last chart in notebook.

* Adressing @LysandreJik comment.

* Set ONNX opset to 11 as default value.

* Set opset param mandatory

* Added ONNX export unittests

* Quality.

* flake8 happy

* Add keras2onnx dependency on extras["tf"]

* Pin keras2onnx on github master to v1.6.5

* Second attempt.

* Third attempt.

* Use the right repo URL this time ...

* Do the same for onnxconverter-common

* Added keras2onnx and onnxconveter-common to 1.7.0 to supports TF2.2

* Correct commit hash.

* Addressing PR review: Optimization are enabled by default.

* Addressing PR review: small changes in the notebook

* setup.py comment about keras2onnx versioning.

db0076a9