"vscode:/vscode.git/clone" did not exist on "986ac03e374a00a52cee98c8ac14fb1ba6b66610"
  1. 09 Jul, 2021 2 commits
    • Nicolas Patry's avatar
      This will reduce "Already borrowed error": (#12550) · cc12e1db
      Nicolas Patry authored
      * This will reduce "Already borrowed error":
      
      Original issue https://github.com/huggingface/tokenizers/issues/537
      
      
      
      The original issue is caused by transformers calling many times
      mutable functions on the rust tokenizers.
      Rust needs to guarantee that only 1 agent has a mutable reference
      to memory at a given time (for many reasons which don't need explaining
      here). Usually, the rust compiler can guarantee that this property is
      true at compile time.
      
      Unfortunately, this is impossible for Python to do that, so PyO3, the
      bridge between rust and python used by `tokenizers`, will change the
      compile guarantee for a dynamic guarantee, so if multiple agents try
      to have multiple mutable borrows at the same time, then the runtime will
      yell with "Already borrowed".
      
      The proposed fix here in transformers, is simply to reduce the actual
      number of calls that really need mutable borrows. By reducing them,
      we reduce the risk of running into "Already borrowed" error.
      The caveat is now we add a call to read the current configuration of the
      `_tokenizer`, so worst case we have 2 calls instead of 1, and best case
      we simply have 1 + a Python comparison of a dict (should be negligible).
      
      * Adding a test.
      
      * trivial error :(.
      
      * Update tests/test_tokenization_fast.py
      Co-authored-by: default avatarSaulLu <55560583+SaulLu@users.noreply.github.com>
      
      * Adding reference to original issues in the tests.
      
      * Update the tests with fast tokenizer.
      Co-authored-by: default avatarSaulLu <55560583+SaulLu@users.noreply.github.com>
      cc12e1db
    • Omar Sanseviero's avatar
      8fe836af
  2. 08 Jul, 2021 9 commits
    • Stas Bekman's avatar
      [doc] fix broken ref (#12597) · ce111fee
      Stas Bekman authored
      ce111fee
    • Stas Bekman's avatar
      [model.from_pretrained] raise exception early on failed load (#12574) · f0dde601
      Stas Bekman authored
      
      
      
      * [model.from_pretrained] raise exception early on failed load
      
      Currently if `load` pretrained weights fails in `from_pretrained`, we first print a whole bunch of successful messages and then fail - this PR puts the exception first to avoid all the misleading messages.
      
      * style
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      f0dde601
    • Sylvain Gugger's avatar
      Fix MT5 init (#12591) · 75e63dbf
      Sylvain Gugger authored
      75e63dbf
    • Nicolas Patry's avatar
      Fixing the pipeline optimization by reindexing targets (V2) (#12330) · 4da568c1
      Nicolas Patry authored
      
      
      * Fixing the pipeline optimization by rescaling the logits first.
      
      * Add test for target equivalence
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      4da568c1
    • Funtowicz Morgan's avatar
      [RFC] Laying down building stone for more flexible ONNX export capabilities (#11786) · 2aa3cd93
      Funtowicz Morgan authored
      
      
      * Laying down building stone for more flexible ONNX export capabilities
      
      * Ability to provide a map of config key to override before exporting.
      
      * Makes it possible to export BART with/without past keys.
      
      * Supports simple mathematical syntax for OnnxVariable.repeated
      
      * Effectively apply value override from onnx config for model
      
      * Supports export with additional features such as with-past for seq2seq
      
      * Store the output path directly in the args for uniform usage across.
      
      * Make BART_ONNX_CONFIG_* constants and fix imports.
      
      * Support BERT model.
      
      * Use tokenizer for more flexibility in defining the inputs of a model.
      
      * Add TODO as remainder to provide the batch/sequence_length as CLI args
      
      * Enable optimizations to be done on the model.
      
      * Enable GPT2 + past
      
      * Improve model validation with outputs containing nested structures
      
      * Enable Roberta
      
      * Enable Albert
      
      * Albert requires opset >= 12
      
      * BERT-like models requires opset >= 12
      
      * Remove double printing.
      
      * Enable XLM-Roberta
      
      * Enable DistilBERT
      
      * Disable optimization by default
      
      * Fix missing setattr when applying optimizer_features
      
      * Add value field to OnnxVariable to define constant input (not from tokenizers)
      
      * Add T5 support.
      
      * Simplify model type retrieval
      
      * Example exporting token_classification pipeline for DistilBERT.
      
      * Refactoring to package `transformers.onnx`
      
      * Solve circular dependency & __main__
      
      * Remove unnecessary imports in `__init__`
      
      * Licences
      
      * Use @Narsil's suggestion to forward the model's configuration to the ONNXConfig to avoid interpolation.
      
      * Onnx export v2 fixes (#12388)
      
      * Tiny fixes
      Remove `convert_pytorch` from onnxruntime-less runtimes
      Correct reference to model
      
      * Style
      
      * Fix Copied from
      
      * LongFormer ONNX config.
      
      * Removed optimizations
      
      * Remvoe bad merge relicas.
      
      * Remove unused constants.
      
      * Remove some deleted constants from imports.
      
      * Fix unittest to remove usage of PyTorch model for onnx.utils.
      
      * Fix distilbert export
      
      * Enable ONNX export test for supported model.
      
      * Style.
      
      * Fix lint.
      
      * Enable all supported default models.
      
      * GPT2 only has one output
      
      * Fix bad property name when overriding config.
      
      * Added unittests and docstrings.
      
      * Disable with_past tests for now.
      
      * Enable outputs validation for default export.
      
      * Remove graph opt lvls.
      
      * Last commit with on-going past commented.
      
      * Style.
      
      * Disabled `with_past` for now
      
      * Remove unused imports.
      
      * Remove framework argument
      
      * Remove TFPreTrainedModel reference
      
      * Add documentation
      
      * Add onnxruntime tests to CircleCI
      
      * Add test
      
      * Rename `convert_pytorch` to `export`
      
      * Use OrderedDict for dummy inputs
      
      * WIP Wav2Vec2
      
      * Revert "WIP Wav2Vec2"
      
      This reverts commit f665efb04c92525c3530e589029f0ae7afdf603e.
      
      * Style
      
      * Use OrderedDict for I/O
      
      * Style.
      
      * Specify OrderedDict documentation.
      
      * Style :)
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      2aa3cd93
    • Sylvain Gugger's avatar
    • Sylvain Gugger's avatar
      6f1adc43
    • Sylvain Gugger's avatar
      Init pickle (#12567) · 0a6b9048
      Sylvain Gugger authored
      * Try to pickle transformers
      
      * Deal with special objs better
      
      * Make picklable
      0a6b9048
    • Hwijeen Ahn's avatar
      raise exception when arguments to pipeline are incomplete (#12548) · b29c3945
      Hwijeen Ahn authored
      * raise exception when arguments are incomplete
      
      * change exception to runtime error
      b29c3945
  3. 07 Jul, 2021 12 commits
  4. 06 Jul, 2021 9 commits
  5. 05 Jul, 2021 8 commits