1. 30 Dec, 2021 3 commits
  2. 28 Dec, 2021 4 commits
  3. 27 Dec, 2021 3 commits
    • Stas Bekman's avatar
      [doc] :obj: hunt (#14954) · e13f72fb
      Stas Bekman authored
      * redo sans examples
      
      * style
      e13f72fb
    • Daniel Stancl's avatar
      Add `ElectraForCausalLM` -> Enable Electra encoder-decoder model (#14729) · 501307b5
      Daniel Stancl authored
      * Add ElectraForCausalLM and cover some basic tests & need to fix a few tests
      
      * Fix bugs
      
      * make style
      
      * make fix-copies
      
      * Update doc
      
      * Change docstring to markdown format
      
      * Remove redundant update_keys_to_ignore
      501307b5
    • Nicolas Patry's avatar
      ChunkPipeline (batch_size enabled on `zero-cls` and `qa` pipelines. (#14225) · b058490c
      Nicolas Patry authored
      
      
      * Pipeline chunks.
      
      * Batching for Chunking pipelines ?
      
      * Batching for `question-answering` and `zero-shot-cls`.
      
      * Fixing for FNet.
      
      * Making ASR a chunk pipeline.
      
      * Chunking ASR API.
      
      * doc style.
      
      * Fixing ASR test.
      
      * Fixing QA eror (p_mask, padding is 1, not 0).
      
      * Enable both vad and simple chunking.
      
      * Max length for vad.
      
      * remove inference mode, crashing on s2t.
      
      * Revert ChunkPipeline for ASRpipeline.
      
      Too many knobs for simple integration within the pipeline, better stick
      to external convenience functions instead, more control to be had,
      simpler pipeline and also easier to replace with other things later.
      
      * Drop necessity for PT for these.
      
      * Enabling generators.
      
      * Add mic + cleanup.
      
      * Typo.
      
      * Typo2.
      
      * Remove ASR work, it does not belong in this PR anymore.
      
      * Update src/transformers/pipelines/pt_utils.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/pipelines/zero_shot_classification.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Adding many comments.
      
      * Doc quality.
      
      * `hidden_states` handling.
      
      * Adding doc.
      
      * Bad rebase.
      
      * Autofixing docs.
      
      * Fixing CRITICAL bug in the new Zerocls pipeline.
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      b058490c
  4. 23 Dec, 2021 7 commits
    • Sylvain Gugger's avatar
      Better logic for getting tokenizer config in AutoTokenizer (#14906) · 676643c6
      Sylvain Gugger authored
      * Better logic for getting tokenizer config in AutoTokenizer
      
      * Remove needless import
      
      * Remove debug statement
      
      * Address review comments
      676643c6
    • Sylvain Gugger's avatar
      Fix failing GPU trainer tests (#14903) · f566c6e3
      Sylvain Gugger authored
      * Fix failing GPU trainer tests
      
      * Remove print statements
      f566c6e3
    • Patrick von Platen's avatar
      [Generate] Remove attention_mask and integrate model_main_input_name (#14856) · fe4197ab
      Patrick von Platen authored
      * up
      
      * save
      
      * correct
      
      * up
      
      * correct more
      
      * up
      
      * up
      
      * up
      
      * up
      
      * up
      
      * correct
      
      * fix tf
      
      * fix
      
      * remove tokenizer
      fe4197ab
    • Anton Lozhkov's avatar
      ee55ea69
    • Yih-Dar's avatar
      Add TFCLIPModel (#13967) · 8f2cc1c3
      Yih-Dar authored
      
      
      * Start the work for TFCLIPModel
      
      * Convert to TF code (TODO: loss + doc)
      
      * Clean up
      
      * Fix pooled_output for TFCLIPTextTransformer - using tf.gather_nd
      
      * assert -> raise error
      
      * Expose TFCLIPModel
      
      * Deal with dummy_inputs
      
      * Add tests
      
      * Fix all tests. TODO: manual check weight loading + add more comments
      
      * Fix pt tf equivalence test
      
      * fixes
      
      * update TFCLIPVisionEmbeddings's Conv2D
      
      * Fix loss + overwrite test_pt_tf_model_equivalence from common
      
      * Add a comment about the change about MainLayer in test_keras_save_load
      
      * Set return_loss=True in TFCLIPModelTester + make tests pass
      
      * overwrite test_pt_tf_model_equivalence from tf common
      
      * fix base_model_prefix
      
      * Fix examples
      
      * remove unused
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * apply review suggestions
      
      * change self.pre_layrnorm to self.pre_layernorm
      
      * apply more review suggestions
      
      * return attention probs before dropout (to align with PT)
      
      * fix weight init
      
      * fix
      
      * build doc
      
      * fix missing doc
      
      * fix for test
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      8f2cc1c3
    • lewtun's avatar
      Add ONNX support for MarianMT models (#14586) · 6b655cc6
      lewtun authored
      * First commit to add MarianMT to ONNX
      
      * Now MarianModel.forward() automatically generates decoder_input_ids, like BartModel.forward()
      
      * Adjusted MarianOnnxConfig.inputs and outputs to work with seq2seq-lm feature
      
      * Style fix
      
      * Added support for other features for already supported models
      
      * Partial support for causal and seq2seq models
      
      * Partial support for causal and seq2seq models
      
      * Add default task for MarianMT ONNX
      
      * Remove automatic creation of decoder_input_ids
      
      * Extend inputs and outputs for MarianMT ONNX config
      
      * Add MarianMT to ONNX unit tests
      
      * Refactor
      
      * OnnxSeq2SeqConfigWithPast to support seq2seq models
      
      * Parameterized the onnx tests
      
      * Restored run_mlm.py
      
      * Restored run_mlm.py
      
      * [WIP] BART update
      
      * BART and MBART
      
      * Add past_key_values and fix dummy decoder inputs
      
      Using a sequence length of 1 in generate_dummy_outputs() produces large discrepancies, presumably due to some hidden optimisations.
      
      * Refactor MarianOnnxConfig to remove custom past_key_values logic
      
      * Fix quality
      
      * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)"
      
      This reverts commit 0f4e39c5.
      
      * is_torch_available test to avoid failing imports
      
      * sorting parameterize parameters to solve ERROR gw0 gw1
      
      * tests fix
      
      * tests fix
      
      * GPT2 with past fix
      
      * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially
      
      * Removed onnx file
      
      * Refactor Marian export to account for base changes
      
      * Fix copies
      
      * Implemented suggestions
      
      * Extend support for causal LM
      
      * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)"
      
      This reverts commit 0f4e39c5.
      
      * is_torch_available test to avoid failing imports
      
      * sorting parameterize parameters to solve ERROR gw0 gw1
      
      * tests fix
      
      * tests fix
      
      * GPT2 with past fix
      
      * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially
      
      * Removed onnx file
      
      * Implemented suggestions
      
      * Fixed __init__ to resolve conflict with master
      
      * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)"
      
      This reverts commit 0f4e39c5
      
      .
      
      * is_torch_available test to avoid failing imports
      
      * sorting parameterize parameters to solve ERROR gw0 gw1
      
      * tests fix
      
      * tests fix
      
      * GPT2 with past fix
      
      * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially
      
      * Removed onnx file
      
      * Implemented suggestions
      
      * Fixed __init__ to resolve conflict with master
      
      * Remove commented import
      
      * Remove ONNX model
      
      * Remove redundant class method
      
      * Tidy up imports
      
      * Fix quality
      
      * Refactor dummy input function
      
      * Add copied from statements to Marian config functions
      
      * Remove false copied from comments
      
      * Fix copy from comment
      Co-authored-by: default avatarMassimiliano Bruni <massimiliano.bruni@hcl.com>
      Co-authored-by: default avatarMichael Benayoun <mickbenayoun@gmail.com>
      6b655cc6
    • Henrik Holm's avatar
  5. 22 Dec, 2021 3 commits
    • Michael Benayoun's avatar
      Onnx enable tasks for supported models (part 2) (#14700) · 13504dcb
      Michael Benayoun authored
      * Revert "Revert "Added support for other features for already supported models (#14358)" (#14679)"
      
      This reverts commit 0f4e39c5.
      
      * is_torch_available test to avoid failing imports
      
      * sorting parameterize parameters to solve ERROR gw0 gw1
      
      * tests fix
      
      * tests fix
      
      * GPT2 with past fix
      
      * Fixed stateful class attribute change that was breaking things when converting multiple models sequentially
      
      * Removed onnx file
      
      * Implemented suggestions
      
      * Fixed __init__ to resolve conflict with master
      
      * Remove commented import
      13504dcb
    • Ryokan RI's avatar
      Feature/fix slow test in mluke (#14749) · 824fd44f
      Ryokan RI authored
      * make MLukeTokenizerTest fast
      
      * make LukeTokenizerTest fast
      
      * add entry to _toctree.yaml
      824fd44f
    • SaulLu's avatar
      update the arguments `add_prefix_space` and `trim_offsets` in... · c94c1b89
      SaulLu authored
      update the arguments `add_prefix_space` and `trim_offsets` in `backend_tokenizer.post_processor` of `RobertaTokenizerFast` (#14752)
      
      * add tests
      
      * change post-processor, pre-tokenizer and decoder (can't update decoder)
      
      * update test (remove decoder which doesn't depend on trim and add_prefix)
      
      * just update the post_processor
      
      * fix change
      
      * `trim_offsets` has no influence on `pre_tokenizer`
      
      * remove a test that need some input from the `tokenizers` lib maintainers
      
      * format
      
      * add new test offsets roberta
      
      * polish comments
      c94c1b89
  6. 21 Dec, 2021 2 commits
  7. 20 Dec, 2021 6 commits
  8. 17 Dec, 2021 5 commits
  9. 16 Dec, 2021 4 commits
  10. 15 Dec, 2021 1 commit
    • Matt's avatar
      TF model cards (#14720) · 48d48276
      Matt authored
      * Initial commit for Keras model cards
      
      * Revert accidental change
      
      * make style
      
      * make style
      
      * make style
      
      * Fix PR comments
      
      * Move repo creation to __init__
      
      * Fixes to README.md creation
      
      * Partial progress for proper card creation on `push_to_hub`
      
      * Proper card creation from `push_to_hub` plus fixes for malformed model cards
      
      * Fixes for model card creation outside the callback
      
      * Adding a model card creation test
      
      * Putting the model card creation test in the right file.
      Good job, Matt.
      
      * make style
      
      * Fix model card test temp dir usage
      
      * Fix model card creation when no optimizer present
      
      * Fixes for when training history not present
      
      * Fix accidental edit to test_modeling_common
      48d48276
  11. 14 Dec, 2021 2 commits
    • Nicolas Patry's avatar
      Adding support for multiple mask tokens. (#14716) · e7ed7ffd
      Nicolas Patry authored
      * Adding support for multiple mask tokens.
      
      - Original implem: https://github.com/huggingface/transformers/pull/10222
      
      Co-authored-by: default avatarnjafer <naveen.jafer@oracle.com>
      
      * In order to accomodate optionally multimodal models like Perceiver
      
      we add information to the tasks to specify tasks where we know for sure
      if we need the tokenizer/feature_extractor or not.
      
      * Adding info in the documentation about multi masks.
      
      + marked as experimental.
      
      * Add a copy() to prevent overriding the same tensor over and over.
      
      * Fixup.
      
      * Adding small test for multi mask with real values..
      Co-authored-by: default avatarnjafer <naveen.jafer@oracle.com>
      e7ed7ffd
    • Nicolas Patry's avatar
      Fixing tests for Perceiver (#14739) · 546a91ab
      Nicolas Patry authored
      * Adding some slow test to check for perceiver at least from a high level.
      
      * Re-enabling fast tests for Perceiver ImageClassification.
      
      * Perceiver might try to run without Tokenizer (Fast doesn't exist) and
      with FeatureExtractor some text only pipelines.
      
      * Oops.
      
      * Adding a comment for `update_config_with_model_class`.
      
      * Remove `model_architecture` to get `tiny_config`.
      
      * Finalize rebase.
      
      * Smarter way to handle undefined FastTokenizer.
      
      * Remove old code.
      
      * Addressing some nits.
      
      * Don't instantiate `None`.
      546a91ab