- 02 Nov, 2020 1 commit
-
-
Nicolas Patry authored
* Some work to fix the behaviour of DefaultArgumentHandler by removing it. * Fixing specific pipelines argument checking.
-
- 29 Oct, 2020 1 commit
-
-
Santiago Castro authored
* Fix typo: indinces -> indices * Fix some more * Fix some more * Fix some more * Fix CI
-
- 28 Oct, 2020 1 commit
-
-
Bram Vanroy authored
* Improve pipeline() docstrings * make style * Update wording for config
-
- 27 Oct, 2020 1 commit
-
-
Joe Davison authored
* add entailment dim argument * rename dim -> id * fix last name change, style * rm arg, auto-infer only * typo * rm superfluous import
-
- 26 Oct, 2020 1 commit
-
-
Sylvain Gugger authored
* Important files * Styling them all * Revert "Styling them all" This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e. * Syling them for realsies * Fix syntax error * Fix benchmark_utils * More fixes * Fix modeling auto and script * Remove new line * Fixes * More fixes * Fix more files * Style * Add FSMT * More fixes * More fixes * More fixes * More fixes * Fixes * More fixes * More fixes * Last fixes * Make sphinx happy
-
- 23 Oct, 2020 1 commit
-
-
Thomas Wolf authored
[tests|tokenizers] Refactoring pipelines test backbone - Small tokenizers improvements - General tests speedups (#7970) * WIP refactoring pipeline tests - switching to fast tokenizers * fix dialog pipeline and fill-mask * refactoring pipeline tests backbone * make large tests slow * fix tests (tf Bart inactive for now) * fix doc... * clean up for merge * fixing tests - remove bart from summarization until there is TF * fix quality and RAG * Add new translation pipeline tests - fix JAX tests * only slow for dialog * Fixing the missing TF-BART imports in modeling_tf_auto * spin out pipeline tests in separate CI job * adding pipeline test to CI YAML * add slow pipeline tests * speed up tf and pt join test to avoid redoing all the standalone pt and tf tests * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> * Update src/transformers/pipelines.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/testing_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add require_torch and require_tf in is_pt_tf_cross_test Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 22 Oct, 2020 2 commits
-
-
Julien Chaumond authored
* FillMaskPipeline: support passing top_k on __call__ Also move from topk to top_k * migrate to new param name in tests * Review from @sgugger
-
Nicolas Patry authored
* Actually make the "translation", "translation_XX_to_YY" task behave correctly. Background: - Currently "translation_cn_to_ar" does not work. (only 3 pairs are supported) - Some models, contain in their config the correct values for the (src, tgt) pair they can translate. It's usually just one pair, and we can infer it automatically from the `model.config.task_specific_params`. If it's not defined we can still probably load the TranslationPipeline nevertheless. Proposed fix: - A simplified version of what could become more general which is a `parametrized` task. "translation" + (src, tgt) in this instance it what we need in the general case. The way we go about it for now is simply parsing "translation_XX_to_YY". If cases of parametrized task arise we should preferably go in something closer to what `datasets` propose which is having a secondary argument `task_options`? that will be close to what that task requires. - Should be backward compatible in all cases for instance `pipeline(task="translation_en_to_de") should work out of the box. - Should provide a warning when a specific translation pair has been selected on behalf of the user using `model.config.task_specific_params`. * Update src/transformers/pipelines.py Co-authored-by:
Julien Chaumond <chaumond@gmail.com> Co-authored-by:
Julien Chaumond <chaumond@gmail.com>
-
- 21 Oct, 2020 1 commit
-
-
Sam Shleifer authored
* half done * doc improvement * Cp test file * brokedn * broken test * undo some mess * ckpt * borked * Halfway * 6 passing * boom boom * Much progress but still 6 * boom boom * merged master * 10 passing * boom boom * Style * no t5 changes * 13 passing * Integration test failing, but not gibberish * Frustrated * Merged master * 4 fail * 4 fail * fix return_dict * boom boom * Still only 4 * prepare method * prepare method * before delete classif * Skip tests to avoid adding boilerplate * boom boom * fast tests passing * style * boom boom * Switch to supporting many input types * remove FIXMENORM * working * Fixed past_key_values/decoder_cached_states confusion * new broken test * Fix attention mask kwarg name * undo accidental * Style and reviewers * style * Docs and common tests * Cleaner assert messages * copy docs * style issues * Sphinx fix * Simplify caching logic * test does not require torch * copy _NoLayerEmbedTokens * Update src/transformers/modeling_tf_bart.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update tests/test_modeling_tf_bart.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_bart.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_bart.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/modeling_tf_bart.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Line length and dont document None * Add pipeline test coverage * assert msg * At parity * Assert messages * mark slow * Update compile test * back in init * Merge master * Fix tests Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 16 Oct, 2020 1 commit
-
-
Julien Chaumond authored
-
- 15 Oct, 2020 2 commits
-
-
Nicolas Patry authored
- TFAutoModelForCausalLM - TFAutoModelForMaskedLM - TFAutoModelForSeq2SeqLM as per deprecation warning. No tests as it simply removes current warnings from tests.
-
Nicolas Patry authored
* Improving Pipelines by defaulting to framework='tf' when pytorch seems unavailable. * Actually changing the default resolution order to account for model defaults Adding a new tests for each pipeline to check that pipeline(task) works too without manually adding the framework too.
-
- 13 Oct, 2020 2 commits
-
-
Tiger authored
-
Lysandre Debut authored
* Do not softmax when num_labels==1 * Update src/transformers/pipelines.py Co-authored-by:
Funtowicz Morgan <mfuntowicz@users.noreply.github.com> Co-authored-by:
Funtowicz Morgan <mfuntowicz@users.noreply.github.com>
-
- 18 Sep, 2020 1 commit
-
-
Yuta Hayashibe authored
-
- 17 Sep, 2020 1 commit
-
-
Sohee Yang authored
* Move 'from transformers' statements to relative imports in some files * Add python prompt symbols in front of the example codes * Reformat the code * Add one missing space Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 07 Sep, 2020 1 commit
-
-
Boris Dayma authored
* feat: allow padding_text for any generative model * docs(pipelines.py): correct typo * Update src/transformers/pipelines.py Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> * feat: rename padding_text to prefix * fix: cannot tokenize empty text * fix: pass prefix arg to pipeline * test: add prefix to text-generetation pipeline * style: fix style * style: clean code and variable name more explicit * set arg docstring to optional Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sam Shleifer <sshleifer@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 02 Sep, 2020 1 commit
-
-
Suraj Patil authored
* add Text2TextGenerationPipeline * remove max length warning * remove comments * remove input_length * fix typo * add tests * use TFAutoModelForSeq2SeqLM * doc * typo * add the doc below TextGenerationPipeline * doc nit * style * delete comment
-
- 01 Sep, 2020 1 commit
-
-
Funtowicz Morgan authored
Signed-off-by:Morgan Funtowicz <funtowiczmo@gmail.com>
-
- 26 Aug, 2020 2 commits
-
-
Lysandre authored
-
Lysandre Debut authored
* Logging * Style * hf_logging > utils.logging * Address @thomwolf's comments * Update test * Update src/transformers/benchmark/benchmark_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Revert bad change Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 25 Aug, 2020 1 commit
-
-
Funtowicz Morgan authored
Signed-off-by:Morgan Funtowicz <funtowiczmo@gmail.com>
-
- 24 Aug, 2020 1 commit
-
-
Sylvain Gugger authored
* Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks
-
- 12 Aug, 2020 2 commits
-
-
Joe Davison authored
* add targets arg to fill-mask pipeline * add tests and more error handling * quality * update docstring
-
Sylvain Gugger authored
* allow using tokenizer.pad as a collate_fn in pytorch * allow using tokenizer.pad as a collate_fn in pytorch * Add documentation and tests * Make attention mask the right shape * Better test Co-authored-by:Thomas Wolf <thomwolf@users.noreply.github.com>
-
- 04 Aug, 2020 1 commit
-
-
Joe Davison authored
-
- 03 Aug, 2020 1 commit
-
-
Sylvain Gugger authored
* Init work on pipelines doc * Work in progress * Work in progress * Doc pipelines * Rm unwanted default * Apply suggestions from code review Lysandre comments Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 30 Jul, 2020 2 commits
-
-
guillaume-be authored
* initial commit for pipeline implementation Addition of input processing and history concatenation * Conversation pipeline tested and working for single & multiple conversation inputs * Added docstrings for dialogue pipeline * Addition of dialogue pipeline integration tests * Delete test_t5.py * Fixed max code length * Updated styling * Fixed test broken by formatting tools * Removed unused import * Added unit test for DialoguePipeline * Fixed Tensorflow compatibility * Fixed multi-framework support using framework flag * - Fixed docstring - Added `min_length_for_response` as an initialization parameter - Renamed `*args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]` - Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input * - renamed pipeline name from dialogue to conversational - removed hardcoded default value of 1000 and use config.max_length instead - added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation - fixed bug in history truncation method * - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised) * - Simplified input tensor conversion * - Updated attention_mask value for Tensorflow compatibility * - Updated last dialogue reference to conversational & fixed integration tests * Fixed conflict with master * Updates following review comments * Updated formatting * Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs * Update src/transformers/pipelines.py Updated docsting following review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* Switch from return_tuple to return_dict * Fix test * [WIP] Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleC… (#5614) * Test TF Flaubert + Add {XLM, Flaubert}{TokenClassification, MultipleChoice} models and tests * AutoModels Tiny tweaks * Style * Final changes before merge * Re-order for simpler review * Final fixes * Addressing @sgugger's comments * Test MultipleChoice * Rework TF trainer (#6038) * Fully rework training/prediction loops * fix method name * Fix variable name * Fix property name * Fix scope * Fix method name * Fix tuple index * Fix tuple index * Fix indentation * Fix variable name * fix eval before log * Add drop remainder for test dataset * Fix step number + fix logging datetime * fix eval loss value * use global step instead of step + fix logging at step 0 * Fix logging datetime * Fix global_step usage * Fix breaking loop + logging datetime * Fix step in prediction loop * Fix step breaking * Fix train/test loops * Force TF at least 2.2 for the trainer * Use assert_cardinality to facilitate the dataset size computation * Log steps per epoch * Make tfds compliant with TPU * Make tfds compliant with TPU * Use TF dataset enumerate instead of the Python one * revert previous commit * Fix data_dir * Apply style * rebase on master * Address Sylvain's comments * Address Sylvain's and Lysandre comments * Trigger CI * Remove unused import * Switch from return_tuple to return_dict * Fix test * Add recent model Co-authored-by:Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Julien Plu <plu.julien@gmail.com>
-
- 28 Jul, 2020 2 commits
-
-
Joe Davison authored
-
Lysandre Debut authored
-
- 27 Jul, 2020 2 commits
-
-
Suraj Patil authored
* use new AutoModel classed * make style and quality
-
Joe Davison authored
* add initial zero-shot pipeline * change default args * update default template * add label string splitting * add str labels support, remove nli from name * style * add input validation and working tf defaults * tests * quality check * add docstring to __call__ * add slow tests * Change truncation to only_first also lower precision on tests for readibility * style
-
- 22 Jul, 2020 1 commit
-
-
Funtowicz Morgan authored
* Attempt to fix the way squad_convert_examples_to_features pad the elements for the QA pipeline. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Quality Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make the code easier to read and avoid testing multiple test the same thing. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * missing enum value on truncation_strategy. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Rethinking for the easiest fix: expose the padding strategy on squad_convert_examples_to_features. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove unused imports. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com>
-
- 12 Jul, 2020 1 commit
-
-
Kevin Canwen Xu authored
* Add model type check for pipelines * Add model type check for pipelines * rename func * Fix the init parameters * Fix format * rollback unnecessary refactor
-
- 10 Jul, 2020 1 commit
-
-
Teven authored
Fixed use of memories in XLNet (caching for language generation + warning when loading improper memoryless model) (#5632) * Pytorch gpu => cpu proper device * Memoryless XLNet warning + fixed memories during generation * Revert "Pytorch gpu => cpu proper device" This reverts commit 93489b36 * made black happy * TF generation with memories * dim => axis * added padding_text to TF XL models * Added comment, added TF
-
- 09 Jul, 2020 2 commits
-
-
Teven authored
* Pytorch gpu => cpu proper device * Memoryless XLNet warning + fixed memories during generation * Revert "Memoryless XLNet warning + fixed memories during generation" This reverts commit 3d3251ff * Took the operations on the generated_sequence out of the ensure_device scope
-
Funtowicz Morgan authored
* Ensure padding and question cannot have higher probs than context. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Add bart the the list of tokenizers adding two <sep> tokens for squad_convert_example_to_feature Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Format. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Addressing @patrickvonplaten comments. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Addressing @patrickvonplaten comments about masking non-context element when generating the answer. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Addressing @sshleifer comments. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make sure we mask CLS after handling impossible answers Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Mask in the correct vectors ... Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com>
-
- 08 Jul, 2020 1 commit
-
-
Lorenzo Ampil authored
* Add B I handling to grouping * Add fix to include separate entity as last token * move last_idx definition outside loop * Use first entity in entity group as reference for entity type * Add test cases * Take out extra class accidentally added * Return tf ner grouped test to original * Take out redundant last entity * Get last_idx safely Co-authored-by:
ColleterVi <36503688+ColleterVi@users.noreply.github.com> * Fix first entity comment * Create separate functions for group_sub_entities and group_entities (splitting call method to testable functions) * Take out unnecessary last_idx * Remove additional forward pass test * Move token classification basic tests to separate class * Move token classification basic tests back to monocolumninputtestcase * Move base ner tests to nerpipelinetests * Take out unused kwargs * Add back mandatory_keys argument * Add unitary tests for group_entities in _test_ner_pipeline * Fix last entity handling * Fix grouping fucntion used * Add typing to group_sub_entities and group_entities Co-authored-by:
ColleterVi <36503688+ColleterVi@users.noreply.github.com>
-
- 03 Jul, 2020 1 commit
-
-
Funtowicz Morgan authored
* Make QA pipeline supports models with more than 2 outputs such as BART assuming start/end are the two first outputs. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * When using the new padding/truncation paradigm setting padding="max_length" + max_length=X actually pads the input up to max_length. This result in every sample going through QA pipelines to be of size 384 whatever the actual input size is making the overall pipeline very slow. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Mask padding & question before applying softmax. Softmax has been refactored to operate in log space for speed and stability. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Format. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use PaddingStrategy.LONGEST instead of DO_NOT_PAD Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Revert "When using the new padding/truncation paradigm setting padding="max_length" + max_length=X actually pads the input up to max_length." This reverts commit 1b00a9a2 Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Trigger CI after unattended failure * Trigger CI
-