1. 01 Sep, 2021 1 commit
  2. 13 May, 2021 1 commit
  3. 07 May, 2021 1 commit
  4. 30 Apr, 2021 1 commit
    • Nicolas Patry's avatar
      Adding `AutomaticSpeechRecognitionPipeline`. (#11337) · db9dd09c
      Nicolas Patry authored
      
      
      * Adding `AutomaticSpeechRecognitionPipeline`.
      
      - Because we added everything to enable this pipeline, we probably
      should add it to `transformers`.
      - This PR tries to limit the scope and focuses only on the pipeline part
      (what should go in, and out).
      - The tests are very specific for S2T and Wav2vec2 to make sure both
      architectures are supported by the pipeline. We don't use the mixin for
      tests right now, because that requires more work in the `pipeline`
      function (will be done in a follow up PR).
      - Unsure about the "helper" function `ffmpeg_read`. It makes a lot of
        sense from a user perspective, it does not add any additional
      dependencies (as in hard dependency, because users can always use their
      own load mechanism). Meanwhile, it feels slightly clunky to have so much
      optional preprocessing.
      - The pipeline is not done to support streaming audio right now.
      
      Future work:
      
      - Add `automatic-speech-recognition` as a `task`. And add the
      FeatureExtractor.from_pretrained within `pipeline` function.
      - Add small models within tests
      - Add the Mixin to tests.
      - Make the logic between ForCTC vs ForConditionalGeneration better.
      
      * Update tests/test_pipelines_automatic_speech_recognition.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Adding docs + main import + type checking + LICENSE.
      
      * Doc style !.
      
      * Fixing TYPE_HINT.
      
      * Specifying waveform shape in the docs.
      
      * Adding asserts + specify in the documentation the shape of the input
      np.ndarray.
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Adding require to tests + move the `feature_extractor` doc.
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      db9dd09c
  5. 16 Dec, 2020 1 commit
  6. 10 Dec, 2020 1 commit
  7. 07 Dec, 2020 1 commit
  8. 26 Oct, 2020 1 commit
    • Sylvain Gugger's avatar
      Doc styling (#8067) · 08f534d2
      Sylvain Gugger authored
      * Important files
      
      * Styling them all
      
      * Revert "Styling them all"
      
      This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.
      
      * Syling them for realsies
      
      * Fix syntax error
      
      * Fix benchmark_utils
      
      * More fixes
      
      * Fix modeling auto and script
      
      * Remove new line
      
      * Fixes
      
      * More fixes
      
      * Fix more files
      
      * Style
      
      * Add FSMT
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * Fixes
      
      * More fixes
      
      * More fixes
      
      * Last fixes
      
      * Make sphinx happy
      08f534d2
  9. 23 Sep, 2020 1 commit
  10. 02 Sep, 2020 1 commit
    • Suraj Patil's avatar
      [pipelines] Text2TextGenerationPipeline (#6744) · 4230d30f
      Suraj Patil authored
      * add Text2TextGenerationPipeline
      
      * remove max length warning
      
      * remove comments
      
      * remove input_length
      
      * fix typo
      
      * add tests
      
      * use TFAutoModelForSeq2SeqLM
      
      * doc
      
      * typo
      
      * add the doc below TextGenerationPipeline
      
      * doc nit
      
      * style
      
      * delete comment
      4230d30f
  11. 04 Aug, 2020 1 commit
  12. 03 Aug, 2020 1 commit
  13. 30 Jul, 2020 1 commit
    • guillaume-be's avatar
      Addition of a DialoguePipeline (#5516) · e642c789
      guillaume-be authored
      
      
      * initial commit for pipeline implementation
      
      Addition of input processing and history concatenation
      
      * Conversation pipeline tested and working for single & multiple conversation inputs
      
      * Added docstrings for dialogue pipeline
      
      * Addition of dialogue pipeline integration tests
      
      * Delete test_t5.py
      
      * Fixed max code length
      
      * Updated styling
      
      * Fixed test broken by formatting tools
      
      * Removed unused import
      
      * Added unit test for DialoguePipeline
      
      * Fixed Tensorflow compatibility
      
      * Fixed multi-framework support using framework flag
      
      * - Fixed docstring
      - Added `min_length_for_response` as an initialization parameter
      - Renamed `*args` to `conversations`, `conversations` being a `Conversation` or a `List[Conversation]`
      - Updated truncation to truncate entire segments of conversations, instead of cutting in the middle of a user/bot input
      
      * - renamed pipeline name from dialogue to conversational
      - removed hardcoded default value of 1000 and use config.max_length instead
      - added `append_response` and `set_history` method to the Conversation class to avoid direct fields mutation
      - fixed bug in history truncation method
      
      * - Updated ConversationalPipeline to accept only active conversations (otherwise a ValueError is raised)
      
      * - Simplified input tensor conversion
      
      * - Updated attention_mask value for Tensorflow compatibility
      
      * - Updated last dialogue reference to conversational & fixed integration tests
      
      * Fixed conflict with master
      
      * Updates following review comments
      
      * Updated formatting
      
      * Added Conversation and ConversationalPipeline to the library __init__, addition of docstrings for Conversation, added both to the docs
      
      * Update src/transformers/pipelines.py
      
      Updated docsting following review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      e642c789
  14. 22 Jun, 2020 1 commit
  15. 16 Jun, 2020 1 commit
  16. 03 Jun, 2020 1 commit
    • Julien Chaumond's avatar
      Pipelines: miscellanea of QoL improvements and small features... (#4632) · 99207bd1
      Julien Chaumond authored
      * [hf_api] Attach all unknown attributes for future-proof compatibility
      
      * [Pipeline] NerPipeline is really a TokenClassificationPipeline
      
      * modelcard.py: I don't think we need to force the download
      
      * Remove config, tokenizer from SUPPORTED_TASKS as we're moving to one model = one weight + one tokenizer
      
      * FillMaskPipeline: also output token in string form
      
      * TextClassificationPipeline: option to return all scores, not just the argmax
      
      * Update docs/source/main_classes/pipelines.rst
      99207bd1
  17. 22 Apr, 2020 1 commit
    • Lorenzo Ampil's avatar
      Pipeline for Text Generation: GenerationPipeline (#3758) · f16540fc
      Lorenzo Ampil authored
      
      
      * Add GenerationPipeline
      
      * Fix parameter names
      
      * Correct parameter __call__ parameters
      
      * Add model type attribute and correct function calls for prepare_input
      
      * Take out trailing commas from init attributes
      
      * Remove unnecessary tokenization line
      
      * Implement support for multiple text inputs
      
      * Apply generation support for multiple input text prompts
      
      * Take out tensor coersion
      
      * Take out batch index
      
      * Add text prompt to return sequence
      
      * Squeeze token tensore before decoding
      
      * Return only a single list of sequences if only one prompt was used
      
      * Correct results variable name
      
      * Add GenerationPipeline to SUPPORTED_TASKS with the alias , initalized w GPT2
      
      * Registedred AutoModelWithLMHead for both pt and t
      
      * Update docstring for GenerationPipeline
      
      * Add kwargs parameter to mode.generate
      
      * Take out kwargs parameter after all
      
      * Add generation pipeline example in pipeline docstring
      
      * Fix max length by squeezing tokens tensor
      
      * Apply ensure_tensor_on_device to pytorch tensor
      
      * Include generation step in torch.no_grad
      
      * Take out input from prepare_xlm_input and set 'en' as default xlm_language
      
      * Apply framework specific encoding during prepare_input
      
      * Format w make style
      
      * Move GenerationPipeline import to follow proper import sorting
      
      * Take out training comma from generation dict
      
      * Apply requested changes
      
      * Change name to TextGenerationPipeline
      
      * Apply TextGenerationPipeline rename to __init___
      
      * Changing alias to
      
      * Set input mapping as input to ensure_tensor_on_device
      
      * Fix assertion placement
      
      * Add test_text_generation
      
      * Add TextGenerationPipeline to PipelineCommonTests
      
      * Take out whitespace
      
      * Format __init__ w black
      
      * Fix __init__ style
      
      * Forman __init___
      
      * Add line to end of __init__
      
      * Correct model tokenizer set for test_text_generation
      
      * Ensure to return list of list, not list of string (to pass test)
      
      * Limit test models to only 3 to limit runtime to address circleCI timeout error
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update tests/test_pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Remove argument docstring, __init__, add additional __call__ arguments, and reformat results to list of dict
      
      * Fix blank result list
      
      * Add TextGenerationPipeline to pipelines.rst
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Fix typos from adding PADDING_TEXT_TOKEN_LENGTH
      
      * Fix incorrectly moved result list
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      
      * Update src/transformers/pipelines.py
      Co-Authored-By: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Add back generation line and make style
      
      * Take out blank whitespace
      
      * Apply new alis, text-generation, to test_pipelines
      
      * Fix text generation alias in test
      
      * Update src/transformers/pipelines.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarJulien Chaumond <chaumond@gmail.com>
      f16540fc
  18. 17 Mar, 2020 1 commit
    • Sam Shleifer's avatar
      Add Summarization to Pipelines (#3128) · 38a555a8
      Sam Shleifer authored
      * passing
      
      * Undo stupid chg
      
      * docs
      
      * undo rename
      
      * delete-cruft
      
      * only import if you have torch
      
      * Dont rely on dict ordering
      
      * Fix dict ordering upstream
      
      * docstring link
      
      * docstring link
      
      * remove trailing comma for 3.5 compat
      
      * new name
      
      * delegate kwarging
      
      * Update kwargs
      38a555a8
  19. 02 Mar, 2020 1 commit
    • Lysandre Debut's avatar
      Pipeline doc (#3055) · d3eb7d23
      Lysandre Debut authored
      * Pipeline doc initial commit
      
      * pipeline abstraction
      
      * Remove modelcard argument from pipeline
      
      * Task-specific pipelines can be instantiated with no model or tokenizer
      
      * All pipelines doc
      d3eb7d23