"flash_attn/vscode:/vscode.git/clone" did not exist on "cb0daccc414021309b8748cbbcbfee5b2604eaf5"
  1. 22 Feb, 2023 1 commit
  2. 16 Feb, 2023 1 commit
  3. 15 Feb, 2023 1 commit
  4. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  5. 02 Feb, 2023 1 commit
  6. 30 Jan, 2023 1 commit
  7. 25 Jan, 2023 1 commit
  8. 18 Jan, 2023 1 commit
  9. 16 Jan, 2023 1 commit
    • Nicolas Patry's avatar
      Fixing batching pipelines on single items for ChunkPipeline (#21132) · 488a179c
      Nicolas Patry authored
      * Fixing #20783
      
      * Update src/transformers/pipelines/base.py
      
      * Fixing some tests.
      
      * Fixup.
      
      * Remove ffmpeg dep + a bit more relaxed for bigbird QA precision.
      
      * Better dataset.
      
      * Prevent failing on TF.
      
      * Better condition. We can't use `can_use_iterator` since we cannot use it
      directly.
      488a179c
  10. 19 Dec, 2022 1 commit
  11. 21 Nov, 2022 1 commit
    • NielsRogge's avatar
      Add Audio Spectogram Transformer (#19981) · 4973d2a0
      NielsRogge authored
      
      
      * First draft
      
      * Make conversion script work
      
      * Add id2label mapping, run code quality
      
      * Fix copies
      
      * Add first draft of feature extractor
      
      * Update conversion script to use feature extractor
      
      * Make more tests pass
      
      * Add docs
      
      * update input_features to input_values + pad by default to max length
      
      * Fix doc tests
      
      * Add feature extractor tests
      
      * Add proper padding/truncation to feature extractor
      
      * Add support for conversion of all audioset checkpoints
      
      * Improve docs and extend conversion script
      
      * Fix README
      
      * Rename spectogram to spectrogram
      
      * Fix copies
      
      * Add integration test
      
      * Remove dummy conv
      
      * Update to ast
      
      * Update organization
      
      * Fix init
      
      * Rename model to AST
      
      * Add require_torchaudio annotator
      
      * Move import of ASTFeatureExtractor under a is_speech_available
      
      * Fix rebase
      
      * Add pipeline config
      
      * Update name of classifier head
      
      * Rename time_dimension and frequency_dimension for clarity
      
      * Remove print statement
      
      * Fix pipeline test
      
      * Fix pipeline test
      
      * Fix index table
      
      * Fix init
      
      * Fix conversion script
      
      * Rename to ForAudioClassification
      
      * Fix index table
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      4973d2a0
  12. 14 Nov, 2022 1 commit
  13. 03 Nov, 2022 1 commit
  14. 17 Oct, 2022 1 commit
    • Sivaudha's avatar
      Fix pipeline predict transform methods (#19657) · 8aad4363
      Sivaudha authored
      * Remove key word argument X from pipeline predict and transform methods
      
      As __call__ of pipeline clasees require one positional argument, passing
      the input as a keyword argument inside predict, transform methods, causing
      __call__ to fail. Hence in this commit the keyword argument is modified
      into positional argument.
      
      * Implement basic tests for scikitcompat pipeline interface
      
      * Seperate tests instead of running with parameterized based on framework as both frameworks will not be active at the same time
      8aad4363
  15. 11 Oct, 2022 1 commit
    • Arthur's avatar
      Fix whisper for `pipeline` (#19482) · b722a6be
      Arthur authored
      * update feature extractor params
      
      * update attention mask handling
      
      * fix doc and pipeline test
      
      * add warning when skipping test
      
      * add whisper translation and transcription test
      
      * fix build doc test
      b722a6be
  16. 07 Oct, 2022 1 commit
    • Sylvain Gugger's avatar
      Rework pipeline tests (#19366) · 9ac586b3
      Sylvain Gugger authored
      * Rework pipeline tests
      
      * Try to fix Flax tests
      
      * Try to put it before
      
      * Use a new decorator instead
      
      * Remove ignore marker since it doesn't work
      
      * Filter pipeline tests
      
      * Woopsie
      
      * Use the fitlered list
      
      * Clean up and fake modif
      
      * Remove init
      
      * Revert fake modif
      9ac586b3
  17. 05 Oct, 2022 1 commit
  18. 06 Sep, 2022 1 commit
  19. 10 Aug, 2022 1 commit
  20. 05 Aug, 2022 1 commit
  21. 19 Jul, 2022 1 commit
  22. 01 Jul, 2022 1 commit
  23. 30 Jun, 2022 2 commits
  24. 12 May, 2022 1 commit
  25. 05 May, 2022 1 commit
  26. 04 Mar, 2022 1 commit
  27. 23 Feb, 2022 2 commits
    • Lysandre Debut's avatar
      [Test refactor 1/5] Per-folder tests reorganization (#15725) · 29c10a41
      Lysandre Debut authored
      
      
      * Per-folder tests reorganization
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      Co-authored-by: default avatarStas Bekman <stas@stason.org>
      29c10a41
    • Nicolas Patry's avatar
      Enable `image-segmentation` on `AutoModelForSemanticSegmentation` (#15647) · 9e71d464
      Nicolas Patry authored
      * Enabling Beit SegFormer to `image-segmentation`.
      
      * Fixing the score.
      
      * Fix import ?
      
      * Missing in type hint.
      
      * Multiple test fixes:
      
      - Add `raw_image` support. It should be the default IMHO since in Python
        world it doesn't make any sense to base64 encode the image (Sorry
        @mishig, didn't catch that in my review). I really think we should
        consider breaking BC here.
      - Add support for Segformer tiny test (needed
        `SegformerModelTester.get_config` to enable TinyConfig
        @NielsRogge)
      - Add the check that `batch_size` works correctly on that pipeline.
        Uncovered that it doesn't for Detr, which IMO is OK since images
        after `feature_extractor` don't have the same size. Comment should
        explain.
      
      * Type hint as a string.
      
      * Make fixup + update black.
      
      * torch+vision protections.
      
      * Don't use torchvision, use F.interpolate instead (no new dep).
      
      * Last fixes for Segformer.
      
      * Update test to reflect new image (which was broken)
      
      * Update tests.
      
      * Major BC modification:
      
      - Removed the string compressed PNG string, that's a job for users
      `transformers` stays in python land.
      - Removed the `score` for semantic segmentation. It has hardly a meaning
        on its own in this context.
      - Don't include the grayscale with logits for now (which could enable
        users to get a sense of confidence). Might be done later.
      - Don't include the surface of the mask (could be used for sorting by
        users, to filter out small masks). It's already calculable, and
        it's easier to add later, than to add now and break later if we need.
      
      * `make fixup`.
      
      * Small changes.
      
      * Rebase + doc fixup.
      9e71d464
  28. 05 Jan, 2022 1 commit
  29. 04 Jan, 2022 1 commit
    • Nicolas Patry's avatar
      Hotfix `chunk_length_s` instead of `_ms`. (#15029) · 19d37c2d
      Nicolas Patry authored
      * Hotfix `chunk_length_s` instead of `_ms`.
      
      * Adding fix of `pad_token` which should be last/previous token for CTC
      
      proper decoding
      
      * Fixing ChunkPipeline unwrapping.
      
      * Adding a PackIterator specific test.
      19d37c2d
  30. 27 Dec, 2021 1 commit
    • Nicolas Patry's avatar
      ChunkPipeline (batch_size enabled on `zero-cls` and `qa` pipelines. (#14225) · b058490c
      Nicolas Patry authored
      
      
      * Pipeline chunks.
      
      * Batching for Chunking pipelines ?
      
      * Batching for `question-answering` and `zero-shot-cls`.
      
      * Fixing for FNet.
      
      * Making ASR a chunk pipeline.
      
      * Chunking ASR API.
      
      * doc style.
      
      * Fixing ASR test.
      
      * Fixing QA eror (p_mask, padding is 1, not 0).
      
      * Enable both vad and simple chunking.
      
      * Max length for vad.
      
      * remove inference mode, crashing on s2t.
      
      * Revert ChunkPipeline for ASRpipeline.
      
      Too many knobs for simple integration within the pipeline, better stick
      to external convenience functions instead, more control to be had,
      simpler pipeline and also easier to replace with other things later.
      
      * Drop necessity for PT for these.
      
      * Enabling generators.
      
      * Add mic + cleanup.
      
      * Typo.
      
      * Typo2.
      
      * Remove ASR work, it does not belong in this PR anymore.
      
      * Update src/transformers/pipelines/pt_utils.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Update src/transformers/pipelines/zero_shot_classification.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Adding many comments.
      
      * Doc quality.
      
      * `hidden_states` handling.
      
      * Adding doc.
      
      * Bad rebase.
      
      * Autofixing docs.
      
      * Fixing CRITICAL bug in the new Zerocls pipeline.
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      b058490c
  31. 14 Dec, 2021 1 commit
    • Nicolas Patry's avatar
      Fixing tests for Perceiver (#14739) · 546a91ab
      Nicolas Patry authored
      * Adding some slow test to check for perceiver at least from a high level.
      
      * Re-enabling fast tests for Perceiver ImageClassification.
      
      * Perceiver might try to run without Tokenizer (Fast doesn't exist) and
      with FeatureExtractor some text only pipelines.
      
      * Oops.
      
      * Adding a comment for `update_config_with_model_class`.
      
      * Remove `model_architecture` to get `tiny_config`.
      
      * Finalize rebase.
      
      * Smarter way to handle undefined FastTokenizer.
      
      * Remove old code.
      
      * Addressing some nits.
      
      * Don't instantiate `None`.
      546a91ab
  32. 13 Dec, 2021 1 commit
    • Lysandre Debut's avatar
      Fixing tests for Perceiver (#14745) · 3d66146a
      Lysandre Debut authored
      
      
      - Do not run image-classification pipeline (_CHECKPOINT_FOR_DOC uses the checkpoint for
      langage, which cannot load a FeatureExtractor so current logic fails).
      - Add a safeguard to not run tests when `tokenizer_class` or
      `feature_extractor_class` **are** defined, but cannot be loaded
      This happens for Perceiver for the "FastTokenizer" (which doesn't exist
      so None) and FeatureExtractor (which does exist but cannot be loaded
      because the checkpoint doesn't define one which is reasonable for the
      said checkpoint)
      - Added `get_vocab` function to `PerceiverTokenizer` since it is used by
      `fill-mask` pipeline when the argument `targets` is used to narrow a
      subset of possible values.
      Co-authored-by: default avatarNicolas Patry <patry.nicolas@protonmail.com>
      3d66146a
  33. 08 Dec, 2021 1 commit
  34. 22 Nov, 2021 1 commit
  35. 19 Nov, 2021 1 commit
  36. 12 Nov, 2021 1 commit
    • Nicolas Patry's avatar
      Adding support for raw python `generator` in addition to `Dataset` for pipelines (#14352) · ed5d1551
      Nicolas Patry authored
      * Adding support for raw python `generator` in addition to `Dataset`
      
      The main goal is to ease the create of streaming data to the pipe.
      
      `Dataset` is more involved and pytorch specific.
      
      This PR, provides a way to use a python iterator too.
      This enabled #14250 but can be proposed as a standalone PR.
      
      ```python
      from transformers import pipeline
      
      def read_data(filename):
          with open(filename, 'r') as f:
              for line in f:
                  yield f
      
      pipe = pipeline("text-classification")
      for classified in pipe(read_data("large_file.txt")):
          print("Success ! ", classified)
      ```
      
      The main caveat of this, is the interaction with `DataLoader` with
      `num_workers>1`. When you have multiple workers, each receive a copy
      of the generator (like `IterableDataset`). That means the naive Iterator
      will fail since all workers iterate on all items of the generator.
      
      There are ways to do clever "skipping", but it could be bad still
      because all workers still do have to pass through all items of the
      generator (they just ignore items they don't handle), depending on
      the case it might be bad.
      
      Using `num_workers=1` is the simplest fix and if the cost of loading
      your data is small enough should be good enough. In the above example
      trying to do smart tricks to skip some lines is unlikely to be a net
      positive for instance.
      
      If there are better ways to do "jumps" on some data, then using
      `Dataset` is more advised (since then differents workers can just jump
      themselves).
      
      * Adding iterator support for `tf` too.
      ed5d1551
  37. 10 Nov, 2021 1 commit
  38. 03 Nov, 2021 1 commit