1. 09 May, 2023 1 commit
    • Matthijs Hollemans's avatar
      audio_utils improvements (#21998) · 7f919509
      Matthijs Hollemans authored
      * silly change to allow making a PR
      
      * clean up doc comments
      
      * simplify hertz_to_mel and mel_to_hertz
      
      * fixup
      
      * clean up power_to_db
      
      * also add amplitude_to_db
      
      * move functions
      
      * clean up mel_filter_bank
      
      * fixup
      
      * credit librosa & torchaudio authors
      
      * add unit tests
      
      * tests for power_to_db and amplitude_to_db
      
      * add mel_filter_bank tests
      
      * rewrite STFT
      
      * add convenience spectrogram function
      
      * missing transpose
      
      * fewer transposes
      
      * add integration test to M-CTC-T
      
      * frame length can be either window or FFT length
      
      * rewrite stft API
      
      * add preemphasis coefficient
      
      * move argument
      
      * add log option to spectrogram
      
      * replace M-CTC-T feature extractor
      
      * fix api thing
      
      * replace whisper STFT
      
      * replace whisper mel filters
      
      * replace tvlt's stft
      
      * allow alternate window names
      
      * replace speecht5 stft
      
      * fixup
      
      * fix integration tests
      
      * fix doc comments
      
      * remove manual FFT length calculation
      
      * fix docs
      
      * go away, deprecation warnings
      
      * combine everything into spectrogram function
      
      * add deprecated functions back
      
      * fixup
      7f919509
  2. 04 May, 2023 1 commit
    • amyeroberts's avatar
      Add methods to update and verify out_features out_indices (#23031) · 90e8263d
      amyeroberts authored
      * Add methods to update and verify out_features out_indices
      
      * Safe update for config attributes
      
      * Fix function names
      
      * Save config correctly
      
      * PR comments - use property setters
      
      * PR comment - directly set attributes
      
      * Update test
      
      * Add updates to recently merged focalnet backbone
      90e8263d
  3. 03 May, 2023 1 commit
  4. 27 Apr, 2023 1 commit
  5. 25 Apr, 2023 1 commit
  6. 24 Apr, 2023 2 commits
  7. 10 Apr, 2023 1 commit
  8. 06 Apr, 2023 3 commits
    • Yih-Dar's avatar
      Update tiny model summary file for recent models (#22637) · c7ec71ba
      Yih-Dar authored
      
      
      * Update tiny model summary file for recent models
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      c7ec71ba
    • Nicolas Patry's avatar
    • Nicolas Patry's avatar
      Adding Llama FastTokenizer support. (#22264) · 1670be4b
      Nicolas Patry authored
      * Adding Llama FastTokenizer support.
      
      - Requires https://github.com/huggingface/tokenizers/pull/1183 version
      - Only support byte_fallback for llama, raise otherwise (safety net).
      - Lots of questions are special tokens
      
      How to test:
      
      ```python
      
      from transformers.convert_slow_tokenizer import convert_slow_tokenizer
      from transformers import AutoTokenizer
      from tokenizers import Tokenizer
      
      tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b")
      
      if False:
          new_tokenizer = Tokenizer.from_file("tok.json")
      else:
          new_tokenizer = convert_slow_tokenizer(tokenizer)
          new_tokenizer.save("tok.json")
      
      strings = [
          "This is a test",
          "生活的真谛是",
          "生活的真谛是[MASK]。",
          # XXX: This one is problematic because of special tokens
          # "<s> Something something",
      ]
      
      for string in strings:
          encoded = tokenizer(string)["input_ids"]
          encoded2 = new_tokenizer.encode(string).ids
      
          assert encoded == encoded2, f"{encoded} != {encoded2}"
      
          decoded = tokenizer.decode(encoded)
          decoded2 = new_tokenizer.decode(encoded2)
      
          assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}"
      ```
      
      The converter + some test script.
      
      The test script.
      
      Tmp save.
      
      Adding Fast tokenizer + tests.
      
      Adding the tokenization tests.
      
      Correct combination.
      
      Small fix.
      
      Fixing tests.
      
      Fixing with latest update.
      
      Rebased.
      
      fix copies + normalized added tokens  + copies.
      
      Adding doc.
      
      TMP.
      
      Doc + split files.
      
      Doc.
      
      Versions + try import.
      
      Fix Camembert + warnings -> Error.
      
      Fix by ArthurZucker.
      
      Not a decorator.
      
      * Fixing comments.
      
      * Adding more to docstring.
      
      * Doc rewriting.
      1670be4b
  9. 29 Mar, 2023 1 commit
  10. 23 Mar, 2023 1 commit
  11. 02 Mar, 2023 1 commit
  12. 28 Feb, 2023 2 commits
    • Sylvain Gugger's avatar
      Fix flaky test for log level (#21776) · b29e2dca
      Sylvain Gugger authored
      * Fix flaky test for log level
      
      * Fix other flaky test
      b29e2dca
    • Yih-Dar's avatar
      🔥Rework pipeline testing by removing `PipelineTestCaseMeta` 🚀 (#21516) · 871c31a6
      Yih-Dar authored
      
      
      * Add PipelineTesterMixin
      
      * remove class PipelineTestCaseMeta
      
      * move validate_test_components
      
      * Add for ViT
      
      * Add to SPECIAL_MODULE_TO_TEST_MAP
      
      * style and quality
      
      * Add feature-extraction
      
      * update
      
      * raise instead of skip
      
      * add tiny_model_summary.json
      
      * more explicit
      
      * skip tasks not in mapping
      
      * add availability check
      
      * Add Copyright
      
      * A way to diable irrelevant tests
      
      * update with main
      
      * remove disable_irrelevant_tests
      
      * skip tests
      
      * better skip message
      
      * better skip message
      
      * Add all pipeline task tests
      
      * revert
      
      * Import PipelineTesterMixin
      
      * subclass test classes with PipelineTesterMixin
      
      * Add pipieline_model_mapping
      
      * Fix import after adding pipieline_model_mapping
      
      * Fix style and quality after adding pipieline_model_mapping
      
      * Fix one more import after adding pipieline_model_mapping
      
      * Fix style and quality after adding pipieline_model_mapping
      
      * Fix test issues
      
      * Fix import requirements
      
      * Fix mapping for MobileViTModelTest
      
      * Update
      
      * Better skip message
      
      * pipieline_model_mapping could not be None
      
      * Remove some PipelineTesterMixin
      
      * Fix typo
      
      * revert tests_fetcher.py
      
      * update
      
      * rename
      
      * revert
      
      * Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests
      
      * style and quality
      
      * test fetcher for all pipeline/model tests
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      871c31a6
  13. 27 Feb, 2023 1 commit
  14. 22 Feb, 2023 1 commit
  15. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  16. 02 Feb, 2023 1 commit
  17. 26 Jan, 2023 1 commit
  18. 17 Jan, 2023 2 commits
  19. 30 Nov, 2022 1 commit
  20. 28 Nov, 2022 2 commits
    • amyeroberts's avatar
      321ef388
    • Matt's avatar
      More TF int dtype fixes (#20384) · de4159a3
      Matt authored
      * Add a test to ensure int dummy inputs are int64
      
      * Move the test into the existing int64 test and update a lot of existing dummies
      
      * Fix remaining dummies
      
      * Fix remaining dummies
      
      * Test for int64 serving sigs as well
      
      * Update core tests to use tf.int64
      
      * Add better messages to the assertions
      
      * Update all serving sigs to int64
      
      * More sneaky hiding tf.int32s
      
      * Add an optional int32 signature in save_pretrained
      
      * make fixup
      
      * Add Amy's suggestions
      
      * Switch all serving sigs back to tf.int32
      
      * Switch all dummies to tf.int32
      
      * Adjust tests to check for tf.int32 instead of tf.int64
      
      * Fix base dummy_inputs dtype
      
      * Start casting to tf.int32 in input_processing
      
      * Change dtype for unpack_inputs test
      
      * Add proper tf.int32 test
      
      * Make the alternate serving signature int64
      de4159a3
  21. 23 Nov, 2022 1 commit
  22. 21 Nov, 2022 1 commit
  23. 02 Nov, 2022 1 commit
    • amyeroberts's avatar
      Add Image Processors (#19796) · a6b77598
      amyeroberts authored
      
      
      * Add CLIP image processor
      
      * Crop size as dict too
      
      * Update warning
      
      * Actually use logger this time
      
      * Normalize doesn't change dtype of input
      
      * Add perceiver image processor
      
      * Tidy up
      
      * Add DPT image processor
      
      * Add Vilt image processor
      
      * Tidy up
      
      * Add poolformer image processor
      
      * Tidy up
      
      * Add LayoutLM v2 and v3 imsge processors
      
      * Tidy up
      
      * Add Flava image processor
      
      * Tidy up
      
      * Add deit image processor
      
      * Tidy up
      
      * Add ConvNext image processor
      
      * Tidy up
      
      * Add levit image processor
      
      * Add segformer image processor
      
      * Add in post processing
      
      * Fix up
      
      * Add ImageGPT image processor
      
      * Fixup
      
      * Add mobilevit image processor
      
      * Tidy up
      
      * Add postprocessing
      
      * Fixup
      
      * Add VideoMAE image processor
      
      * Tidy up
      
      * Add ImageGPT image processor
      
      * Fixup
      
      * Add ViT image processor
      
      * Tidy up
      
      * Add beit image processor
      
      * Add mobilevit image processor
      
      * Tidy up
      
      * Add postprocessing
      
      * Fixup
      
      * Fix up
      
      * Fix flava and remove tree module
      
      * Fix image classification pipeline failing tests
      
      * Update feature extractor in trainer scripts
      
      * Update pad_if_smaller to accept tuple and int size
      
      * Update for image segmentation pipeline
      
      * Update src/transformers/models/perceiver/image_processing_perceiver.py
      Co-authored-by: default avatarAlara Dirik <8944735+alaradirik@users.noreply.github.com>
      
      * Update src/transformers/image_processing_utils.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/beit/image_processing_beit.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * PR comments - docstrings; remove accidentally added resize; var names
      
      * Update docstrings
      
      * Add exception if size is not in the right format
      
      * Fix exception check
      
      * Fix up
      
      * Use shortest_edge in tuple in script
      Co-authored-by: default avatarAlara Dirik <8944735+alaradirik@users.noreply.github.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      a6b77598
  24. 24 Oct, 2022 1 commit
  25. 18 Oct, 2022 1 commit
  26. 17 Oct, 2022 1 commit
  27. 12 Oct, 2022 1 commit
  28. 27 Sep, 2022 2 commits
  29. 26 Sep, 2022 1 commit
  30. 21 Sep, 2022 1 commit
  31. 02 Sep, 2022 1 commit
  32. 31 Aug, 2022 1 commit
  33. 17 Aug, 2022 1 commit
    • amyeroberts's avatar
      Update feature extractor methods to enable type cast before normalize (#18499) · 49e44b21
      amyeroberts authored
      * Update methods to optionally rescale
      This is necessary to allow for casting our images / videos to numpy arrays within the feature extractors' call. We want to do this to make sure the behaviour is as expected when flags like  are False. If some transformations aren't applied, then the output type can't be unexpected e.g. a list of PIL images instead of numpy arrays.
      
      * Cast images to numpy arrays in call to enable consistent behaviour with different configs
      
      * Remove accidental clip changes
      
      * Update tests to reflect the scaling logic
      We write a generic  function to handle rescaling of our arrays. In order for the API to be intuitive, we take some factor c and rescale the image values by that. This means, the rescaling done in normalize and to_numpy_array are now done with array * (1/255) instead of array / 255. This leads to small differences in the resulting image. When testing, this was in the order of 1e-8, and so deemed OK
      49e44b21