1. 08 May, 2023 1 commit
  2. 04 May, 2023 1 commit
  3. 03 May, 2023 1 commit
  4. 20 Apr, 2023 1 commit
  5. 18 Apr, 2023 1 commit
  6. 17 Apr, 2023 1 commit
  7. 13 Apr, 2023 1 commit
  8. 07 Apr, 2023 1 commit
  9. 06 Apr, 2023 1 commit
    • Nicolas Patry's avatar
      Adding Llama FastTokenizer support. (#22264) · 1670be4b
      Nicolas Patry authored
      * Adding Llama FastTokenizer support.
      
      - Requires https://github.com/huggingface/tokenizers/pull/1183 version
      - Only support byte_fallback for llama, raise otherwise (safety net).
      - Lots of questions are special tokens
      
      How to test:
      
      ```python
      
      from transformers.convert_slow_tokenizer import convert_slow_tokenizer
      from transformers import AutoTokenizer
      from tokenizers import Tokenizer
      
      tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b")
      
      if False:
          new_tokenizer = Tokenizer.from_file("tok.json")
      else:
          new_tokenizer = convert_slow_tokenizer(tokenizer)
          new_tokenizer.save("tok.json")
      
      strings = [
          "This is a test",
          "生活的真谛是",
          "生活的真谛是[MASK]。",
          # XXX: This one is problematic because of special tokens
          # "<s> Something something",
      ]
      
      for string in strings:
          encoded = tokenizer(string)["input_ids"]
          encoded2 = new_tokenizer.encode(string).ids
      
          assert encoded == encoded2, f"{encoded} != {encoded2}"
      
          decoded = tokenizer.decode(encoded)
          decoded2 = new_tokenizer.decode(encoded2)
      
          assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}"
      ```
      
      The converter + some test script.
      
      The test script.
      
      Tmp save.
      
      Adding Fast tokenizer + tests.
      
      Adding the tokenization tests.
      
      Correct combination.
      
      Small fix.
      
      Fixing tests.
      
      Fixing with latest update.
      
      Rebased.
      
      fix copies + normalized added tokens  + copies.
      
      Adding doc.
      
      TMP.
      
      Doc + split files.
      
      Doc.
      
      Versions + try import.
      
      Fix Camembert + warnings -> Error.
      
      Fix by ArthurZucker.
      
      Not a decorator.
      
      * Fixing comments.
      
      * Adding more to docstring.
      
      * Doc rewriting.
      1670be4b
  10. 03 Apr, 2023 2 commits
  11. 29 Mar, 2023 2 commits
  12. 24 Mar, 2023 2 commits
  13. 22 Mar, 2023 1 commit
  14. 21 Mar, 2023 2 commits
  15. 17 Mar, 2023 1 commit
    • Ali Hassani's avatar
      Fix natten (#22229) · 3028b20a
      Ali Hassani authored
      * Add kernel size to NATTEN's QK arguments.
      
      The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional
      argument to the QK operation to allow optional RPBs.
      
      This ends up failing NATTEN tests.
      
      This commit adds NATTEN back to circleci and adds the arguments to get
      it working again.
      
      * Force NATTEN >= 0.14.5
      3028b20a
  16. 14 Mar, 2023 1 commit
  17. 02 Mar, 2023 1 commit
    • amyeroberts's avatar
      Use PyAV instead of Decord in examples (#21572) · 3412f597
      amyeroberts authored
      * Use PyAV instead of Decord
      
      * Get frame indices
      
      * Fix number of frames
      
      * Update src/transformers/models/videomae/image_processing_videomae.py
      
      * Fix up
      
      * Fix copies
      
      * Update timesformer doctests
      
      * Update docstrings
      3412f597
  18. 16 Feb, 2023 1 commit
  19. 13 Feb, 2023 1 commit
  20. 09 Feb, 2023 1 commit
  21. 06 Feb, 2023 1 commit
    • Sylvain Gugger's avatar
      Update quality tooling for formatting (#21480) · 6f79d264
      Sylvain Gugger authored
      * Result of black 23.1
      
      * Update target to Python 3.7
      
      * Switch flake8 to ruff
      
      * Configure isort
      
      * Configure isort
      
      * Apply isort with line limit
      
      * Put the right black version
      
      * adapt black in check copies
      
      * Fix copies
      6f79d264
  22. 31 Jan, 2023 1 commit
    • NielsRogge's avatar
      Add DETA (#20983) · 5451f889
      NielsRogge authored
      * First draft
      
      * Add initial draft of conversion script
      
      * Convert all weights
      
      * Fix config
      
      * Add image processor
      
      * Fix DetaImageProcessor
      
      * Run make fix copies
      
      * Remove timm dependency
      
      * Fix dummy objects
      
      * Improve loss function
      
      * Remove conv_encoder attribute
      
      * Update conversion scripts
      
      * Improve postprocessing + docs
      
      * Fix copied from statements
      
      * Add tests
      
      * Improve postprocessing
      
      * Improve postprocessing
      
      * Update READMEs
      
      * More improvements
      
      * Fix rebase
      
      * Add is_torchvision_available
      
      * Add torchvision dependency
      
      * Fix typo and README
      
      * Fix bug
      
      * Add copied from
      
      * Fix style
      
      * Apply suggestions
      
      * Fix thanks to @ydshieh
      
      * Fix another dependency check
      
      * Simplify image processor
      
      * Add scipy
      
      * Improve code
      
      * Add threshold argument
      
      * Fix bug
      
      * Set default threshold
      
      * Improve integration test
      
      * Add another integration test
      
      * Update setup.py
      
      * Address review
      
      * Improve deformable attention function
      
      * Improve copied from
      
      * Use relative imports
      
      * Address review
      
      * Replace assertions
      
      * Address review
      
      * Update dummies
      
      * Remove dummies
      
      * Address comments, update READMEs
      
      * Remove custom kernel code
      
      * Add image processor tests
      
      * Add requires_backends
      
      * Add minor comment
      
      * Update scripts
      
      * Update organization name
      
      * Fix defaults, add doc tests
      
      * Add id2label for object 365
      
      * Fix tests
      
      * Update task guide
      5451f889
  23. 30 Jan, 2023 1 commit
  24. 23 Jan, 2023 1 commit
  25. 18 Jan, 2023 1 commit
  26. 31 Dec, 2022 1 commit
    • Hao Wang's avatar
      update pyknp to rhoknp (#20890) · 375801d5
      Hao Wang authored
      * update pyknp to rhoknp
      
      * fix linter
      
      * fix linter
      
      * fix linter
      
      * fix linter
      
      * fix linter
      
      * support rhoknp==1.1.0, fix testcase
      375801d5
  27. 16 Dec, 2022 1 commit
  28. 08 Dec, 2022 1 commit
    • Nathan Raw's avatar
      Add video classification pipeline (#20151) · 9e56aff5
      Nathan Raw authored
      * 🚧 wip video classification pipeline
      
      * 🚧 wip - add is_decord_available check
      
      * 🐛 add missing import
      
      *  add tests
      
      * 🔧 add decord to setup extras
      
      * 🚧 add is_decord_available
      
      *  add video-classification pipeline
      
      * 📝 add video classification pipe to docs
      
      * 🐛 add missing VideoClassificationPipeline import
      
      * 📌 add decord install in test runner
      
      *  fix url inputs to video-classification pipeline
      
      *  updates from review
      
      * 📝 add video cls pipeline to docs
      
      * 📝 add docstring
      
      * 🔥 remove unused import
      
      * 🔥 remove some code
      
      * 📝 docfix
      9e56aff5
  29. 06 Dec, 2022 1 commit
  30. 01 Dec, 2022 1 commit
  31. 29 Nov, 2022 1 commit
    • Pi Esposito's avatar
      add in layer gpt2 tokenizer (#20421) · fb2b45e5
      Pi Esposito authored
      * add minimal working gpt2 tokenizer
      
      * graph mode and output equivalence tests working
      
      * not today tensorflow. serialization test passing!
      
      * fix style, documentation, docstrings and all that jazz
      
      * passing consistency checks
      
      * move keras nlp to tf dependencies
      
      * fix tf modeling utils and gpt2 attention to enable compiling
      
      * fix (I hope) keras nlp dependencies
      
      * rever changes on generation
      
      * remove debug prints
      
      * remove redundant tf dummy objects
      
      * add from config, get config and max length settings to address review
      
      * let flake ignore the error on distillation you are welcome
      
      * test from config
      
      * add padding test
      
      * address sgugger review
      fb2b45e5
  32. 18 Nov, 2022 4 commits
  33. 15 Nov, 2022 1 commit
    • Sylvain Gugger's avatar
      Enable PyTorch 1.13 (#20168) · 9643ecf8
      Sylvain Gugger authored
      * Try PT1.13 by removing torch scatter
      
      * Skip failing tests
      
      * Style
      
      * Remvoe testing extras for repo utils
      
      * Try with all decorators
      
      * Try to wipe the cache
      
      * Fix all tests?
      
      * Try this way
      
      * Fix comma
      
      * Update to main
      
      * Try with less deps
      
      * Quality
      9643ecf8