1. 07 Oct, 2022 1 commit
    • Amrit Sahu's avatar
      [WIP] Add ZeroShotObjectDetectionPipeline (#18445) (#18930) · e9a49bab
      Amrit Sahu authored
      * Add ZeroShotObjectDetectionPipeline (#18445)
      
      * Add AutoModelForZeroShotObjectDetection task
      
      This commit also adds the following
      
      - Add explicit _processor method for ZeroShotObjectDetectionPipeline.
        This is necessary as pipelines don't auto infer processors yet and
        `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
        process multiple images at once
      
      - Add auto tests and other tests for ZeroShotObjectDetectionPipeline
      
      * Add AutoModelForZeroShotObjectDetection task
      
      This commit also adds the following
      
      - Add explicit _processor method for ZeroShotObjectDetectionPipeline.
        This is necessary as pipelines don't auto infer processors yet and
        `OwlVitProcessor` wraps tokenizer and feature_extractor together, to
        process multiple images at once
      
      - Add auto tests and other tests for ZeroShotObjectDetectionPipeline
      
      * Add batching for ZeroShotObjectDetectionPipeline
      
      * Fix doc-string ZeroShotObjectDetectionPipeline
      
      * Fix output format: ZeroShotObjectDetectionPipeline
      e9a49bab
  2. 07 Sep, 2022 1 commit
    • Ankur Goyal's avatar
      Add DocumentQuestionAnswering pipeline (#18414) · 2ef77421
      Ankur Goyal authored
      
      
      * [WIP] Skeleton of VisualQuestionAnweringPipeline extended to support LayoutLM-like models
      
      * Fixup
      
      * Use the full encoding
      
      * Basic refactoring to DocumentQuestionAnsweringPipeline
      
      * Cleanup
      
      * Improve args, docs, and implement preprocessing
      
      * Integrate OCR
      
      * Refactor question_answering pipeline
      
      * Use refactored QA code in the document qa pipeline
      
      * Fix tests
      
      * Some small cleanups
      
      * Use a string type annotation for Image.Image
      
      * Update encoding with image features
      
      * Wire through the basic docs
      
      * Handle invalid response
      
      * Handle empty word_boxes properly
      
      * Docstring fix
      
      * Integrate Donut model
      
      * Fixup
      
      * Incorporate comments
      
      * Address comments
      
      * Initial incorporation of tests
      
      * Address Comments
      
      * Change assert to ValueError
      
      * Comments
      
      * Wrap `score` in float to make it JSON serializable
      
      * Incorporate AutoModeLForDocumentQuestionAnswering changes
      
      * Fixup
      
      * Rename postprocess function
      
      * Fix auto import
      
      * Applying comments
      
      * Improve docs
      
      * Remove extra assets and add copyright
      
      * Address comments
      Co-authored-by: default avatarAnkur Goyal <ankur@impira.com>
      2ef77421
  3. 12 Aug, 2022 1 commit
  4. 04 Aug, 2022 1 commit
    • NielsRogge's avatar
      Add VideoMAE (#17821) · f9a0008d
      NielsRogge authored
      
      
      * First draft
      
      * Add VideoMAEForVideoClassification
      
      * Improve conversion script
      
      * Add VideoMAEForPreTraining
      
      * Add VideoMAEFeatureExtractor
      
      * Improve VideoMAEFeatureExtractor
      
      * Improve docs
      
      * Add first draft of model tests
      
      * Improve VideoMAEForPreTraining
      
      * Fix base_model_prefix
      
      * Make model take pixel_values of shape (B, T, C, H, W)
      
      * Add loss computation of VideoMAEForPreTraining
      
      * Improve tests
      
      * Improve model tests茅
      
      * Make all tests pass
      
      * Add VideoMAE to main README
      
      * Add tests for VideoMAEFeatureExtractor
      
      * Add integration test
      
      * Improve conversion script
      
      * Rename patch embedding class
      
      * Remove VideoMAELayer from init
      
      * Update design of patch embeddings
      
      * Improve comments
      
      * Improve conversion script
      
      * Improve conversion script
      
      * Add conversion of pretrained model
      
      * Add loss verification of pretrained model
      
      * Add loss verification of unnormalized targets
      
      * Add integration test for pretraining model
      
      * Apply suggestions from code review
      
      * Fix bug to make feature extractor resize only shorter edge
      
      * Address more comments
      
      * Improve normalization of videos
      
      * Add doc examples
      
      * Move constants to dedicated script
      
      * Remove scripts
      
      * Transfer checkpoints, fix docs
      
      * Update script
      
      * Update image mean and std
      
      * Fix doc tests
      
      * Set return_tensors to NumPy by default
      
      * Revert the previous change
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      f9a0008d
  5. 13 Jun, 2022 1 commit
  6. 09 May, 2022 1 commit
    • Manan Dey's avatar
      add `mobilebert` onnx configs (#17029) · dc3645dc
      Manan Dey authored
      * update docs of length_penalty
      
      * Revert "update docs of length_penalty"
      
      This reverts commit 466bf4800b75ec29bd2ff75bad8e8973bd98d01c.
      
      * add mobilebert onnx config
      
      * address suggestions
      
      * Update auto.mdx
      
      * Update __init__.py
      
      * Update features.py
      dc3645dc
  7. 04 Apr, 2022 1 commit
  8. 04 Mar, 2022 1 commit
  9. 17 Feb, 2022 1 commit
    • NielsRogge's avatar
      Add SimMIM (#15586) · 57882177
      NielsRogge authored
      
      
      * Add first draft
      
      * Make model importable
      
      * Make SwinForMaskedImageModeling importable
      
      * Fix imports
      
      * Add missing inits
      
      * Add support for Swin
      
      * Fix bug
      
      * Fix bug
      
      * Fix another bug
      
      * Fix Swin MIM implementation
      
      * Fix default encoder stride
      
      * Fix Swin
      
      * Add print statements for debugging
      
      * Add image_size data argument
      
      * Fix Swin
      
      * Fix image_size
      
      * Add print statements for debugging
      
      * Fix print statement
      
      * Remove print statements
      
      * Improve reshaping of bool_masked_pos
      
      * Add support for DeiT, fix tests
      
      * Improve docstrings
      
      * Apply new black version
      
      * Improve script
      
      * Fix bug
      
      * Improve README
      
      * Apply suggestions from code review
      
      * Remove DS_Store and add to gitignore
      
      * Apply suggestions from code review + fix BEiT Flax
      
      * Revert BEiT changes
      
      * Improve README
      
      * Fix code quality
      
      * Improve README
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MBP.localdomain>
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      57882177
  10. 08 Feb, 2022 1 commit
    • Joao Gante's avatar
      Add TFSpeech2Text (#15113) · 8406fa6d
      Joao Gante authored
      * Add wrapper classes
      
      * convert inner layers to tf
      
      * Add TF Encoder and Decoder layers
      
      * TFSpeech2Text models
      
      * Loadable model
      
      * TF model with same outputs as PT model
      
      * test skeleton
      
      * correct tests and run the fixup
      
      * correct attention expansion
      
      * TFSpeech2Text pask_key_values with TF format
      8406fa6d
  11. 04 Feb, 2022 1 commit
  12. 10 Jan, 2022 1 commit
    • Yih-Dar's avatar
      Add TFVisionEncoderDecoderModel (#14148) · b67fd797
      Yih-Dar authored
      
      
      * Start the work on TFVisionEncoderDecoderModel
      
      * Expose TFVisionEncoderDecoderModel
      
      * fix import
      
      * Add modeling_tf_vision_encoder_decoder to _ignore_modules in get_model_modules()
      
      * reorder
      
      * Apply the fix for checkpoint loading as in #14016
      
      * remove attention_mask + fix VISION_DUMMY_INPUTS
      
      * A minimal change to make TF generate() work for vision models as encoder in encoder-decoder setting
      
      * fix wrong condition: shape_list(input_ids) == 2
      
      * add tests
      
      * use personal TFViTModel checkpoint (for now)
      
      * Add equivalence tests + projection layer
      
      * style
      
      * make sure projection layer can run
      
      * Add examples
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Clean comments (need to work on TODOs for PyTorch models)
      
      * Remove TF -> PT in check_pt_tf_equivalence for TFVisionEncoderDecoderModel
      
      * fixes
      
      * Revert changes in PT code.
      
      * Update tests/test_modeling_tf_vision_encoder_decoder.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Add test_inference_coco_en for TF test
      
      * fix quality
      
      * fix name
      
      * build doc
      
      * add main_input_name
      
      * Fix ckpt name in test
      
      * fix diff between master and this PR
      
      * fix doc
      
      * fix style and quality
      
      * fix missing doc
      
      * fix labels handling
      
      * Delete auto.rst
      
      * Add the changes done in #14016
      
      * fix prefix
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * make style
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      b67fd797
  13. 28 Dec, 2021 1 commit
  14. 27 Dec, 2021 1 commit