• Alara Dirik's avatar
    Add OWL-ViT model for zero-shot object detection (#17938) · 12d66b47
    Alara Dirik authored
    * add owlvit model skeleton
    
    * add class and box predictor heads
    
    * convert modified flax clip to pytorch
    
    * fix box and class predictors
    
    * add OwlViTImageTextEmbedder
    
    * convert class and box head checkpoints
    
    * convert image text embedder checkpoints
    
    * add object detection head
    
    * fix bugs
    
    * update conversion script
    
    * update conversion script
    
    * fix q,v,k,out weight conversion conversion
    
    * add owlvit object detection output
    
    * fix bug in image embedder
    
    * fix bugs in text embedder
    
    * fix positional embeddings
    
    * fix bug in inference mode vision pooling
    
    * update docs, init tokenizer and processor files
    
    * support batch processing
    
    * add OwlViTProcessor
    
    * remove merge conflicts
    
    * readd owlvit imports
    
    * fix bug in OwlViTProcessor imports
    
    * fix bugs in processor
    
    * update docs
    
    * fix bugs in processor
    
    * update owlvit docs
    
    * add OwlViTFeatureExtractor
    
    * style changes, add postprocess method to feature extractor
    
    * add feature extractor and processor tests
    
    * add object detection tests
    
    * update conversion script
    
    * update config paths
    
    * update config paths
    
    * fix configuration paths and bugs
    
    * fix bugs in OwlViT tests
    
    * add import checks to processor
    
    * fix docs and minor issues
    
    * fix docs and minor issues
    
    * fix bugs and issues
    
    * fix bugs and issues
    
    * fix bugs and issues
    
    * fix bugs and issues
    
    * update docs and examples
    
    * fix bugs and issues
    
    * update conversion script, fix positional embeddings
    
    * process 2D input ids, update tests
    
    * fix style and quality issues
    
    * update docs
    
    * update docs and imports
    
    * update OWL-ViT index.md
    
    * fix bug in OwlViT feature ext tests
    
    * fix code examples, return_dict by default
    
    * return_dict by default
    
    * minor fixes, add tests to processor
    
    * small fixes
    
    * add output_attentions arg to main model
    
    * fix bugs
    
    * remove output_hidden_states arg from main model
    
    * update self.config variables
    
    * add option to return last_hidden_states
    
    * fix bug in config variables
    
    * fix copied from statements
    
    * fix small issues and bugs
    
    * fix bugs
    
    * fix bugs, support greyscale images
    
    * run fixup
    
    * update repo name
    
    * merge OwlViTImageTextEmbedder with obj detection head
    
    * fix merge conflict
    
    * fix merge conflict
    
    * make fixup
    
    * fix bugs
    
    * fix bugs
    
    * add additional processor test
    12d66b47
README.md 61.7 KB