• NielsRogge's avatar
    Add ViLT (#14895) · ac227093
    NielsRogge authored
    
    
    * First commit
    
    * Add conversion script
    
    * Make conversion script work for base model
    
    * More improvements
    
    * Update conversion script, works for vqa
    
    * Add indexing argument to meshgrid
    
    * Make conversion script work for ViltForPreTraining
    
    * Add ViltForPreTraining to docs
    
    * Fix device issue
    
    * Add processor
    
    * Add MinMaxResize to feature extractor
    
    * Implement call method of ViltProcessor
    
    * Fix tests
    
    * Add integration test
    
    * Add loss calculation for VQA
    
    * Improve tests
    
    * Improve some more tests
    
    * Debug tests
    
    * Small improvements
    
    * Add support for attention_mask
    
    * Remove mask_it
    
    * Add pixel_mask
    
    * Add tests for ViltFeatureExtractor
    
    * Improve tests
    
    * Add ViltForNaturalLanguageVisualReasoning
    
    * Add ViltForNaturalLanguageVisualReasoning to conversion script
    
    * Minor fixes
    
    * Add support for image_embeds, update docstrings to markdown
    
    * Update docs to markdown
    
    * Improve conversion script
    
    * Rename ViltForPreTraining to ViltForMaskedLM
    
    * Improve conversion script
    
    * Convert docstrings to markdown
    
    * Fix code example of retrieval model
    
    * Properly convert masked language model
    
    * Add integration test for nlvr
    
    * Fix code quality
    
    * Apply suggestions from code review
    
    * Add copied from statements
    
    * Fix pretrained_config_archive_map
    
    * Fix docs
    
    * Add model to README
    
    * Apply suggestions from code review
    Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
    
    * Apply more suggestions from code review
    
    * Make code more readable
    
    * Add ViltForNaturalLanguageVisualReasoning to the tests
    
    * Rename ViltForVisualQuestionAnswering to ViltForQuestionAnswering
    
    * Replace pixel_values_2 by single tensor
    
    * Add hidden_states and attentions
    
    * Fix one more test
    
    * Fix all tests
    
    * Update year
    
    * Fix rebase issues
    
    * Fix another rebase issue
    
    * Remove ViltForPreTraining from auto mapping
    
    * Rename ViltForImageRetrievalTextRetrieval to ViltForImageAndTextRetrieval
    
    * Make it possible to use BertTokenizerFast in the processor
    
    * Use BertTokenizerFast by default
    
    * Rename ViltForNaturalLanguageVisualReasoning, define custom model output
    Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
    ac227093
README.md 48.6 KB