1. 08 Dec, 2021 1 commit
    • NielsRogge's avatar
      Add Perceiver IO (#14487) · 65b20b73
      NielsRogge authored
      * First draft
      
      * Style and remove mlm
      
      * Make forward pass work
      
      * More improvements
      
      * More improvements
      
      * Fix bug
      
      * More improvements
      
      * More improvements
      
      * Add PerceiverTokenizer first draft
      
      * Improve conversion script
      
      * More improvements
      
      * Make conversion script work for the encoder
      
      * Make conversion script work with local pickle files
      
      * Style & quality, fix-copies
      
      * Add dummy input to conversion script
      
      * Add absolute position embeddings to TextPreProcessor
      
      * Make forward pass of encoder work
      
      * More improvements
      
      * Move text preprocessor to separate script
      
      * More improvements
      
      * More improvements
      
      * Add post processor
      
      * Make MLM model work
      
      * Style
      
      * Add PerceiverForMaskedLM
      
      * Add PerceiverImagePreprocessor
      
      * Make style
      
      * Make PerceiverForImageClassification work
      
      * More improvements
      
      * More improvements
      
      * Use tokenizer in conversion script
      
      * Use PerceiverForMaskedLM in conversion script
      
      * Define custom PerceiverModelOutput
      
      * Improve PerceiverAttention to make it work for both MLM and image classification
      
      * More improvements
      
      * More improvements
      
      * More improvements to the conversion script
      
      * Make conversion script work for both MLM and image classification
      
      * Add PerceiverFeatureExtractor
      
      * More improvements
      
      * Style and quality
      
      * Add center cropping
      
      * Fix bug
      
      * Small fix
      
      * Add print statement
      
      * Fix bug in image preprocessor
      
      * Fix bug with conversion script
      
      * Make output position embeddings an nn.Parameter layer instead of nn.Embedding
      
      * Comment out print statements
      
      * Add position encoding classes
      
      * More improvements
      
      * Use position_encoding_kwargs
      
      * Add PerceiverForImageClassificationFourier
      
      * Make style & quality
      
      * Add PerceiverForImageClassificationConvProcessing
      
      * Style & quality
      
      * Add flow model
      
      * Move processors to modeling file
      
      * Make position encodings modular
      
      * Make basic decoder use modular position encodings
      
      * Add PerceiverForOpticalFlow to conversion script
      
      * Add AudioPreprocessor
      
      * Make it possible for the basic decoder to use Fourier position embeddings
      
      * Add PerceiverForMultimodalAutoencoding
      
      * Improve model for optical flow
      
      * Improve _build_network_inputs method
      
      * Add print statement
      
      * Fix device issue
      
      * Fix device of Fourier embeddings
      
      * Add print statements for debugging
      
      * Add another print statement
      
      * Add another print statement
      
      * Add another print statement
      
      * Add another print statement
      
      * Improve PerceiverAudioPreprocessor
      
      * Improve conversion script for multimodal modal
      
      * More improvements
      
      * More improvements
      
      * Improve multimodal model
      
      * Make forward pass multimodal model work
      
      * More improvements
      
      * Improve tests
      
      * Fix some more tests
      
      * Add output dataclasses
      
      * Make more tests pass
      
      * Add print statements for debuggin
      
      * Add tests for image classification
      
      * Add PerceiverClassifierOutput
      
      * More improvements
      
      * Make more tests pass for the optical flow model
      
      * Make style & quality
      
      * Small improvements
      
      * Don't support training for optical flow model for now
      
      * Fix _prepare_for_class for tests
      
      * Make more tests pass, add some docs
      
      * Add multimodal model to tests
      
      * Minor fixes
      
      * Fix tests
      
      * Improve conversion script
      
      * Make fixup
      
      * Remove pos_dim argument
      
      * Fix device issue
      
      * Potential fix for OOM
      
      * Revert previous commit
      
      * Fix test_initialization
      
      * Add print statements for debugging
      
      * Fix print statement
      
      * Add print statement
      
      * Add print statement
      
      * Add print statement
      
      * Add print statement
      
      * Add print statement
      
      * Add print statement
      
      * Remove need for output_shape
      
      * Comment out output_shape
      
      * Remove unnecessary code
      
      * Improve docs
      
      * Fix make fixup
      
      * Remove PerceiverTextProcessor from init
      
      * Improve docs
      
      * Small improvement
      
      * Apply first batch of suggestions from code review
      
      * Apply more suggestions from code review
      
      * Update docstrings
      
      * Define dicts beforehand for readability
      
      * Rename task to architecture in conversion script, include PerceiverModel in tests
      
      * Add print statements for debugging
      
      * Fix tests on GPU
      
      * Remove preprocessors, postprocessors and decoders from main init
      
      * Add integration test
      
      * Fix docs
      
      * Replace einops by torch
      
      * Update for new docs frontend
      
      * Rename PerceiverForImageClassification
      
      * Improve docs
      
      * Improve docs
      
      * Improve docs of PerceiverModel
      
      * Fix some more tests
      
      * Improve center_crop
      
      * Add PerceiverForSequenceClassification
      
      * Small improvements
      
      * Fix tests
      
      * Add integration test for optical flow model
      
      * Clean up
      
      * Add tests for tokenizer
      
      * Fix tokenizer by adding special tokens properly
      
      * Fix CI
      65b20b73