• NielsRogge's avatar
    Add LUKE (#11223) · f3cf8ae7
    NielsRogge authored
    
    
    * Rebase with master
    
    * Minor bug fix in docs
    
    * Copy files from adding_luke_v2 and improve docs
    
    * change the default value of use_entity_aware_attention to True
    
    * remove word_hidden_states
    
    * fix head models
    
    * fix tests
    
    * fix the conversion script
    
    * add integration tests for the pretrained large model
    
    * improve docstring
    
    * Improve docs, make style
    
    * fix _init_weights for pytorch 1.8
    
    * improve docs
    
    * fix tokenizer to construct entity sequence with [MASK] entity when entities=None
    
    * Make fix-copies
    
    * Make style & quality
    
    * Bug fixes
    
    * Add LukeTokenizer to init
    
    * Address most comments by @patil-suraj and @LysandreJik
    
    * rename _compute_extended_attention_mask to get_extended_attention_mask
    
    * add comments to LukeSelfAttention
    
    * fix the documentation of the tokenizer
    
    * address comments by @patil-suraj, @LysandreJik, and @sgugger
    
    * improve docs
    
    * Make style, quality and fix-copies
    
    * Improve docs
    
    * fix docs
    
    * add "entity_span_classification" task
    
    * update example code for LukeForEntitySpanClassification
    
    * improve docs
    
    * improve docs
    
    * improve the code example in luke.rst
    
    * rename the classification layer in LukeForEntityClassification from typing to classifier
    
    * add bias to the classifier in LukeForEntitySpanClassification
    
    * update docs to use fine-tuned hub models in code examples of the head models
    
    * update the example sentences
    
    * Make style & quality
    
    * Add require_torch to tokenizer tests
    
    * Add require_torch to tokenizer tests
    
    * Address comments by @sgugger and add community notebooks
    
    * Make fix-copies
    Co-authored-by: default avatarIkuya Yamada <ikuya@ikuya.net>
    f3cf8ae7
tokenization_auto.py 20.5 KB