• Pradhy729's avatar
    Feed forward chunking (#6024) · b25cec13
    Pradhy729 authored
    
    
    * Chunked feed forward for Bert
    
    This is an initial implementation to test applying feed forward chunking for BERT.
    Will need additional modifications based on output and benchmark results.
    
    * Black and cleanup
    
    * Feed forward chunking in BertLayer class.
    
    * Isort
    
    * add chunking for all models
    
    * fix docs
    
    * Fix typo
    Co-authored-by: default avatarpatrickvonplaten <patrick.v.platen@gmail.com>
    b25cec13
test_modeling_bert.py 17.4 KB