• Patrick von Platen's avatar
    [Flax] Adapt Flax models to new structure (#9484) · 0b98ca36
    Patrick von Platen authored
    
    
    * Create modeling_flax_eletra with code copied from modeling_flax_bert
    
    * Add ElectraForMaskedLM and ElectraForPretraining
    
    * Add modeling test for Flax electra and fix naming and arg in Flax Electra model
    
    * Add documentation
    
    * Fix code style
    
    * Create modeling_flax_eletra with code copied from modeling_flax_bert
    
    * Add ElectraForMaskedLM and ElectraForPretraining
    
    * Add modeling test for Flax electra and fix naming and arg in Flax Electra model
    
    * Add documentation
    
    * Fix code style
    
    * Fix code quality
    
    * Adjust tol in assert_almost_equal due to very small difference between model output, ranging 0.0010 - 0.0016
    
    * Remove redundant ElectraPooler
    
    * save intermediate
    
    * adapt
    
    * correct bert flax design
    
    * adapt roberta as well
    
    * finish roberta flax
    
    * finish
    
    * apply suggestions
    
    * apply suggestions
    Co-authored-by: default avatarChris Nguyen <anhtu2687@gmail.com>
    0b98ca36
test_modeling_flax_common.py 6 KB