• Sehoon Kim's avatar
    I-BERT model support (#10153) · 63645b3b
    Sehoon Kim authored
    
    
    * IBertConfig, IBertTokentizer added
    
    * IBert Model names moified
    
    * tokenizer bugfix
    
    * embedding -> QuantEmbedding
    
    * quant utils added
    
    * quant_mode added to configuration
    
    * QuantAct added, Embedding layer + QuantAct addition
    
    * QuantAct added
    
    * unused path removed, QKV quantized
    
    * self attention layer all quantized, except softmax
    
    * temporarl commit
    
    * all liner layers quantized
    
    * quant_utils bugfix
    
    * bugfix: requantization missing
    
    * IntGELU added
    
    * IntSoftmax added
    
    * LayerNorm implemented
    
    * LayerNorm implemented all
    
    * names changed: roberta->ibert
    
    * config not inherit from ROberta
    
    * No support for CausalLM
    
    * static quantization added, quantize_model.py removed
    
    * import modules uncommented
    
    * copyrights fixed
    
    * minor bugfix
    
    * quant_modules, quant_utils merged as one file
    
    * import * fixed
    
    * unused runfile removed
    
    * make style run
    
    * configutration.py docstring fixed
    
    * refactoring: comments removed, function name fixed
    
    * unused dependency removed
    
    * typo fixed
    
    * comments(Copied from), assertion string added
    
    * refactoring: super(..) -> super(), etc.
    
    * refactoring
    
    * refarctoring
    
    * make style
    
    * refactoring
    
    * cuda -> to(x.device)
    
    * weight initialization removed
    
    * QuantLinear set_param removed
    
    * QuantEmbedding set_param removed
    
    * IntLayerNorm set_param removed
    
    * assert string added
    
    * assertion error message fixed
    
    * is_decoder removed
    
    * enc-dec arguments/functions removed
    
    * Converter removed
    
    * quant_modules docstring fixed
    
    * conver_slow_tokenizer rolled back
    
    * quant_utils docstring fixed
    
    * unused aruments e.g. use_cache removed from config
    
    * weight initialization condition fixed
    
    * x_min, x_max initialized with small values to avoid div-zero exceptions
    
    * testing code for ibert
    
    * test emb, linear, gelu, softmax added
    
    * test ln and act added
    
    * style reformatted
    
    * force_dequant added
    
    * error tests overrided
    
    * make style
    
    * Style + Docs
    
    * force dequant tests added
    
    * Fix fast tokenizer in init
    
    * Fix doc
    
    * Remove space
    
    * docstring, IBertConfig, chunk_size
    
    * test_modeling_ibert refactoring
    
    * quant_modules.py refactoring
    
    * e2e integration test added
    
    * tokenizers removed
    
    * IBertConfig added to tokenizer_auto.py
    
    * bugfix
    
    * fix docs & test
    
    * fix style num 2
    
    * final fixes
    Co-authored-by: default avatarSehoon Kim <sehoonkim@berkeley.edu>
    Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
    Co-authored-by: default avatarSylvain Gugger <sylvain.gugger@gmail.com>
    Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
    63645b3b
ibert.rst 4.07 KB