• Shilong Zhang's avatar
    Refactor the baseclass related to transformer (#978) · e05fb560
    Shilong Zhang authored
    
    
    * minor changes
    
    * change to modulist
    
    * change to Sequential
    
    * replace dropout with attn_drop and proj_drop in MultiheadAttention
    
    * add operation_name for attn
    
    * add drop path and move all ffn args to ffncfgs
    
    * fix typo
    
    * fix a bug when use default value of ffn_cfgs
    
    * fix ffns
    
    * add deprecate warning
    
    * fix deprecate warning
    
    * change to pop kwargs
    
    * support register FFN of transformer
    
    * support batch first
    
    * fix batch first wapper
    
    * fix forward wapper
    
    * fix typo
    
    * fix lint
    
    * add unitest for transformer
    
    * fix unitest
    
    * fix equal
    
    * use allclose
    
    * fix comments
    
    * fix comments
    
    * change configdict to dict
    
    * move drop to a file
    
    * add comments for drop path
    
    * add noqa 501
    
    * move bnc wapper to MultiheadAttention
    
    * move bnc wapper to MultiheadAttention
    
    * use dep warning
    
    * resolve comments
    
    * add unitest:
    
    * rename residual to identity
    
    * revert runner
    
    * msda residual to identity
    
    * rename inp_identity to identity
    
    * fix name
    
    * fix transformer
    
    * remove key in msda
    
    * remove assert for key
    Co-authored-by: default avatarHIT-cwh <2892770585@qq.com>
    Co-authored-by: default avatarbkhuang <congee524@gmail.com>
    Co-authored-by: default avatarWenwei Zhang <40779233+ZwwWayne@users.noreply.github.com>
    e05fb560
test_ms_deformable_attn.py 5.79 KB