1. 10 May, 2021 1 commit
    • Tanmay Laud's avatar
      Big Bird Fast Tokenizer implementation (#11075) · f7f87295
      Tanmay Laud authored
      
      
      * Added Big Bird Fast Tokenizer initial file
      
      * style fixes
      
      * flake fixes
      
      * Added big bird fast tokenizer to init files
      
      * Added big bird fast to Auto tokenization
      
      * fix styles
      
      * minor quality fixes
      
      * Added initial test code
      
      * Fix SpmConverter when precompiled_charsmap doesn't exist
      
      * fixed post processor
      
      * minor style fix
      
      * minor fix input names
      
      * Actually fix identity normalization
      
      * style
      
      * Added token type ids to fast tokenizer
      
      * style
      
      * flake fix
      
      * fix copies
      Co-authored-by: default avatarAnthony MOI <m.anthony.moi@gmail.com>
      f7f87295
  2. 21 Apr, 2021 1 commit
  3. 31 Mar, 2021 1 commit
  4. 30 Mar, 2021 1 commit
    • Vasudev Gupta's avatar
      BigBird (#10183) · 6dfd0272
      Vasudev Gupta authored
      
      
      * init bigbird
      
      * model.__init__ working, conversion script ready, config updated
      
      * add conversion script
      
      * BigBirdEmbeddings working :)
      
      * slightly update conversion script
      
      * BigBirdAttention working :) ; some bug in layer.output.dense
      
      * add debugger-notebook
      
      * forward() working for BigBirdModel :) ; replaced gelu with gelu_fast
      
      * tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :)
      
      * BigBirdModel working in block-sparse attention mode :)
      
      * add BigBirdForPreTraining
      
      * small fix
      
      * add tokenizer for BigBirdModel
      
      * fix config & hence modeling
      
      * fix base prefix
      
      * init testing
      
      * init tokenizer test
      
      * pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements
      
      * remove position_embedding_type arg
      
      * complete normal tests
      
      * add comments to block sparse attention
      
      * add attn_probs for sliding & global tokens
      
      * create fn for block sparse attn mask creation
      
      * add special tests
      
      * restore pos embed arg
      
      * minor fix
      
      * attn probs update
      
      * make big bird fully gpu friendly
      
      * fix tests
      
      * remove pruning
      
      * correct tokenzier & minor fixes
      
      * update conversion script , remove norm_type
      
      * tokenizer-inference test add
      
      * remove extra comments
      
      * add docs
      
      * save intermediate
      
      * finish trivia_qa conversion
      
      * small update to forward
      
      * correct qa and layer
      
      * better error message
      
      * BigBird QA ready
      
      * fix rebased
      
      * add triva-qa debugger notebook
      
      * qa setup
      
      * fixed till embeddings
      
      * some issue in q/k/v_layer
      
      * fix bug in conversion-script
      
      * fixed till self-attn
      
      * qa fixed except layer norm
      
      * add qa end2end test
      
      * fix gradient ckpting ; other qa test
      
      * speed-up big bird a bit
      
      * hub_id=google
      
      * clean up
      
      * make quality
      
      * speed up einsum with bmm
      
      * finish perf improvements for big bird
      
      * remove wav2vec2 tok
      
      * fix tokenizer
      
      * include docs
      
      * correct docs
      
      * add helper to auto pad block size
      
      * make style
      
      * remove fast tokenizer for now
      
      * fix some
      
      * add pad test
      
      * finish
      
      * fix some bugs
      
      * fix another bug
      
      * fix buffer tokens
      
      * fix comment and merge from master
      
      * add comments
      
      * make style
      
      * commit some suggestions
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Fix typos
      
      * fix some more suggestions
      
      * add another patch
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fix copies
      
      * another path
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * update
      
      * update nit suggestions
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      6dfd0272