1. 30 Jun, 2021 2 commits
    • Patrick von Platen's avatar
      [Flax] Add wav2vec2 (#12271) · 0d1f67e6
      Patrick von Platen authored
      
      
      * fix_torch_device_generate_test
      
      * remove @
      
      * start flax wav2vec2
      
      * save intermediate
      
      * forward pass has correct shape
      
      * add weight norm
      
      * add files
      
      * finish ctc
      
      * make style
      
      * finish gumbel quantizer
      
      * correct docstrings
      
      * correct some more files
      
      * fix vit
      
      * finish quality
      
      * correct tests
      
      * correct docstring
      
      * correct tests
      
      * start wav2vec2 pretraining script
      
      * save intermediate
      
      * start pretraining script
      
      * finalize pretraining script
      
      * finish
      
      * finish
      
      * small typo
      
      * finish
      
      * correct
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * make style
      
      * push
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      0d1f67e6
    • NielsRogge's avatar
      Add CANINE (#12024) · 6e685978
      NielsRogge authored
      
      
      * First pass
      
      * More progress
      
      * Add support for local attention
      
      * More improvements
      
      * More improvements
      
      * Conversion script working
      
      * Add CanineTokenizer
      
      * Make style & quality
      
      * First draft of integration test
      
      * Remove decoder test
      
      * Improve tests
      
      * Add documentation
      
      * Mostly docs improvements
      
      * Add CanineTokenizer tests
      
      * Fix most tests on GPU, improve upsampling projection
      
      * Address most comments by @dhgarrette
      
      * Remove decoder logic
      
      * Improve Canine tests, improve docs of CanineConfig
      
      * All tokenizer tests passing
      
      * Make fix-copies and fix tokenizer tests
      
      * Fix test_model_outputs_equivalence test
      
      * Apply suggestions from @sgugger's review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Address some more comments
      
      * Add support for hidden_states and attentions of shallow encoders
      
      * Define custom CanineModelOutputWithPooling, tests pass
      
      * First pass
      
      * More progress
      
      * Add support for local attention
      
      * More improvements
      
      * More improvements
      
      * Conversion script working
      
      * Add CanineTokenizer
      
      * Make style & quality
      
      * First draft of integration test
      
      * Remove decoder test
      
      * Improve tests
      
      * Add documentation
      
      * Mostly docs improvements
      
      * Add CanineTokenizer tests
      
      * Fix most tests on GPU, improve upsampling projection
      
      * Address most comments by @dhgarrette
      
      * Remove decoder logic
      
      * Improve Canine tests, improve docs of CanineConfig
      
      * All tokenizer tests passing
      
      * Make fix-copies and fix tokenizer tests
      
      * Fix test_model_outputs_equivalence test
      
      * Apply suggestions from @sgugger's review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Address some more comments
      
      * Make conversion script work for Canine-c too
      
      * Fix tokenizer tests
      
      * Remove file
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      6e685978
  2. 23 Jun, 2021 1 commit
  3. 22 Jun, 2021 1 commit
  4. 17 Jun, 2021 1 commit
  5. 16 Jun, 2021 1 commit
  6. 14 Jun, 2021 3 commits
    • Vasudev Gupta's avatar
      Flax Big Bird (#11967) · d9c0d08f
      Vasudev Gupta authored
      
      
      * add flax bert
      
      * bert -> bigbird
      
      * original_full ported
      
      * add debugger
      
      * init block sparse
      
      * fix copies ; gelu_fast -> gelu_new
      
      * block sparse port
      
      * fix block sparse
      
      * block sparse working
      
      * all ckpts working
      
      * fix-copies
      
      * make quality
      
      * init tests
      
      * temporary fix for FlaxBigBirdForMultipleChoice
      
      * skip test_attention_outputs
      
      * fix
      
      * gelu_fast -> gelu_new ; fix multiple choice model
      
      * remove nsp
      
      * fix sequence classifier
      
      * fix
      
      * make quality
      
      * make fix-copies
      
      * finish
      
      * Delete debugger.ipynb
      
      * Update src/transformers/models/big_bird/modeling_flax_big_bird.py
      
      * make style
      
      * finish
      
      * bye bye jit flax tests
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d9c0d08f
    • Will Rice's avatar
      Adding TFWav2Vec2Model (#11617) · d438eee0
      Will Rice authored
      
      
      * [WIP] Add TFWav2Vec2Model
      
      Work in progress for adding a tensorflow version of Wav2Vec2
      
      * feedback changes
      
      * small fix
      
      * Test Feedback Round 1
      
      * Add SpecAugment and CTC Loss
      
      * correct spec augment mask creation
      
      * docstring and correct copyright
      
      * correct bugs
      
      * remove bogus file
      
      * finish tests correction
      
      * del unnecessary layers
      
      * Update src/transformers/models/wav2vec2/modeling_tf_wav2vec2.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * make style
      
      * correct final bug
      
      * Feedback Changes
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d438eee0
    • Daniel Stancl's avatar
      FlaxBart (#11537) · 4a51b1dd
      Daniel Stancl authored
      
      
      * Start working on FlaxBart
      
      * Create modeling_flax_bart.py
      
      * Write FlaxBartAttention
      
      * Add FlaxBartEncoderLayer
      
      * Add FlaxBartDecoderLayer and some typing
      
      * Add helepr function for FlaxBart
      
      * shift_tokens_right
      
      * _make_causal_mask
      
      * _expand_mask
      
      * Add PositionalEmbedding and fix init_std naming
      
      * Add FlaxBartPretrainedModel
      
      * Add FlaxBartEncoder
      
      * Add FlaxBartEncoder
      
      * Add FlaxBartEncoder among modules to be imported
      
      * YET WE CANNOT INITIALIZE THAT!! :(
      
      * Make BartEncoder working
      
      Change BartEncoder to instance of nn.Module so far
      
      * Add FlaxBartDecoder
      
      * Add FlaxBartModel
      
      * TODO to make model run -> Prepapre model inputs
      
      * Resolve padding
      
      * Add FlaxBartModel
      
      * Add FlaxBartModel into importable modules
      
      * Remove FlaxBartEncoder and FlaxBartDecoder from importable modules
      
      * make style; not properly working
      
      * make style; make quality not pass due to some import I left
      
      * Remove TODO for padding_idx in nn.Embed so far
      
      * Add FlaxBartForConditionalGeneration
      
      * Incorporate Flax model output classes, i.e. return_dict
      
      * Add another models and incorporate use_cache arg
      
      * Add FlaxBartForSequenceClassification and FlaxBartForQuestionAnswering
      
      * Incorporate use_cache arg from PyTorch implementation
      
      * Add all necessary Flax output utils
      
      * Add FlaxBartForCausalLM; not working yet'
      
      * Add minor improvements; still lacks some functionality
      
      * Update docs, src and tests
      
      * Add support of FlaxBart to docs/source
      
      * Fix some bugs in FlaxBart souce code
      
      * Add some neccessary tests for FlaxBart models - jit_compilation not passing
      
      * Fix tests and add test_head_masking
      
      * Fix tests for @jax.jit computation
      
      * Add test_head_masking
      
      * Migrate FlaxBart tests from jax.numpy to numpy
      
      * Remove FlaxBartForCausalLM
      
      * Clean repo
      
      * fix bart model weight structure
      
      * Fix FlaxBartForSequenceClassification
      
      Slicing is not possible to use below jit, therefore, selecting sentence
      representation from hidden_states must be changed.
      
      * Allow FlaxBartForSequenceClassification for testing pt_flax equivalence
      
      * Allow testing for FlaxBartForQA for pt_flax equivalence
      
      * Add a comment to FlaxBartForSequenceClassification + change noise from 1e-3 to 1e-6
      
      * remove past_key_values
      
      * remove inputs_mebeds and make input_ids required
      
      * add position ids
      
      * re-write attention layer
      
      * fix dataclass
      
      * fix pos embeds and attention output
      
      * fix pos embeds
      
      * expose encode method
      
      * expose decode method
      
      * move docstring to top
      
      * add cache for causal attn layer
      
      * remove head masking for now
      
      * s2s greedy search first pass
      
      * boom boom
      
      * fix typos
      
      * fix greedy generate for bart
      
      * use encoder, decoder layers instead of num_hidden_layers
      
      * handle encoder_outputs
      
      * cleanup
      
      * simplify decoding
      
      * more clean-up
      
      * typos
      
      * Change header + add {decoder_,}position_ids into 2 models
      
      * add BartConfig
      
      * fix existing tests
      
      * add encode, decode methods
      
      * Fix shift_tokens_right for JIT compilation + clarify one condition
      
      * fix decode
      
      * encoder => encode
      
      * simplify generate
      
      * add tests for encode and decode
      
      * style
      
      * add tests for cache
      
      * fix equivalence tests
      
      * sample generate now works with seq2seq
      
      * generation tests
      
      * initialize dense layers
      
      * docstring and cleanup
      
      * quality
      
      * remove get/set input_embeddings
      
      * address Patricks suggestions
      
      * decode for every model, remove encoder_outputs from call
      
      * update tests accordingly
      
      * decode returns only decoder outputs and logits
      
      * fix arguments
      
      * doc encode, decode methods
      
      * correct base_model_prefix
      
      * fix test for seq classif model
      
      * fix docs
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      4a51b1dd
  7. 10 Jun, 2021 1 commit
  8. 09 Jun, 2021 1 commit
    • NielsRogge's avatar
      Add DETR (#11653) · d3eacbb8
      NielsRogge authored
      
      
      * Squash all commits of modeling_detr_v7 branch into one
      
      * Improve docs
      
      * Fix tests
      
      * Style
      
      * Improve docs some more and fix most tests
      
      * Fix slow tests of ViT, DeiT and DETR
      
      * Improve replacement of batch norm
      
      * Restructure timm backbone forward
      
      * Make DetrForSegmentation support any timm backbone
      
      * Fix name of output
      
      * Address most comments by @LysandreJik
      
      * Give better names for variables
      
      * Conditional imports + timm in setup.py
      
      * Address additional comments by @sgugger
      
      * Make style, add require_timm and require_vision to tests茅
      
      * Remove train_backbone attribute of DetrConfig, add methods to freeze/unfreeze backbone
      
      * Add png files to fixtures
      
      * Fix type hint
      
      * Add timm to workflows
      
      * Add `BatchNorm2d` to the weight initialization
      
      * Fix retain_grad test
      
      * Replace model checkpoints by Facebook namespace
      
      * Fix name of checkpoint in test
      
      * Add user-friendly message when scipy is not available
      
      * Address most comments by @patrickvonplaten
      
      * Remove return_intermediate_layers attribute of DetrConfig and simplify Joiner
      
      * Better initialization
      
      * Scipy is necessary to get sklearn metrics
      
      * Rename TimmBackbone to DetrTimmConvEncoder and rename DetrJoiner to DetrConvModel
      
      * Make style
      
      * Improve docs and add 2 community notebooks
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      d3eacbb8
  9. 02 Jun, 2021 1 commit
  10. 01 Jun, 2021 3 commits
  11. 20 May, 2021 1 commit
  12. 18 May, 2021 2 commits
    • Suraj Patil's avatar
      FlaxGPT2 (#11556) · ca33278f
      Suraj Patil authored
      
      
      * flax gpt2
      
      * combine masks
      
      * handle shared embeds
      
      * add causal LM sample
      
      * style
      
      * add tests
      
      * style
      
      * fix imports, docs, quality
      
      * don't use cache
      
      * add cache
      
      * add cache 1st version
      
      * make use cache work
      
      * start adding test for generation
      
      * finish generation loop compilation
      
      * rewrite test
      
      * finish
      
      * update
      
      * update
      
      * apply sylvains suggestions
      
      * update
      
      * refactor
      
      * fix typo
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      ca33278f
    • Patrick von Platen's avatar
      Add more subsections to main doc (#11758) · cebb96f5
      Patrick von Platen authored
      * add headers to main doc
      
      * Apply suggestions from code review
      
      * update
      
      * upload
      cebb96f5
  13. 17 May, 2021 1 commit
  14. 12 May, 2021 2 commits
    • Suraj Patil's avatar
      Fix clip docs (#11694) · f063c56d
      Suraj Patil authored
      * fix doc url
      
      * fix example
      f063c56d
    • Suraj Patil's avatar
      CLIP (#11445) · 8719afa1
      Suraj Patil authored
      
      
      * begin second draft
      
      * fix import, style
      
      * add loss
      
      * fix embeds, logits_scale, and projection
      
      * fix imports
      
      * add conversion script
      
      * add feature_extractor and processor
      
      * style
      
      * add tests for tokenizer, extractor and processor
      
      * add vision model tests
      
      * add weight init
      
      * add more tests
      
      * fix save_load  test
      
      * model output, dosstrings, causal mask
      
      * config doc
      
      * add clip model tests
      
      * return dict
      
      * bigin integration test
      
      * add integration tests
      
      * fix-copies
      
      * fix init
      
      * Clip => CLIP
      
      * fix module name
      
      * docs
      
      * fix doc
      
      * output_dim => projection_dim
      
      * fix checkpoint names
      
      * remoe fast tokenizer file
      
      * fix conversion script
      
      * fix tests, quality
      
      * put causal mask on device
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fix attribute test
      
      * style
      
      * address sylvains comments
      
      * style
      
      * fix docstrings
      
      * add qucik_gelu in activations, docstrings
      
      * clean-up attention test
      
      * fix act fun
      
      * fix config
      
      * fix torchscript tests
      
      * even batch_size
      
      * remove comment
      
      * fix ouput tu_tuple
      
      * fix save load tests
      
      * fix add tokens test
      
      * add fast tokenizer
      
      * update copyright
      
      * new processor API
      
      * fix docs
      
      * docstrings
      
      * docs
      
      * fix doc
      
      * fix doc
      
      * fix tokenizer
      
      * fix import in doc example
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * check types of config
      
      * valhalla => openai
      
      * load image using url
      
      * fix test
      
      * typo
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      8719afa1
  15. 10 May, 2021 1 commit
    • Tanmay Laud's avatar
      Big Bird Fast Tokenizer implementation (#11075) · f7f87295
      Tanmay Laud authored
      
      
      * Added Big Bird Fast Tokenizer initial file
      
      * style fixes
      
      * flake fixes
      
      * Added big bird fast tokenizer to init files
      
      * Added big bird fast to Auto tokenization
      
      * fix styles
      
      * minor quality fixes
      
      * Added initial test code
      
      * Fix SpmConverter when precompiled_charsmap doesn't exist
      
      * fixed post processor
      
      * minor style fix
      
      * minor fix input names
      
      * Actually fix identity normalization
      
      * style
      
      * Added token type ids to fast tokenizer
      
      * style
      
      * flake fix
      
      * fix copies
      Co-authored-by: default avatarAnthony MOI <m.anthony.moi@gmail.com>
      f7f87295
  16. 07 May, 2021 1 commit
    • Vasudev Gupta's avatar
      Add BigBirdPegasus (#10991) · dc3f6758
      Vasudev Gupta authored
      
      
      * init bigbird pegasus
      
      * add debugging nb ; update config
      
      * init conversion
      
      * update conversion script
      
      * complete conversion script
      
      * init forward()
      
      * complete forward()
      
      * add tokenizer
      
      * add some slow tests
      
      * commit current
      
      * fix copies
      
      * add docs
      
      * add conversion script for bigbird-roberta-summarization
      
      * remove TODO
      
      * small fixups
      
      * correct tokenizer
      
      * add bigbird core for now
      
      * fix config
      
      * fix more
      
      * revert pegasus-tokenizer back
      
      * make style
      
      * everything working for pubmed; yayygit status
      
      * complete tests finally
      
      * remove bigbird pegasus tok
      
      * correct tokenizer
      
      * correct tests
      
      * add tokenizer files
      
      * finish make style
      
      * fix test
      
      * update
      
      * make style
      
      * fix tok utils base file
      
      * make fix-copies
      
      * clean a bit
      
      * small update
      
      * fix some suggestions
      
      * add to readme
      
      * fix a bit, clean tests
      
      * fix more tests
      
      * Update src/transformers/__init__.py
      
      * Update src/transformers/__init__.py
      
      * make fix-copies
      
      * complete attn switching, auto-padding left
      
      * make style
      
      * fix auto-padding test
      
      * make style
      
      * fix batched attention tests
      
      * put tolerance at 1e-1 for stand-alone decoder test
      
      * fix docs
      
      * fix tests
      
      * correct slow tokenizer conversion
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * complete remaining suggestions
      
      * fix test
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      dc3f6758
  17. 04 May, 2021 1 commit
    • Patrick Fernandes's avatar
      [Flax] Add Electra models (#11426) · 0afe4a90
      Patrick Fernandes authored
      
      
      * add electra model to flax
      
      * Remove Electra Next Sentence Prediction model added by mistake
      
      * fix parameter sharing and loosen equality threshold
      
      * fix styling issues
      
      * add mistaken removen imports
      
      * fix electra table
      
      * Add FlaxElectra to automodels and fixe docs
      
      * fix issues pointed out the PR
      
      * fix flax electra to comply with latest changes
      
      * remove stale class
      
      * add copied from
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      0afe4a90
  18. 03 May, 2021 1 commit
    • NielsRogge's avatar
      Add LUKE (#11223) · f3cf8ae7
      NielsRogge authored
      
      
      * Rebase with master
      
      * Minor bug fix in docs
      
      * Copy files from adding_luke_v2 and improve docs
      
      * change the default value of use_entity_aware_attention to True
      
      * remove word_hidden_states
      
      * fix head models
      
      * fix tests
      
      * fix the conversion script
      
      * add integration tests for the pretrained large model
      
      * improve docstring
      
      * Improve docs, make style
      
      * fix _init_weights for pytorch 1.8
      
      * improve docs
      
      * fix tokenizer to construct entity sequence with [MASK] entity when entities=None
      
      * Make fix-copies
      
      * Make style & quality
      
      * Bug fixes
      
      * Add LukeTokenizer to init
      
      * Address most comments by @patil-suraj and @LysandreJik
      
      * rename _compute_extended_attention_mask to get_extended_attention_mask
      
      * add comments to LukeSelfAttention
      
      * fix the documentation of the tokenizer
      
      * address comments by @patil-suraj, @LysandreJik, and @sgugger
      
      * improve docs
      
      * Make style, quality and fix-copies
      
      * Improve docs
      
      * fix docs
      
      * add "entity_span_classification" task
      
      * update example code for LukeForEntitySpanClassification
      
      * improve docs
      
      * improve docs
      
      * improve the code example in luke.rst
      
      * rename the classification layer in LukeForEntityClassification from typing to classifier
      
      * add bias to the classifier in LukeForEntitySpanClassification
      
      * update docs to use fine-tuned hub models in code examples of the head models
      
      * update the example sentences
      
      * Make style & quality
      
      * Add require_torch to tokenizer tests
      
      * Add require_torch to tokenizer tests
      
      * Address comments by @sgugger and add community notebooks
      
      * Make fix-copies
      Co-authored-by: default avatarIkuya Yamada <ikuya@ikuya.net>
      f3cf8ae7
  19. 30 Apr, 2021 2 commits
  20. 28 Apr, 2021 1 commit
  21. 14 Apr, 2021 1 commit
  22. 13 Apr, 2021 1 commit
  23. 12 Apr, 2021 2 commits
    • NielsRogge's avatar
      Add DeiT (PyTorch) (#11056) · 9f126097
      NielsRogge authored
      * First draft of deit
      
      * More improvements
      
      * Remove DeiTTokenizerFast from init
      
      * Conversion script works
      
      * Add DeiT to ViT conversion script
      
      * Add tests, add head model, add support for deit in vit conversion script
      
      * Update model checkpoint names
      
      * Update image_mean and image_std, set resample to bicubic
      
      * Improve docs
      
      * Docs improvements
      
      * Add DeiTForImageClassificationWithTeacher to init
      
      * Address comments by @sgugger
      
      * Improve feature extractors
      
      * Make fix-copies
      
      * Minor fixes
      
      * Address comments by @patil-suraj
      
      * All models uploaded
      
      * Fix tests
      
      * Remove labels argument from DeiTForImageClassificationWithTeacher
      
      * Fix-copies, style and quality
      
      * Fix tests
      
      * Fix typo
      
      * Multiple docs improvements
      
      * More docs fixes
      9f126097
    • fghuman's avatar
      Added documentation for data collator. (#10941) · 0c6fcd30
      fghuman authored
      
      
      * Added documentation for data collator.
      
      * Update docs/source/data_collator.rst
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Added documentation for data collator.
      
      * Added documentation for the data collator.
      
      * Merge branch 'doc_DataCollator' of C:\Users\mahii\PycharmProjects\transformers with conflicts.
      
      * Update documentation for the data collator.
      
      * Update documentation for the data collator.
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarAmna <A.A.Ahmad@student.tudelft.nl>
      0c6fcd30
  24. 09 Apr, 2021 1 commit
  25. 08 Apr, 2021 1 commit
  26. 05 Apr, 2021 1 commit
  27. 01 Apr, 2021 1 commit
    • NielsRogge's avatar
      Add Vision Transformer and ViTFeatureExtractor (#10950) · 30677dc7
      NielsRogge authored
      
      
      * Squash all commits into one
      
      * Update ViTFeatureExtractor to use image_utils instead of torchvision
      
      * Remove torchvision and add Pillow
      
      * Small docs improvement
      
      * Address most comments by @sgugger
      
      * Fix tests
      
      * Clean up conversion script
      
      * Pooler first draft
      
      * Fix quality
      
      * Improve conversion script
      
      * Make style and quality
      
      * Make fix-copies
      
      * Minor docs improvements
      
      * Should use fix-copies instead of manual handling
      
      * Revert "Should use fix-copies instead of manual handling"
      
      This reverts commit fd4e591bce4496d41406425c82606a8fdaf8a50b.
      
      * Place ViT in alphabetical order
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      30677dc7
  28. 30 Mar, 2021 2 commits
    • Suraj Patil's avatar
      GPT Neo (#10848) · 86026437
      Suraj Patil authored
      
      
      * lets begin
      
      * boom boom
      
      * fix out proj in attn
      
      * fix attention
      
      * fix local attention
      
      * add tokenizer
      
      * fix imports
      
      * autotokenizer
      
      * fix checkpoint name
      
      * cleanup
      
      * more clean-up
      
      * more cleanup
      
      * output attentions
      
      * fix attn mask creation
      
      * fix imports
      
      * config doc
      
      * add tests
      
      * add slow tests
      
      * quality
      
      * add conversion script
      
      * copyright
      
      * typo
      
      * another bites the dust
      
      * fix attention tests
      
      * doc
      
      * add embed init in convert function
      
      * fix copies
      
      * remove tokenizer
      
      * enable caching
      
      * address review comments
      
      * improve config and create attn layer list internally
      
      * more consistent naming
      
      * init hf config from mesh-tf config json file
      
      * remove neo tokenizer from doc
      
      * handle attention_mask in local attn layer
      
      * attn_layers => attention_layers
      
      * add tokenizer_class in config
      
      * fix docstring
      
      * raise if len of attention_layers is not same as num_layers
      
      * remove tokenizer_class from config
      
      * more consistent naming
      
      * fix doc
      
      * fix checkpoint names
      
      * fp16 compat
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      86026437
    • Vasudev Gupta's avatar
      BigBird (#10183) · 6dfd0272
      Vasudev Gupta authored
      
      
      * init bigbird
      
      * model.__init__ working, conversion script ready, config updated
      
      * add conversion script
      
      * BigBirdEmbeddings working :)
      
      * slightly update conversion script
      
      * BigBirdAttention working :) ; some bug in layer.output.dense
      
      * add debugger-notebook
      
      * forward() working for BigBirdModel :) ; replaced gelu with gelu_fast
      
      * tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :)
      
      * BigBirdModel working in block-sparse attention mode :)
      
      * add BigBirdForPreTraining
      
      * small fix
      
      * add tokenizer for BigBirdModel
      
      * fix config & hence modeling
      
      * fix base prefix
      
      * init testing
      
      * init tokenizer test
      
      * pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements
      
      * remove position_embedding_type arg
      
      * complete normal tests
      
      * add comments to block sparse attention
      
      * add attn_probs for sliding & global tokens
      
      * create fn for block sparse attn mask creation
      
      * add special tests
      
      * restore pos embed arg
      
      * minor fix
      
      * attn probs update
      
      * make big bird fully gpu friendly
      
      * fix tests
      
      * remove pruning
      
      * correct tokenzier & minor fixes
      
      * update conversion script , remove norm_type
      
      * tokenizer-inference test add
      
      * remove extra comments
      
      * add docs
      
      * save intermediate
      
      * finish trivia_qa conversion
      
      * small update to forward
      
      * correct qa and layer
      
      * better error message
      
      * BigBird QA ready
      
      * fix rebased
      
      * add triva-qa debugger notebook
      
      * qa setup
      
      * fixed till embeddings
      
      * some issue in q/k/v_layer
      
      * fix bug in conversion-script
      
      * fixed till self-attn
      
      * qa fixed except layer norm
      
      * add qa end2end test
      
      * fix gradient ckpting ; other qa test
      
      * speed-up big bird a bit
      
      * hub_id=google
      
      * clean up
      
      * make quality
      
      * speed up einsum with bmm
      
      * finish perf improvements for big bird
      
      * remove wav2vec2 tok
      
      * fix tokenizer
      
      * include docs
      
      * correct docs
      
      * add helper to auto pad block size
      
      * make style
      
      * remove fast tokenizer for now
      
      * fix some
      
      * add pad test
      
      * finish
      
      * fix some bugs
      
      * fix another bug
      
      * fix buffer tokens
      
      * fix comment and merge from master
      
      * add comments
      
      * make style
      
      * commit some suggestions
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Fix typos
      
      * fix some more suggestions
      
      * add another patch
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fix copies
      
      * another path
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * update
      
      * update nit suggestions
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      6dfd0272
  29. 25 Mar, 2021 1 commit
    • Amir Tahmasbi's avatar
      Layout lm tf 2 (#10636) · 4684bfc7
      Amir Tahmasbi authored
      
      
      * Added embeddings layer
      
      * Added layoutlm layers, main model, maskedlm and token classification classes
      
      * Added model classes to tf auto models
      
      * Added model to PT to TF conversion script
      
      * Added model to doc README
      
      * Added tests
      
      * Removed unused imports
      
      * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py
      
      * Made tests pass!
      
      * Fixed typos in imports and docs
      
      * Fixed a typo in embeddings layer
      
      * Removed imports
      
      * Fixed formatting issues, imports, tests
      
      * Added layoutlm layers, main model, maskedlm and token classification classes
      
      * Added model classes to tf auto models
      
      * Added model to PT to TF conversion script
      
      * Removed unused imports
      
      * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py
      
      * Made tests pass!
      
      * Fixed typos in imports and docs
      
      * Removed imports
      
      * Fixed small formatting issues
      
      * Removed duplicates import from main __init__.py
      
      * Chnaged deafult arg to true for adding  pooling layer to tf layoutlm
      
      * Fixed formatting issues
      
      * Style
      
      * Added copied from to classes copied from bert
      
      * Fixed doc strings examples to work with layoutlm inputs
      
      * Removed PyTorch reference in doc strings example
      
      * Added integration tests
      
      * Cleaned up initialization file
      
      * Updated model checkpoint identifiers
      
      * Fixed imports
      Co-authored-by: default avatarAmir Tahmasbi <amir@ehsai.ca>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      4684bfc7
  30. 23 Mar, 2021 1 commit