1. 18 May, 2021 2 commits
    • Suraj Patil's avatar
      FlaxGPT2 (#11556) · ca33278f
      Suraj Patil authored
      
      
      * flax gpt2
      
      * combine masks
      
      * handle shared embeds
      
      * add causal LM sample
      
      * style
      
      * add tests
      
      * style
      
      * fix imports, docs, quality
      
      * don't use cache
      
      * add cache
      
      * add cache 1st version
      
      * make use cache work
      
      * start adding test for generation
      
      * finish generation loop compilation
      
      * rewrite test
      
      * finish
      
      * update
      
      * update
      
      * apply sylvains suggestions
      
      * update
      
      * refactor
      
      * fix typo
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      ca33278f
    • Patrick von Platen's avatar
      Add more subsections to main doc (#11758) · cebb96f5
      Patrick von Platen authored
      * add headers to main doc
      
      * Apply suggestions from code review
      
      * update
      
      * upload
      cebb96f5
  2. 17 May, 2021 1 commit
  3. 12 May, 2021 2 commits
    • Suraj Patil's avatar
      Fix clip docs (#11694) · f063c56d
      Suraj Patil authored
      * fix doc url
      
      * fix example
      f063c56d
    • Suraj Patil's avatar
      CLIP (#11445) · 8719afa1
      Suraj Patil authored
      
      
      * begin second draft
      
      * fix import, style
      
      * add loss
      
      * fix embeds, logits_scale, and projection
      
      * fix imports
      
      * add conversion script
      
      * add feature_extractor and processor
      
      * style
      
      * add tests for tokenizer, extractor and processor
      
      * add vision model tests
      
      * add weight init
      
      * add more tests
      
      * fix save_load  test
      
      * model output, dosstrings, causal mask
      
      * config doc
      
      * add clip model tests
      
      * return dict
      
      * bigin integration test
      
      * add integration tests
      
      * fix-copies
      
      * fix init
      
      * Clip => CLIP
      
      * fix module name
      
      * docs
      
      * fix doc
      
      * output_dim => projection_dim
      
      * fix checkpoint names
      
      * remoe fast tokenizer file
      
      * fix conversion script
      
      * fix tests, quality
      
      * put causal mask on device
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fix attribute test
      
      * style
      
      * address sylvains comments
      
      * style
      
      * fix docstrings
      
      * add qucik_gelu in activations, docstrings
      
      * clean-up attention test
      
      * fix act fun
      
      * fix config
      
      * fix torchscript tests
      
      * even batch_size
      
      * remove comment
      
      * fix ouput tu_tuple
      
      * fix save load tests
      
      * fix add tokens test
      
      * add fast tokenizer
      
      * update copyright
      
      * new processor API
      
      * fix docs
      
      * docstrings
      
      * docs
      
      * fix doc
      
      * fix doc
      
      * fix tokenizer
      
      * fix import in doc example
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * check types of config
      
      * valhalla => openai
      
      * load image using url
      
      * fix test
      
      * typo
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      8719afa1
  4. 10 May, 2021 1 commit
    • Tanmay Laud's avatar
      Big Bird Fast Tokenizer implementation (#11075) · f7f87295
      Tanmay Laud authored
      
      
      * Added Big Bird Fast Tokenizer initial file
      
      * style fixes
      
      * flake fixes
      
      * Added big bird fast tokenizer to init files
      
      * Added big bird fast to Auto tokenization
      
      * fix styles
      
      * minor quality fixes
      
      * Added initial test code
      
      * Fix SpmConverter when precompiled_charsmap doesn't exist
      
      * fixed post processor
      
      * minor style fix
      
      * minor fix input names
      
      * Actually fix identity normalization
      
      * style
      
      * Added token type ids to fast tokenizer
      
      * style
      
      * flake fix
      
      * fix copies
      Co-authored-by: default avatarAnthony MOI <m.anthony.moi@gmail.com>
      f7f87295
  5. 07 May, 2021 1 commit
    • Vasudev Gupta's avatar
      Add BigBirdPegasus (#10991) · dc3f6758
      Vasudev Gupta authored
      
      
      * init bigbird pegasus
      
      * add debugging nb ; update config
      
      * init conversion
      
      * update conversion script
      
      * complete conversion script
      
      * init forward()
      
      * complete forward()
      
      * add tokenizer
      
      * add some slow tests
      
      * commit current
      
      * fix copies
      
      * add docs
      
      * add conversion script for bigbird-roberta-summarization
      
      * remove TODO
      
      * small fixups
      
      * correct tokenizer
      
      * add bigbird core for now
      
      * fix config
      
      * fix more
      
      * revert pegasus-tokenizer back
      
      * make style
      
      * everything working for pubmed; yayygit status
      
      * complete tests finally
      
      * remove bigbird pegasus tok
      
      * correct tokenizer
      
      * correct tests
      
      * add tokenizer files
      
      * finish make style
      
      * fix test
      
      * update
      
      * make style
      
      * fix tok utils base file
      
      * make fix-copies
      
      * clean a bit
      
      * small update
      
      * fix some suggestions
      
      * add to readme
      
      * fix a bit, clean tests
      
      * fix more tests
      
      * Update src/transformers/__init__.py
      
      * Update src/transformers/__init__.py
      
      * make fix-copies
      
      * complete attn switching, auto-padding left
      
      * make style
      
      * fix auto-padding test
      
      * make style
      
      * fix batched attention tests
      
      * put tolerance at 1e-1 for stand-alone decoder test
      
      * fix docs
      
      * fix tests
      
      * correct slow tokenizer conversion
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * complete remaining suggestions
      
      * fix test
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      dc3f6758
  6. 04 May, 2021 1 commit
    • Patrick Fernandes's avatar
      [Flax] Add Electra models (#11426) · 0afe4a90
      Patrick Fernandes authored
      
      
      * add electra model to flax
      
      * Remove Electra Next Sentence Prediction model added by mistake
      
      * fix parameter sharing and loosen equality threshold
      
      * fix styling issues
      
      * add mistaken removen imports
      
      * fix electra table
      
      * Add FlaxElectra to automodels and fixe docs
      
      * fix issues pointed out the PR
      
      * fix flax electra to comply with latest changes
      
      * remove stale class
      
      * add copied from
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      0afe4a90
  7. 03 May, 2021 1 commit
    • NielsRogge's avatar
      Add LUKE (#11223) · f3cf8ae7
      NielsRogge authored
      
      
      * Rebase with master
      
      * Minor bug fix in docs
      
      * Copy files from adding_luke_v2 and improve docs
      
      * change the default value of use_entity_aware_attention to True
      
      * remove word_hidden_states
      
      * fix head models
      
      * fix tests
      
      * fix the conversion script
      
      * add integration tests for the pretrained large model
      
      * improve docstring
      
      * Improve docs, make style
      
      * fix _init_weights for pytorch 1.8
      
      * improve docs
      
      * fix tokenizer to construct entity sequence with [MASK] entity when entities=None
      
      * Make fix-copies
      
      * Make style & quality
      
      * Bug fixes
      
      * Add LukeTokenizer to init
      
      * Address most comments by @patil-suraj and @LysandreJik
      
      * rename _compute_extended_attention_mask to get_extended_attention_mask
      
      * add comments to LukeSelfAttention
      
      * fix the documentation of the tokenizer
      
      * address comments by @patil-suraj, @LysandreJik, and @sgugger
      
      * improve docs
      
      * Make style, quality and fix-copies
      
      * Improve docs
      
      * fix docs
      
      * add "entity_span_classification" task
      
      * update example code for LukeForEntitySpanClassification
      
      * improve docs
      
      * improve docs
      
      * improve the code example in luke.rst
      
      * rename the classification layer in LukeForEntityClassification from typing to classifier
      
      * add bias to the classifier in LukeForEntitySpanClassification
      
      * update docs to use fine-tuned hub models in code examples of the head models
      
      * update the example sentences
      
      * Make style & quality
      
      * Add require_torch to tokenizer tests
      
      * Add require_torch to tokenizer tests
      
      * Address comments by @sgugger and add community notebooks
      
      * Make fix-copies
      Co-authored-by: default avatarIkuya Yamada <ikuya@ikuya.net>
      f3cf8ae7
  8. 30 Apr, 2021 2 commits
  9. 28 Apr, 2021 1 commit
  10. 14 Apr, 2021 1 commit
  11. 13 Apr, 2021 1 commit
  12. 12 Apr, 2021 2 commits
    • NielsRogge's avatar
      Add DeiT (PyTorch) (#11056) · 9f126097
      NielsRogge authored
      * First draft of deit
      
      * More improvements
      
      * Remove DeiTTokenizerFast from init
      
      * Conversion script works
      
      * Add DeiT to ViT conversion script
      
      * Add tests, add head model, add support for deit in vit conversion script
      
      * Update model checkpoint names
      
      * Update image_mean and image_std, set resample to bicubic
      
      * Improve docs
      
      * Docs improvements
      
      * Add DeiTForImageClassificationWithTeacher to init
      
      * Address comments by @sgugger
      
      * Improve feature extractors
      
      * Make fix-copies
      
      * Minor fixes
      
      * Address comments by @patil-suraj
      
      * All models uploaded
      
      * Fix tests
      
      * Remove labels argument from DeiTForImageClassificationWithTeacher
      
      * Fix-copies, style and quality
      
      * Fix tests
      
      * Fix typo
      
      * Multiple docs improvements
      
      * More docs fixes
      9f126097
    • fghuman's avatar
      Added documentation for data collator. (#10941) · 0c6fcd30
      fghuman authored
      
      
      * Added documentation for data collator.
      
      * Update docs/source/data_collator.rst
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Added documentation for data collator.
      
      * Added documentation for the data collator.
      
      * Merge branch 'doc_DataCollator' of C:\Users\mahii\PycharmProjects\transformers with conflicts.
      
      * Update documentation for the data collator.
      
      * Update documentation for the data collator.
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarAmna <A.A.Ahmad@student.tudelft.nl>
      0c6fcd30
  13. 09 Apr, 2021 1 commit
  14. 08 Apr, 2021 1 commit
  15. 05 Apr, 2021 1 commit
  16. 01 Apr, 2021 1 commit
    • NielsRogge's avatar
      Add Vision Transformer and ViTFeatureExtractor (#10950) · 30677dc7
      NielsRogge authored
      
      
      * Squash all commits into one
      
      * Update ViTFeatureExtractor to use image_utils instead of torchvision
      
      * Remove torchvision and add Pillow
      
      * Small docs improvement
      
      * Address most comments by @sgugger
      
      * Fix tests
      
      * Clean up conversion script
      
      * Pooler first draft
      
      * Fix quality
      
      * Improve conversion script
      
      * Make style and quality
      
      * Make fix-copies
      
      * Minor docs improvements
      
      * Should use fix-copies instead of manual handling
      
      * Revert "Should use fix-copies instead of manual handling"
      
      This reverts commit fd4e591bce4496d41406425c82606a8fdaf8a50b.
      
      * Place ViT in alphabetical order
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      30677dc7
  17. 30 Mar, 2021 2 commits
    • Suraj Patil's avatar
      GPT Neo (#10848) · 86026437
      Suraj Patil authored
      
      
      * lets begin
      
      * boom boom
      
      * fix out proj in attn
      
      * fix attention
      
      * fix local attention
      
      * add tokenizer
      
      * fix imports
      
      * autotokenizer
      
      * fix checkpoint name
      
      * cleanup
      
      * more clean-up
      
      * more cleanup
      
      * output attentions
      
      * fix attn mask creation
      
      * fix imports
      
      * config doc
      
      * add tests
      
      * add slow tests
      
      * quality
      
      * add conversion script
      
      * copyright
      
      * typo
      
      * another bites the dust
      
      * fix attention tests
      
      * doc
      
      * add embed init in convert function
      
      * fix copies
      
      * remove tokenizer
      
      * enable caching
      
      * address review comments
      
      * improve config and create attn layer list internally
      
      * more consistent naming
      
      * init hf config from mesh-tf config json file
      
      * remove neo tokenizer from doc
      
      * handle attention_mask in local attn layer
      
      * attn_layers => attention_layers
      
      * add tokenizer_class in config
      
      * fix docstring
      
      * raise if len of attention_layers is not same as num_layers
      
      * remove tokenizer_class from config
      
      * more consistent naming
      
      * fix doc
      
      * fix checkpoint names
      
      * fp16 compat
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      86026437
    • Vasudev Gupta's avatar
      BigBird (#10183) · 6dfd0272
      Vasudev Gupta authored
      
      
      * init bigbird
      
      * model.__init__ working, conversion script ready, config updated
      
      * add conversion script
      
      * BigBirdEmbeddings working :)
      
      * slightly update conversion script
      
      * BigBirdAttention working :) ; some bug in layer.output.dense
      
      * add debugger-notebook
      
      * forward() working for BigBirdModel :) ; replaced gelu with gelu_fast
      
      * tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :)
      
      * BigBirdModel working in block-sparse attention mode :)
      
      * add BigBirdForPreTraining
      
      * small fix
      
      * add tokenizer for BigBirdModel
      
      * fix config & hence modeling
      
      * fix base prefix
      
      * init testing
      
      * init tokenizer test
      
      * pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements
      
      * remove position_embedding_type arg
      
      * complete normal tests
      
      * add comments to block sparse attention
      
      * add attn_probs for sliding & global tokens
      
      * create fn for block sparse attn mask creation
      
      * add special tests
      
      * restore pos embed arg
      
      * minor fix
      
      * attn probs update
      
      * make big bird fully gpu friendly
      
      * fix tests
      
      * remove pruning
      
      * correct tokenzier & minor fixes
      
      * update conversion script , remove norm_type
      
      * tokenizer-inference test add
      
      * remove extra comments
      
      * add docs
      
      * save intermediate
      
      * finish trivia_qa conversion
      
      * small update to forward
      
      * correct qa and layer
      
      * better error message
      
      * BigBird QA ready
      
      * fix rebased
      
      * add triva-qa debugger notebook
      
      * qa setup
      
      * fixed till embeddings
      
      * some issue in q/k/v_layer
      
      * fix bug in conversion-script
      
      * fixed till self-attn
      
      * qa fixed except layer norm
      
      * add qa end2end test
      
      * fix gradient ckpting ; other qa test
      
      * speed-up big bird a bit
      
      * hub_id=google
      
      * clean up
      
      * make quality
      
      * speed up einsum with bmm
      
      * finish perf improvements for big bird
      
      * remove wav2vec2 tok
      
      * fix tokenizer
      
      * include docs
      
      * correct docs
      
      * add helper to auto pad block size
      
      * make style
      
      * remove fast tokenizer for now
      
      * fix some
      
      * add pad test
      
      * finish
      
      * fix some bugs
      
      * fix another bug
      
      * fix buffer tokens
      
      * fix comment and merge from master
      
      * add comments
      
      * make style
      
      * commit some suggestions
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Fix typos
      
      * fix some more suggestions
      
      * add another patch
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fix copies
      
      * another path
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * update
      
      * update nit suggestions
      
      * make style
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      6dfd0272
  18. 25 Mar, 2021 1 commit
    • Amir Tahmasbi's avatar
      Layout lm tf 2 (#10636) · 4684bfc7
      Amir Tahmasbi authored
      
      
      * Added embeddings layer
      
      * Added layoutlm layers, main model, maskedlm and token classification classes
      
      * Added model classes to tf auto models
      
      * Added model to PT to TF conversion script
      
      * Added model to doc README
      
      * Added tests
      
      * Removed unused imports
      
      * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py
      
      * Made tests pass!
      
      * Fixed typos in imports and docs
      
      * Fixed a typo in embeddings layer
      
      * Removed imports
      
      * Fixed formatting issues, imports, tests
      
      * Added layoutlm layers, main model, maskedlm and token classification classes
      
      * Added model classes to tf auto models
      
      * Added model to PT to TF conversion script
      
      * Removed unused imports
      
      * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py
      
      * Made tests pass!
      
      * Fixed typos in imports and docs
      
      * Removed imports
      
      * Fixed small formatting issues
      
      * Removed duplicates import from main __init__.py
      
      * Chnaged deafult arg to true for adding  pooling layer to tf layoutlm
      
      * Fixed formatting issues
      
      * Style
      
      * Added copied from to classes copied from bert
      
      * Fixed doc strings examples to work with layoutlm inputs
      
      * Removed PyTorch reference in doc strings example
      
      * Added integration tests
      
      * Cleaned up initialization file
      
      * Updated model checkpoint identifiers
      
      * Fixed imports
      Co-authored-by: default avatarAmir Tahmasbi <amir@ehsai.ca>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      4684bfc7
  19. 23 Mar, 2021 1 commit
  20. 16 Mar, 2021 1 commit
  21. 11 Mar, 2021 1 commit
  22. 10 Mar, 2021 1 commit
    • Suraj Patil's avatar
      Speech2TextTransformer (#10175) · d26b37e7
      Suraj Patil authored
      
      
      * s2t
      
      * fix config
      
      * conversion script
      
      * fix import
      
      * add tokenizer
      
      * fix tok init
      
      * fix tokenizer
      
      * first version working
      
      * fix embeds
      
      * fix lm head
      
      * remove extra heads
      
      * fix convert script
      
      * handle encoder attn mask
      
      * style
      
      * better enc attn mask
      
      * override _prepare_attention_mask_for_generation
      
      * handle attn_maks in encoder and decoder
      
      * input_ids => input_features
      
      * enable use_cache
      
      * remove old code
      
      * expand embeddings if needed
      
      * remove logits bias
      
      * masked_lm_loss => loss
      
      * hack tokenizer to support feature processing
      
      * fix model_input_names
      
      * style
      
      * fix error message
      
      * doc
      
      * remove inputs_embeds
      
      * remove input_embeds
      
      * remove unnecessary docstring
      
      * quality
      
      * SpeechToText => Speech2Text
      
      * style
      
      * remove shared_embeds
      
      * subsample => conv
      
      * remove Speech2TextTransformerDecoderWrapper
      
      * update output_lengths formula
      
      * fix table
      
      * remove max_position_embeddings
      
      * update conversion scripts
      
      * add possibility to do upper case for now
      
      * add FeatureExtractor and Processor
      
      * add tests for extractor
      
      * require_torch_audio => require_torchaudio
      
      * add processor test
      
      * update import
      
      * remove classification head
      
      * attention mask is now 1D
      
      * update docstrings
      
      * attention mask should be of type long
      
      * handle attention mask from generate
      
      * alwyas return attention_mask
      
      * fix test
      
      * style
      
      * doc
      
      * Speech2TextTransformer => Speech2Text
      
      * Speech2TextTransformerConfig => Speech2TextConfig
      
      * remove dummy_inputs
      
      * nit
      
      * style
      
      * multilinguial tok
      
      * fix tokenizer
      
      * add tgt_lang setter
      
      * save lang_codes
      
      * fix tokenizer
      
      * add forced_bos_token_id to tokenizer
      
      * apply review suggestions
      
      * add torchaudio to extra deps
      
      * add speech deps to CI
      
      * fix dep
      
      * add libsndfile to ci
      
      * libsndfile1
      
      * add speech to extras all
      
      * libsndfile1 -> libsndfile1
      
      * libsndfile
      
      * libsndfile1-dev
      
      * apt update
      
      * add sudo to install
      
      * update deps table
      
      * install libsndfile1-dev on CI
      
      * tuple to list
      
      * init conv layer
      
      * add model tests
      
      * quality
      
      * add integration tests
      
      * skip_special_tokens
      
      * add speech_to_text_transformer in toctree
      
      * fix tokenizer
      
      * fix fp16 tests
      
      * add tokenizer tests
      
      * fix copyright
      
      * input_values => input_features
      
      * doc
      
      * add model in readme
      
      * doc
      
      * change checkpoint names
      
      * fix copyright
      
      * fix code example
      
      * add max_model_input_sizes in tokenizer
      
      * fix integration tests
      
      * add do_lower_case to tokenizer
      
      * remove clamp trick
      
      * fix "Add modeling imports here"
      
      * fix copyrights
      
      * fix tests
      
      * SpeechToTextTransformer => SpeechToText
      
      * fix naming
      
      * fix table formatting
      
      * fix typo
      
      * style
      
      * fix typos
      
      * remove speech dep from extras[testing]
      
      * fix copies
      
      * rename doc file,
      
      * put imports under is_torch_available
      
      * run feat extract tests when torch is available
      
      * dummy objects for processor and extractor
      
      * fix imports in tests
      
      * fix import in modeling test
      
      * fxi imports
      
      * fix torch import
      
      * fix imports again
      
      * fix positional embeddings
      
      * fix typo in import
      
      * adapt new extractor refactor
      
      * style
      
      * fix torchscript test
      
      * doc
      
      * doc
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * fix docs, copied from, style
      
      * fix docstring
      
      * handle imports
      
      * remove speech from all extra deps
      
      * remove s2t from seq2seq lm mapping
      
      * better names
      
      * skip training tests
      
      * add install instructions
      
      * List => Tuple
      
      * doc
      
      * fix conversion script
      
      * fix urls
      
      * add instruction for libsndfile
      
      * fix fp16 test
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      d26b37e7
  23. 08 Mar, 2021 1 commit
    • Ratthachat (Jung)'s avatar
      Add TFRag (#9002) · 696e8a43
      Ratthachat (Jung) authored
      * Create modeling_tf_dpr.py
      
      * Add TFDPR
      
      * Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
      
      last commit accidentally deleted these 4 lines, so I recover them back
      
      * Add TFDPR
      
      * Add TFDPR
      
      * clean up some comments, add TF input-style doc string
      
      * Add TFDPR
      
      * Make return_dict=False as default
      
      * Fix return_dict bug (in .from_pretrained)
      
      * Add get_input_embeddings()
      
      * Create test_modeling_tf_dpr.py
      
      The current version is already passed all 27 tests!
      Please see the test run at : 
      https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
      
      
      
      * fix quality
      
      * delete init weights
      
      * run fix copies
      
      * fix repo consis
      
      * del config_class, load_tf_weights
      
      They shoud be 'pytorch only'
      
      * add config_class back
      
      after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
      
      * newline after .. note::
      
      * import tf, np (Necessary for ModelIntegrationTest)
      
      * slow_test from_pretrained with from_pt=True
      
      At the moment we don't have TF weights (since we don't have official official TF model)
      Previously, I did not run slow test, so I missed this bug
      
      * Add simple TFDPRModelIntegrationTest
      
      Note that this is just a test that TF and Pytorch gives approx. the same output.
      However, I could not test with the official DPR repo's output yet
      
      * upload correct tf model
      
      * remove position_ids as missing keys
      
      * create modeling_tf_rag
      
      * add tests for tf
      
      * add tf tests
      
      * revert wrong pt commit
      
      * further refactor
      
      * further refactor
      
      * refactor
      
      * Update modeling_tf_rag.py
      
      - input_processing
      - fix prepare_input_for_generation (mostly fix generate bug)
      - bring back from_pretrained hack in order to test generate
      
      * delete colab pieces of code
      
      * Show case of greedy "generate"
      
      Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output.
      
      * cosmetic update
      
      * correct typos
      
      * update
      
      * push some progress
      
      * make easy check
      
      * fix rag save from pretrained
      
      * Update src/transformers/modeling_tf_utils.py
      
      * remove commented out lines
      
      * delete unnecessary lines
      
      * add simple test case for nq_checkpoint
      
      Add nq_checkpoint test to show that current version without hack still fails
      
      * temporarily put ugly hack back again
      
      * Add TFRagSequenceForGeneration!!
      
      * __init__.py , import TFRagSequenceForGeneration
      
      * Add TFRagSequence tests!
      
      * rag init.py - add TFRagSequenceForGeneration
      
      * fix from_pretrained
      
      * fix prepare_inputs_for_generation
      
      * Beam search for RagToken!
      
      * minor clean up
      
      * add tf.cast in TFRagModel
      
      * More tf.cast
      
      * Add all remaining tests (still have issues)
      
      * delete all T5 related
      
      * make style
      
      * fix load weight prefix
      
      * fix bart
      
      * fix return_dict for tf_rag
      
      make all tests pass .. Hooray
      
      * fix some tests
      
      * fix code quality
      
      * fix qualtiy check
      
      * finish tests tf rag
      
      * add tf rag to docs
      
      * remove TFT5 from docstring
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * remove TFT5 from docstring
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Delete outdated comments
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * improve doc strings
      
      * add generative model classes
      
      * fix adjust token logic
      
      * refactor generate for TFRag
      
      * using shape_list, not _get_shape
      Co-authored-by: default avatarJulien Plu <plu.julien@gmail.com>
      
      * axis=[1]->axis=1
      
      * delete NEED_HELP comment
      
      * improve readability
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * improve readability
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * improve readability
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Indicating model is in a developing state in docstrings
      
      As suggested by Julien
      
      * small last changes
      
      * apply sylvains suggestions
      
      * finish tf rag
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarpatrickvonplaten <patrick@huggingface.co>
      Co-authored-by: default avatarJulien Plu <plu.julien@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      696e8a43
  24. 06 Mar, 2021 1 commit
    • Suraj Patil's avatar
      Add m2m100 (#10236) · f6e74a63
      Suraj Patil authored
      * m2m_100
      
      * no layernorm_embedding
      
      * sinusoidal positional embeddings
      
      * update pos embeddings
      
      * add default config values
      
      * tokenizer
      
      * add conversion script
      
      * fix config
      
      * fix pos embed
      
      * remove _float_tensor
      
      * update tokenizer
      
      * update lang codes
      
      * handle lang codes
      
      * fix pos embeds
      
      * fix spm key
      
      * put embedding weights on device
      
      * remove qa and seq classification heads
      
      * fix convert script
      
      * lang codes pn one line
      
      * fix embeds
      
      * fix tokenizer
      
      * fix tokenizer
      
      * add fast tokenizer
      
      * style
      
      * M2M100MT => M2M100
      
      * fix copyright, style
      
      * tokenizer converter
      
      * vocab file
      
      * remove fast tokenizer
      
      * fix embeds
      
      * fix tokenizer
      
      * fix tests
      
      * add tokenizer tests
      
      * add integration test
      
      * quality
      
      * fix model name
      
      * fix test
      
      * doc
      
      * doc
      
      * fix doc
      
      * add copied from statements
      
      * fix tokenizer tests
      
      * apply review suggestions
      
      * fix urls
      
      * fix shift_tokens_right
      
      * apply review suggestions
      
      * fix
      
      * fix doc
      
      * add lang code to id
      
      * remove unused function
      
      * update checkpoint names
      
      * fix copy
      
      * fix tokenizer
      
      * fix checkpoint names
      
      * fix merge issue
      
      * style
      f6e74a63
  25. 01 Mar, 2021 1 commit
  26. 25 Feb, 2021 2 commits
    • Sehoon Kim's avatar
      I-BERT model support (#10153) · 63645b3b
      Sehoon Kim authored
      
      
      * IBertConfig, IBertTokentizer added
      
      * IBert Model names moified
      
      * tokenizer bugfix
      
      * embedding -> QuantEmbedding
      
      * quant utils added
      
      * quant_mode added to configuration
      
      * QuantAct added, Embedding layer + QuantAct addition
      
      * QuantAct added
      
      * unused path removed, QKV quantized
      
      * self attention layer all quantized, except softmax
      
      * temporarl commit
      
      * all liner layers quantized
      
      * quant_utils bugfix
      
      * bugfix: requantization missing
      
      * IntGELU added
      
      * IntSoftmax added
      
      * LayerNorm implemented
      
      * LayerNorm implemented all
      
      * names changed: roberta->ibert
      
      * config not inherit from ROberta
      
      * No support for CausalLM
      
      * static quantization added, quantize_model.py removed
      
      * import modules uncommented
      
      * copyrights fixed
      
      * minor bugfix
      
      * quant_modules, quant_utils merged as one file
      
      * import * fixed
      
      * unused runfile removed
      
      * make style run
      
      * configutration.py docstring fixed
      
      * refactoring: comments removed, function name fixed
      
      * unused dependency removed
      
      * typo fixed
      
      * comments(Copied from), assertion string added
      
      * refactoring: super(..) -> super(), etc.
      
      * refactoring
      
      * refarctoring
      
      * make style
      
      * refactoring
      
      * cuda -> to(x.device)
      
      * weight initialization removed
      
      * QuantLinear set_param removed
      
      * QuantEmbedding set_param removed
      
      * IntLayerNorm set_param removed
      
      * assert string added
      
      * assertion error message fixed
      
      * is_decoder removed
      
      * enc-dec arguments/functions removed
      
      * Converter removed
      
      * quant_modules docstring fixed
      
      * conver_slow_tokenizer rolled back
      
      * quant_utils docstring fixed
      
      * unused aruments e.g. use_cache removed from config
      
      * weight initialization condition fixed
      
      * x_min, x_max initialized with small values to avoid div-zero exceptions
      
      * testing code for ibert
      
      * test emb, linear, gelu, softmax added
      
      * test ln and act added
      
      * style reformatted
      
      * force_dequant added
      
      * error tests overrided
      
      * make style
      
      * Style + Docs
      
      * force dequant tests added
      
      * Fix fast tokenizer in init
      
      * Fix doc
      
      * Remove space
      
      * docstring, IBertConfig, chunk_size
      
      * test_modeling_ibert refactoring
      
      * quant_modules.py refactoring
      
      * e2e integration test added
      
      * tokenizers removed
      
      * IBertConfig added to tokenizer_auto.py
      
      * bugfix
      
      * fix docs & test
      
      * fix style num 2
      
      * final fixes
      Co-authored-by: default avatarSehoon Kim <sehoonkim@berkeley.edu>
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      Co-authored-by: default avatarSylvain Gugger <sylvain.gugger@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      63645b3b
    • Patrick von Platen's avatar
      [PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor,... · cb38ffcc
      Patrick von Platen authored
      [PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
      
      * push to show
      
      * small improvement
      
      * small improvement
      
      * Update src/transformers/feature_extraction_utils.py
      
      * Update src/transformers/feature_extraction_utils.py
      
      * implement base
      
      * add common tests
      
      * make all tests pass for wav2vec2
      
      * make padding work & add more tests
      
      * finalize feature extractor utils
      
      * add call method to feature extraction
      
      * finalize feature processor
      
      * finish tokenizer
      
      * finish general processor design
      
      * finish tests
      
      * typo
      
      * remove bogus file
      
      * finish docstring
      
      * add docs
      
      * finish docs
      
      * small fix
      
      * correct docs
      
      * save intermediate
      
      * load changes
      
      * apply changes
      
      * apply changes to doc
      
      * change tests
      
      * apply surajs recommend
      
      * final changes
      
      * Apply suggestions from code review
      
      * fix typo
      
      * fix import
      
      * correct docstring
      cb38ffcc
  27. 22 Feb, 2021 1 commit
  28. 19 Feb, 2021 1 commit
  29. 15 Feb, 2021 1 commit
    • Suraj Patil's avatar
      Add mBART-50 (#10154) · 6fc940ed
      Suraj Patil authored
      * add tokenizer for mBART-50
      
      * update tokenizers
      
      * make src_lang and tgt_lang optional
      
      * update tokenizer test
      
      * add setter
      
      * update docs
      
      * update conversion script
      
      * update docs
      
      * update conversion script
      
      * update tokenizer
      
      * update test
      
      * update docs
      
      * doc
      
      * address Sylvain's suggestions
      
      * fix test
      
      * fix formatting
      
      * nits
      6fc940ed
  30. 03 Feb, 2021 1 commit
  31. 02 Feb, 2021 1 commit
    • Patrick von Platen's avatar
      Wav2Vec2 (#9659) · d6217fb3
      Patrick von Platen authored
      
      
      * add raw scaffold
      
      * implement feat extract layers
      
      * make style
      
      * remove +
      
      * correctly convert weights
      
      * make feat extractor work
      
      * make feature extraction proj work
      
      * run forward pass
      
      * finish forward pass
      
      * Succesful decoding example
      
      * remove unused files
      
      * more changes
      
      * add wav2vec tokenizer
      
      * add new structure
      
      * fix run forward
      
      * add other layer norm architecture
      
      * finish 2nd structure
      
      * add model tests
      
      * finish tests for tok and model
      
      * clean-up
      
      * make style
      
      * finish docstring for model and config
      
      * make style
      
      * correct docstring
      
      * correct tests
      
      * change checkpoints to fairseq
      
      * fix examples
      
      * finish wav2vec2
      
      * make style
      
      * apply sylvains suggestions
      
      * apply lysandres suggestions
      
      * change print to log.info
      
      * re-add assert statement
      
      * add input_values as required input name
      
      * finish wav2vec2 tokenizer
      
      * Update tests/test_tokenization_wav2vec2.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * apply sylvains suggestions
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      d6217fb3
  32. 01 Feb, 2021 1 commit
  33. 27 Jan, 2021 2 commits