1. 06 Jan, 2021 1 commit
  2. 05 Jan, 2021 3 commits
    • Patrick von Platen's avatar
      [PyTorch Bart] Split Bart into different models (#9343) · eef66035
      Patrick von Platen authored
      * first try
      
      * remove old template
      
      * finish bart
      
      * finish mbart
      
      * delete unnecessary line
      
      * init pegasus
      
      * save intermediate
      
      * correct pegasus
      
      * finish pegasus
      
      * remove cookie cutter leftover
      
      * add marian
      
      * finish blenderbot
      
      * replace in file
      
      * correctly split blenderbot
      
      * delete "old" folder
      
      * correct "add statement"
      
      * adapt config for tf comp
      
      * correct configs for tf
      
      * remove ipdb
      
      * fix more stuff
      
      * fix mbart
      
      * push pegasus fix
      
      * fix mbart
      
      * more fixes
      
      * fix research projects code
      
      * finish docs for bart, mbart, and marian
      
      * delete unnecessary file
      
      * correct attn typo
      
      * correct configs
      
      * remove pegasus for seq class
      
      * correct peg docs
      
      * correct peg docs
      
      * finish configs
      
      * further improve docs
      
      * add copied from statements to mbart
      
      * fix copied from in mbart
      
      * add copy statements to marian
      
      * add copied from to marian
      
      * add pegasus copied from
      
      * finish pegasus
      
      * finish copied from
      
      * Apply suggestions from code review
      
      * make style
      
      * backward comp blenderbot
      
      * apply lysandres and sylvains suggestions
      
      * apply suggestions
      
      * push last fixes
      
      * fix docs
      
      * fix tok tests
      
      * fix imports code style
      
      * fix doc
      eef66035
    • Patrick von Platen's avatar
      LED (#9278) · 189387e9
      Patrick von Platen authored
      * create model
      
      * add integration
      
      * save current state
      
      * make integration tests pass
      
      * add one more test
      
      * add explanation to tests
      
      * remove from bart
      
      * add padding
      
      * remove unnecessary test
      
      * make all tests pass
      
      * re-add cookie cutter tests
      
      * finish PyTorch
      
      * fix attention test
      
      * Update tests/test_modeling_common.py
      
      * revert change
      
      * remove unused file
      
      * add string to doc
      
      * save intermediate
      
      * make tf integration tests pass
      
      * finish tf
      
      * fix doc
      
      * fix docs again
      
      * add led to doctree
      
      * add to auto tokenizer
      
      * added tips for led
      
      * make style
      
      * apply jplus statements
      
      * correct tf longformer
      
      * apply lysandres suggestions
      
      * apply sylvains suggestions
      
      * Apply suggestions from code review
      189387e9
    • Julien Plu's avatar
      Use stable functions (#9369) · 4225740a
      Julien Plu authored
      4225740a
  3. 04 Jan, 2021 1 commit
  4. 25 Dec, 2020 1 commit
  5. 24 Dec, 2020 1 commit
    • Ratthachat (Jung)'s avatar
      Proposed Fix : [RagSequenceForGeneration] generate "without" input_ids (#9220) · f3a3b91d
      Ratthachat (Jung) authored
      * Create modeling_tf_dpr.py
      
      * Add TFDPR
      
      * Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
      
      last commit accidentally deleted these 4 lines, so I recover them back
      
      * Add TFDPR
      
      * Add TFDPR
      
      * clean up some comments, add TF input-style doc string
      
      * Add TFDPR
      
      * Make return_dict=False as default
      
      * Fix return_dict bug (in .from_pretrained)
      
      * Add get_input_embeddings()
      
      * Create test_modeling_tf_dpr.py
      
      The current version is already passed all 27 tests!
      Please see the test run at : 
      https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
      
      
      
      * fix quality
      
      * delete init weights
      
      * run fix copies
      
      * fix repo consis
      
      * del config_class, load_tf_weights
      
      They shoud be 'pytorch only'
      
      * add config_class back
      
      after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
      
      * newline after .. note::
      
      * import tf, np (Necessary for ModelIntegrationTest)
      
      * slow_test from_pretrained with from_pt=True
      
      At the moment we don't have TF weights (since we don't have official official TF model)
      Previously, I did not run slow test, so I missed this bug
      
      * Add simple TFDPRModelIntegrationTest
      
      Note that this is just a test that TF and Pytorch gives approx. the same output.
      However, I could not test with the official DPR repo's output yet
      
      * upload correct tf model
      
      * remove position_ids as missing keys
      
      * fix RagSeq generate with context_input_ids
      
      fix RagSeq generate with context_input_ids
      
      * apply style
      
      * delete unused lines
      
      * Add test_rag_sequence_generate_batch_from_context_input_ids
      
      * Readability improved
      
      * stylying
      
      * Stylize
      
      * typos
      
      * add check_model_generate_from_context_input_ids
      
      * make style
      
      * Apply suggestions from code review
      
      * make style2
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarpatrickvonplaten <patrick@huggingface.co>
      f3a3b91d
  6. 23 Dec, 2020 1 commit
    • Suraj Patil's avatar
      Add caching mechanism to BERT, RoBERTa (#9183) · 88ef8893
      Suraj Patil authored
      * add past_key_values
      
      * add use_cache option
      
      * make mask before cutting ids
      
      * adjust position_ids according to past_key_values
      
      * flatten past_key_values
      
      * fix positional embeds
      
      * fix _reorder_cache
      
      * set use_cache to false when not decoder, fix attention mask init
      
      * add test for caching
      
      * add past_key_values for Roberta
      
      * fix position embeds
      
      * add caching test for roberta
      
      * add doc
      
      * make style
      
      * doc, fix attention mask, test
      
      * small fixes
      
      * adress patrick's comments
      
      * input_ids shouldn't start with pad token
      
      * use_cache only when decoder
      
      * make consistent with bert
      
      * make copies consistent
      
      * add use_cache to encoder
      
      * add past_key_values to tapas attention
      
      * apply suggestions from code review
      
      * make coppies consistent
      
      * add attn mask in tests
      
      * remove copied from longformer
      
      * apply suggestions from code review
      
      * fix bart test
      
      * nit
      
      * simplify model outputs
      
      * fix doc
      
      * fix output ordering
      88ef8893
  7. 22 Dec, 2020 3 commits
  8. 21 Dec, 2020 3 commits
  9. 19 Dec, 2020 1 commit
    • sandip's avatar
      Added TF TransfoXL Sequence Classification (#9169) · e0e255be
      sandip authored
      * TF Transfoxl seq classification
      
      * Update test_modeling_tf_transfo_xl.py
      
      Added num_labels to config level
      
      * TF Transfoxl seq classification
      
      * Update test_modeling_tf_transfo_xl.py
      
      Added num_labels to config level
      
      * code refactor
      
      * code refactor
      
      * code refator
      e0e255be
  10. 18 Dec, 2020 2 commits
  11. 17 Dec, 2020 1 commit
  12. 16 Dec, 2020 3 commits
    • Lysandre Debut's avatar
      TableQuestionAnsweringPipeline (#9145) · 1c1a2ffb
      Lysandre Debut authored
      
      
      * AutoModelForTableQuestionAnswering
      
      * TableQuestionAnsweringPipeline
      
      * Apply suggestions from Patrick's code review
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Sylvain and Patrick comments
      
      * Better PyTorch/TF error message
      
      * Add integration tests
      
      * Argument Handler naming
      Co-authored-by: default avatarpatrickvonplaten <patrick.v.platen@gmail.com>
      
      * Fix docs to appease the documentation gods
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      1c1a2ffb
    • Lysandre Debut's avatar
      AutoModelForTableQuestionAnswering (#9154) · 07384baf
      Lysandre Debut authored
      * AutoModelForTableQuestionAnswering
      
      * Update src/transformers/models/auto/modeling_auto.py
      
      * Style
      07384baf
    • Patrick von Platen's avatar
      [Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) · 640e6fe1
      Patrick von Platen authored
      
      
      * save intermediate
      
      * save intermediate
      
      * save intermediate
      
      * correct flax bert model file
      
      * new module / model naming
      
      * make style
      
      * almost finish BERT
      
      * finish roberta
      
      * make fix-copies
      
      * delete keys file
      
      * last refactor
      
      * fixes in run_mlm_flax.py
      
      * remove pooled from run_mlm_flax.py`
      
      * fix gelu | gelu_new
      
      * remove Module from inits
      
      * splits
      
      * dirty print
      
      * preventing warmup_steps == 0
      
      * smaller splits
      
      * make fix-copies
      
      * dirty print
      
      * dirty print
      
      * initial_evaluation argument
      
      * declaration order fix
      
      * proper model initialization/loading
      
      * proper initialization
      
      * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug
      
      * removed tokenizers warning hack, fixed model re-initialization
      
      * reverted training_args.py changes
      
      * fix flax from pretrained
      
      * improve test in flax
      
      * apply sylvains tips
      
      * update init
      
      * make 0.3.0 compatible
      
      * revert tevens changes
      
      * revert tevens changes 2
      
      * finalize revert
      
      * fix bug
      
      * add docs
      
      * add pretrained to init
      
      * Update src/transformers/modeling_flax_utils.py
      
      * fix copies
      
      * final improvements
      Co-authored-by: default avatarTevenLeScao <teven.lescao@gmail.com>
      640e6fe1
  13. 15 Dec, 2020 7 commits
    • NielsRogge's avatar
      [WIP] Tapas v4 (tres) (#9117) · 1551e2dc
      NielsRogge authored
      
      
      * First commit: adding all files from tapas_v3
      
      * Fix multiple bugs including soft dependency and new structure of the library
      
      * Improve testing by adding torch_device to inputs and adding dependency on scatter
      
      * Use Python 3 inheritance rather than Python 2
      
      * First draft model cards of base sized models
      
      * Remove model cards as they are already on the hub
      
      * Fix multiple bugs with integration tests
      
      * All model integration tests pass
      
      * Remove print statement
      
      * Add test for convert_logits_to_predictions method of TapasTokenizer
      
      * Incorporate suggestions by Google authors
      
      * Fix remaining tests
      
      * Change position embeddings sizes to 512 instead of 1024
      
      * Comment out positional embedding sizes
      
      * Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
      
      * Added more model names
      
      * Fix truncation when no max length is specified
      
      * Disable torchscript test
      
      * Make style & make quality
      
      * Quality
      
      * Address CI needs
      
      * Test the Masked LM model
      
      * Fix the masked LM model
      
      * Truncate when overflowing
      
      * More much needed docs improvements
      
      * Fix some URLs
      
      * Some more docs improvements
      
      * Test PyTorch scatter
      
      * Set to slow + minify
      
      * Calm flake8 down
      
      * First commit: adding all files from tapas_v3
      
      * Fix multiple bugs including soft dependency and new structure of the library
      
      * Improve testing by adding torch_device to inputs and adding dependency on scatter
      
      * Use Python 3 inheritance rather than Python 2
      
      * First draft model cards of base sized models
      
      * Remove model cards as they are already on the hub
      
      * Fix multiple bugs with integration tests
      
      * All model integration tests pass
      
      * Remove print statement
      
      * Add test for convert_logits_to_predictions method of TapasTokenizer
      
      * Incorporate suggestions by Google authors
      
      * Fix remaining tests
      
      * Change position embeddings sizes to 512 instead of 1024
      
      * Comment out positional embedding sizes
      
      * Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES
      
      * Added more model names
      
      * Fix truncation when no max length is specified
      
      * Disable torchscript test
      
      * Make style & make quality
      
      * Quality
      
      * Address CI needs
      
      * Test the Masked LM model
      
      * Fix the masked LM model
      
      * Truncate when overflowing
      
      * More much needed docs improvements
      
      * Fix some URLs
      
      * Some more docs improvements
      
      * Add add_pooling_layer argument to TapasModel
      
      Fix comments by @sgugger and @patrickvonplaten
      
      * Fix issue in docs + fix style and quality
      
      * Clean up conversion script and add task parameter to TapasConfig
      
      * Revert the task parameter of TapasConfig
      
      Some minor fixes
      
      * Improve conversion script and add test for absolute position embeddings
      
      * Improve conversion script and add test for absolute position embeddings
      
      * Fix bug with reset_position_index_per_cell arg of the conversion cli
      
      * Add notebooks to the examples directory and fix style and quality
      
      * Apply suggestions from code review
      
      * Move from `nielsr/` to `google/` namespace
      
      * Apply Sylvain's comments
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      Co-authored-by: default avatarRogge Niels <niels.rogge@howest.be>
      Co-authored-by: default avatarLysandreJik <lysandre.debut@reseau.eseo.fr>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarsgugger <sylvain.gugger@gmail.com>
      1551e2dc
    • Sylvain Gugger's avatar
      Add possibility to switch between APEX and AMP in Trainer (#9137) · ad895af9
      Sylvain Gugger authored
      
      
      * Add possibility to switch between APEX and AMP in Trainer
      
      * Update src/transformers/training_args.py
      Co-authored-by: default avatarStas Bekman <stas00@users.noreply.github.com>
      
      * Address review comments
      
      * Update src/transformers/training_args.py
      Co-authored-by: default avatarStas Bekman <stas00@users.noreply.github.com>
      Co-authored-by: default avatarStas Bekman <stas00@users.noreply.github.com>
      ad895af9
    • Lysandre Debut's avatar
      Add large model config (#9140) · 0b2f46fa
      Lysandre Debut authored
      0b2f46fa
    • Patrick von Platen's avatar
      [TF Bart] Refactor TFBart (#9029) · abc573f5
      Patrick von Platen authored
      * reorder file
      
      * delete unnecesarry function
      
      * make style
      
      * save intermediate
      
      * fix attention masks
      
      * correct tf bart past key values
      
      * solve merge conflict bug
      
      * correct tensor dims
      
      * save intermediate tf
      
      * change attn layer
      
      * fix typo re-order past
      
      * inputs_embeds
      
      * make fix copies
      
      * finish tests
      
      * fix graph mode
      
      * appyl lysandres suggestions
      abc573f5
    • sandip's avatar
      Added TF OpenAi GPT1 Sequence Classification (#9105) · 389aba34
      sandip authored
      
      
      * TF OpenAI GPT Sequence Classification
      
      * Update src/transformers/models/openai/modeling_tf_openai.py
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      389aba34
    • Julien Plu's avatar
      Fix tf2.4 (#9120) · ef2d4cd4
      Julien Plu authored
      
      
      * Fix tests for TF 2.4
      
      * Remove <2.4 limitation
      
      * Add version condition
      
      * Update tests/test_optimization_tf.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update tests/test_optimization_tf.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update tests/test_optimization_tf.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      ef2d4cd4
    • Lysandre Debut's avatar
      Fix T5 model parallel tes (#9107) · 6ccea048
      Lysandre Debut authored
      k
      6ccea048
  14. 14 Dec, 2020 4 commits
    • Julien Plu's avatar
      Fix T5 and BART for TF (#9063) · df3f4d2a
      Julien Plu authored
      * Fix T5 for graphe compilation+execution
      
      * Fix BART
      
      * Fix import
      
      * Fix naming
      
      * fix attribute name
      
      * Oops
      
      * fix import
      
      * fix tests
      
      * fix tests
      
      * Update test
      
      * Add mising import
      
      * Address Patrick's comments
      
      * Style
      
      * Address Patrick's comment
      df3f4d2a
    • Ahmed Elnaggar's avatar
      Add parallelization support for T5EncoderModel (#9082) · a9c8bff7
      Ahmed Elnaggar authored
      
      
      * add model parallelism to T5EncoderModel
      
      add model parallelism to T5EncoderModel
      
      * remove decoder from T5EncoderModel parallelize
      
      * uodate T5EncoderModel docs
      
      * Extend T5ModelTest for T5EncoderModel
      
      * fix T5Stask using range for get_device_map
      
      * fix style
      Co-authored-by: default avatarAhmed Elnaggar <elnaggar@rostlab.informatik.tu-muenchen.de>
      a9c8bff7
    • Patrick von Platen's avatar
      [RAG, Bart] Align RAG, Bart cache with T5 and other models of transformers (#9098) · fa1ddced
      Patrick von Platen authored
      * fix rag
      
      * fix slow test
      
      * fix past in bart
      fa1ddced
    • Julien Plu's avatar
      Fix embeddings resizing in TF models (#8657) · 51d9c569
      Julien Plu authored
      * Resize the biases in same time than the embeddings
      
      * Trigger CI
      
      * Biases are not reset anymore
      
      * Remove get_output_embeddings + better LM model detection in generation utils
      
      * Apply style
      
      * First test on BERT
      
      * Update docstring + new name
      
      * Apply the new resizing logic to all the models
      
      * fix tests
      
      * Apply style
      
      * Update the template
      
      * Fix naming
      
      * Fix naming
      
      * Apply style
      
      * Apply style
      
      * Remove unused import
      
      * Revert get_output_embeddings
      
      * Trigger CI
      
      * Update num parameters
      
      * Restore get_output_embeddings in TFPretrainedModel and add comments
      
      * Style
      
      * Add decoder resizing
      
      * Style
      
      * Fix tests
      
      * Separate bias and decoder resize
      
      * Fix tests
      
      * Fix tests
      
      * Apply style
      
      * Add bias resizing in MPNet
      
      * Trigger CI
      
      * Apply style
      51d9c569
  15. 11 Dec, 2020 1 commit
  16. 10 Dec, 2020 1 commit
  17. 09 Dec, 2020 4 commits
  18. 08 Dec, 2020 2 commits