1. 13 Apr, 2022 1 commit
    • Tu Vu's avatar
      Add self training code for text classification (#16738) · 34ef029d
      Tu Vu authored
      * Add self-training code for text-classification
      
      * Add self-training code for text-classification
      
      * Add self-training code for text-classification
      
      * Add self-training code for text-classification
      
      * Add self-training code for text-classification
      
      * Delete strata
      34ef029d
  2. 12 Apr, 2022 1 commit
  3. 11 Apr, 2022 2 commits
    • Zachary Mueller's avatar
      Fix example logs repeating themselves (#16669) · 69233cf0
      Zachary Mueller authored
      Move declaration of log streams to before tests, so that results won't get compounded on top of each other
      69233cf0
    • Jia LI's avatar
      Jia multi gpu eval (#16428) · 4868a830
      Jia LI authored
      
      
      * add simple multi gpu complet
      
      * add human_eval_multi_gpu
      
      * use copy strategy to distribute across gpu, to avoid padding
      
      * add doc string
      
      * update code style
      
      * use task id to arrange output
      
      * truncate input to avoid zero pad
      
      * Stop the copy mechanism
      
      * update style
      
      * restore copies to scale better in distributed mode
      
      * update style
      
      * replace human eval
      
      * Apply suggestions from code review
      
      1. Tokenize all input at the same time
      2. use attention_mask to get the input length
      3. other small fixes
      Co-authored-by: default avatarLeandro von Werra <lvwerra@users.noreply.github.com>
      
      * correct typo and update docstring
      
      * update code style
      
      * remove num sample division constraint
      
      * remove max len calculation
      
      * use accelerator.gather once to speed up
      
      * use accelerate set_seed; update accelerate version
      
      * correct gather bug
      Co-authored-by: default avatarLeandro von Werra <lvwerra@users.noreply.github.com>
      4868a830
  4. 08 Apr, 2022 1 commit
    • NielsRogge's avatar
      Add TAPEX (#16473) · 4ef0abb7
      NielsRogge authored
      
      
      * Add TapexTokenizer
      
      * Improve docstrings and provide option to provide answer
      
      * Remove option for pretokenized inputs
      
      * Add TAPEX to README
      
      * Fix copies
      
      * Remove option for pretokenized inputs
      
      * Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification.
      
      * - Draft a README file for running the script and introducing some background.
      - Remove unused code lines in tabfact script.
      - Disable the deafult `pad_to_max_length` option which is memory-consuming.
      
      * * Support `as_target_tokenizer` function for TapexTokenizer.
      * Fix the do_lower_case behaviour of TapexTokenizer.
      * Add unit tests for target scenarios and cased/uncased scenarios for both source and target.
      
      * * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function.
      * Fix typos in tapex example README.
      
      * * fix the evaluation script - remove the property `task_name`
      
      * * Make the label space more clear for tabfact tasks
      
      * * Using a new fine-tuning script for tapex-base on tabfact.
      
      * * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case
      * Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql
      
      * * Remove the default tokenizer_name option.
      * Provide evaluation command.
      
      * * Support for WikiTableQuestion dataset.
      
      * Fix a typo in README.
      
      * * Fix the datasets's key name in WikiTableQuestions
      
      * Run make fixup and move test to folder
      
      * Fix quality
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply some more suggestions from code review
      
      * Improve docstrings
      
      * Overwrite failing test
      
      * Improve comment in example scripts
      
      * Fix rebase
      
      * Add TAPEX to Auto mapping
      
      * Add TAPEX to auto config mappings
      
      * Put TAPEX higher than BART in auto mapping
      
      * Add TAPEX to doc tests
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MBP.localdomain>
      Co-authored-by: default avatarSivilTaram <qianlxc@outlook.com>
      Co-authored-by: default avatarNiels Rogge <nielsrogge@nielss-mbp.home>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      4ef0abb7
  5. 31 Mar, 2022 1 commit
  6. 30 Mar, 2022 1 commit
  7. 28 Mar, 2022 1 commit
  8. 24 Mar, 2022 1 commit
  9. 23 Mar, 2022 2 commits
    • Edward Beeching's avatar
      Decision transformer gym (#15845) · aff9bc40
      Edward Beeching authored
      
      
      * Created the Decision Transformer Modle
      
      * updating tests, copy to other machine
      
      * Added last hidden size to Decision Transformer modelling outputs
      
      * Removed copy of original DT file
      
      * made a temporary change to gpt2 to have it conform with the Decision Transformer version
      
      * Updated tests
      
      * Ignoring a file used to test the DT model
      
      * added comments to config file
      
      * added comments and argument descriptions to decision transformer file
      
      * Updated doc
      
      * Ran "make style"
      
      * Remove old model imports
      
      * Removed unused imports, cleaned up init file
      
      * Update docs/source/model_doc/decision_transformer.mdx
      
      added my username
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Reverted changes made to gpt2
      
      * Removed datasets submodule
      
      * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states
      
      * Added support for return of hidden states, attentions and return dict of gpt2 model.
      
      * Updated tests to include many of the ModelTesterMixin tests. 
      
      The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes
      
      * Added missing line to the end of gpt2 file
      
      * Added an integration test for the Decision Transformer
      
      Test performs and autoregressive evaluation for two time steps
      
      * Set done and info to _ to fix failing test
      
      * Updated integration test to be deterministic and check expected outputs
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Removed unnecessary config options
      
      * Cleaned up commented code and old comments.
      
      * Cleaned up commented code.
      
      * Changed DecisionTransformer to Decision Transformer
      
      * Added Decision Transformer to the main README file
      
      * Added copy of GTP2 called DecisionTranformerGPT2Model
      
      * isorted imports
      
      * isorted imports
      
      * Added model to non-English README files
      
      * Ran make fix-copies and corrected some cases.
      
      * Updated index file to include Decision Transformer
      
      * Added gpt2 model as copy inside the Decision Transformer model file
      
      * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS
      
      * Deleted redundant checkpoint files (I don't know how these got committed)
      
      * Removed testing files. (These should have never been committed)
      
      * Removed accidentally committed files
      
      * Moved the Decision Transformer test to its own directory
      
      * Add type hints for Pegasus (#16324)
      
      * Funnel type hints (#16323)
      
      * add pt funnel type hints
      
      * add tf funnel type hints
      
      * Add type hints for ProphetNet PyTorch (#16272)
      
      * [GLPN] Improve docs (#16331)
      
      * Add link to notebook
      
      * Add link
      
      * Fix bug
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      
      * Added type hints for Pytorch Marian calls (#16200)
      
      * Added type hinting for forward functions in pytorch marian
      
      * typo correction
      
      * Removed type hints on functions from BART per Suraj Patil request
      
      * fix import pb
      
      * fix typo
      
      * corrected tuple call
      
      * ran black
      
      * after fix-copies
      Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List
      
      * Fixing copies to roformer and pegasus
      Co-authored-by: default avatarClementine Fourrier <cfourrie@inria.fr>
      Co-authored-by: default avatarmatt <rocketknight1@gmail.com>
      
      * Moved DecisionTransformOutput to modeling_decision_transformer
      
      * Moved the example usage to research project and cleaned comments
      
      * Made tests ignore the copy of gpt2 in Decision Transformer
      
      * Added module output to modelling decision transformer
      
      * removed copied gpt2 model from list of transformers models
      
      * Updated tests and created __init__ file for new test location
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Removed unneeded summary type from config file
      
      * Fixed copies
      
      * Updated pretrained config map to refer to hopper-medium checkpoint
      
      * done (#16340)
      
      * Added Decision transformer to model docs
      
      * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Add type annotations for Rembert/Splinter and copies (#16338)
      
      * undo black autoformat
      
      * minor fix to rembert forward with default
      
      * make fix-copies, make quality
      
      * Adding types to template model
      
      * Removing List from the template types
      
      * Remove `Optional` from a couple of types that don't accept `None`
      Co-authored-by: default avatarmatt <rocketknight1@gmail.com>
      
      * [Bug template] Shift responsibilities for long-range (#16344)
      
      * Fix code repetition in serialization guide (#16346)
      
      * Adopt framework-specific blocks for content (#16342)
      
      *  refactor code samples with framework-specific blocks
      
      *  update training.mdx
      
      * 🖍
      
       apply feedback
      
      * Updates the default branch from master to main (#16326)
      
      * Updates the default branch from master to main
      
      * Links from `master` to `main`
      
      * Typo
      
      * Update examples/flax/README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Updated model with custom docstring example
      
      * Created the Decision Transformer Modle
      
      * updating tests, copy to other machine
      
      * Added last hidden size to Decision Transformer modelling outputs
      
      * Removed copy of original DT file
      
      * made a temporary change to gpt2 to have it conform with the Decision Transformer version
      
      * Updated tests
      
      * Ignoring a file used to test the DT model
      
      * added comments to config file
      
      * added comments and argument descriptions to decision transformer file
      
      * Updated doc
      
      * Ran "make style"
      
      * Remove old model imports
      
      * Removed unused imports, cleaned up init file
      
      * Update docs/source/model_doc/decision_transformer.mdx
      
      added my username
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Reverted changes made to gpt2
      
      * Removed datasets submodule
      
      * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states
      
      * Added support for return of hidden states, attentions and return dict of gpt2 model.
      
      * Updated tests to include many of the ModelTesterMixin tests. 
      
      The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes
      
      * Added missing line to the end of gpt2 file
      
      * Added an integration test for the Decision Transformer
      
      Test performs and autoregressive evaluation for two time steps
      
      * Set done and info to _ to fix failing test
      
      * Updated integration test to be deterministic and check expected outputs
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Removed unnecessary config options
      
      * Cleaned up commented code and old comments.
      
      * Cleaned up commented code.
      
      * Changed DecisionTransformer to Decision Transformer
      
      * Added Decision Transformer to the main README file
      
      * Added copy of GTP2 called DecisionTranformerGPT2Model
      
      * isorted imports
      
      * isorted imports
      
      * Added model to non-English README files
      
      * Ran make fix-copies and corrected some cases.
      
      * Updated index file to include Decision Transformer
      
      * Added gpt2 model as copy inside the Decision Transformer model file
      
      * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS
      
      * Deleted redundant checkpoint files (I don't know how these got committed)
      
      * Removed testing files. (These should have never been committed)
      
      * Removed accidentally committed files
      
      * Moved the Decision Transformer test to its own directory
      
      * Moved DecisionTransformOutput to modeling_decision_transformer
      
      * Moved the example usage to research project and cleaned comments
      
      * Made tests ignore the copy of gpt2 in Decision Transformer
      
      * Added module output to modelling decision transformer
      
      * removed copied gpt2 model from list of transformers models
      
      * Updated tests and created __init__ file for new test location
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Removed unneeded summary type from config file
      
      * Fixed copies
      
      * Updated pretrained config map to refer to hopper-medium checkpoint
      
      * Added Decision transformer to model docs
      
      * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Updated model with custom docstring example
      
      * Updated copies, config auto, and readme files.
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarDan Tegzes <48134725+Tegzes@users.noreply.github.com>
      Co-authored-by: default avatarAdam Montgomerie <adam@avanssion.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      Co-authored-by: default avatarClémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
      Co-authored-by: default avatarClementine Fourrier <cfourrie@inria.fr>
      Co-authored-by: default avatarmatt <rocketknight1@gmail.com>
      Co-authored-by: default avatarFrancesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
      Co-authored-by: default avatarJacob Dineen <54680234+jacobdineen@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarOmar Sanseviero <osanseviero@gmail.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre.debut@reseau.eseo.fr>
      aff9bc40
    • Lysandre Debut's avatar
      Updates the default branch from master to main (#16326) · eca77f47
      Lysandre Debut authored
      
      
      * Updates the default branch from master to main
      
      * Links from `master` to `main`
      
      * Typo
      
      * Update examples/flax/README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      eca77f47
  10. 21 Mar, 2022 1 commit
  11. 17 Mar, 2022 1 commit
  12. 16 Mar, 2022 3 commits
  13. 15 Mar, 2022 1 commit
    • Anton Lozhkov's avatar
      Add the XTREME-S fine-tuning example (#15985) · 99fd3eb4
      Anton Lozhkov authored
      * CTC+classification draft
      
      * CTC+classification draft
      
      * style
      
      * multilingual runs
      
      * Fix race condition during processor.from_reatrained
      
      * Merge covost experiments
      
      * Add README
      
      * Quality
      
      * Switch to .all configs
      
      * Fix typos
      99fd3eb4
  14. 12 Mar, 2022 1 commit
    • Stas Bekman's avatar
      [Deepspeed] add support for bf16 mode (#14569) · 580dd87c
      Stas Bekman authored
      
      
      * [WIP] add support for bf16 mode
      
      * prep for bf16
      
      * prep for bf16
      
      * fix; zero2/bf16 is ok
      
      * check bf16 is available
      
      * test fixes
      
      * enable zero3_bf16
      
      * config files
      
      * docs
      
      * split stage_dtype; merge back to non-dtype-specific config file
      
      * fix doc
      
      * cleanup
      
      * cleanup
      
      * bfloat16 => bf16 to match the PR changes
      
      * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/
      
      * test fixes/skipping
      
      * move
      
      * fix
      
      * Update docs/source/main_classes/deepspeed.mdx
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * backticks
      
      * cleanup
      
      * cleanup
      
      * cleanup
      
      * new version
      
      * add note about grad accum in bf16
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      580dd87c
  15. 10 Mar, 2022 1 commit
  16. 04 Mar, 2022 1 commit
  17. 02 Mar, 2022 1 commit
  18. 21 Feb, 2022 1 commit
  19. 15 Feb, 2022 1 commit
  20. 11 Feb, 2022 1 commit
  21. 09 Feb, 2022 1 commit
  22. 07 Feb, 2022 1 commit
  23. 31 Jan, 2022 2 commits
    • Jonatas Grosman's avatar
    • Julien Plu's avatar
      Add (M)Luke model training for Token Classification in the examples (#14880) · aa19f478
      Julien Plu authored
      * Add Luke training
      
      * Fix true label tags
      
      * Fix true label tags
      
      * Fix true label tags
      
      * Update the data collator for Luke
      
      * Some training refactor for Luke
      
      * Improve data collator for Luke
      
      * Fix import
      
      * Fix datasets concatenation
      
      * Add the --max_entity_length argument for Luke models
      
      * Remove unused code
      
      * Fix style issues
      
      * Fix style issues
      
      * Move the Luke training into a separate folder
      
      * Fix style
      
      * Fix naming
      
      * Fix filtering
      
      * Fix filtering
      
      * Fix filter
      
      * Update some preprocessing
      
      * Move luke to research_projects
      
      * Checkstyle
      
      * Address comments
      
      * Fix style
      aa19f478
  24. 27 Jan, 2022 4 commits
  25. 24 Jan, 2022 1 commit
  26. 21 Jan, 2022 2 commits
  27. 20 Jan, 2022 2 commits
  28. 19 Jan, 2022 3 commits