- 12 Jul, 2021 1 commit
-
-
Lysandre Debut authored
* Cleanup test * Skip TF TransfoXL test
-
- 14 Jun, 2021 1 commit
-
-
Stas Bekman authored
* consistent nn. and nn.functional: p3 templates * restore
-
- 05 May, 2021 1 commit
-
-
Patrick von Platen authored
* lazy_init_weights * remove ipdb * save int * add necessary code * remove unnecessary utils * Update src/transformers/models/t5/modeling_t5.py * clean * add tests * correct * finish tests * finish tests * fix some more tests * fix xlnet & transfo-xl * fix more tests * make sure tests are independent * fix tests more * finist tests * final touches * Update src/transformers/modeling_utils.py * Apply suggestions from code review * Update src/transformers/modeling_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * clean tests * give arg positive name * add more mock weights to xlnet Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
- 06 Jan, 2021 1 commit
-
-
Simon Brandeis authored
* Define new output dataclasses for greedy generation * Add output_[...] flags in greedy generation methods Added output_attentions, output_hidden_states, output_scores flags in generate and greedy_search methods in GenerationMixin. * [WIP] Implement logic and tests for output flags in generation * Update GreedySearchOutput classes & docstring * Implement greedy search output accumulation logic Update greedy_search unittests Fix generate method return value docstring Properly init flags with the default config * Update configuration to add output_scores flag * Fix test_generation_utils Sort imports and fix isinstance tests for GreedySearchOutputs * Fix typo in generation_utils * Add return_dict_in_generate for backwards compatibility * Add return_dict_in_generate flag in config * Fix tyPo in configuration * Fix handling of attentions and hidden_states flags * Make style & quality * first attempt attentions * some corrections * improve tests * special models requires special test * disable xlm test for now * clean tests * fix for tf * isort * Add output dataclasses for other generation methods * Add logic to return dict in sample generation * Complete test for sample generation - Pass output_attentions and output_hidden_states flags to encoder in encoder-decoder models - Fix import satements order in test_generation_utils file * Add logic to return dict in sample generation - Refactor tests to avoid using self.assertTrue, which provides scarce information when the test fails - Add tests for the three beam_search methods: vanilla, sample and grouped * Style doc * Fix copy-paste error in generation tests * Rename logits to scores and refactor * Refactor group_beam_search for consistency * make style * add sequences_scores * fix all tests * add docs * fix beam search finalize test * correct docstring * clean some files * Made suggested changes to the documentation * Style doc ? * Style doc using the Python util * Update src/transformers/generation_utils.py * fix empty lines * fix all test Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
- 07 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Add copyright everywhere missing * Style
-
- 02 Dec, 2020 2 commits
-
-
Patrick von Platen authored
* fix resize tokens * correct mobile_bert * move embedding fix into modeling_utils.py * refactor * fix lm head resize * refactor * break lines to make sylvain happy * add news tests * fix typo * improve test * skip bart-like for now * check if base_model = get(...) is necessary * clean files * improve test * fix tests * revert style templates * Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py -
sandip authored
* Transfoxl sequence classification * Transfoxl sequence classification
-
- 25 Nov, 2020 1 commit
-
-
Joe Davison authored
* bart output hidden states upstream * same w/ decoder * add tests * fix prophetnet * fix gpt2 and ctrl * fix fstm and skip test for reformer and longformer * fix all models Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
- 17 Nov, 2020 1 commit
-
-
Sylvain Gugger authored
* Put models in subfolders * Styling * Fix imports in tests * More fixes in test imports * Sneaky hidden imports * Fix imports in doc files * More sneaky imports * Finish fixing tests * Fix examples * Fix path for copies * More fixes for examples * Fix dummy files * More fixes for example * More model import fixes * Is this why you're unhappy GitHub? * Fix imports in conver command
-
- 16 Nov, 2020 1 commit
-
-
Sylvain Gugger authored
* Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Run on the real suite * Fix slow tests
-
- 10 Nov, 2020 1 commit
-
-
Stas Bekman authored
* s|multiple_gpu|multi_gpu|g; s|multigpu|multi_gpu|g' * doc
-
- 09 Nov, 2020 1 commit
-
-
Patrick von Platen authored
* add training tests * correct longformer * fix docs * fix some tests * fix some more train tests * remove ipdb * fix multiple edge case model training * fix funnel and prophetnet * clean gpt models * undo renaming of albert
-
- 03 Nov, 2020 1 commit
-
-
Patrick von Platen authored
* first draft * show design proposition for new generate method * up * make better readable * make first version * gpt2 tests pass * make beam search for gpt2 work * add first encoder-decoder code * delete typo * make t5 work * save indermediate * make bart work with beam search * finish beam search bart / t5 * add default kwargs * make more tests pass * fix no bad words sampler * some fixes and tests for all distribution processors * fix test * fix rag slow tests * merge to master * add nograd to generate * make all slow tests pass * speed up generate * fix edge case bug * small fix * correct typo * add type hints and docstrings * fix typos in tests * add beam search tests * add tests for beam scorer * fix test rag * finish beam search tests * move generation tests in seperate file * fix generation tests * more tests * add aggressive generation tests * fix tests * add gpt2 sample test * add more docstring * add more docs * finish doc strings * apply some more of sylvains and sams comments * fix some typos * make fix copies * apply lysandres and sylvains comments * final corrections on examples * small fix for reformer
-
- 30 Oct, 2020 1 commit
-
-
Lysandre Debut authored
* Test TF GPU CI * Change cache * Fix missing torch requirement * Fix some model tests Style * LXMERT * MobileBERT * Longformer skip test * XLNet * The rest of the tests * RAG goes OOM in multi gpu setup * YAML test files * Last fixes * Skip doctests * Fill mask tests * Yaml files * Last test fix * Style * Update cache * Change ONNX tests to slow + use tiny model
-
- 20 Oct, 2020 1 commit
-
-
Stas Bekman authored
* rename skip targets + docs * fix quotes * style * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * small improvements * fix Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 26 Aug, 2020 1 commit
-
-
Lysandre authored
-
- 24 Aug, 2020 1 commit
-
-
Sylvain Gugger authored
* Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks
-
- 13 Aug, 2020 1 commit
-
-
Stas Bekman authored
* cleanup torch unittests: part 2 * remove trailing comma added by isort, and which breaks flake * one more comma * revert odd balls * part 3: odd cases * more ["key"] -> .key refactoring * .numpy() is not needed * more unncessary .numpy() removed * more simplification
-
- 31 Jul, 2020 1 commit
-
-
Sylvain Gugger authored
* Use return_dict=True in all tests * Formatting
-
- 02 Jul, 2020 1 commit
-
-
Teven authored
* Changed expected_output_ids in TransfoXL generation test to match #4826 generation PR. * making black happy * making isort happy
-
- 01 Jul, 2020 1 commit
-
-
Sam Shleifer authored
-
- 16 Jun, 2020 1 commit
-
-
Amil Khare authored
Co-authored-by:Sam Shleifer <sshleifer@gmail.com>
-
- 10 Jun, 2020 1 commit
-
-
RafaelWO authored
* Fixed resize_token_embeddings for transfo_xl model * Fixed resize_token_embeddings for transfo_xl. Added custom methods to TransfoXLPreTrainedModel for resizing layers of the AdaptiveEmbedding. * Updated docstring * Fixed resizinhg cutoffs; added check for new size of embedding layer. * Added test for resize_token_embeddings * Fixed code quality * Fixed unchanged cutoffs in model.config Co-authored-by:Rafael Weingartner <rweingartner.its-b2015@fh-salzburg.ac.at>
-
- 02 Jun, 2020 1 commit
-
-
Julien Chaumond authored
* Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI
-
- 19 May, 2020 2 commits
-
-
Patrick von Platen authored
* fix gpu slow tests in pytorch * change model to device syntax
-
Julien Chaumond authored
* Test case for #3936 * multigpu tests pass on pytorch 1.4.0 * Fixup * multigpu tests pass on pytorch 1.5.0 * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * rename multigpu to require_multigpu * mode doc
-
- 01 May, 2020 1 commit
-
-
Julien Chaumond authored
There's an inconsistency right now where: - we load some models into CACHE_DIR - and some models in the default cache - and often, in both for the same models When running the RUN_SLOW tests, this takes a lot of disk space, time, and bandwidth. I'd rather always use the default cache
-
- 13 Apr, 2020 1 commit
-
-
Teven authored
* Shifting labels inside TransfoXLLMHead * Changed doc to reflect change * Updated pytorch test * removed IDE whitespace changes * black reformat Co-authored-by:TevenLeScao <teven.lescao@gmail.com>
-
- 20 Mar, 2020 1 commit
-
-
Patrick von Platen authored
* make style * fix conflicts
-
- 18 Mar, 2020 1 commit
-
-
Patrick von Platen authored
Adding LM Head to Transfo-XL and first step to fixing problem with Adaptive Embeddings in TransfoXL (#3286) * first commit * work in progress * make language generation task pass * update to working version for LM * delete print * remove dead code * make style
-
- 08 Mar, 2020 2 commits
-
-
patrickvonplaten authored
-
patrickvonplaten authored
-
- 25 Feb, 2020 2 commits
-
-
Patrick von Platen authored
-
Patrick von Platen authored
-
- 24 Feb, 2020 1 commit
-
-
Patrick von Platen authored
* add slow generate lm_model tests * fix conflicts * merge conflicts * fix conflicts * add slow generate lm_model tests * make style * delete unused variable * fix conflicts * fix conflicts * fix conflicts * delete unused variable * fix conflicts * finished hard coded tests
-
- 21 Feb, 2020 1 commit
-
-
Patrick von Platen authored
* improving generation * finalized special token behaviour for no_beam_search generation * solved modeling_utils merge conflict * solve merge conflicts in modeling_utils.py * add run_generation improvements from PR #2749 * adapted language generation to not use hardcoded -1 if no padding token is available * remove the -1 removal as hard coded -1`s are not necessary anymore * add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown * add slow language generation tests for pretrained models using hardcoded output with pytorch seed * delete ipdb * check that all generated tokens are valid * renaming * renaming Generation -> Generate * make style * updated so that generate_beam_search has same token behavior than generate_no_beam_search * consistent return format for run_generation.py * deleted pretrain lm generate tests -> will be added in another PR * cleaning of unused if statements and renaming * run_generate will always return an iterable * make style * consistent renaming * improve naming, make sure generate function always returns the same tensor, add docstring * add slow tests for all lmhead models * make style and improve example comments modeling_utils * better naming and refactoring in modeling_utils * improving generation * finalized special token behaviour for no_beam_search generation * solved modeling_utils merge conflict * solve merge conflicts in modeling_utils.py * add run_generation improvements from PR #2749 * adapted language generation to not use hardcoded -1 if no padding token is available * remove the -1 removal as hard coded -1`s are not necessary anymore * add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown * add slow language generation tests for pretrained models using hardcoded output with pytorch seed * delete ipdb * check that all generated tokens are valid * renaming * renaming Generation -> Generate * make style * updated so that generate_beam_search has same token behavior than generate_no_beam_search * consistent return format for run_generation.py * deleted pretrain lm generate tests -> will be added in another PR * cleaning of unused if statements and renaming * run_generate will always return an iterable * make style * consistent renaming * improve naming, make sure generate function always returns the same tensor, add docstring * add slow tests for all lmhead models * make style and improve example comments modeling_utils * better naming and refactoring in modeling_utils * changed fast random lm generation testing design to more general one * delete in old testing design in gpt2 * correct old variable name * temporary fix for encoder_decoder lm generation tests - has to be updated when t5 is fixed * adapted all fast random generate tests to new design * better warning description in modeling_utils * better comment * better comment and error message Co-authored-by:Thomas Wolf <thomwolf@users.noreply.github.com>
-
- 06 Jan, 2020 2 commits
-
-
alberduris authored
-
alberduris authored
-
- 22 Dec, 2019 2 commits
-
-
Aymeric Augustin authored
-
Aymeric Augustin authored
I suspect the wrapper classes were created in order to prevent the abstract base class (TF)CommonModelTester from being included in test discovery and running, because that would fail. I solved this by replacing the abstract base class with a mixin. Code changes are just de-indenting and automatic reformattings performed by black to use the extra line space.
-