- 01 Sep, 2021 13 commits
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * up * correct some bugs * correct model * finish speech2text extension * up * up * up * up * Update utils/custom_init_isort.py * up * up * update with tokenizer * correct old tok * correct old tok * fix bug * up * up * add more tests * up * fix docs * up * fix some more tests * add better config * correct some more things " * fix tests * improve docs * Apply suggestions from code review * Apply suggestions from code review * final fixes * finalize * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * apply suggestions Lysandre and Sylvain * apply nicos suggestions * upload everything * finish Co-authored-by:
Patrick von Platen <patrick@huggingface.co> Co-authored-by: your_github_username <your_github_email> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Lysandre Debut authored
-
Hamid Shojanazeri authored
Fix for the issue of device-id getting hardcoded for token_type_ids during Tracing for ConvBert (#12287) * added token_type_ids buffer to fix the issue #5664 * Handling the case that position_id buffer is not registered * added token_type_ids buffer to fix the issue #5664 * modified to support device conversion when the model is traced
-
Hamid Shojanazeri authored
Fix for the issue of device-id getting hardcoded for position-ids during Tracing for Distillbert (#12290) * registered buffer for position-ids to address issues similar to issue#5664 * added comment * added the flag to prevent from adding the buffer into the state_dict
-
Hamid Shojanazeri authored
Fix for the issue of device-id getting hardcoded for position-ids during Tracing for Flaubert (#12292) * adding position_ids buffer to fix the issue simialr to #5664 * adding position-id buffer to address similar issues to #5664
-
Lysandre Debut authored
* Torchscript test for Flaubert * Update tests/test_modeling_flaubert.py * Update tests/test_modeling_flaubert.py
-
Lysandre Debut authored
* Torchscript test for ConvBERT * Apply suggestions from code review
-
Lysandre Debut authored
* Torchscript test for DistilBERT * Update tests/test_modeling_distilbert.py
-
Lysandre Debut authored
* Torchscript test * Remove print statement
-
Anton Lozhkov authored
* Add the audio classification pipeline * Remove autoconfig exception * Mark ffmpeg test as slow * Rearrange pipeline tests * Add small test * Replace asserts with ValueError
-
Patrick von Platen authored
-
Jonathan Chang authored
* Add option to add flax * Add flax template for __init__.py * Add flax template for .rst * Copy TF modeling template * Add a missing line in modeling_tf_... template * Update first half of modeling_flax_.. * Update encoder flax template * Copy test_modeling_tf... as test_modeling_flax... * Replace some TF to Flax in test_modeling_flax_... * Replace tf to np some function might not work, like _assert_tensors_equal * Replace remaining tf to np (might not work) * Fix cookiecutter * Add Flax in to_replace_... template * Update transformers-cli add-new-model * Save generate_flax in configuration.json This will be read by transformers-cli * Fix to_replace_... and cli * Fix replace cli * Fix cookiecutter name * Move docstring earlier to avoid not defined error * Fix a missing Module * Add encoder-decoder flax template from bart * Fix flax test * Make style * Fix endif * Fix replace all "utf-8 -> unp-8" * Update comment * Fix flax template (add missing ..._DOCSTRING) * Use flax_bart imports in template (was t5) * Fix unp * Update templates/adding_a_new_model/tests * Revert "Fix unp" This reverts commit dc9002a41d902c4f9b07343eab1cb350c8b7fd57. * Remove one line of copied from to suppress CI error * Use generate_tensorflow_pytorch_and_flax * Add a missing part * fix typo * fix flax config * add examples for flax * small rename * correct modeling imports * correct auto loading * corrects some flax tests * correct small typo * correct as type * finish modif * correct more templates * final fixes * add file testers * up * make sure tests match template regex * correct pytorch * correct tf * correct more tf * correct imports * minor error * minor error * correct init * more fixes * correct more flax tests * correct flax test * more fixes * correct docs * update * fix Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
-
- 31 Aug, 2021 21 commits
-
-
Stella Biderman authored
* Test GPTJ implementation * Fixed conflicts * Update __init__.py * Update __init__.py * change GPT_J to GPTJ * fix missing imports and typos * use einops for now (need to change to torch ops later) * Use torch ops instead of einsum * remove einops deps * Update configuration_auto.py * Added GPT J * Update gptj.rst * Update __init__.py * Update test_modeling_gptj.py * Added GPT J * Changed configs to match GPT2 instead of GPT Neo * Removed non-existent sequence model * Update configuration_auto.py * Update configuration_auto.py * Update configuration_auto.py * Update modeling_gptj.py * Update modeling_gptj.py * Progress on updating configs to agree with GPT2 * Update modeling_gptj.py * num_layers -> n_layer * layer_norm_eps -> layer_norm_epsilon * attention_layers -> num_hidden_layers * Update modeling_gptj.py * attention_pdrop -> attn_pdrop * hidden_act -> activation_function * Update configuration_gptj.py * Update configuration_gptj.py * Update configuration_gptj.py * Update configuration_gptj.py * Update configuration_gptj.py * Update modeling_gptj.py * Update modeling_gptj.py * Update modeling_gptj.py * Update modeling_gptj.py * Update modeling_gptj.py * Update modeling_gptj.py * fix layernorm and lm_head size delete attn_type * Update docs/source/model_doc/gptj.rst Co-authored-by:
Suraj Patil <surajp815@gmail.com> * removed claim that GPT J uses local attention * Removed GPTJForSequenceClassification * Update src/transformers/models/gptj/configuration_gptj.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Removed unsupported boilerplate * Update tests/test_modeling_gptj.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update tests/test_modeling_gptj.py Co-authored-by:
Eric Hallahan <eric@hallahans.name> * Update tests/test_modeling_gptj.py Co-authored-by:
Eric Hallahan <eric@hallahans.name> * Update tests/test_modeling_gptj.py Co-authored-by:
Eric Hallahan <eric@hallahans.name> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update __init__.py * Update configuration_gptj.py * Update modeling_gptj.py * Corrected indentation * Remove stray backslash * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store * Update docs to match * Remove tf loading * Remove config.jax * Remove stray `else:` statement * Remove references to `load_tf_weights_in_gptj` * Adapt tests to match output from GPT-J 6B * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Default `activation_function` to `gelu_new` - Specify the approximate formulation of GELU to ensure parity with the default setting of `jax.nn.gelu()` * Fix part of the config documentation * Revert "Update configuration_auto.py" This reverts commit e9860e9c043b6ebf57a0e705044e9ec9ba2263bb. * Revert "Update configuration_auto.py" This reverts commit cfaaae4c4dc70f1fbe9abd60fc8bd0b863b8c011. * Revert "Update configuration_auto.py" This reverts commit 687788954fd0cfbc567fa1202d56a4ff9271944f. * Revert "Update configuration_auto.py" This reverts commit 194d024ea87d4fcef0dcb08e57f52c47511a9fc6. * Hyphenate GPT-J * Undid sorting of the models alphabetically * Reverting previous commit * fix style and quality issues * Update docs/source/model_doc/gptj.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/configuration_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/configuration_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/configuration_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Replaced GPTJ-specific code with generic code * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Made the code always use rotary positional encodings * Update index.rst * Fix documentation * Combine attention classes - Condense all attention operations into `GPTJAttention` - Replicate GPT-2 and improve code clarity by renaming `GPTJAttention.attn_pdrop` and `GPTJAttention.resid_pdrop` to `GPTJAttention.attn_dropout` and `GPTJAttention.resid_dropout` * Removed `config.rotary_dim` from tests * Update test_modeling_gptj.py * Update test_modeling_gptj.py * Fix formatting * Removed depreciated argument `layer_id` to `GPTJAttention` * Update modeling_gptj.py * Update modeling_gptj.py * Fix code quality * Restore model functionality * Save `lm_head.weight` in checkpoints * Fix crashes when loading with reduced precision * refactor self._attn(...)` and rename layer weights" * make sure logits are in fp32 for sampling * improve docs * Add `GPTJForCausalLM` to `TextGenerationPipeline` whitelist * Added GPT-J to the README * Fix doc/readme consistency * Add rough parallelization support - Remove unused imports and variables - Clean up docstrings - Port experimental parallelization code from GPT-2 into GPT-J * Clean up loose ends * Fix index.rst Co-authored-by:
kurumuz <kurumuz1@gmail.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Eric Hallahan <eric@hallahans.name> Co-authored-by:
Leo Gao <54557097+leogao2@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: your_github_username <your_github_email> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Lysandre authored
-
Lysandre authored
-
Lysandre authored
-
Lysandre authored
-
Patrick von Platen authored
* correct * also comment out multi-gpu test push
-
Sylvain Gugger authored
* Add generate kwargs to Seq2SeqTrainingArguments * typo * Address review comments + doc * Style
-
Matt authored
-
Lysandre authored
-
Matt authored
* Adding a TF variant of the DataCollatorForTokenClassification to get feedback * Added a Numpy variant and a post_init check to fail early if a missing import is found * Fixed call to Numpy variant * Added a couple more of the collators * Update src/transformers/data/data_collator.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fixes, style pass, finished DataCollatorForSeqToSeq * Added all the LanguageModeling DataCollators, except SOP and PermutationLanguageModeling * Adding DataCollatorForPermutationLanguageModeling * Style pass * Add missing `__call__` for PLM * Remove `post_init` checks for frameworks because the imports inside them were making us fail code quality checks * Remove unused imports * First attempt at some TF tests * A second attempt to make any of those tests actually work * TF tests, round three * TF tests, round four * TF tests, round five * TF tests, all enabled! * Style pass * Merging tests into `test_data_collator.py` * Merging tests into `test_data_collator.py` * Fixing up test imports * Fixing up test imports * Trying shuffling the conditionals around * Commenting out non-functional old tests * Completed all tests for all three frameworks * Style pass * Fixed test typo * Style pass * Move standard `__call__` method to mixin * Rearranged imports for `test_data_collator` * Fix data collator typo "torch" -> "pt" * Fixed the most embarrassingly obvious bug * Update src/transformers/data/data_collator.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Renaming mixin * Updating docs Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Dalton Walker <dalton_walker@icloud.com> Co-authored-by:
Andrew Romans <andrew.romans@hotmail.com>
-
Sylvain Gugger authored
-
Jongheon Kim authored
Set missing seq_length variable when using inputs_embeds with ALBERT & Remove code duplication (#13152) * Set seq_length variable when using inputs_embeds * remove code duplication
-
Jake Tae authored
`at` should be `a1`
-
Stas Bekman authored
fix a few implementation links
-
Sylvain Gugger authored
-
Kamal Raj authored
* Deberta_v2 tf * added new line at the end of file, make style * +V2, typo * remove never executed branch of code * rm cmnt and fixed typo in url filter * cleanup according to review comments * added #Copied from
-
Apoorv Garg authored
-
tucan9389 authored
* Add GPT2ForTokenClassification * Fix dropout exception for GPT2 NER * Remove sequence label in test * Change TokenClassifierOutput to TokenClassifierOutputWithPast * Fix for black formatter * Remove dummy * Update docs for GPT2ForTokenClassification * Fix check_inits ci fail * Update dummy_pt_objects after make fix-copies * Remove TokenClassifierOutputWithPast * Fix tuple input issue Co-authored-by:danielsejong55@gmail.com <danielsejong55@gmail.com>
-
Serhiy-Shekhovtsov authored
-
Patrick von Platen authored
* up * finish * Apply suggestions from code review * apply Lysandres suggestions * adapt circle ci as well * finish * Update setup.py
-
Sylvain Gugger authored
* Incorporate tests dependencies in tests_fetcher * Harder modif * Debug * Loop through all files * Last modules * Remove debug statement
-
- 30 Aug, 2021 6 commits
-
-
Olatunji Ruwase authored
* Use DS callable API to allow hf_scheduler + ds_optimizer * Preserve backward-compatibility * Restore backward compatibility * Tweak arg positioning * Tweak arg positioning * bump the required version * Undo indent * Update src/transformers/trainer.py * style Co-authored-by:
Stas Bekman <stas@stason.org> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
Laura Hanu authored
* added missing __spec__ to _LazyModule * test __spec__ is not None after module import * changed module_spec arg to be optional in _LazyModule * fix style issue * added module spec test to test_file_utils
-
Sylvain Gugger authored
* Fix release utils * Update docs/source/conf.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Sylvain Gugger authored
* Fix AutoTokenizer when a tokenizer has no fast version * Add test
-
Li-Huai (Allan) Lin authored
* Correct outdated function signatures on website. * Upgrade sphinx to 3.5.4 (latest 3.x) * Test * Test * Test * Test * Test * Test * Revert unnecessary changes. * Change sphinx version to 3.5.4" * Test python 3.7.11
-
Kamal Raj authored
* albert flax * year -> 2021 * docstring updated for flax * removed head_mask * removed from_pt * removed passing attention_mask to embedding layer
-