- 08 Dec, 2021 1 commit
-
-
NielsRogge authored
* First draft * Style and remove mlm * Make forward pass work * More improvements * More improvements * Fix bug * More improvements * More improvements * Add PerceiverTokenizer first draft * Improve conversion script * More improvements * Make conversion script work for the encoder * Make conversion script work with local pickle files * Style & quality, fix-copies * Add dummy input to conversion script * Add absolute position embeddings to TextPreProcessor * Make forward pass of encoder work * More improvements * Move text preprocessor to separate script * More improvements * More improvements * Add post processor * Make MLM model work * Style * Add PerceiverForMaskedLM * Add PerceiverImagePreprocessor * Make style * Make PerceiverForImageClassification work * More improvements * More improvements * Use tokenizer in conversion script * Use PerceiverForMaskedLM in conversion script * Define custom PerceiverModelOutput * Improve PerceiverAttention to make it work for both MLM and image classification * More improvements * More improvements * More improvements to the conversion script * Make conversion script work for both MLM and image classification * Add PerceiverFeatureExtractor * More improvements * Style and quality * Add center cropping * Fix bug * Small fix * Add print statement * Fix bug in image preprocessor * Fix bug with conversion script * Make output position embeddings an nn.Parameter layer instead of nn.Embedding * Comment out print statements * Add position encoding classes * More improvements * Use position_encoding_kwargs * Add PerceiverForImageClassificationFourier * Make style & quality * Add PerceiverForImageClassificationConvProcessing * Style & quality * Add flow model * Move processors to modeling file * Make position encodings modular * Make basic decoder use modular position encodings * Add PerceiverForOpticalFlow to conversion script * Add AudioPreprocessor * Make it possible for the basic decoder to use Fourier position embeddings * Add PerceiverForMultimodalAutoencoding * Improve model for optical flow * Improve _build_network_inputs method * Add print statement * Fix device issue * Fix device of Fourier embeddings * Add print statements for debugging * Add another print statement * Add another print statement * Add another print statement * Add another print statement * Improve PerceiverAudioPreprocessor * Improve conversion script for multimodal modal * More improvements * More improvements * Improve multimodal model * Make forward pass multimodal model work * More improvements * Improve tests * Fix some more tests * Add output dataclasses * Make more tests pass * Add print statements for debuggin * Add tests for image classification * Add PerceiverClassifierOutput * More improvements * Make more tests pass for the optical flow model * Make style & quality * Small improvements * Don't support training for optical flow model for now * Fix _prepare_for_class for tests * Make more tests pass, add some docs * Add multimodal model to tests * Minor fixes * Fix tests * Improve conversion script * Make fixup * Remove pos_dim argument * Fix device issue * Potential fix for OOM * Revert previous commit * Fix test_initialization * Add print statements for debugging * Fix print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Add print statement * Remove need for output_shape * Comment out output_shape * Remove unnecessary code * Improve docs * Fix make fixup * Remove PerceiverTextProcessor from init * Improve docs * Small improvement * Apply first batch of suggestions from code review * Apply more suggestions from code review * Update docstrings * Define dicts beforehand for readability * Rename task to architecture in conversion script, include PerceiverModel in tests * Add print statements for debugging * Fix tests on GPU * Remove preprocessors, postprocessors and decoders from main init * Add integration test * Fix docs * Replace einops by torch * Update for new docs frontend * Rename PerceiverForImageClassification * Improve docs * Improve docs * Improve docs of PerceiverModel * Fix some more tests * Improve center_crop * Add PerceiverForSequenceClassification * Small improvements * Fix tests * Add integration test for optical flow model * Clean up * Add tests for tokenizer * Fix tokenizer by adding special tokens properly * Fix CI
-
- 01 Dec, 2021 2 commits
-
-
Sylvain Gugger authored
* Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix typo in toctree (#14516) * Fix checkpoints badge * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).*> * Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by:
Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix checkpoints badge * Fix typo in toctree (#14516) * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).*> * Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by:
Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Styling Co-authored-by:
Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Julien Chaumond <julien@huggingface.co>
-
Suraj Patil authored
* add flax gptj * no bias in attention dense * no wpe * fix rotary embeddings * fix rotary embeds * fix rotray embeds * quality * doc and quality * fix equivalence tests
-
- 30 Nov, 2021 3 commits
-
-
Suraj Patil authored
* init vision_text_dual_encoder * fix merge * remove extra heads * fix tests * remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP * remove archive map * fix imports * fix more imports * fix init * delete tokenizers * fix imports * clean * support clip's vision model * handle None config * begin tests * more test and few fixes * warn about newly init weights * more tests * add loss to model * remove extra classes from doc * add processor * doc and small fixes * add start docstr * update flax model * flax tests * more flax tests * doc * quality * doc and quality * fix doc * doc * remove comments * update warning * quality * fix docs * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * replace asserts, fix imports * update imports * fix import * address some review comments * fix check * reduce tolerance * fix test * add flax integration test * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address Sylvain's comments * fix style * add pt_flax_equivalence test in PT tests * add pt integration test * update test * use pre-trained checkpoint in examples Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Daniel Stancl authored
* Init Flax implementation for Blenderbot * Add a majority of stuff except for tests * make style quality * Add tests and fix some bugs * Add tests * Clean source code and fix some bugs * Fix copies and docs * Fix jax device condition for tests * Fix layer norm in the encoder * Fix a few typos in the test file * make fix-copies * make fix-copies * fix layer norm * Fix Flax params dtype (#13090) * Fix PR reference (#13098) * make fix-copies * Update tests/test_modeling_flax_blenderbot.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com>
-
Kamal Raj authored
* TF Tapas first commit * updated docs * updated logger message * updated pytorch weight conversion script to support scalar array * added use_cache to tapas model config to work properly with tf input_processing * 1. rm embeddings_sum 2. added # Copied 3. + TFTapasMLMHead 4. and lot other small fixes * updated docs * + test for tapas * updated testing_utils to check is_tensorflow_probability_available * converted model logits post processing using numpy to work with both PT and TF models * + TFAutoModelForTableQuestionAnswering * added TF support * added test for TFAutoModelForTableQuestionAnswering * added test for TFAutoModelForTableQuestionAnswering pipeline * updated auto model docs * fixed typo in import * added tensorflow_probability to run tests * updated MLM head * updated tapas.rst with TF model docs * fixed optimizer import in docs * updated convert to np data from pt model is not `transformers.tokenization_utils_base.BatchEncoding` after pipeline upgrade * updated pipeline: 1. with torch.no_gard removed, pipeline forward handles 2. token_type_ids converted to numpy * updated docs. * removed `use_cache` from config * removed floats_tensor * updated code comment * updated Copyright Year and logits_aggregation Optional * updated docs and comments * updated docstring * fixed model weight loading * make fixup * fix indentation * added tf slow pipeline test * pip upgrade * upgrade python to 3.7 * removed from_pt from tests * revert commit f18cfa9
-
- 19 Nov, 2021 2 commits
-
-
Shang Zhang authored
* clean up branch for add-qdqbert-model * README update for QAT example; update docstrings in modeling_qdqbert.py * Update qdqbert.rst * Update README.md * Update README.md * calibration data using traning set; QAT example runs in fp32 * re-use BERTtokenizer for qdqbert * Update docs/source/model_doc/qdqbert.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove qdqbert tokenizer * Update qdqbert.rst * update evaluate-hf-trt-qa.py * update configuration_qdqbert.py * update modeling_qdqbert.py: add copied statement; replace assert with ValueError * update copied from statement * add is_quantization_available; run make fix-copies * unittest add require_quantization * add backend dependency to qdqbert model * update README; update evaluate script; make style * lint * docs qdqbert update * circleci build_doc add pytorch-quantization for qdqbert * update README * update example readme with instructions to upgrade TensorRT to 8.2 * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * change quantization to pytorch_quantization for backend requirement * feed_forward_chunking not supported in QDQBert * make style * update model docstrings and comments in testing scripts * rename example to quantization-qdqbert; rename example scripts from qat to quant * Update src/transformers/models/qdqbert/modeling_qdqbert.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * rm experimental functions in quant_trainer * qa cleanup * make fix-copies for docs index.rst * fix doctree; use post_init() for qdqbert * fix early device assignment for qdqbert * fix CI:Model templates runner Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
NielsRogge authored
* Add integration test * Fix typo
-
- 18 Nov, 2021 1 commit
-
-
NielsRogge authored
* First draft * More improvements * Improve conversion script * Fix init weights for layer norm * Fix correct model for conversion script * Don't tie input and output embeddings * Add print statements for debugging * Add print statements for debugging * Fix vocab size of model * Improve documentation, remove fast tokenizer * Add ImageGPTForImageClassification, improve docs * Fix docs issue * Set verbosity level back to info * Improve tests * Fix tests and add figure * Delete tokenizer file * Remove ImageGPTTokenizer from init files * Remove ImageGPTLayer from init files * Remove ImageGPT tokenizer from docs * First draft of ImageGPTFeatureExtractor * Fix typo * Fix bug * More improvements * Apply suggestions from code review, add tests for feature extractor * Fix layernorm * Update save_pretrained method * Fix issue * Make all tests of ImageGPTFeatureExtractor pass * Update code examples * Rename model inputs to pixel_values * Improve code examples * Update init_weights to post_init * Fix post_init
-
- 09 Nov, 2021 2 commits
-
-
Yih-Dar authored
* Start the work for TFViTModel * Convert to TF code - need to check in the follow up commits * Clean up model code * Expose TFViTModel * make style * make quality * Add test * make style & quality * Fix some imports * fix wrong usage - *kwargs => ** kwargs * Fix Conv2D weight loading (PT->TF) issue * Add tests for images with different sizes + fix model * Fix some common tests for TFViTModel * Use inputs instead of input_ids in test_compile_tf_model * Add a comment about transpose and Conv2D in convert_tf_weight_name_to_pt_weight_name * Avoid transpose in TFViT call * Fix Conv2D issue in load_tf2_weights_in_pytorch_model * Use tf.keras.layers.Conv2D instead of tf.nn.conv2d * Using simpler heuristic to detect Conv2D layer * Change convert_tf_weight_name_to_pt_weight_name to return TransposeType * Check tf_weight_shape is not None before using it * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix missing comma * fix input dtype Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Yih-Dar authored
* Start the work on FlaxVisionEncoderDecoderModel * Add FlaxVisionEncoderDecoderModel * Add VisionEncoderDecoderConfig * Make FlaxVisionEncoderDecoderModel visible to transformers * Add test * Fix wrong getattr usage * Fix tests * Add FlaxAutoModelForVision2Seq * Expose FLAX_MODEL_FOR_VISION_2_SEQ_MAPPING * clean-up * add integration test * update expected logits * update expected scores * Add ViT2GPT2ModelIntegrationTest + some cleaning * Add projection layer + PT/Flax equivalence tests * Fix import * minor changes * make test slow again * Apply suggestions * Add modeling_flax_vision_encoder_decoder to _ignore_modules in get_model_modules() * fix copies * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> * split long strings in multiple lines * decoder_input_ids can't be None * Add back test_configuration_tie * Remove attention_mask parameter * fix test - encoder_last_hidden_state should be encoder_outputs.last_hidden_state instead of the projected vector * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Remove more encoder_attention_mask * remove encoder_attention_mask when calling self.decode (in FlaxVisionEncoderDecoderModule) * Fix style + pass 1s instead of None as encoder_attention_mask * fix init_weights * pass None for encoder_attention_mask * pass 1s instead of None as encoder_attention_mask * Fix doc style Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 03 Nov, 2021 1 commit
-
-
Sylvain Gugger authored
* Start PR doc * Cleanup the quality checks and document them * Add reference in the contributing guide * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Rename file as per review suggestion Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
- 29 Oct, 2021 1 commit
-
-
Daniel Stancl authored
* Add the support for the fast (rust) implementation of BlenbderbotTokenizer * Fix a converter and a typo in a doc * Apply the patil-suraj's suggestion * (Nitpick) Fast tokenization -> Fast Tokenization in doc * Apply the SaulLu's suggestion * Apply Narsil's suggestion to fix test pipelines * Add encoder_no_repeat_ngram_size according to the Narsil's suggestion * Revert the last (unnecessary) commit * Override pipeline config for Blenderbot to allow for larger pos. emb. * make fix-copies
-
- 28 Oct, 2021 2 commits
-
-
Lysandre authored
-
NielsRogge authored
* First draft * Make style & quality * Improve conversion script * Add print statement to see actual slice * Make absolute tolerance smaller * Fix image classification models * Add post_process_semantic method * Disable padding * Improve conversion script * Rename to ForSemanticSegmentation, add integration test, remove post_process methods * Improve docs * Fix code quality * Fix feature extractor tests * Fix tests for image classification model * Delete file * Add is_torch_available to feature extractor * Improve documentation of feature extractor methods * Apply suggestions from @sgugger's code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions of code review * Rebase with master * Fix rebase issues * Make sure model only outputs hidden states when the user wants to * Apply suggestions from code review * Add pad method * Support padding of 2d images * Add print statement * Add print statement * Move padding method to SegformerFeatureExtractor * Fix issue * Add casting of segmentation maps * Add test for padding * Add small note about padding Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 26 Oct, 2021 1 commit
-
-
Patrick von Platen authored
* unispeech * add copy from * remove hubert copy from * finish for today * add unispeech-sat * adapt more * up * up * up * up * add modeling * add tests * up * up * finish * up * Apply suggestions from code review * up * up * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * up * up Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 18 Oct, 2021 1 commit
-
-
Dat Quoc Nguyen authored
* Add the pre-trained BARTpho model * Add the pre-trained BARTpho model * Add the pre-trained BARTpho model * Fix incorrectly sorted and/or formatted imports * Fix incorrectly sorted and/or formatted style * Fix check_dummies * Fix check_dummies * Fix check_dummies * Update docs/source/model_doc/bartpho.rst Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/bartpho/__init__.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/bartpho/tokenization_bartpho.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update tests/test_tokenization_bartpho.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/bartpho/tokenization_bartpho.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update tests/test_tokenization_bartpho.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update docs/source/model_doc/bartpho.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/bartpho.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/bartpho/__init__.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Add the pre-trained BARTpho model * Add Tips section in doc and details of monolingual_vocab_file * Fix conflicts * Add another tip related to monolingual_vocab_file * Readd dependency_versions_table.py * Handle failing checks * Remove test_list.txt * Remove md5sum.saved * Revise Readme.md Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 15 Oct, 2021 1 commit
-
-
Anton Lozhkov authored
* Working encoder * SEW-D and tests * Further conv fixes * Automodels and conv inits * Update integration tests, add docs * Docs cleanup, resolve todos * Conf fix * Fix docs * Fix tests, apply suggestions * Update src/transformers/models/sew/modeling_sew.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Model conversion and updated no-mask tests * Remove copy of feature_proj * Style * Update src/transformers/models/auto/feature_extraction_auto.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/auto/feature_extraction_auto.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Move orgs Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 13 Oct, 2021 1 commit
-
-
NielsRogge authored
* First draft * Update self-attention of RoBERTa as proposition * Improve conversion script * Add TrOCR decoder-only model * More improvements * Make forward pass with pretrained weights work * More improvements * Some more improvements * More improvements * Make conversion work * Clean up print statements * Add documentation, processor * Add test files * Small improvements * Some more improvements * Make fix-copies, improve docs * Make all vision encoder decoder model tests pass * Make conversion script support other models * Update URL for OCR image * Update conversion script * Fix style & quality * Add support for the large-printed model * Fix some issues * Add print statement for debugging * Add print statements for debugging * Make possible fix for sinusoidal embedding * Further debugging * Potential fix v2 * Add more print statements for debugging * Add more print statements for debugging * Deubg more * Comment out print statements * Make conversion of large printed model possible, address review comments * Make it possible to convert the stage1 checkpoints * Clean up code, apply suggestions from code review * Apply suggestions from code review, use Microsoft models in tests * Rename encoder_hidden_size to cross_attention_hidden_size * Improve docs
-
- 12 Oct, 2021 1 commit
-
-
Yih-Dar authored
* Add cross attentions to TFGPT2Model * Add TFEncoderDecoderModel * Add TFBaseModelOutputWithPoolingAndCrossAttentions * Add cross attentions to TFBertModel * Fix past or past_key_values argument issue * Fix generation * Fix save and load * Add some checks and comments * Clean the code that deals with past keys/values * Add kwargs to processing_inputs * Add serving_output to TFEncoderDecoderModel * Some cleaning + fix use_cache value issue * Fix tests + add bert2bert/bert2gpt2 tests * Fix more tests * Ignore crossattention.bias when loading GPT2 weights into TFGPT2 * Fix return_dict_in_generate in tf generation * Fix is_token_logit_eos_token bug in tf generation * Finalize the tests after fixing some bugs * Fix another is_token_logit_eos_token bug in tf generation * Add/Update docs * Add TFBertEncoderDecoderModelTest * Clean test script * Add TFEncoderDecoderModel to the library * Add cross attentions to TFRobertaModel * Add TFRobertaEncoderDecoderModelTest * make style * Change the way of position_ids computation * bug fix * Fix copies in tf_albert * Remove some copied from and apply some fix-copies * Remove some copied * Add cross attentions to some other TF models * Remove encoder_hidden_states from TFLayoutLMModel.call for now * Make style * Fix TFRemBertForCausalLM * Revert the change to longformer + Remove copies * Revert the change to albert and convbert + Remove copies * make quality * make style * Add TFRembertEncoderDecoderModelTest * make quality and fix-copies * test TFRobertaForCausalLM * Fixes for failed tests * Fixes for failed tests * fix more tests * Fixes for failed tests * Fix Auto mapping order * Fix TFRemBertEncoder return value * fix tf_rembert * Check copies are OK * Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined * Add TFEncoderDecoderModelSaveLoadTests * fix tf weight loading * check the change of use_cache * Revert the change * Add missing test_for_causal_lm for TFRobertaModelTest * Try cleaning past * fix _reorder_cache * Revert some files to original versions * Keep as many copies as possible * Apply suggested changes - Use raise ValueError instead of assert * Move import to top * Fix wrong require_torch * Replace more assert by raise ValueError * Add test_pt_tf_model_equivalence (the test won't pass for now) * add test for loading/saving * finish * finish * Remove test_pt_tf_model_equivalence * Update tf modeling template * Remove pooling, added in the prev. commit, from MainLayer * Update tf modeling test template * Move inputs["use_cache"] = False to modeling_tf_utils.py * Fix torch.Tensor in the comment * fix use_cache * Fix missing use_cache in ElectraConfig * Add a note to from_pretrained * Fix style * Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt * Fix TFMLP (in TFGPT2) activation issue * Fix None past_key_values value in serving_output * Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub * Apply review suggestions - style for cross_attns in serving_output * Apply review suggestions - change assert + docstrings * break the error message to respect the char limit * deprecate the argument past * fix docstring style * Update the encoder-decoder rst file * fix Unknown interpreted text role "method" * fix typo Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 29 Sep, 2021 1 commit
-
-
Matt authored
* Keras callback to push to hub each epoch, or after N steps * Reworked the callback to use Repository * Use an Enum for save_strategy * Style pass * Correct type for tokenizer * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Adding print message to the final upload * Adding print message to the final upload * Change how we wait for the last process to finish * is_done is a property, not a method, derp * Docstrings and documentation * Style pass * Style edit * Docstring reformat * Docstring rewrite * Replacing print with internal logger Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 27 Sep, 2021 1 commit
-
-
Lysandre authored
-
- 22 Sep, 2021 1 commit
-
-
Lysandre Debut authored
* Add BlenderBot small tokenizer to the init * Update src/transformers/__init__.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Style * Bugfix Co-authored-by:
Suraj Patil <surajp815@gmail.com>
-
- 21 Sep, 2021 1 commit
-
-
Kamal Raj authored
* beit-flax * updated FLAX_BEIT_MLM_DOCSTRING * removed bool_masked_pos from classification * updated Copyright * code refactoring: x -> embeddings * updated test: rm from_pt * Update docs/source/model_doc/beit.rst * model code dtype updates and other changes according to review * relative_position_bias revert back to pytorch design
-
- 20 Sep, 2021 1 commit
-
-
Gunjan Chhablani authored
* Init FNet * Update config * Fix config * Update model classes * Update tokenizers to use sentencepiece * Fix errors in model * Fix defaults in config * Remove position embedding type completely * Fix typo and take only real numbers * Fix type vocab size in configuration * Add projection layer to embeddings * Fix position ids bug in embeddings * Add minor changes * Add conversion script and remove CausalLM vestiges * Fix conversion script * Fix conversion script * Remove CausalLM Test * Update checkpoint names to dummy checkpoints * Add tokenizer mapping * Fix modeling file and corresponding tests * Add tokenization test file * Add PreTraining model test * Make style and quality * Make tokenization base tests work * Update docs * Add FastTokenizer tests * Fix fast tokenizer special tokens * Fix style and quality * Remove load_tf_weights vestiges * Add FNet to main README * Fix configuration example indentation * Comment tokenization slow test * Fix style * Add changes from review * Fix style * Remove bos and eos tokens from tokenizers * Add tokenizer slow test, TPU transforms, NSP * Add scipy check * Add scipy availabilty check to test * Fix tokenizer and use correct inputs * Remove remaining TODOs * Fix tests * Fix tests * Comment Fourier Test * Uncomment Fourier Test * Change to google checkpoint * Add changes from review * Fix activation function * Fix model integration test * Add more integration tests * Add comparison steps to MLM integration test * Fix style * Add masked tokenization fix * Improve mask tokenization fix * Fix index docs * Add changes from review * Fix issue * Fix failing import in test * some more fixes * correct fast tokenizer * finalize * make style * Remove additional tokenization logic * Set do_lower_case to False * Allow keeping accents * Fix tokenization test * Fix FNet Tokenizer Fast * fix tests * make style * Add tips to FNet docs Co-authored-by:patrickvonplaten <patrick.v.platen@gmail.com>
-
- 14 Sep, 2021 1 commit
-
-
Bhadresh Savani authored
* added initial files * fixes pipeline * fixes style and quality * fixes doc issue and positional encoding * fixes layer norm and test * fixes quality issue * fixes code quality * removed extra layer norm * added layer norm back in encoder and decoder * added more code copy quality checks * update tests * Apply suggestions from code review * fix import * fix test Co-authored-by:patil-suraj <surajp815@gmail.com>
-
- 10 Sep, 2021 1 commit
-
-
Nicolas Patry authored
* Enabling dataset iteration on pipelines. Enabling dataset iteration on pipelines. Unifying parameters under `set_parameters` function. Small fix. Last fixes after rebase Remove print. Fixing text2text `generate_kwargs` No more `self.max_length`. Fixing tf only conversational. Consistency in start/stop index over TF/PT. Speeding up drastically on TF (nasty bug where max_length would increase a ton.) Adding test for support for non fast tokenizers. Fixign GPU usage on zero-shot. Fix working on Tf. Update src/transformers/pipelines/base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Update src/transformers/pipelines/base.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Small cleanup. Remove all asserts + simple format. * Fixing audio-classification for large PR. * Overly explicity null checking. * Encapsulating GPU/CPU pytorch manipulation directly within `base.py`. * Removed internal state for parameters of the pipeline. Instead of overriding implicitly internal state, we moved to real named arguments on every `preprocess`, `_forward`, `postprocess` function. Instead `_sanitize_parameters` will be used to split all kwargs of both __init__ and __call__ into the 3 kinds of named parameters. * Move import warnings. * Small fixes. * Quality. * Another small fix, using the CI to debug faster. * Last fixes. * Last fix. * Small cleanup of tensor moving. * is not None. * Adding a bunch of docs + a iteration test. * Fixing doc style. * KeyDataset = None guard. * RRemoving the Cuda test for pipelines (was testing). * Even more simple iteration test. * Correct import . * Long day. * Fixes in docs. * [WIP] migrating object detection. * Fixed the target_size bug. * Fixup. * Bad variable name. * Fixing `ensure_on_device` respects original ModelOutput.
-
- 08 Sep, 2021 1 commit
-
-
Li-Huai (Allan) Lin authored
* Complete basic mechanism * Save * Complete everything * Style & Quality * Update READMEs * Add testing * Fix README.md format * Apply suggestions * Fix format * Update utils/check_copies.py Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 01 Sep, 2021 2 commits
-
-
NielsRogge authored
* Remove disclaimer * First draft * Fix rebase * Improve docs some more * Add inference section * Improve example scripts section * Improve code examples of modeling files * Add docs regarding task prefix * Address @craffel's comments * Apply suggestions from @patrickvonplaten's review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Add suggestions from code review * Apply @sgugger's suggestions * Fix Flax code examples * Fix index.rst Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * up * correct some bugs * correct model * finish speech2text extension * up * up * up * up * Update utils/custom_init_isort.py * up * up * update with tokenizer * correct old tok * correct old tok * fix bug * up * up * add more tests * up * fix docs * up * fix some more tests * add better config * correct some more things " * fix tests * improve docs * Apply suggestions from code review * Apply suggestions from code review * final fixes * finalize * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * apply suggestions Lysandre and Sylvain * apply nicos suggestions * upload everything * finish Co-authored-by:
Patrick von Platen <patrick@huggingface.co> Co-authored-by: your_github_username <your_github_email> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 31 Aug, 2021 3 commits
-
-
Stella Biderman authored
* Test GPTJ implementation * Fixed conflicts * Update __init__.py * Update __init__.py * change GPT_J to GPTJ * fix missing imports and typos * use einops for now (need to change to torch ops later) * Use torch ops instead of einsum * remove einops deps * Update configuration_auto.py * Added GPT J * Update gptj.rst * Update __init__.py * Update test_modeling_gptj.py * Added GPT J * Changed configs to match GPT2 instead of GPT Neo * Removed non-existent sequence model * Update configuration_auto.py * Update configuration_auto.py * Update configuration_auto.py * Update modeling_gptj.py * Update modeling_gptj.py * Progress on updating configs to agree with GPT2 * Update modeling_gptj.py * num_layers -> n_layer * layer_norm_eps -> layer_norm_epsilon * attention_layers -> num_hidden_layers * Update modeling_gptj.py * attention_pdrop -> attn_pdrop * hidden_act -> activation_function * Update configuration_gptj.py * Update configuration_gptj.py * Update configuration_gptj.py * Update configuration_gptj.py * Update configuration_gptj.py * Update modeling_gptj.py * Update modeling_gptj.py * Update modeling_gptj.py * Update modeling_gptj.py * Update modeling_gptj.py * Update modeling_gptj.py * fix layernorm and lm_head size delete attn_type * Update docs/source/model_doc/gptj.rst Co-authored-by:
Suraj Patil <surajp815@gmail.com> * removed claim that GPT J uses local attention * Removed GPTJForSequenceClassification * Update src/transformers/models/gptj/configuration_gptj.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Removed unsupported boilerplate * Update tests/test_modeling_gptj.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update tests/test_modeling_gptj.py Co-authored-by:
Eric Hallahan <eric@hallahans.name> * Update tests/test_modeling_gptj.py Co-authored-by:
Eric Hallahan <eric@hallahans.name> * Update tests/test_modeling_gptj.py Co-authored-by:
Eric Hallahan <eric@hallahans.name> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update __init__.py * Update configuration_gptj.py * Update modeling_gptj.py * Corrected indentation * Remove stray backslash * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store * Update docs to match * Remove tf loading * Remove config.jax * Remove stray `else:` statement * Remove references to `load_tf_weights_in_gptj` * Adapt tests to match output from GPT-J 6B * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Default `activation_function` to `gelu_new` - Specify the approximate formulation of GELU to ensure parity with the default setting of `jax.nn.gelu()` * Fix part of the config documentation * Revert "Update configuration_auto.py" This reverts commit e9860e9c043b6ebf57a0e705044e9ec9ba2263bb. * Revert "Update configuration_auto.py" This reverts commit cfaaae4c4dc70f1fbe9abd60fc8bd0b863b8c011. * Revert "Update configuration_auto.py" This reverts commit 687788954fd0cfbc567fa1202d56a4ff9271944f. * Revert "Update configuration_auto.py" This reverts commit 194d024ea87d4fcef0dcb08e57f52c47511a9fc6. * Hyphenate GPT-J * Undid sorting of the models alphabetically * Reverting previous commit * fix style and quality issues * Update docs/source/model_doc/gptj.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/configuration_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/configuration_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/configuration_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Replaced GPTJ-specific code with generic code * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Made the code always use rotary positional encodings * Update index.rst * Fix documentation * Combine attention classes - Condense all attention operations into `GPTJAttention` - Replicate GPT-2 and improve code clarity by renaming `GPTJAttention.attn_pdrop` and `GPTJAttention.resid_pdrop` to `GPTJAttention.attn_dropout` and `GPTJAttention.resid_dropout` * Removed `config.rotary_dim` from tests * Update test_modeling_gptj.py * Update test_modeling_gptj.py * Fix formatting * Removed depreciated argument `layer_id` to `GPTJAttention` * Update modeling_gptj.py * Update modeling_gptj.py * Fix code quality * Restore model functionality * Save `lm_head.weight` in checkpoints * Fix crashes when loading with reduced precision * refactor self._attn(...)` and rename layer weights" * make sure logits are in fp32 for sampling * improve docs * Add `GPTJForCausalLM` to `TextGenerationPipeline` whitelist * Added GPT-J to the README * Fix doc/readme consistency * Add rough parallelization support - Remove unused imports and variables - Clean up docstrings - Port experimental parallelization code from GPT-2 into GPT-J * Clean up loose ends * Fix index.rst Co-authored-by:
kurumuz <kurumuz1@gmail.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Eric Hallahan <eric@hallahans.name> Co-authored-by:
Leo Gao <54557097+leogao2@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: your_github_username <your_github_email> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Lysandre authored
-
Kamal Raj authored
* Deberta_v2 tf * added new line at the end of file, make style * +V2, typo * remove never executed branch of code * rm cmnt and fixed typo in url filter * cleanup according to review comments * added #Copied from
-
- 30 Aug, 2021 3 commits
-
-
Kamal Raj authored
* albert flax * year -> 2021 * docstring updated for flax * removed head_mask * removed from_pt * removed passing attention_mask to embedding layer
-
Kamal Raj authored
* distilbert-flax * added missing self * docs fix * removed tied kernal extra init * updated docs * x -> hidden states * removed head_mask * removed from_pt, +FLAX * updated year
-
NielsRogge authored
* First commit * Make style * Fix dummy objects * Add Detectron2 config * Add LayoutLMv2 pooler * More improvements, add documentation * More improvements * Add model tests * Add clarification regarding image input * Improve integration test * Fix bug * Fix another bug * Fix another bug * Fix another bug * More improvements * Make more tests pass * Make more tests pass * Improve integration test * Remove gradient checkpointing and add head masking * Add integration test * Add LayoutLMv2ForSequenceClassification to the tests * Add LayoutLMv2ForQuestionAnswering * More improvements * More improvements * Small improvements * Fix _LazyModule * Fix fast tokenizer * Move sync_batch_norm to a separate method * Replace dummies by requires_backends * Move calculation of visual bounding boxes to separate method + update README * Add models to main init * First draft * More improvements * More improvements * More improvements * More improvements * More improvements * Remove is_split_into_words * More improvements * Simply tesseract - no use of pandas anymore * Add LayoutLMv2Processor * Update is_pytesseract_available * Fix bugs * Improve feature extractor * Fix bug * Add print statement * Add truncation of bounding boxes * Add tests for LayoutLMv2FeatureExtractor and LayoutLMv2Tokenizer * Improve tokenizer tests * Make more tokenizer tests pass * Make more tests pass, add integration tests * Finish integration tests * More improvements * More improvements - update API of the tokenizer * More improvements * Remove support for VQA training * Remove some files * Improve feature extractor * Improve documentation and one more tokenizer test * Make quality and small docs improvements * Add batched tests for LayoutLMv2Processor, remove fast tokenizer * Add truncation of labels * Apply suggestions from code review * Improve processor tests * Fix failing tests and add suggestion from code review * Fix tokenizer test * Add detectron2 CI job * Simplify CI job * Comment out non-detectron2 jobs and specify number of processes * Add pip install torchvision * Add durations to see which tests are slow * Fix tokenizer test and make model tests smaller * Frist draft * Use setattr * Possible fix * Proposal with configuration * First draft of fast tokenizer * More improvements * Enable fast tokenizer tests * Make more tests pass * Make more tests pass * More improvements * Addd padding to fast tokenizer * Mkae more tests pass * Make more tests pass * Make all tests pass for fast tokenizer * Make fast tokenizer support overflowing boxes and labels * Add support for overflowing_labels to slow tokenizer * Add support for fast tokenizer to the processor * Update processor tests for both slow and fast tokenizers * Add head models to model mappings * Make style & quality * Remove Detectron2 config file * Add configurable option to label all subwords * Fix test * Skip visual segment embeddings in test * Use ResNet-18 backbone in tests instead of ResNet-101 * Proposal * Re-enable all jobs on CI * Fix installation of tesseract * Fix failing test * Fix index table * Add LayoutXLM doc page, first draft of code examples * Improve documentation a lot * Update expected boxes for Tesseract 4.0.0 beta * Use offsets to create labels instead of checking if they start with ## * Update expected boxes for Tesseract 4.1.1 * Fix conflict * Make variable names cleaner, add docstring, add link to notebooks * Revert "Fix conflict" This reverts commit a9b46ce9afe47ebfcfe7b45e6a121d49e74ef2c5. * Revert to make integration test pass * Apply suggestions from @LysandreJik's review * Address @patrickvonplaten's comments * Remove fixtures DocVQA in favor of dataset on the hub Co-authored-by:Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 24 Aug, 2021 1 commit
-
-
Ori Ram authored
-
- 23 Aug, 2021 1 commit
-
-
Yih-Dar authored
* make flax gpt2 working with cross attention * Remove encoder->decoder projection layer * A draft (incomplete) for FlaxEncoderDecoderModel * Add the method from_encoder_decoder_pretrained + the docstrings * Fix the mistakes of using EncoderDecoderModel * Fix style * Add FlaxEncoderDecoderModel to the library * Fix cyclic imports * Add FlaxEncoderDecoderModel to modeling_flax_auto.py * Remove question comments * add tests for FlaxEncoderDecoderModel * add flax_encoder_decoder to the lists of ignored entries in check_repo.py * fix missing required positional arguments * Remove **kwargs when creating FlaxEncoderDecoderModel in from_encoder_decoder_pretrained() Also fix generation eos/pad tokens issue * Fix: Use sequences from the generated_output * Change a check from assert to raise ValueError * Fix examples and token ids issues * Fix missing all_cross_attentions when outputting tuple in modeling_gpt2 * Remove the changes in configuration docstrings. * allow for bert 2 gpt2 * make fix-copies * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Change remaining examples to bert2gpt2 * Change the test to Bert2GPT2 * Fix examples * Fix import * Fix unpack bug * Rename to FlaxEncoderDecoderModelTest and change the test to bert2gpt2 * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Fix: NotImplentedError -> NotImplementedError * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * up * finalize Co-authored-by:
ydshieh <ydshieh@user.noreply> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 17 Aug, 2021 1 commit
-
-
Ori Ram authored
* splinter template * initialize splinter classes * Splinter Tokenizer * splinter.rst * tokenization fixes * Documentation & some minor variable name changes * bug fix (added back question_token_id to config) + variable names * Minor bug fixes + variable name changes * Fix Splinter references after merge with new transformers * changes after running make style & quality * Fix documentation unindent * Fix doc indentation in tokenization_splinter * Fix also SplinterTokenizerFast * Add Splinter to index.rst and README * Fixdouble whitespace from index.rst * Fixed index.rst with 'make fix-copies' * Update docs/source/model_doc/splinter.rst Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update docs/source/model_doc/splinter.rst Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update docs/source/model_doc/splinter.rst Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update docs/source/model_doc/splinter.rst Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/models/splinter/__init__.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Added "copied from BERT" comments * Removing unnexessary code from modeling_splinter * Update README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/splinter/configuration_splinter.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Remove references to TF modeling from splinter * Update src/transformers/models/splinter/modeling_splinter.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove unnecessary check * Update src/transformers/models/splinter/modeling_splinter.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add differences between Splinter and Bert tokenizers * Update src/transformers/models/splinter/modeling_splinter.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/splinter/tokenization_splinter_fast.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove unnecessary check * Doc formatting * Update src/transformers/models/splinter/tokenization_splinter.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/splinter/tokenization_splinter.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * bug fix: remove load_tf_weights attribute * Some minor quality changes * Update docs/source/model_doc/splinter.rst Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/splinter/configuration_splinter.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Change FullyConnectedLayer to SplinterFullyConnectedLayer * Variable naming * Reove gather_positions function * Remove ClassificationHead as it's outdated * Update src/transformers/models/splinter/modeling_splinter.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Remove hardcoded 102 token id * Minor style change * Added "tau" organization to all model identifiers & URLS * Added tau to the tests as well * Copy-from comments * Removed all unnecessary classes (e.g. SplinterForMaskedLM) * Running make fix-copies * Bug fix: Further removed unnecessary classes * Add Splinter to AutoTokenization * Add an integration test for Splinter * Removed initialize_new_qass from config - It will be done through different checkpoints * Removed `initialize_new_qass` from documentation as well * Added new checkpoint names (`tau/splinter-base-qass` and same for large) in the code * Minor change to test * SplinterTokenizer now doesn't abstract from BertTokenizer * SplinterTokenizerFast also dosn't abstract from Bert * style and quality * bug fix: import ing torch in tests only if it's available * Auto mappings * Changed copyrights in Splinter's files * Update src/transformers/models/splinter/configuration_splinter.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
yuvalkirstain <kirstain.yuval@gmail.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 16 Aug, 2021 1 commit
-
-
Omar Sanseviero authored
* Fix frameworks table so it's alphabetical * Update index.rst * Don't differentiate when sorting between upper and lower case
-