Unverified Commit 74712e22 authored by Sylvain Gugger's avatar Sylvain Gugger Committed by GitHub
Browse files

Honor contributors to models (#11329)

* Honor contributors to models

* Fix typo

* Address review comments

* Add more authors
parent aad95c7c
...@@ -31,7 +31,8 @@ According to the abstract, ...@@ -31,7 +31,8 @@ According to the abstract,
extractive summary. extractive summary.
- Pegasus achieves SOTA summarization performance on all 12 downstream tasks, as measured by ROUGE and human eval. - Pegasus achieves SOTA summarization performance on all 12 downstream tasks, as measured by ROUGE and human eval.
The Authors' code can be found `here <https://github.com/google-research/pegasus>`__. This model was contributed by `sshleifer <https://huggingface.co/sshleifer>`__. The Authors' code can be found `here
<https://github.com/google-research/pegasus>`__.
Checkpoints Checkpoints
......
...@@ -50,7 +50,7 @@ Example of use: ...@@ -50,7 +50,7 @@ Example of use:
>>> # phobert = TFAutoModel.from_pretrained("vinai/phobert-base") >>> # phobert = TFAutoModel.from_pretrained("vinai/phobert-base")
The original code can be found `here <https://github.com/VinAIResearch/PhoBERT>`__. This model was contributed by `dqnguyen <https://huggingface.co/dqnguyen>`__. The original code can be found `here <https://github.com/VinAIResearch/PhoBERT>`__.
PhobertTokenizer PhobertTokenizer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
...@@ -43,6 +43,7 @@ outperforming parametric seq2seq models and task-specific retrieve-and-extract a ...@@ -43,6 +43,7 @@ outperforming parametric seq2seq models and task-specific retrieve-and-extract a
tasks, we find that RAG models generate more specific, diverse and factual language than a state-of-the-art tasks, we find that RAG models generate more specific, diverse and factual language than a state-of-the-art
parametric-only seq2seq baseline.* parametric-only seq2seq baseline.*
This model was contributed by `ola13 <https://huggingface.co/ola13>`__.
RagConfig RagConfig
......
...@@ -32,7 +32,8 @@ layers instead of the standard residuals, which allows storing activations only ...@@ -32,7 +32,8 @@ layers instead of the standard residuals, which allows storing activations only
N times, where N is the number of layers. The resulting model, the Reformer, performs on par with Transformer models N times, where N is the number of layers. The resulting model, the Reformer, performs on par with Transformer models
while being much more memory-efficient and much faster on long sequences.* while being much more memory-efficient and much faster on long sequences.*
The Authors' code can be found `here <https://github.com/google/trax/tree/master/trax/models/reformer>`__. This model was contributed by `patrickvonplaten <https://huggingface.co/patrickvonplaten>`__. The Authors' code can be
found `here <https://github.com/google/trax/tree/master/trax/models/reformer>`__.
Axial Positional Encodings Axial Positional Encodings
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
...@@ -20,8 +20,8 @@ The RetriBERT model was proposed in the blog post `Explain Anything Like I'm Fiv ...@@ -20,8 +20,8 @@ The RetriBERT model was proposed in the blog post `Explain Anything Like I'm Fiv
Question Answering <https://yjernite.github.io/lfqa.html>`__. RetriBERT is a small model that uses either a single or Question Answering <https://yjernite.github.io/lfqa.html>`__. RetriBERT is a small model that uses either a single or
pair of BERT encoders with lower-dimension projection for dense semantic indexing of text. pair of BERT encoders with lower-dimension projection for dense semantic indexing of text.
Code to train and use the model can be found `here This model was contributed by `yjernite <https://huggingface.co/yjernite>`__. Code to train and use the model can be
<https://github.com/huggingface/transformers/tree/master/examples/distillation>`__. found `here <https://github.com/huggingface/transformers/tree/master/examples/distillation>`__.
RetriBertConfig RetriBertConfig
......
...@@ -44,7 +44,8 @@ Tips: ...@@ -44,7 +44,8 @@ Tips:
separate your segments with the separation token :obj:`tokenizer.sep_token` (or :obj:`</s>`) separate your segments with the separation token :obj:`tokenizer.sep_token` (or :obj:`</s>`)
- :doc:`CamemBERT <camembert>` is a wrapper around RoBERTa. Refer to this page for usage examples. - :doc:`CamemBERT <camembert>` is a wrapper around RoBERTa. Refer to this page for usage examples.
The original code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`_. This model was contributed by `julien-c <https://huggingface.co/julien-c>`__. The original code can be found `here
<https://github.com/pytorch/fairseq/tree/master/examples/roberta>`_.
RobertaConfig RobertaConfig
......
...@@ -25,7 +25,8 @@ transcripts/translations autoregressively. Speech2Text has been fine-tuned on se ...@@ -25,7 +25,8 @@ transcripts/translations autoregressively. Speech2Text has been fine-tuned on se
`LibriSpeech <http://www.openslr.org/12>`__, `CoVoST 2 <https://github.com/facebookresearch/covost>`__, `MuST-C `LibriSpeech <http://www.openslr.org/12>`__, `CoVoST 2 <https://github.com/facebookresearch/covost>`__, `MuST-C
<https://ict.fbk.eu/must-c/>`__. <https://ict.fbk.eu/must-c/>`__.
The original code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/speech_to_text>`__. This model was contributed by `valhalla <https://huggingface.co/valhalla>`__. The original code can be found `here
<https://github.com/pytorch/fairseq/tree/master/examples/speech_to_text>`__.
Inference Inference
......
...@@ -47,6 +47,9 @@ Tips: ...@@ -47,6 +47,9 @@ Tips:
- For best results when finetuning on sequence classification tasks, it is recommended to start with the - For best results when finetuning on sequence classification tasks, it is recommended to start with the
`squeezebert/squeezebert-mnli-headless` checkpoint. `squeezebert/squeezebert-mnli-headless` checkpoint.
This model was contributed by `forresti <https://huggingface.co/forresti>`__.
SqueezeBertConfig SqueezeBertConfig
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
...@@ -48,7 +48,8 @@ Tips: ...@@ -48,7 +48,8 @@ Tips:
layers to the decoder and auto-regressively generates the decoder output. - T5 uses relative scalar embeddings. layers to the decoder and auto-regressively generates the decoder output. - T5 uses relative scalar embeddings.
Encoder input padding can be done on the left and on the right. Encoder input padding can be done on the left and on the right.
The original code can be found `here <https://github.com/google-research/text-to-text-transfer-transformer>`__. This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
<https://github.com/google-research/text-to-text-transfer-transformer>`__.
Training Training
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
...@@ -49,7 +49,8 @@ entailment (a binary classification task). For more details, see their follow-up ...@@ -49,7 +49,8 @@ entailment (a binary classification task). For more details, see their follow-up
intermediate pre-training <https://www.aclweb.org/anthology/2020.findings-emnlp.27/>`__ by Julian Martin Eisenschlos, intermediate pre-training <https://www.aclweb.org/anthology/2020.findings-emnlp.27/>`__ by Julian Martin Eisenschlos,
Syrine Krichene and Thomas Müller. Syrine Krichene and Thomas Müller.
The original code can be found `here <https://github.com/google-research/tapas>`__. This model was contributed by `nielsr <https://huggingface.co/nielsr>`__. The original code can be found `here
<https://github.com/google-research/tapas>`__.
Tips: Tips:
......
...@@ -41,7 +41,8 @@ Tips: ...@@ -41,7 +41,8 @@ Tips:
original implementation trains on SQuAD with padding on the left, therefore the padding defaults are set to left. original implementation trains on SQuAD with padding on the left, therefore the padding defaults are set to left.
- Transformer-XL is one of the few models that has no sequence length limit. - Transformer-XL is one of the few models that has no sequence length limit.
The original code can be found `here <https://github.com/kimiyoung/transformer-xl>`__. This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
<https://github.com/kimiyoung/transformer-xl>`__.
TransfoXLConfig TransfoXLConfig
......
...@@ -67,7 +67,8 @@ Tips: ...@@ -67,7 +67,8 @@ Tips:
improvement of 2% to training from scratch, but still 4% behind supervised pre-training. improvement of 2% to training from scratch, but still 4% behind supervised pre-training.
The original code (written in JAX) can be found `here <https://github.com/google-research/vision_transformer>`__. This model was contributed by `nielsr <https://huggingface.co/nielsr>`__. The original code (written in JAX) can be
found `here <https://github.com/google-research/vision_transformer>`__.
Note that we converted the weights from Ross Wightman's `timm library Note that we converted the weights from Ross Wightman's `timm library
<https://github.com/rwightman/pytorch-image-models>`__, who already converted the weights from JAX to PyTorch. Credits <https://github.com/rwightman/pytorch-image-models>`__, who already converted the weights from JAX to PyTorch. Credits
......
...@@ -36,6 +36,8 @@ Tips: ...@@ -36,6 +36,8 @@ Tips:
- Wav2Vec2 model was trained using connectionist temporal classification (CTC) so the model output has to be decoded - Wav2Vec2 model was trained using connectionist temporal classification (CTC) so the model output has to be decoded
using :class:`~transformers.Wav2Vec2CTCTokenizer`. using :class:`~transformers.Wav2Vec2CTCTokenizer`.
This model was contributed by `patrickvonplaten <https://huggingface.co/patrickvonplaten>`__.
Wav2Vec2Config Wav2Vec2Config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
...@@ -42,7 +42,8 @@ Tips: ...@@ -42,7 +42,8 @@ Tips:
- XLM has multilingual checkpoints which leverage a specific :obj:`lang` parameter. Check out the :doc:`multi-lingual - XLM has multilingual checkpoints which leverage a specific :obj:`lang` parameter. Check out the :doc:`multi-lingual
<../multilingual>` page for more information. <../multilingual>` page for more information.
The original code can be found `here <https://github.com/facebookresearch/XLM/>`__. This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
<https://github.com/facebookresearch/XLM/>`__.
XLMConfig XLMConfig
......
...@@ -44,7 +44,8 @@ Tips: ...@@ -44,7 +44,8 @@ Tips:
- This implementation is the same as RoBERTa. Refer to the :doc:`documentation of RoBERTa <roberta>` for usage examples - This implementation is the same as RoBERTa. Refer to the :doc:`documentation of RoBERTa <roberta>` for usage examples
as well as the information relative to the inputs and outputs. as well as the information relative to the inputs and outputs.
The original code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/xlmr>`__. This model was contributed by `stefan-it <https://huggingface.co/stefan-it>`__. The original code can be found `here
<https://github.com/pytorch/fairseq/tree/master/examples/xlmr>`__.
XLMRobertaConfig XLMRobertaConfig
......
...@@ -44,7 +44,8 @@ Tips: ...@@ -44,7 +44,8 @@ Tips:
`examples/text-generation/run_generation.py`) `examples/text-generation/run_generation.py`)
- XLNet is one of the few models that has no sequence length limit. - XLNet is one of the few models that has no sequence length limit.
The original code can be found `here <https://github.com/zihangdai/xlnet/>`__. This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
<https://github.com/zihangdai/xlnet/>`__.
XLNetConfig XLNetConfig
......
...@@ -27,6 +27,10 @@ Tips: ...@@ -27,6 +27,10 @@ Tips:
<INSERT TIPS ABOUT MODEL HERE> <INSERT TIPS ABOUT MODEL HERE>
This model was contributed by `<INSERT YOUR HF USERNAME HERE>
<https://huggingface.co/<INSERT YOUR HF USERNAME HERE>>`__. The original code can be found `here
<<INSERT LINK TO GITHUB REPO HERE>>`__.
{{cookiecutter.camelcase_modelname}}Config {{cookiecutter.camelcase_modelname}}Config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment