Honor contributors to models (#11329)

* Honor contributors to models * Fix typo * Address review comments * Add more authors

Honor contributors to models (#11329)
* Honor contributors to models * Fix typo * Address review comments * Add more authors
74712e22 · Sylvain Gugger · GitHub · aad95c7c · 74712e22 · 74712e22
Unverified Commit 74712e22 authored Apr 21, 2021 by Sylvain Gugger Committed by GitHub Apr 21, 2021
20 changed files
--- a/docs/source/model_doc/albert.rst
+++ b/docs/source/model_doc/albert.rst
@@ -43,7 +43,8 @@ Tips:
  similar to a BERT-like architecture with the same number of hidden layers as it has to iterate through the same
  number of (repeating) layers.
-The original code can be found `here <https://github.com/google-research/ALBERT>`__.
+This model was contributed by `lysandre <https://huggingface.co/lysandre>`__. The original code can be found `here
+<https://github.com/google-research/ALBERT>`__.
 AlbertConfig
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/model_doc/bart.rst
+++ b/docs/source/model_doc/bart.rst
@@ -35,7 +35,8 @@ According to the abstract,
  state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains
  of up to 6 ROUGE.
-The Authors' code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/bart>`__.
+This model was contributed by `sshleifer <https://huggingface.co/sshleifer>`__. The Authors' code can be found `here
+<https://github.com/pytorch/fairseq/tree/master/examples/bart>`__.
 Examples

--- a/docs/source/model_doc/barthez.rst
+++ b/docs/source/model_doc/barthez.rst
@@ -16,7 +16,7 @@ BARThez
 Overview
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The BARThez model was proposed in `BARThez: a Skilled Pretrained French Sequence-to-Sequence Model`
+The BARThez model was proposed in `BARThez: a Skilled Pretrained French Sequence-to-Sequence Model
 <https://arxiv.org/abs/2010.12321>`__ by Moussa Kamal Eddine, Antoine J.-P. Tixier, Michalis Vazirgiannis on 23 Oct,
 2020.
@@ -35,7 +35,8 @@ summarization dataset, OrangeSum, that we release with this paper. We also conti
 pretrained multilingual BART on BARThez's corpus, and we show that the resulting model, which we call mBARTHez,
 provides a significant boost over vanilla BARThez, and is on par with or outperforms CamemBERT and FlauBERT.*
-The Authors' code can be found `here <https://github.com/moussaKam/BARThez>`__.
+This model was contributed by `moussakam <https://huggingface.co/moussakam>`__. The Authors' code can be found `here
+<https://github.com/moussaKam/BARThez>`__.
 Examples

--- a/docs/source/model_doc/bert.rst
+++ b/docs/source/model_doc/bert.rst
@@ -42,7 +42,8 @@ Tips:
 - BERT was trained with the masked language modeling (MLM) and next sentence prediction (NSP) objectives. It is
  efficient at predicting masked tokens and at NLU in general, but is not optimal for text generation.
-The original code can be found `here <https://github.com/google-research/bert>`__.
+This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
+<https://github.com/google-research/bert>`__.
 BertConfig
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/model_doc/bert_japanese.rst
+++ b/docs/source/model_doc/bert_japanese.rst
@@ -71,6 +71,8 @@ Tips:
 - This implementation is the same as BERT, except for tokenization method. Refer to the :doc:`documentation of BERT
  <bert>` for more usage examples.
+This model was contributed by `cl-tohoku <https://huggingface.co/cl-tohoku>`__.
 BertJapaneseTokenizer
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/model_doc/bertgeneration.rst
+++ b/docs/source/model_doc/bertgeneration.rst
@@ -79,7 +79,8 @@ Tips:
 - For summarization, sentence splitting, sentence fusion and translation, no special tokens are required for the input.
  Therefore, no EOS token should be added to the end of the input.
-The original code can be found `here <https://tfhub.dev/s?module-type=text-generation&subtype=module,placeholder>`__.
+This model was contributed by `patrickvonplaten <https://huggingface.co/patrickvonplaten>`__. The original code can be
+found `here <https://tfhub.dev/s?module-type=text-generation&subtype=module,placeholder>`__.
 BertGenerationConfig
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/model_doc/bertweet.rst
+++ b/docs/source/model_doc/bertweet.rst
@@ -54,8 +54,8 @@ Example of use:
    >>> # from transformers import TFAutoModel
    >>> # bertweet = TFAutoModel.from_pretrained("vinai/bertweet-base")
+This model was contributed by `dqnguyen <https://huggingface.co/dqnguyen>`__. The original code can be found `here
-The original code can be found `here <https://github.com/VinAIResearch/BERTweet>`__.
+<https://github.com/VinAIResearch/BERTweet>`__.
 BertweetTokenizer
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/model_doc/bigbird.rst
+++ b/docs/source/model_doc/bigbird.rst
@@ -50,7 +50,8 @@ Tips:
 - Current implementation supports only **ITC**.
 - Current implementation doesn't support **num_random_blocks = 0**
-The original code can be found `here <https://github.com/google-research/bigbird>`__.
+This model was contributed by `vasudevgupta <https://huggingface.co/vasudevgupta>`__. The original code can be found
+`here <https://github.com/google-research/bigbird>`__.
 BigBirdConfig
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/model_doc/blenderbot.rst
+++ b/docs/source/model_doc/blenderbot.rst
@@ -36,7 +36,8 @@ and code publicly available. Human evaluations show our best models are superior
 dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing
 failure cases of our models.*
-The authors' code can be found `here <https://github.com/facebookresearch/ParlAI>`__ .
+This model was contributed by `sshleifer <https://huggingface.co/sshleifer>`__. The authors' code can be found `here
+<https://github.com/facebookresearch/ParlAI>`__ .
 Implementation Notes

--- a/docs/source/model_doc/blenderbot_small.rst
+++ b/docs/source/model_doc/blenderbot_small.rst
@@ -39,7 +39,8 @@ and code publicly available. Human evaluations show our best models are superior
 dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing
 failure cases of our models.*
-The authors' code can be found `here <https://github.com/facebookresearch/ParlAI>`__ .
+This model was contributed by `patrickvonplaten <https://huggingface.co/patrickvonplaten>`__. The authors' code can be
+found `here <https://github.com/facebookresearch/ParlAI>`__ .
 BlenderbotSmallConfig
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/model_doc/bort.rst
+++ b/docs/source/model_doc/bort.rst
@@ -43,4 +43,5 @@ Tips:
  that is sadly not open-sourced yet. It would be very useful for the community, if someone tries to implement the
  algorithm to make BORT fine-tuning work.
-The original code can be found `here <https://github.com/alexa/bort/>`__.
+This model was contributed by `stefan-it <https://huggingface.co/stefan-it>`__. The original code can be found `here
+<https://github.com/alexa/bort/>`__.
--- a/docs/source/model_doc/camembert.rst
+++ b/docs/source/model_doc/camembert.rst
@@ -37,7 +37,8 @@ Tips:
 - This implementation is the same as RoBERTa. Refer to the :doc:`documentation of RoBERTa <roberta>` for usage examples
  as well as the information relative to the inputs and outputs.
-The original code can be found `here <https://camembert-model.fr/>`__.
+This model was contributed by `camembert <https://huggingface.co/camembert>`__. The original code can be found `here
+<https://camembert-model.fr/>`__.
 CamembertConfig
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/model_doc/convbert.rst
+++ b/docs/source/model_doc/convbert.rst
@@ -34,8 +34,10 @@ ConvBERT significantly outperforms BERT and its variants in various downstream t
 fewer model parameters. Remarkably, ConvBERTbase model achieves 86.4 GLUE score, 0.7 higher than ELECTRAbase, while
 using less than 1/4 training cost. Code and pre-trained models will be released.*
-ConvBERT training tips are similar to those of BERT. The original implementation can be found here:
+ConvBERT training tips are similar to those of BERT.
-https://github.com/yitu-opensource/ConvBert
+This model was contributed by `abhishek <https://huggingface.co/abhishek>`__. The original implementation can be found
+here: https://github.com/yitu-opensource/ConvBert
 ConvBertConfig
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/model_doc/cpm.rst
+++ b/docs/source/model_doc/cpm.rst
@@ -33,7 +33,8 @@ language model, which could facilitate several downstream Chinese NLP tasks, suc
 cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many
 NLP tasks in the settings of few-shot (even zero-shot) learning.*
-The original implementation can be found here: https://github.com/TsinghuaAI/CPM-Generate
+This model was contributed by `canwenxu <https://huggingface.co/canwenxu>`__. The original implementation can be found
+here: https://github.com/TsinghuaAI/CPM-Generate
 Note: We only have a tokenizer here, since the model architecture is the same as GPT-2.

--- a/docs/source/model_doc/ctrl.rst
+++ b/docs/source/model_doc/ctrl.rst
@@ -46,7 +46,8 @@ Tips:
  `reusing the past in generative models <../quickstart.html#using-the-past>`__ for more information on the usage of
  this argument.
-The original code can be found `here <https://github.com/salesforce/ctrl>`__.
+This model was contributed by `keskarnitishr <https://huggingface.co/keskarnitishr>`__. The original code can be found
+`here <https://github.com/salesforce/ctrl>`__.
 CTRLConfig

--- a/docs/source/model_doc/deberta.rst
+++ b/docs/source/model_doc/deberta.rst
@@ -38,7 +38,8 @@ the training data performs consistently better on a wide range of NLP tasks, ach
 pre-trained models will be made publicly available at https://github.com/microsoft/DeBERTa.*
-The original code can be found `here <https://github.com/microsoft/DeBERTa>`__.
+This model was contributed by `DeBERTa <https://huggingface.co/DeBERTa>`__. The original code can be found `here
+<https://github.com/microsoft/DeBERTa>`__.
 DebertaConfig

--- a/docs/source/model_doc/deberta_v2.rst
+++ b/docs/source/model_doc/deberta_v2.rst
@@ -58,7 +58,8 @@ New in v2:
 - **900M model & 1.5B model** Two additional model sizes are available: 900M and 1.5B, which significantly improves the
  performance of downstream tasks.
-The original code can be found `here <https://github.com/microsoft/DeBERTa>`__.
+This model was contributed by `DeBERTa <https://huggingface.co/DeBERTa>`__. The original code can be found `here
+<https://github.com/microsoft/DeBERTa>`__.
 DebertaV2Config

--- a/docs/source/model_doc/deit.rst
+++ b/docs/source/model_doc/deit.rst
@@ -73,6 +73,8 @@ Tips:
  `facebook/deit-base-patch16-384`. Note that one should use :class:`~transformers.DeiTFeatureExtractor` in order to
  prepare images for the model.
+This model was contributed by `nielsr <https://huggingface.co/nielsr>`__.
 DeiTConfig
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

--- a/docs/source/model_doc/distilbert.rst
+++ b/docs/source/model_doc/distilbert.rst
@@ -44,7 +44,7 @@ Tips:
 - DistilBERT doesn't have options to select the input positions (:obj:`position_ids` input). This could be added if
  necessary though, just let us know if you need this option.
-The original code can be found `here
+This model was contributed by `victorsanh <https://huggingface.co/victorsanh>`__. The original code can be found `here
 <https://github.com/huggingface/transformers/tree/master/examples/distillation>`__.

--- a/docs/source/model_doc/dpr.rst
+++ b/docs/source/model_doc/dpr.rst
@@ -30,7 +30,8 @@ our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% ab
 retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA
 benchmarks.*
-The original code can be found `here <https://github.com/facebookresearch/DPR>`__.
+This model was contributed by `lhoestq <https://huggingface.co/lhoestq>`__. The original code can be found `here
+<https://github.com/facebookresearch/DPR>`__.
 DPRConfig