Unverified Commit 74712e22 authored by Sylvain Gugger's avatar Sylvain Gugger Committed by GitHub
Browse files

Honor contributors to models (#11329)

* Honor contributors to models

* Fix typo

* Address review comments

* Add more authors
parent aad95c7c
......@@ -43,7 +43,8 @@ Tips:
similar to a BERT-like architecture with the same number of hidden layers as it has to iterate through the same
number of (repeating) layers.
The original code can be found `here <https://github.com/google-research/ALBERT>`__.
This model was contributed by `lysandre <https://huggingface.co/lysandre>`__. The original code can be found `here
<https://github.com/google-research/ALBERT>`__.
AlbertConfig
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -35,7 +35,8 @@ According to the abstract,
state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains
of up to 6 ROUGE.
The Authors' code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/bart>`__.
This model was contributed by `sshleifer <https://huggingface.co/sshleifer>`__. The Authors' code can be found `here
<https://github.com/pytorch/fairseq/tree/master/examples/bart>`__.
Examples
......
......@@ -16,7 +16,7 @@ BARThez
Overview
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The BARThez model was proposed in `BARThez: a Skilled Pretrained French Sequence-to-Sequence Model`
The BARThez model was proposed in `BARThez: a Skilled Pretrained French Sequence-to-Sequence Model
<https://arxiv.org/abs/2010.12321>`__ by Moussa Kamal Eddine, Antoine J.-P. Tixier, Michalis Vazirgiannis on 23 Oct,
2020.
......@@ -35,7 +35,8 @@ summarization dataset, OrangeSum, that we release with this paper. We also conti
pretrained multilingual BART on BARThez's corpus, and we show that the resulting model, which we call mBARTHez,
provides a significant boost over vanilla BARThez, and is on par with or outperforms CamemBERT and FlauBERT.*
The Authors' code can be found `here <https://github.com/moussaKam/BARThez>`__.
This model was contributed by `moussakam <https://huggingface.co/moussakam>`__. The Authors' code can be found `here
<https://github.com/moussaKam/BARThez>`__.
Examples
......
......@@ -42,7 +42,8 @@ Tips:
- BERT was trained with the masked language modeling (MLM) and next sentence prediction (NSP) objectives. It is
efficient at predicting masked tokens and at NLU in general, but is not optimal for text generation.
The original code can be found `here <https://github.com/google-research/bert>`__.
This model was contributed by `thomwolf <https://huggingface.co/thomwolf>`__. The original code can be found `here
<https://github.com/google-research/bert>`__.
BertConfig
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -71,6 +71,8 @@ Tips:
- This implementation is the same as BERT, except for tokenization method. Refer to the :doc:`documentation of BERT
<bert>` for more usage examples.
This model was contributed by `cl-tohoku <https://huggingface.co/cl-tohoku>`__.
BertJapaneseTokenizer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -79,7 +79,8 @@ Tips:
- For summarization, sentence splitting, sentence fusion and translation, no special tokens are required for the input.
Therefore, no EOS token should be added to the end of the input.
The original code can be found `here <https://tfhub.dev/s?module-type=text-generation&subtype=module,placeholder>`__.
This model was contributed by `patrickvonplaten <https://huggingface.co/patrickvonplaten>`__. The original code can be
found `here <https://tfhub.dev/s?module-type=text-generation&subtype=module,placeholder>`__.
BertGenerationConfig
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -54,8 +54,8 @@ Example of use:
>>> # from transformers import TFAutoModel
>>> # bertweet = TFAutoModel.from_pretrained("vinai/bertweet-base")
The original code can be found `here <https://github.com/VinAIResearch/BERTweet>`__.
This model was contributed by `dqnguyen <https://huggingface.co/dqnguyen>`__. The original code can be found `here
<https://github.com/VinAIResearch/BERTweet>`__.
BertweetTokenizer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -50,7 +50,8 @@ Tips:
- Current implementation supports only **ITC**.
- Current implementation doesn't support **num_random_blocks = 0**
The original code can be found `here <https://github.com/google-research/bigbird>`__.
This model was contributed by `vasudevgupta <https://huggingface.co/vasudevgupta>`__. The original code can be found
`here <https://github.com/google-research/bigbird>`__.
BigBirdConfig
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -36,7 +36,8 @@ and code publicly available. Human evaluations show our best models are superior
dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing
failure cases of our models.*
The authors' code can be found `here <https://github.com/facebookresearch/ParlAI>`__ .
This model was contributed by `sshleifer <https://huggingface.co/sshleifer>`__. The authors' code can be found `here
<https://github.com/facebookresearch/ParlAI>`__ .
Implementation Notes
......
......@@ -39,7 +39,8 @@ and code publicly available. Human evaluations show our best models are superior
dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing
failure cases of our models.*
The authors' code can be found `here <https://github.com/facebookresearch/ParlAI>`__ .
This model was contributed by `patrickvonplaten <https://huggingface.co/patrickvonplaten>`__. The authors' code can be
found `here <https://github.com/facebookresearch/ParlAI>`__ .
BlenderbotSmallConfig
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -43,4 +43,5 @@ Tips:
that is sadly not open-sourced yet. It would be very useful for the community, if someone tries to implement the
algorithm to make BORT fine-tuning work.
The original code can be found `here <https://github.com/alexa/bort/>`__.
This model was contributed by `stefan-it <https://huggingface.co/stefan-it>`__. The original code can be found `here
<https://github.com/alexa/bort/>`__.
......@@ -37,7 +37,8 @@ Tips:
- This implementation is the same as RoBERTa. Refer to the :doc:`documentation of RoBERTa <roberta>` for usage examples
as well as the information relative to the inputs and outputs.
The original code can be found `here <https://camembert-model.fr/>`__.
This model was contributed by `camembert <https://huggingface.co/camembert>`__. The original code can be found `here
<https://camembert-model.fr/>`__.
CamembertConfig
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -34,8 +34,10 @@ ConvBERT significantly outperforms BERT and its variants in various downstream t
fewer model parameters. Remarkably, ConvBERTbase model achieves 86.4 GLUE score, 0.7 higher than ELECTRAbase, while
using less than 1/4 training cost. Code and pre-trained models will be released.*
ConvBERT training tips are similar to those of BERT. The original implementation can be found here:
https://github.com/yitu-opensource/ConvBert
ConvBERT training tips are similar to those of BERT.
This model was contributed by `abhishek <https://huggingface.co/abhishek>`__. The original implementation can be found
here: https://github.com/yitu-opensource/ConvBert
ConvBertConfig
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -33,7 +33,8 @@ language model, which could facilitate several downstream Chinese NLP tasks, suc
cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many
NLP tasks in the settings of few-shot (even zero-shot) learning.*
The original implementation can be found here: https://github.com/TsinghuaAI/CPM-Generate
This model was contributed by `canwenxu <https://huggingface.co/canwenxu>`__. The original implementation can be found
here: https://github.com/TsinghuaAI/CPM-Generate
Note: We only have a tokenizer here, since the model architecture is the same as GPT-2.
......
......@@ -46,7 +46,8 @@ Tips:
`reusing the past in generative models <../quickstart.html#using-the-past>`__ for more information on the usage of
this argument.
The original code can be found `here <https://github.com/salesforce/ctrl>`__.
This model was contributed by `keskarnitishr <https://huggingface.co/keskarnitishr>`__. The original code can be found
`here <https://github.com/salesforce/ctrl>`__.
CTRLConfig
......
......@@ -38,7 +38,8 @@ the training data performs consistently better on a wide range of NLP tasks, ach
pre-trained models will be made publicly available at https://github.com/microsoft/DeBERTa.*
The original code can be found `here <https://github.com/microsoft/DeBERTa>`__.
This model was contributed by `DeBERTa <https://huggingface.co/DeBERTa>`__. The original code can be found `here
<https://github.com/microsoft/DeBERTa>`__.
DebertaConfig
......
......@@ -58,7 +58,8 @@ New in v2:
- **900M model & 1.5B model** Two additional model sizes are available: 900M and 1.5B, which significantly improves the
performance of downstream tasks.
The original code can be found `here <https://github.com/microsoft/DeBERTa>`__.
This model was contributed by `DeBERTa <https://huggingface.co/DeBERTa>`__. The original code can be found `here
<https://github.com/microsoft/DeBERTa>`__.
DebertaV2Config
......
......@@ -73,6 +73,8 @@ Tips:
`facebook/deit-base-patch16-384`. Note that one should use :class:`~transformers.DeiTFeatureExtractor` in order to
prepare images for the model.
This model was contributed by `nielsr <https://huggingface.co/nielsr>`__.
DeiTConfig
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......
......@@ -44,7 +44,7 @@ Tips:
- DistilBERT doesn't have options to select the input positions (:obj:`position_ids` input). This could be added if
necessary though, just let us know if you need this option.
The original code can be found `here
This model was contributed by `victorsanh <https://huggingface.co/victorsanh>`__. The original code can be found `here
<https://github.com/huggingface/transformers/tree/master/examples/distillation>`__.
......
......@@ -30,7 +30,8 @@ our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% ab
retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA
benchmarks.*
The original code can be found `here <https://github.com/facebookresearch/DPR>`__.
This model was contributed by `lhoestq <https://huggingface.co/lhoestq>`__. The original code can be found `here
<https://github.com/facebookresearch/DPR>`__.
DPRConfig
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment