Add model and doc badges (#4811)

* Add badges for models and docs

Add model and doc badges (#4811)
* Add badges for models and docs
56d5d160 · Sylvain Gugger · GitHub · 4ab74245 · 56d5d160
Unverified Commit 56d5d160 authored Jun 05, 2020 by Sylvain Gugger Committed by GitHub Jun 05, 2020
Hide whitespace changes
Inline Side-by-side

Showing with 165 additions and 40 deletions

docs/source/summary.rst docs/source/summary.rst +165 -40

No files found.
--- a/docs/source/summary.rst
+++ b/docs/source/summary.rst
@@ -50,6 +50,15 @@ that at each position, the model can only look at the tokens before in the atten
 Original GPT
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=openai-gpt">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-openai--gpt-blueviolet">
+   </a>
+   <a href="/model_doc/gpt">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-openai--gpt-blueviolet">
+   </a>
+
 `Improving Language Understanding by Generative Pre-Training <https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf>`_, 
 Alec Radford et al.

@@ -58,11 +67,18 @@ The first autoregressive model based on the transformer architecture, pretrained
 The library provides versions of the model for language modeling and multitask language modeling/multiple choice 
 classification.

-More information in this :doc:`model documentation </model_doc/gpt>`.
-
 GPT-2
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=gpt2">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-gpt2-blueviolet">
+   </a>
+   <a href="/model_doc/gpt2">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-gpt2-blueviolet">
+   </a>
+
 `Language Models are Unsupervised Multitask Learners <https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf>`_, 
 Alec Radford et al.

@@ -72,11 +88,18 @@ more).
 The library provides versions of the model for language modeling and multitask language modeling/multiple choice 
 classification.

-More information in this :doc:`model documentation </model_doc/gpt2>`.
-
 CTRL
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=ctrl">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-ctrl-blueviolet">
+   </a>
+   <a href="/model_doc/ctrl">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-ctrl-blueviolet">
+   </a>
+
 `CTRL: A Conditional Transformer Language Model for Controllable Generation <https://arxiv.org/abs/1909.05858>`_, 
 Nitish Shirish Keskar et al.

@@ -86,11 +109,18 @@ wikipedia article, a book or a movie review.

 The library provides a version of the model for language modeling only.

-More information in this :doc:`model documentation </model_doc/ctrl>`.
-
 Transformer-XL
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=transfo-xl">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-transfo--xl-blueviolet">
+   </a>
+   <a href="/model_doc/transformerxl">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-transfo--xl-blueviolet">
+   </a>
+
 `Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context <https://arxiv.org/abs/1901.02860>`_, 
 Zihang Dai et al.

@@ -108,13 +138,20 @@ adjustments in the way attention scores are computed.

 The library provides a version of the model for language modeling only.

-More information in this :doc:`model documentation </model_doc/transformerxl>`.
-
 .. _reformer:

 Reformer
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=reformer">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-reformer-blueviolet">
+   </a>
+   <a href="/model_doc/reformer">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-reformer-blueviolet">
+   </a>
+
 `Reformer: The Efficient Transformer <https://arxiv.org/abs/2001.04451>`_,
 Nikita Kitaev et al .

@@ -138,11 +175,18 @@ pretraining yet, though.

 The library provides a version of the model for language modeling only.

-More information in this :doc:`model documentation </model_doc/reformer>`.
-
 XLNet
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=xlnet">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlnet-blueviolet">
+   </a>
+   <a href="/model_doc/xlnet">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlnet-blueviolet">
+   </a>
+
 `XLNet: Generalized Autoregressive Pretraining for Language Understanding <https://arxiv.org/abs/1906.08237>`_,
 Zhilin Yang et al.

@@ -156,20 +200,27 @@ XLNet also uses the same recurrence mechanism as TransformerXL to build long-ter
 The library provides a version of the model for language modeling, token classification, sentence classification, 
 multiple choice classification and question answering.

-More information in this :doc:`model documentation </model_doc/xlnet>`.
-
 .. _autoencoding-models:

 Autoencoding models
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-As mentioned before, these models rely on the encoder part of the original transformer and use no mask so the model can `
+As mentioned before, these models rely on the encoder part of the original transformer and use no mask so the model can
 look at all the tokens in the attention heads. For pretraining, inputs are a corrupted version of the sentence, usually 
 obtained by masking tokens, and targets are the original sentences.

 BERT
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=bert">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-bert-blueviolet">
+   </a>
+   <a href="/model_doc/bert">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bert-blueviolet">
+   </a>
+
 `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`_,
 Jacob Devlin et al.

@@ -187,11 +238,18 @@ they are not related. The model has to predict if the sentences are consecutive
 The library provides a version of the model for language modeling (traditional or masked), next sentence prediction, 
 token classification, sentence classification, multiple choice classification and question answering.

-More information in this :doc:`model documentation </model_doc/bert>`.
-
 ALBERT
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=albert">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-albert-blueviolet">
+   </a>
+   <a href="/model_doc/albert">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-albert-blueviolet">
+   </a>
+
 `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations <https://arxiv.org/abs/1909.11942>`_,
 Zhenzhong Lan et al.

@@ -209,11 +267,18 @@ Same as BERT but with a few tweaks:
 The library provides a version of the model for masked language modeling, token classification, sentence 
 classification, multiple choice classification and question answering.

-More information in this :doc:`model documentation </model_doc/albert>`.
-
 RoBERTa
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=roberta">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-roberta-blueviolet">
+   </a>
+   <a href="/model_doc/roberta">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-roberta-blueviolet">
+   </a>
+
 `RoBERTa: A Robustly Optimized BERT Pretraining Approach <https://arxiv.org/abs/1907.11692>`_,
 Yinhan Liu et al.

@@ -228,11 +293,18 @@ Same as BERT with better pretraining tricks:
 The library provides a version of the model for masked language modeling, token classification, sentence 
 classification, multiple choice classification and question answering.

-More information in this :doc:`model documentation </model_doc/roberta>`.
-
 DistilBERT
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=distilbert">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-distilbert-blueviolet">
+   </a>
+   <a href="/model_doc/distilbert">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-distilbert-blueviolet">
+   </a>
+
 `DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter <https://arxiv.org/abs/1910.01108>`_,
 Victor Sanh et al.

@@ -246,11 +318,18 @@ the same probabilities as the larger model. The actual objective is a combinatio
 The library provides a version of the model for masked language modeling, token classification, sentence classification 
 and question answering.

-More information in this :doc:`model documentation </model_doc/distilbert>`.
-
 XLM
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=xlm">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlm-blueviolet">
+   </a>
+   <a href="/model_doc/xlm">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm-blueviolet">
+   </a>
+
 `Cross-lingual Language Model Pretraining <https://arxiv.org/abs/1901.07291>`_, Guillaume Lample and Alexis Conneau

 A transformer model trained on several languages. There are three different type of training for this model and the 
@@ -274,11 +353,18 @@ language.
 The library provides a version of the model for language modeling, token classification, sentence classification and 
 question answering.

-More information in this :doc:`model documentation </model_doc/xlm>`.
-
 XLM-RoBERTa
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=xlm-roberta">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-xlm--roberta-blueviolet">
+   </a>
+   <a href="/model_doc/xlmroberta">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm--roberta-blueviolet">
+   </a>
+
 `Unsupervised Cross-lingual Representation Learning at Scale <https://arxiv.org/abs/1911.02116>`_, Alexis Conneau et 
 al.

@@ -289,22 +375,36 @@ masked language modeling on sentences coming from one language. However, the mod
 The library provides a version of the model for masked language modeling, token classification, sentence 
 classification, multiple choice classification and question answering.

-More information in this :doc:`model documentation </model_doc/xlmroberta>`.
-
 FlauBERT
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=flaubert">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-flaubert-blueviolet">
+   </a>
+   <a href="/model_doc/flaubert">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-flaubert-blueviolet">
+   </a>
+
 `FlauBERT: Unsupervised Language Model Pre-training for French <https://arxiv.org/abs/1912.05372>`_, Hang Le et al.

 Like RoBERTa, without the sentence ordering prediction (so just trained on the MLM objective).

 The library provides a version of the model for language modeling and sentence classification.

-More information in this :doc:`model documentation </model_doc/flaubert>`.
-
 ELECTRA
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=electra">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-electra-blueviolet">
+   </a>
+   <a href="/model_doc/electra">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-electra-blueviolet">
+   </a>
+
 `ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators <https://arxiv.org/abs/2003.10555>`_, 
 Kevin Clark et al.

@@ -317,13 +417,20 @@ traditional GAN setting) then the ELECTRA model is trained for a few steps.
 The library provides a version of the model for masked language modeling, token classification and sentence 
 classification.

-More information in this :doc:`model documentation </model_doc/electra>`.
-
 .. _longformer:

 Longformer
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=longformer">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-longformer-blueviolet">
+   </a>
+   <a href="/model_doc/longformer">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-longformer-blueviolet">
+   </a>
+
 `Longformer: The Long-Document Transformer <https://arxiv.org/abs/2004.05150>`_, Iz Beltagy et al.

 A transformer model replacing the attention matrices by sparse matrices to go faster. Often, the local context (e.g., 
@@ -339,9 +446,6 @@ pretraining yet, though.
 The library provides a version of the model for masked language modeling, token classification, sentence 
 classification, multiple choice classification and question answering.

-More information in this :doc:`model documentation </model_doc/longformer>`.
-
-
 .. _seq-to-seq-models:

 Sequence-to-sequence models
@@ -352,8 +456,17 @@ As mentioned before, these models keep both the encoder and the decoder of the o
 BART
 ----------------------------------------------

-`BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension <https://arxiv.org/abs/1910.13461>`_, 
-Mike Lewis et al.
+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=bart">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-bart-blueviolet">
+   </a>
+   <a href="/model_doc/bart">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bart-blueviolet">
+   </a>
+
+`BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension 
+<https://arxiv.org/abs/1910.13461>`_, Mike Lewis et al.

 Sequence-to-sequence model with an encoder and a decoder. Encoder is fed a corrupted version of the tokens, decoder is 
 fed the tokens (but has a mask to hide the future words like a regular transformers decoder). For the encoder, on the 
@@ -367,22 +480,36 @@ pretraining tasks, a composition of the following transformations are applied:

 The library provides a version of this model for conditional generation and sequence classification.

-More information in this :doc:`model documentation </model_doc/bart>`.
-
 MarianMT
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=marian">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-marian-blueviolet">
+   </a>
+   <a href="/model_doc/marian">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-marian-blueviolet">
+   </a>
+
 `Marian: Fast Neural Machine Translation in C++ <https://arxiv.org/abs/1804.00344>`_, Marcin Junczys-Dowmunt et al.

 A framework for translation models, using the same models as BART

 The library provides a version of this model for conditional generation.

-More information in this :doc:`model documentation </model_doc/marian>`.
-
 T5
 ----------------------------------------------

+.. raw:: html
+
+   <a href="https://huggingface.co/models?filter=t5">
+       <img alt="Models" src="https://img.shields.io/badge/All_model_pages-t5-blueviolet">
+   </a>
+   <a href="/model_doc/t5">
+       <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-t5-blueviolet">
+   </a>
+
 `Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer <https://arxiv.org/abs/1910.10683>`_, 
 Colin Raffel et al.

@@ -403,8 +530,6 @@ input becomes “My <x> very <y> .” and the target is “<x> dog is <y> . <z>

 The library provides a version of this model for conditional generation.

-More information in this :doc:`model documentation </model_doc/t5>`.
-
 .. _multimodal-models:

 Multimodal models