Unverified Commit 68d92519 authored by AK391's avatar AK391 Committed by GitHub
Browse files

Merge branch 'master' into master

parents 7480ded6 5cd7086f
...@@ -70,8 +70,12 @@ that at each position, the model can only look at the tokens before the attentio ...@@ -70,8 +70,12 @@ that at each position, the model can only look at the tokens before the attentio
<a href="model_doc/openai-gpt"> <a href="model_doc/openai-gpt">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-openai--gpt-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-openai--gpt-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/openai-gpt">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Improving Language Understanding by Generative Pre-Training](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf), Alec Radford et al. [Improving Language Understanding by Generative Pre-Training](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf), Alec Radford et al.
The first autoregressive model based on the transformer architecture, pretrained on the Book Corpus dataset. The first autoregressive model based on the transformer architecture, pretrained on the Book Corpus dataset.
...@@ -88,8 +92,12 @@ classification. ...@@ -88,8 +92,12 @@ classification.
<a href="model_doc/gpt2"> <a href="model_doc/gpt2">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-gpt2-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-gpt2-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/gpt2">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf), [Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf),
Alec Radford et al. Alec Radford et al.
...@@ -108,8 +116,12 @@ classification. ...@@ -108,8 +116,12 @@ classification.
<a href="model_doc/ctrl"> <a href="model_doc/ctrl">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-ctrl-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-ctrl-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/tiny-ctrl">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858), [CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858),
Nitish Shirish Keskar et al. Nitish Shirish Keskar et al.
...@@ -128,8 +140,12 @@ The library provides a version of the model for language modeling only. ...@@ -128,8 +140,12 @@ The library provides a version of the model for language modeling only.
<a href="model_doc/transfo-xl"> <a href="model_doc/transfo-xl">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-transfo--xl-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-transfo--xl-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/transfo-xl-wt103">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860), Zihang [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context](https://arxiv.org/abs/1901.02860), Zihang
Dai et al. Dai et al.
...@@ -158,8 +174,12 @@ The library provides a version of the model for language modeling only. ...@@ -158,8 +174,12 @@ The library provides a version of the model for language modeling only.
<a href="model_doc/reformer"> <a href="model_doc/reformer">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-reformer-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-reformer-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/reformer-crime-and-punishment">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451), Nikita Kitaev et al . [Reformer: The Efficient Transformer](https://arxiv.org/abs/2001.04451), Nikita Kitaev et al .
An autoregressive transformer model with lots of tricks to reduce memory footprint and compute time. Those tricks An autoregressive transformer model with lots of tricks to reduce memory footprint and compute time. Those tricks
...@@ -195,8 +215,12 @@ The library provides a version of the model for language modeling only. ...@@ -195,8 +215,12 @@ The library provides a version of the model for language modeling only.
<a href="model_doc/xlnet"> <a href="model_doc/xlnet">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlnet-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlnet-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/xlnet-base-cased">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237), Zhilin [XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237), Zhilin
Yang et al. Yang et al.
...@@ -229,6 +253,9 @@ corrupted versions. ...@@ -229,6 +253,9 @@ corrupted versions.
<a href="model_doc/bert"> <a href="model_doc/bert">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bert-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bert-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/bert-base-uncased">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805), [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805),
...@@ -257,8 +284,12 @@ token classification, sentence classification, multiple choice classification an ...@@ -257,8 +284,12 @@ token classification, sentence classification, multiple choice classification an
<a href="model_doc/albert"> <a href="model_doc/albert">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-albert-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-albert-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/albert-base-v2">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[ALBERT: A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942), [ALBERT: A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942),
Zhenzhong Lan et al. Zhenzhong Lan et al.
...@@ -285,8 +316,12 @@ classification, multiple choice classification and question answering. ...@@ -285,8 +316,12 @@ classification, multiple choice classification and question answering.
<a href="model_doc/roberta"> <a href="model_doc/roberta">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-roberta-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-roberta-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/roberta-base">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692), Yinhan Liu et al. [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692), Yinhan Liu et al.
Same as BERT with better pretraining tricks: Same as BERT with better pretraining tricks:
...@@ -309,8 +344,12 @@ classification, multiple choice classification and question answering. ...@@ -309,8 +344,12 @@ classification, multiple choice classification and question answering.
<a href="model_doc/distilbert"> <a href="model_doc/distilbert">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-distilbert-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-distilbert-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/distilbert-base-uncased">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108), [DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108),
Victor Sanh et al. Victor Sanh et al.
...@@ -333,8 +372,12 @@ and question answering. ...@@ -333,8 +372,12 @@ and question answering.
<a href="model_doc/convbert"> <a href="model_doc/convbert">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-convbert-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-convbert-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/conv-bert-base">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[ConvBERT: Improving BERT with Span-based Dynamic Convolution](https://arxiv.org/abs/2008.02496), Zihang Jiang, [ConvBERT: Improving BERT with Span-based Dynamic Convolution](https://arxiv.org/abs/2008.02496), Zihang Jiang,
Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan. Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan.
...@@ -362,8 +405,12 @@ and question answering. ...@@ -362,8 +405,12 @@ and question answering.
<a href="model_doc/xlm"> <a href="model_doc/xlm">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/xlm-mlm-en-2048">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Cross-lingual Language Model Pretraining](https://arxiv.org/abs/1901.07291), Guillaume Lample and Alexis Conneau [Cross-lingual Language Model Pretraining](https://arxiv.org/abs/1901.07291), Guillaume Lample and Alexis Conneau
A transformer model trained on several languages. There are three different type of training for this model and the A transformer model trained on several languages. There are three different type of training for this model and the
...@@ -395,8 +442,12 @@ question answering. ...@@ -395,8 +442,12 @@ question answering.
<a href="model_doc/xlm-roberta"> <a href="model_doc/xlm-roberta">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm--roberta-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xlm--roberta-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/xlm-roberta-base">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116), Alexis Conneau et [Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116), Alexis Conneau et
al. al.
...@@ -416,8 +467,12 @@ classification, multiple choice classification and question answering. ...@@ -416,8 +467,12 @@ classification, multiple choice classification and question answering.
<a href="model_doc/flaubert"> <a href="model_doc/flaubert">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-flaubert-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-flaubert-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/flaubert_small_cased">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[FlauBERT: Unsupervised Language Model Pre-training for French](https://arxiv.org/abs/1912.05372), Hang Le et al. [FlauBERT: Unsupervised Language Model Pre-training for French](https://arxiv.org/abs/1912.05372), Hang Le et al.
Like RoBERTa, without the sentence ordering prediction (so just trained on the MLM objective). Like RoBERTa, without the sentence ordering prediction (so just trained on the MLM objective).
...@@ -433,8 +488,12 @@ The library provides a version of the model for language modeling and sentence c ...@@ -433,8 +488,12 @@ The library provides a version of the model for language modeling and sentence c
<a href="model_doc/electra"> <a href="model_doc/electra">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-electra-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-electra-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/electra_large_discriminator_squad2_512">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://arxiv.org/abs/2003.10555), [ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators](https://arxiv.org/abs/2003.10555),
Kevin Clark et al. Kevin Clark et al.
...@@ -456,8 +515,12 @@ classification. ...@@ -456,8 +515,12 @@ classification.
<a href="model_doc/funnel"> <a href="model_doc/funnel">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-funnel-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-funnel-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/funnel-transformer-small">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236), Zihang Dai et al. [Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing](https://arxiv.org/abs/2006.03236), Zihang Dai et al.
Funnel Transformer is a transformer model using pooling, a bit like a ResNet model: layers are grouped in blocks, and Funnel Transformer is a transformer model using pooling, a bit like a ResNet model: layers are grouped in blocks, and
...@@ -488,8 +551,12 @@ classification, multiple choice classification and question answering. ...@@ -488,8 +551,12 @@ classification, multiple choice classification and question answering.
<a href="model_doc/longformer"> <a href="model_doc/longformer">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-longformer-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-longformer-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/longformer-base-4096-finetuned-squadv1">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150), Iz Beltagy et al. [Longformer: The Long-Document Transformer](https://arxiv.org/abs/2004.05150), Iz Beltagy et al.
A transformer model replacing the attention matrices by sparse matrices to go faster. Often, the local context (e.g., A transformer model replacing the attention matrices by sparse matrices to go faster. Often, the local context (e.g.,
...@@ -526,8 +593,12 @@ As mentioned before, these models keep both the encoder and the decoder of the o ...@@ -526,8 +593,12 @@ As mentioned before, these models keep both the encoder and the decoder of the o
<a href="model_doc/bart"> <a href="model_doc/bart">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bart-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-bart-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/bart-large-mnli">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461), Mike Lewis et al. [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461), Mike Lewis et al.
Sequence-to-sequence model with an encoder and a decoder. Encoder is fed a corrupted version of the tokens, decoder is Sequence-to-sequence model with an encoder and a decoder. Encoder is fed a corrupted version of the tokens, decoder is
...@@ -551,8 +622,12 @@ The library provides a version of this model for conditional generation and sequ ...@@ -551,8 +622,12 @@ The library provides a version of this model for conditional generation and sequ
<a href="model_doc/pegasus"> <a href="model_doc/pegasus">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-pegasus-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-pegasus-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/pegasus_paraphrase">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[PEGASUS: Pre-training with Extracted Gap-sentences forAbstractive Summarization](https://arxiv.org/pdf/1912.08777.pdf), Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu on Dec 18, 2019. [PEGASUS: Pre-training with Extracted Gap-sentences forAbstractive Summarization](https://arxiv.org/pdf/1912.08777.pdf), Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu on Dec 18, 2019.
Sequence-to-sequence model with the same encoder-decoder model architecture as BART. Pegasus is pre-trained jointly on Sequence-to-sequence model with the same encoder-decoder model architecture as BART. Pegasus is pre-trained jointly on
...@@ -580,8 +655,12 @@ The library provides a version of this model for conditional generation, which s ...@@ -580,8 +655,12 @@ The library provides a version of this model for conditional generation, which s
<a href="model_doc/marian"> <a href="model_doc/marian">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-marian-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-marian-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/opus-mt-zh-en">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Marian: Fast Neural Machine Translation in C++](https://arxiv.org/abs/1804.00344), Marcin Junczys-Dowmunt et al. [Marian: Fast Neural Machine Translation in C++](https://arxiv.org/abs/1804.00344), Marcin Junczys-Dowmunt et al.
A framework for translation models, using the same models as BART A framework for translation models, using the same models as BART
...@@ -598,8 +677,12 @@ The library provides a version of this model for conditional generation. ...@@ -598,8 +677,12 @@ The library provides a version of this model for conditional generation.
<a href="model_doc/t5"> <a href="model_doc/t5">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-t5-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-t5-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/t5-base">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683), Colin Raffel et al. [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683), Colin Raffel et al.
Uses the traditional transformer model (with a slight change in the positional embeddings, which are learned at each Uses the traditional transformer model (with a slight change in the positional embeddings, which are learned at each
...@@ -629,8 +712,12 @@ The library provides a version of this model for conditional generation. ...@@ -629,8 +712,12 @@ The library provides a version of this model for conditional generation.
<a href="model_doc/mt5"> <a href="model_doc/mt5">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-mt5-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-mt5-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/mt5-small-finetuned-arxiv-cs-finetuned-arxiv-cs-full">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934), Linting Xue [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934), Linting Xue
et al. et al.
...@@ -649,8 +736,12 @@ The library provides a version of this model for conditional generation. ...@@ -649,8 +736,12 @@ The library provides a version of this model for conditional generation.
<a href="model_doc/mbart"> <a href="model_doc/mbart">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-mbart-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-mbart-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/mbart-large-50-one-to-many-mmt">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Multilingual Denoising Pre-training for Neural Machine Translation](https://arxiv.org/abs/2001.08210) by Yinhan Liu, [Multilingual Denoising Pre-training for Neural Machine Translation](https://arxiv.org/abs/2001.08210) by Yinhan Liu,
Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer. Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer.
...@@ -677,8 +768,12 @@ finetuning. ...@@ -677,8 +768,12 @@ finetuning.
<a href="model_doc/prophetnet"> <a href="model_doc/prophetnet">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-prophetnet-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-prophetnet-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/prophetnet-large-uncased">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training,](https://arxiv.org/abs/2001.04063) by [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training,](https://arxiv.org/abs/2001.04063) by
Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou. Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou.
...@@ -701,8 +796,12 @@ summarization. ...@@ -701,8 +796,12 @@ summarization.
<a href="model_doc/xlm-prophetnet"> <a href="model_doc/xlm-prophetnet">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xprophetnet-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-xprophetnet-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/xprophetnet-large-wiki100-cased-xglue-ntg">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training,](https://arxiv.org/abs/2001.04063) by [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training,](https://arxiv.org/abs/2001.04063) by
Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou. Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou.
...@@ -753,8 +852,12 @@ Some models use documents retrieval during (pre)training and inference for open- ...@@ -753,8 +852,12 @@ Some models use documents retrieval during (pre)training and inference for open-
<a href="model_doc/dpr"> <a href="model_doc/dpr">
<img alt="Doc" src="https://img.shields.io/badge/Model_documentation-dpr-blueviolet"> <img alt="Doc" src="https://img.shields.io/badge/Model_documentation-dpr-blueviolet">
</a> </a>
<a href="https://huggingface.co/spaces/akhaliq/dpr-question_encoder-bert-base-multilingual">
<img alt="Spaces" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue">
</a>
</div> </div>
[Dense Passage Retrieval for Open-Domain Question Answering](https://arxiv.org/abs/2004.04906), Vladimir Karpukhin et [Dense Passage Retrieval for Open-Domain Question Answering](https://arxiv.org/abs/2004.04906), Vladimir Karpukhin et
al. al.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment