[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)

* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

[model_cards] Migrate cards from this repo to model repos on huggingface.co (#9013)
* rm all model cards * Update the .rst @sgugger it is still not super crystal clear/streamlined so let me know if any ideas to make it simpler * Add a rootlevel README.md with simple instructions/context * Update docs/source/model_sharing.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * rm all model cards Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
3552d0e0 · Julien Chaumond · GitHub · 29e45979 · 29e45979 · 29e45979
Unverified Commit 3552d0e0 authored Dec 12, 2020 by Julien Chaumond Committed by GitHub Dec 11, 2020
20 changed files
--- a/model_cards/valhalla/distilbart-mnli-12-1/README.md
+++ b/model_cards/valhalla/distilbart-mnli-12-1/README.md
---
-datasets:
- mnli
-tags:
- distilbart
- distilbart-mnli
-pipeline_tag: zero-shot-classification
---
-# DistilBart-MNLI
-distilbart-mnli is the distilled version of bart-large-mnli created using the **No Teacher Distillation** technique proposed for BART summarisation by Huggingface, [here](https://github.com/huggingface/transformers/tree/master/examples/seq2seq#distilbart).
-We just copy alternating layers from `bart-large-mnli` and finetune more on the same data. 
-|                                                                                      | matched acc | mismatched acc |
-| ------------------------------------------------------------------------------------ | ----------- | -------------- |
-| [bart-large-mnli](https://huggingface.co/facebook/bart-large-mnli) (baseline, 12-12) | 89.9        | 90.01          |
-| [distilbart-mnli-12-1](https://huggingface.co/valhalla/distilbart-mnli-12-1)         | 87.08       | 87.5           |
-| [distilbart-mnli-12-3](https://huggingface.co/valhalla/distilbart-mnli-12-3)         | 88.1        | 88.19          |
-| [distilbart-mnli-12-6](https://huggingface.co/valhalla/distilbart-mnli-12-6)         | 89.19       | 89.01          |
-| [distilbart-mnli-12-9](https://huggingface.co/valhalla/distilbart-mnli-12-9)         | 89.56       | 89.52          |
-This is a very simple and effective technique, as we can see the performance drop is very little.
-Detailed performace trade-offs will be posted in this [sheet](https://docs.google.com/spreadsheets/d/1dQeUvAKpScLuhDV1afaPJRRAE55s2LpIzDVA5xfqxvk/edit?usp=sharing).
-## Fine-tuning
-If you want to train these models yourself, clone the [distillbart-mnli repo](https://github.com/patil-suraj/distillbart-mnli) and follow the steps below
-Clone and install transformers from source
-```bash
-git clone https://github.com/huggingface/transformers.git
-pip install -qqq -U ./transformers
-```
-Download MNLI data
-```bash
-python transformers/utils/download_glue_data.py --data_dir glue_data --tasks MNLI
-```
-Create student model
-```bash
-python create_student.py \
-  --teacher_model_name_or_path facebook/bart-large-mnli \
-  --student_encoder_layers 12 \
-  --student_decoder_layers 6 \
-  --save_path student-bart-mnli-12-6 \
-```
-Start fine-tuning
-```bash
-python run_glue.py args.json
-```
-You can find the logs of these trained models in this [wandb project](https://wandb.ai/psuraj/distilbart-mnli).
\ No newline at end of file
--- a/model_cards/valhalla/distilbart-mnli-12-3/README.md
+++ b/model_cards/valhalla/distilbart-mnli-12-3/README.md
---
-datasets:
- mnli
-tags:
- distilbart
- distilbart-mnli
-pipeline_tag: zero-shot-classification
---
-# DistilBart-MNLI
-distilbart-mnli is the distilled version of bart-large-mnli created using the **No Teacher Distillation** technique proposed for BART summarisation by Huggingface, [here](https://github.com/huggingface/transformers/tree/master/examples/seq2seq#distilbart).
-We just copy alternating layers from `bart-large-mnli` and finetune more on the same data. 
-|                                                                                      | matched acc | mismatched acc |
-| ------------------------------------------------------------------------------------ | ----------- | -------------- |
-| [bart-large-mnli](https://huggingface.co/facebook/bart-large-mnli) (baseline, 12-12) | 89.9        | 90.01          |
-| [distilbart-mnli-12-1](https://huggingface.co/valhalla/distilbart-mnli-12-1)         | 87.08       | 87.5           |
-| [distilbart-mnli-12-3](https://huggingface.co/valhalla/distilbart-mnli-12-3)         | 88.1        | 88.19          |
-| [distilbart-mnli-12-6](https://huggingface.co/valhalla/distilbart-mnli-12-6)         | 89.19       | 89.01          |
-| [distilbart-mnli-12-9](https://huggingface.co/valhalla/distilbart-mnli-12-9)         | 89.56       | 89.52          |
-This is a very simple and effective technique, as we can see the performance drop is very little.
-Detailed performace trade-offs will be posted in this [sheet](https://docs.google.com/spreadsheets/d/1dQeUvAKpScLuhDV1afaPJRRAE55s2LpIzDVA5xfqxvk/edit?usp=sharing).
-## Fine-tuning
-If you want to train these models yourself, clone the [distillbart-mnli repo](https://github.com/patil-suraj/distillbart-mnli) and follow the steps below
-Clone and install transformers from source
-```bash
-git clone https://github.com/huggingface/transformers.git
-pip install -qqq -U ./transformers
-```
-Download MNLI data
-```bash
-python transformers/utils/download_glue_data.py --data_dir glue_data --tasks MNLI
-```
-Create student model
-```bash
-python create_student.py \
-  --teacher_model_name_or_path facebook/bart-large-mnli \
-  --student_encoder_layers 12 \
-  --student_decoder_layers 6 \
-  --save_path student-bart-mnli-12-6 \
-```
-Start fine-tuning
-```bash
-python run_glue.py args.json
-```
-You can find the logs of these trained models in this [wandb project](https://wandb.ai/psuraj/distilbart-mnli).
\ No newline at end of file
--- a/model_cards/valhalla/distilbart-mnli-12-6/README.md
+++ b/model_cards/valhalla/distilbart-mnli-12-6/README.md
---
-datasets:
- mnli
-tags:
- distilbart
- distilbart-mnli
-pipeline_tag: zero-shot-classification
---
-# DistilBart-MNLI
-distilbart-mnli is the distilled version of bart-large-mnli created using the **No Teacher Distillation** technique proposed for BART summarisation by Huggingface, [here](https://github.com/huggingface/transformers/tree/master/examples/seq2seq#distilbart).
-We just copy alternating layers from `bart-large-mnli` and finetune more on the same data. 
-|                                                                                      | matched acc | mismatched acc |
-| ------------------------------------------------------------------------------------ | ----------- | -------------- |
-| [bart-large-mnli](https://huggingface.co/facebook/bart-large-mnli) (baseline, 12-12) | 89.9        | 90.01          |
-| [distilbart-mnli-12-1](https://huggingface.co/valhalla/distilbart-mnli-12-1)         | 87.08       | 87.5           |
-| [distilbart-mnli-12-3](https://huggingface.co/valhalla/distilbart-mnli-12-3)         | 88.1        | 88.19          |
-| [distilbart-mnli-12-6](https://huggingface.co/valhalla/distilbart-mnli-12-6)         | 89.19       | 89.01          |
-| [distilbart-mnli-12-9](https://huggingface.co/valhalla/distilbart-mnli-12-9)         | 89.56       | 89.52          |
-This is a very simple and effective technique, as we can see the performance drop is very little.
-Detailed performace trade-offs will be posted in this [sheet](https://docs.google.com/spreadsheets/d/1dQeUvAKpScLuhDV1afaPJRRAE55s2LpIzDVA5xfqxvk/edit?usp=sharing).
-## Fine-tuning
-If you want to train these models yourself, clone the [distillbart-mnli repo](https://github.com/patil-suraj/distillbart-mnli) and follow the steps below
-Clone and install transformers from source
-```bash
-git clone https://github.com/huggingface/transformers.git
-pip install -qqq -U ./transformers
-```
-Download MNLI data
-```bash
-python transformers/utils/download_glue_data.py --data_dir glue_data --tasks MNLI
-```
-Create student model
-```bash
-python create_student.py \
-  --teacher_model_name_or_path facebook/bart-large-mnli \
-  --student_encoder_layers 12 \
-  --student_decoder_layers 6 \
-  --save_path student-bart-mnli-12-6 \
-```
-Start fine-tuning
-```bash
-python run_glue.py args.json
-```
-You can find the logs of these trained models in this [wandb project](https://wandb.ai/psuraj/distilbart-mnli).
\ No newline at end of file
--- a/model_cards/valhalla/distilbart-mnli-12-9/README.md
+++ b/model_cards/valhalla/distilbart-mnli-12-9/README.md
---
-datasets:
- mnli
-tags:
- distilbart
- distilbart-mnli
-pipeline_tag: zero-shot-classification
---
-# DistilBart-MNLI
-distilbart-mnli is the distilled version of bart-large-mnli created using the **No Teacher Distillation** technique proposed for BART summarisation by Huggingface, [here](https://github.com/huggingface/transformers/tree/master/examples/seq2seq#distilbart).
-We just copy alternating layers from `bart-large-mnli` and finetune more on the same data. 
-|                                                                                      | matched acc | mismatched acc |
-| ------------------------------------------------------------------------------------ | ----------- | -------------- |
-| [bart-large-mnli](https://huggingface.co/facebook/bart-large-mnli) (baseline, 12-12) | 89.9        | 90.01          |
-| [distilbart-mnli-12-1](https://huggingface.co/valhalla/distilbart-mnli-12-1)         | 87.08       | 87.5           |
-| [distilbart-mnli-12-3](https://huggingface.co/valhalla/distilbart-mnli-12-3)         | 88.1        | 88.19          |
-| [distilbart-mnli-12-6](https://huggingface.co/valhalla/distilbart-mnli-12-6)         | 89.19       | 89.01          |
-| [distilbart-mnli-12-9](https://huggingface.co/valhalla/distilbart-mnli-12-9)         | 89.56       | 89.52          |
-This is a very simple and effective technique, as we can see the performance drop is very little.
-Detailed performace trade-offs will be posted in this [sheet](https://docs.google.com/spreadsheets/d/1dQeUvAKpScLuhDV1afaPJRRAE55s2LpIzDVA5xfqxvk/edit?usp=sharing).
-## Fine-tuning
-If you want to train these models yourself, clone the [distillbart-mnli repo](https://github.com/patil-suraj/distillbart-mnli) and follow the steps below
-Clone and install transformers from source
-```bash
-git clone https://github.com/huggingface/transformers.git
-pip install -qqq -U ./transformers
-```
-Download MNLI data
-```bash
-python transformers/utils/download_glue_data.py --data_dir glue_data --tasks MNLI
-```
-Create student model
-```bash
-python create_student.py \
-  --teacher_model_name_or_path facebook/bart-large-mnli \
-  --student_encoder_layers 12 \
-  --student_decoder_layers 6 \
-  --save_path student-bart-mnli-12-6 \
-```
-Start fine-tuning
-```bash
-python run_glue.py args.json
-```
-You can find the logs of these trained models in this [wandb project](https://wandb.ai/psuraj/distilbart-mnli).
\ No newline at end of file
--- a/model_cards/valhalla/electra-base-discriminator-finetuned_squadv1/README.md
+++ b/model_cards/valhalla/electra-base-discriminator-finetuned_squadv1/README.md
-# ELECTRA-BASE-DISCRIMINATOR finetuned on SQuADv1
-This is electra-base-discriminator model finetuned on SQuADv1 dataset for for question answering task.
-## Model details
-As mentioned in the original paper: ELECTRA is a new method for self-supervised language representation learning.
-It can be used to pre-train transformer networks using relatively little compute. 
-ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network, 
-similar to the discriminator of a GAN. At small scale, ELECTRA achieves strong results even when trained on a single GPU.
-At large scale, ELECTRA achieves state-of-the-art results on the SQuAD 2.0 dataset.
-| Param               | #Value |
-|---------------------|--------|
-| layers              | 12     |
-| hidden size         | 768    |
-| num attetion heads  | 12     |
-| on disk size        | 436MB  |
-## Model training
-This model was trained on google colab v100 GPU. 
-You can find the fine-tuning colab here
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11yo-LaFsgggwmDSy2P8zD3tzf5cCb-DU?usp=sharing).
-## Results
-The results are actually slightly better than given in the paper. 
-In the paper the authors mentioned that electra-base achieves 84.5 EM and 90.8 F1
-| Metric | #Value |
-|--------|--------|
-| EM     | 85.0520|
-| F1     | 91.6050|
-## Model in Action  🚀
-```python3
-from transformers import pipeline
-nlp = pipeline('question-answering', model='valhalla/electra-base-discriminator-finetuned_squadv1')
-nlp({
-    'question': 'What is the answer to everything ?',
-    'context': '42 is the answer to life the universe and everything'
-})
-=> {'answer': '42', 'end': 2, 'score': 0.981274963050339, 'start': 0}
-```
-> Created with ❤️ by Suraj Patil [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/patil-suraj/)
-[![Twitter icon](https://cdn0.iconfinder.com/data/icons/shift-logotypes/32/Twitter-32.png)](https://twitter.com/psuraj28)
--- a/model_cards/valhalla/longformer-base-4096-finetuned-squadv1/README.md
+++ b/model_cards/valhalla/longformer-base-4096-finetuned-squadv1/README.md
-# LONGFORMER-BASE-4096 fine-tuned on SQuAD v1
-This is longformer-base-4096 model fine-tuned on SQuAD v1 dataset for question answering task. 
-[Longformer](https://arxiv.org/abs/2004.05150) model  created by Iz Beltagy, Matthew E. Peters, Arman Coha from AllenAI.  As the paper explains it 
-> `Longformer` is a BERT-like model for long documents. 
-The pre-trained model can handle sequences with upto 4096 tokens. 
-## Model Training
-This model was trained on google colab v100 GPU. You can find the fine-tuning colab here [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1zEl5D-DdkBKva-DdreVOmN0hrAfzKG1o?usp=sharing).
-Few things to keep in mind while training longformer for QA task,
-by default longformer uses sliding-window local attention on all tokens. But For QA, all question tokens should  have global attention. For more details on this please refer the paper. The `LongformerForQuestionAnswering` model automatically does that for you. To allow it to do that 
-1. The input sequence must have three sep tokens, i.e the sequence should be encoded like this
-   ` <s> question</s></s> context</s>`.  If you encode the question and answer as a input  pair, then the tokenizer already takes care of that, you shouldn't worry about it.
-2. `input_ids` should always be a batch of examples. 
-## Results
-|Metric       | # Value |
-|-------------|---------|
-| Exact Match | 85.1466 |
-| F1          | 91.5415 |
-## Model in Action  🚀
-```python
-import torch
-from transformers import AutoTokenizer, AutoModelForQuestionAnswering,
-tokenizer = AutoTokenizer.from_pretrained("valhalla/longformer-base-4096-finetuned-squadv1")
-model = AutoModelForQuestionAnswering.from_pretrained("valhalla/longformer-base-4096-finetuned-squadv1")
-text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
-question = "What has Huggingface done ?"
-encoding = tokenizer(question, text, return_tensors="pt")
-input_ids = encoding["input_ids"]
-# default is local attention everywhere
-# the forward method will automatically set global attention on question tokens
-attention_mask = encoding["attention_mask"]
-start_scores, end_scores = model(input_ids, attention_mask=attention_mask)
-all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())
-answer_tokens = all_tokens[torch.argmax(start_scores) :torch.argmax(end_scores)+1]
-answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
-# output => democratized NLP
-```
-The `LongformerForQuestionAnswering` isn't yet supported in `pipeline` . I'll update this card once the support has been added.
-> Created with ❤️ by Suraj Patil [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/patil-suraj/)
-[![Twitter icon](https://cdn0.iconfinder.com/data/icons/shift-logotypes/32/Twitter-32.png)](https://twitter.com/psuraj28)
--- a/model_cards/valhalla/t5-base-e2e-qg/README.md
+++ b/model_cards/valhalla/t5-base-e2e-qg/README.md
---
-datasets:
- squad
-tags:
- question-generation
-widget:
- text: "Python is a programming language. It is developed by Guido Van Rossum and released in 1991. </s>"
-license: mit
---
-## T5 for question-generation
-This is [t5-base](https://arxiv.org/abs/1910.10683) model trained for end-to-end question generation task. Simply input the text and the model will generate multile questions. 
-You can play with the model using the inference API, just put the text and see the results!
-For more deatils see [this](https://github.com/patil-suraj/question_generation) repo.
-### Model in action 🚀
-You'll need to clone the [repo](https://github.com/patil-suraj/question_generation).
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/question_generation/blob/master/question_generation.ipynb)
-```python3
-from pipelines import pipeline
-text = "Python is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum \
-and first released in 1991, Python's design philosophy emphasizes code \
-readability with its notable use of significant whitespace."
-nlp = pipeline("e2e-qg", model="valhalla/t5-base-e2e-qg")
-nlp(text)
-=> [
- 'Who created Python?',
- 'When was Python first released?',
- "What is Python's design philosophy?"
-]
-```
\ No newline at end of file
--- a/model_cards/valhalla/t5-base-qa-qg-hl/README.md
+++ b/model_cards/valhalla/t5-base-qa-qg-hl/README.md
---
-datasets:
- squad
-tags:
- question-generation
-widget:
- text: "generate question: <hl> 42 <hl> is the answer to life, the universe and everything. </s>"
- text: "question: What is 42 context: 42 is the answer to life, the universe and everything. </s>"
-license: mit
---
-## T5 for multi-task QA and QG
-This is multi-task [t5-base](https://arxiv.org/abs/1910.10683) model trained for question answering and answer aware question generation tasks. 
-For question generation the answer spans are highlighted within the text with special highlight tokens (`<hl>`) and prefixed with 'generate question: '. For QA the input is processed like this `question: question_text context: context_text </s>` 
-You can play with the model using the inference API. Here's how you can use it
-For QG
-`generate question: <hl> 42 <hl> is the answer to life, the universe and everything. </s>`
-For QA
-`question: What is 42 context: 42 is the answer to life, the universe and everything. </s>`
-For more deatils see [this](https://github.com/patil-suraj/question_generation) repo.
-### Model in action 🚀
-You'll need to clone the [repo](https://github.com/patil-suraj/question_generation).
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/question_generation/blob/master/question_generation.ipynb)
-```python3
-from pipelines import pipeline
-nlp = pipeline("multitask-qa-qg", model="valhalla/t5-base-qa-qg-hl")
-# to generate questions simply pass the text
-nlp("42 is the answer to life, the universe and everything.")
-=> [{'answer': '42', 'question': 'What is the answer to life, the universe and everything?'}]
-# for qa pass a dict with "question" and "context"
-nlp({
-    "question": "What is 42 ?",
-    "context": "42 is the answer to life, the universe and everything."
-})
-=> 'the answer to life, the universe and everything'
-```
\ No newline at end of file
--- a/model_cards/valhalla/t5-base-qg-hl/README.md
+++ b/model_cards/valhalla/t5-base-qg-hl/README.md
---
-datasets:
- squad
-tags:
- question-generation
-widget:
- text: "<hl> 42 <hl> is the answer to life, the universe and everything. </s>"
- text: "Python is a programming language. It is developed by <hl> Guido Van Rossum <hl>. </s>"
- text: "Although <hl> practicality <hl> beats purity </s>"
-license: mit
---
-## T5 for question-generation
-This is [t5-base](https://arxiv.org/abs/1910.10683) model trained for answer aware question generation task. The answer spans are highlighted within the text with special highlight tokens. 
-You can play with the model using the inference API, just highlight the answer spans with `<hl>` tokens and end the text with `</s>`. For example
-`<hl> 42 <hl> is the answer to life, the universe and everything. </s>`
-For more deatils see [this](https://github.com/patil-suraj/question_generation) repo.
-### Model in action 🚀
-You'll need to clone the [repo](https://github.com/patil-suraj/question_generation).
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/question_generation/blob/master/question_generation.ipynb)
-```python3
-from pipelines import pipeline
-nlp = pipeline("question-generation", model="valhalla/t5-base-qg-hl")
-nlp("42 is the answer to life, universe and everything.")
-=> [{'answer': '42', 'question': 'What is the answer to life, universe and everything?'}]
-```
\ No newline at end of file
--- a/model_cards/valhalla/t5-base-squad/README.md
+++ b/model_cards/valhalla/t5-base-squad/README.md
-# T5 for question-answering
-This is T5-base model fine-tuned on SQuAD1.1 for QA using text-to-text approach
-## Model training
-This model was trained on colab TPU with 35GB RAM for 4 epochs
-## Results:
-| Metric      | #Value  |
-|-------------|---------|
-| Exact Match | 81.5610 |
-| F1          | 89.9601 |
-## Model in Action 🚀
-```
-from transformers import AutoModelWithLMHead, AutoTokenizer
-tokenizer = AutoTokenizer.from_pretrained("valhalla/t5-base-squad")
-model = AutoModelWithLMHead.from_pretrained("valhalla/t5-base-squad")
-def get_answer(question, context):
-  input_text = "question: %s  context: %s </s>" % (question, context)
-  features = tokenizer([input_text], return_tensors='pt')
-  out = model.generate(input_ids=features['input_ids'], 
-               attention_mask=features['attention_mask'])
-  return tokenizer.decode(out[0])
-context = "In Norse mythology, Valhalla is a majestic, enormous hall located in Asgard, ruled over by the god Odin."
-question = "What is Valhalla ?"
-get_answer(question, context)
-# output: 'a majestic, enormous hall located in Asgard, ruled over by the god Odin'
-```
-Play with this model [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1a5xpJiUjZybfU9Mi-aDkOp116PZ9-wni?usp=sharing)
-> Created by Suraj Patil [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/patil-suraj/)
-[![Twitter icon](https://cdn0.iconfinder.com/data/icons/shift-logotypes/32/Twitter-32.png)](https://twitter.com/psuraj28)
--- a/model_cards/valhalla/t5-samll-qg-prepend/README.md
+++ b/model_cards/valhalla/t5-samll-qg-prepend/README.md
---
-datasets:
- squad
-tags:
- question-generation
-widget:
- text: "answer: 42  context: 42 is the answer to life, the universe and everything. </s>"
- text: "answer: Guido Van Rossum context: Python is a programming language. It is developed by Guido Van Rossum. </s>"
- text: "answer: Explicit context: Explicit is better than implicit </s>"
-license: mit
---
-## T5 for question-generation
-This is [t5-small](https://arxiv.org/abs/1910.10683) model trained for answer aware question generation task. The answer text is prepended before the context text. 
-You can play with the model using the inference API, just get the input text in this format and see the results!
-`answer: answer_text context: context_text </s>`
-For example
-`answer: 42  context: 42 is the answer to life, the universe and everything. </s>`
-For more deatils see [this](https://github.com/patil-suraj/question_generation) repo.
-### Model in action 🚀
-You'll need to clone the [repo](https://github.com/patil-suraj/question_generation).
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/question_generation/blob/master/question_generation.ipynb)
-```python3
-from pipelines import pipeline
-nlp = pipeline("question-generation", qg_format="prepend")
-nlp("42 is the answer to life, universe and everything.")
-=> [{'answer': '42', 'question': 'What is the answer to life, universe and everything?'}]
-```
\ No newline at end of file
--- a/model_cards/valhalla/t5-small-e2e-qg/README.md
+++ b/model_cards/valhalla/t5-small-e2e-qg/README.md
---
-datasets:
- squad
-tags:
- question-generation
-widget:
- text: "Python is developed by Guido Van Rossum and released in 1991. </s>"
-license: mit
---
-## T5 for question-generation
-This is [t5-small](https://arxiv.org/abs/1910.10683) model trained for end-to-end question generation task. Simply input the text and the model will generate multile questions. 
-You can play with the model using the inference API, just put the text and see the results!
-For more deatils see [this](https://github.com/patil-suraj/question_generation) repo.
-### Model in action 🚀
-You'll need to clone the [repo](https://github.com/patil-suraj/question_generation).
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/question_generation/blob/master/question_generation.ipynb)
-```python3
-from pipelines import pipeline
-text = "Python is an interpreted, high-level, general-purpose programming language. Created by Guido van Rossum \
-and first released in 1991, Python's design philosophy emphasizes code \
-readability with its notable use of significant whitespace."
-nlp = pipeline("e2e-qg")
-nlp(text)
-=> [
- 'Who created Python?',
- 'When was Python first released?',
- "What is Python's design philosophy?"
-]
-```
\ No newline at end of file
--- a/model_cards/valhalla/t5-small-qa-qg-hl/README.md
+++ b/model_cards/valhalla/t5-small-qa-qg-hl/README.md
---
-datasets:
- squad
-tags:
- question-generation
-widget:
- text: "generate question: <hl> 42 <hl> is the answer to life, the universe and everything. </s>"
- text: "question: What is 42 context: 42 is the answer to life, the universe and everything. </s>"
-license: mit
---
-## T5 for multi-task QA and QG
-This is multi-task [t5-small](https://arxiv.org/abs/1910.10683) model trained for question answering and answer aware question generation tasks. 
-For question generation the answer spans are highlighted within the text with special highlight tokens (`<hl>`) and prefixed with 'generate question: '. For QA the input is processed like this `question: question_text context: context_text </s>` 
-You can play with the model using the inference API. Here's how you can use it
-For QG
-`generate question: <hl> 42 <hl> is the answer to life, the universe and everything. </s>`
-For QA
-`question: What is 42 context: 42 is the answer to life, the universe and everything. </s>`
-For more deatils see [this](https://github.com/patil-suraj/question_generation) repo.
-### Model in action 🚀
-You'll need to clone the [repo](https://github.com/patil-suraj/question_generation).
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/question_generation/blob/master/question_generation.ipynb)
-```python3
-from pipelines import pipeline
-nlp = pipeline("multitask-qa-qg")
-# to generate questions simply pass the text
-nlp("42 is the answer to life, the universe and everything.")
-=> [{'answer': '42', 'question': 'What is the answer to life, the universe and everything?'}]
-# for qa pass a dict with "question" and "context"
-nlp({
-    "question": "What is 42 ?",
-    "context": "42 is the answer to life, the universe and everything."
-})
-=> 'the answer to life, the universe and everything'
-```
\ No newline at end of file
--- a/model_cards/valhalla/t5-small-qg-hl/README.md
+++ b/model_cards/valhalla/t5-small-qg-hl/README.md
---
-datasets:
- squad
-tags:
- question-generation
-widget:
- text: "<hl> 42 <hl> is the answer to life, the universe and everything. </s>"
- text: "Python is a programming language. It is developed by <hl> Guido Van Rossum <hl>. </s>"
- text: "Simple is better than <hl> complex <hl>. </s>"
-license: mit
---
-## T5 for question-generation
-This is [t5-small](https://arxiv.org/abs/1910.10683) model trained for answer aware question generation task. The answer spans are highlighted within the text with special highlight tokens. 
-You can play with the model using the inference API, just highlight the answer spans with `<hl>` tokens and end the text with `</s>`. For example
-`<hl> 42 <hl> is the answer to life, the universe and everything. </s>`
-For more deatils see [this](https://github.com/patil-suraj/question_generation) repo.
-### Model in action 🚀
-You'll need to clone the [repo](https://github.com/patil-suraj/question_generation).
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/question_generation/blob/master/question_generation.ipynb)
-```python3
-from pipelines import pipeline
-nlp = pipeline("question-generation")
-nlp("42 is the answer to life, universe and everything.")
-=> [{'answer': '42', 'question': 'What is the answer to life, universe and everything?'}]
-```
\ No newline at end of file
--- a/model_cards/vinai/bertweet-base/README.md
+++ b/model_cards/vinai/bertweet-base/README.md
-# <a name="introduction"></a> BERTweet: A pre-trained language model for English Tweets 
- - BERTweet is the first public large-scale language model pre-trained for English Tweets. BERTweet is trained based on the [RoBERTa](https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.md)  pre-training procedure, using the same model configuration as [BERT-base](https://github.com/google-research/bert). 
- - The corpus used to pre-train BERTweet consists of 850M English Tweets (16B word tokens ~ 80GB), containing 845M Tweets streamed from 01/2012 to 08/2019 and 5M Tweets related to the **COVID-19** pandemic. 
- - BERTweet does better than its competitors RoBERTa-base and [XLM-R-base](https://arxiv.org/abs/1911.02116) and outperforms previous state-of-the-art models on three downstream Tweet NLP tasks of Part-of-speech tagging, Named entity recognition and text classification.
-The general architecture and experimental results of BERTweet can be found in our [paper](https://arxiv.org/abs/2005.10200):
-    @inproceedings{bertweet,
-    title     = {{BERTweet: A pre-trained language model for English Tweets}},
-    author    = {Dat Quoc Nguyen and Thanh Vu and Anh Tuan Nguyen},
-    booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
-    year      = {2020}
-    }
-**Please CITE** our paper when BERTweet is used to help produce published results or is incorporated into other software.
-For further information or requests, please go to [BERTweet's homepage](https://github.com/VinAIResearch/BERTweet)!
-### <a name="install2"></a> Installation 
- -  Python 3.6+, and PyTorch 1.1.0+ (or TensorFlow 2.0+)
- -  Install `transformers`:
-    - `git clone https://github.com/huggingface/transformers.git`
-    - `cd transformers`
-    - `pip3 install --upgrade .`
- - Install `emoji`: `pip3 install emoji`
-### <a name="models2"></a> Pre-trained models 
-Model | #params | Arch. | Pre-training data
---|---|---|---
-`vinai/bertweet-base` | 135M | base | 845M English Tweets (cased)
-`vinai/bertweet-covid19-base-cased` | 135M | base | 23M COVID-19 English Tweets (cased)
-`vinai/bertweet-covid19-base-uncased` | 135M | base | 23M COVID-19 English Tweets (uncased)
-Two pre-trained models `vinai/bertweet-covid19-base-cased` and `vinai/bertweet-covid19-base-uncased` are resulted by further pre-training the pre-trained model `vinai/bertweet-base` on a  corpus of 23M COVID-19 English Tweets for 40 epochs.  
-### <a name="usage2"></a> Example usage 
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer 
-bertweet = AutoModel.from_pretrained("vinai/bertweet-base")
-tokenizer = AutoTokenizer.from_pretrained("vinai/bertweet-base")
-# INPUT TWEET IS ALREADY NORMALIZED!
-line = "SC has first two presumptive cases of coronavirus , DHEC confirms HTTPURL via @USER :cry:"
-input_ids = torch.tensor([tokenizer.encode(line)])
-with torch.no_grad():
-    features = bertweet(input_ids)  # Models outputs are now tuples
-## With TensorFlow 2.0+:
-# from transformers import TFAutoModel
-# bertweet = TFAutoModel.from_pretrained("vinai/bertweet-base")
-```
-### <a name="preprocess"></a> Normalize raw input Tweets 
-Before applying `fastBPE` to the pre-training corpus of 850M English Tweets, we tokenized these  Tweets using `TweetTokenizer` from the NLTK toolkit and used the `emoji` package to translate emotion icons into text strings (here, each icon is referred to as a word token).   We also normalized the Tweets by converting user mentions and web/url links into special tokens `@USER` and `HTTPURL`, respectively. Thus it is recommended to also apply the same pre-processing step for BERTweet-based downstream applications w.r.t. the raw input Tweets. BERTweet provides this pre-processing step by enabling the `normalization` argument. 
-```python
-import torch
-from transformers import AutoTokenizer
-# Load the AutoTokenizer with a normalization mode if the input Tweet is raw
-tokenizer = AutoTokenizer.from_pretrained("vinai/bertweet-base", normalization=True)
-# from transformers import BertweetTokenizer
-# tokenizer = BertweetTokenizer.from_pretrained("vinai/bertweet-base", normalization=True)
-line = "SC has first two presumptive cases of coronavirus, DHEC confirms https://postandcourier.com/health/covid19/sc-has-first-two-presumptive-cases-of-coronavirus-dhec-confirms/article_bddfe4ae-5fd3-11ea-9ce4-5f495366cee6.html?utm_medium=social&utm_source=twitter&utm_campaign=user-share… via @postandcourier"
-input_ids = torch.tensor([tokenizer.encode(line)])
-```
--- a/model_cards/vinai/bertweet-covid19-base-cased/README.md
+++ b/model_cards/vinai/bertweet-covid19-base-cased/README.md
-# <a name="introduction"></a> BERTweet: A pre-trained language model for English Tweets 
- - BERTweet is the first public large-scale language model pre-trained for English Tweets. BERTweet is trained based on the [RoBERTa](https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.md)  pre-training procedure, using the same model configuration as [BERT-base](https://github.com/google-research/bert). 
- - The corpus used to pre-train BERTweet consists of 850M English Tweets (16B word tokens ~ 80GB), containing 845M Tweets streamed from 01/2012 to 08/2019 and 5M Tweets related to the **COVID-19** pandemic. 
- - BERTweet does better than its competitors RoBERTa-base and [XLM-R-base](https://arxiv.org/abs/1911.02116) and outperforms previous state-of-the-art models on three downstream Tweet NLP tasks of Part-of-speech tagging, Named entity recognition and text classification.
-The general architecture and experimental results of BERTweet can be found in our [paper](https://arxiv.org/abs/2005.10200):
-    @inproceedings{bertweet,
-    title     = {{BERTweet: A pre-trained language model for English Tweets}},
-    author    = {Dat Quoc Nguyen and Thanh Vu and Anh Tuan Nguyen},
-    booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
-    year      = {2020}
-    }
-**Please CITE** our paper when BERTweet is used to help produce published results or is incorporated into other software.
-For further information or requests, please go to [BERTweet's homepage](https://github.com/VinAIResearch/BERTweet)!
-### <a name="install2"></a> Installation 
- -  Python 3.6+, and PyTorch 1.1.0+ (or TensorFlow 2.0+)
- -  Install `transformers`:
-    - `git clone https://github.com/huggingface/transformers.git`
-    - `cd transformers`
-    - `pip3 install --upgrade .`
- - Install `emoji`: `pip3 install emoji`
-### <a name="models2"></a> Pre-trained models 
-Model | #params | Arch. | Pre-training data
---|---|---|---
-`vinai/bertweet-base` | 135M | base | 845M English Tweets (cased)
-`vinai/bertweet-covid19-base-cased` | 135M | base | 23M COVID-19 English Tweets (cased)
-`vinai/bertweet-covid19-base-uncased` | 135M | base | 23M COVID-19 English Tweets (uncased)
-Two pre-trained models `vinai/bertweet-covid19-base-cased` and `vinai/bertweet-covid19-base-uncased` are resulted by further pre-training the pre-trained model `vinai/bertweet-base` on a  corpus of 23M COVID-19 English Tweets for 40 epochs.  
-### <a name="usage2"></a> Example usage 
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer 
-bertweet = AutoModel.from_pretrained("vinai/bertweet-covid19-base-cased")
-tokenizer = AutoTokenizer.from_pretrained("vinai/bertweet-covid19-base-cased")
-# INPUT TWEET IS ALREADY NORMALIZED!
-line = "SC has first two presumptive cases of coronavirus , DHEC confirms HTTPURL via @USER :cry:"
-input_ids = torch.tensor([tokenizer.encode(line)])
-with torch.no_grad():
-    features = bertweet(input_ids)  # Models outputs are now tuples
-## With TensorFlow 2.0+:
-# from transformers import TFAutoModel
-# bertweet = TFAutoModel.from_pretrained("vinai/bertweet-covid19-base-cased")
-```
-### <a name="preprocess"></a> Normalize raw input Tweets 
-Before applying `fastBPE` to the pre-training corpus of 850M English Tweets, we tokenized these  Tweets using `TweetTokenizer` from the NLTK toolkit and used the `emoji` package to translate emotion icons into text strings (here, each icon is referred to as a word token).   We also normalized the Tweets by converting user mentions and web/url links into special tokens `@USER` and `HTTPURL`, respectively. Thus it is recommended to also apply the same pre-processing step for BERTweet-based downstream applications w.r.t. the raw input Tweets. BERTweet provides this pre-processing step by enabling the `normalization` argument. 
-```python
-import torch
-from transformers import AutoTokenizer
-# Load the AutoTokenizer with a normalization mode if the input Tweet is raw
-tokenizer = AutoTokenizer.from_pretrained("vinai/bertweet-covid19-base-cased", normalization=True)
-# from transformers import BertweetTokenizer
-# tokenizer = BertweetTokenizer.from_pretrained("vinai/bertweet-covid19-base-cased", normalization=True)
-line = "SC has first two presumptive cases of coronavirus, DHEC confirms https://postandcourier.com/health/covid19/sc-has-first-two-presumptive-cases-of-coronavirus-dhec-confirms/article_bddfe4ae-5fd3-11ea-9ce4-5f495366cee6.html?utm_medium=social&utm_source=twitter&utm_campaign=user-share… via @postandcourier"
-input_ids = torch.tensor([tokenizer.encode(line)])
-```
--- a/model_cards/vinai/bertweet-covid19-base-uncased/README.md
+++ b/model_cards/vinai/bertweet-covid19-base-uncased/README.md
-# <a name="introduction"></a> BERTweet: A pre-trained language model for English Tweets 
- - BERTweet is the first public large-scale language model pre-trained for English Tweets. BERTweet is trained based on the [RoBERTa](https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.md)  pre-training procedure, using the same model configuration as [BERT-base](https://github.com/google-research/bert). 
- - The corpus used to pre-train BERTweet consists of 850M English Tweets (16B word tokens ~ 80GB), containing 845M Tweets streamed from 01/2012 to 08/2019 and 5M Tweets related to the **COVID-19** pandemic. 
- - BERTweet does better than its competitors RoBERTa-base and [XLM-R-base](https://arxiv.org/abs/1911.02116) and outperforms previous state-of-the-art models on three downstream Tweet NLP tasks of Part-of-speech tagging, Named entity recognition and text classification.
-The general architecture and experimental results of BERTweet can be found in our [paper](https://arxiv.org/abs/2005.10200):
-    @inproceedings{bertweet,
-    title     = {{BERTweet: A pre-trained language model for English Tweets}},
-    author    = {Dat Quoc Nguyen and Thanh Vu and Anh Tuan Nguyen},
-    booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations},
-    year      = {2020}
-    }
-**Please CITE** our paper when BERTweet is used to help produce published results or is incorporated into other software.
-For further information or requests, please go to [BERTweet's homepage](https://github.com/VinAIResearch/BERTweet)!
-### <a name="install2"></a> Installation 
- -  Python 3.6+, and PyTorch 1.1.0+ (or TensorFlow 2.0+)
- -  Install `transformers`:
-    - `git clone https://github.com/huggingface/transformers.git`
-    - `cd transformers`
-    - `pip3 install --upgrade .`
- - Install `emoji`: `pip3 install emoji`
-### <a name="models2"></a> Pre-trained models 
-Model | #params | Arch. | Pre-training data
---|---|---|---
-`vinai/bertweet-base` | 135M | base | 845M English Tweets (cased)
-`vinai/bertweet-covid19-base-cased` | 135M | base | 23M COVID-19 English Tweets (cased)
-`vinai/bertweet-covid19-base-uncased` | 135M | base | 23M COVID-19 English Tweets (uncased)
-Two pre-trained models `vinai/bertweet-covid19-base-cased` and `vinai/bertweet-covid19-base-uncased` are resulted by further pre-training the pre-trained model `vinai/bertweet-base` on a  corpus of 23M COVID-19 English Tweets for 40 epochs.  
-### <a name="usage2"></a> Example usage 
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer 
-bertweet = AutoModel.from_pretrained("vinai/bertweet-covid19-base-uncased")
-tokenizer = AutoTokenizer.from_pretrained("vinai/bertweet-covid19-base-uncased")
-# INPUT TWEET IS ALREADY NORMALIZED!
-line = "SC has first two presumptive cases of coronavirus , DHEC confirms HTTPURL via @USER :cry:"
-input_ids = torch.tensor([tokenizer.encode(line)])
-with torch.no_grad():
-    features = bertweet(input_ids)  # Models outputs are now tuples
-## With TensorFlow 2.0+:
-# from transformers import TFAutoModel
-# bertweet = TFAutoModel.from_pretrained("vinai/bertweet-covid19-base-uncased")
-```
-### <a name="preprocess"></a> Normalize raw input Tweets 
-Before applying `fastBPE` to the pre-training corpus of 850M English Tweets, we tokenized these  Tweets using `TweetTokenizer` from the NLTK toolkit and used the `emoji` package to translate emotion icons into text strings (here, each icon is referred to as a word token).   We also normalized the Tweets by converting user mentions and web/url links into special tokens `@USER` and `HTTPURL`, respectively. Thus it is recommended to also apply the same pre-processing step for BERTweet-based downstream applications w.r.t. the raw input Tweets. BERTweet provides this pre-processing step by enabling the `normalization` argument. 
-```python
-import torch
-from transformers import AutoTokenizer
-# Load the AutoTokenizer with a normalization mode if the input Tweet is raw
-tokenizer = AutoTokenizer.from_pretrained("vinai/bertweet-covid19-base-uncased", normalization=True)
-# from transformers import BertweetTokenizer
-# tokenizer = BertweetTokenizer.from_pretrained("vinai/bertweet-covid19-base-uncased", normalization=True)
-line = "SC has first two presumptive cases of coronavirus, DHEC confirms https://postandcourier.com/health/covid19/sc-has-first-two-presumptive-cases-of-coronavirus-dhec-confirms/article_bddfe4ae-5fd3-11ea-9ce4-5f495366cee6.html?utm_medium=social&utm_source=twitter&utm_campaign=user-share… via @postandcourier"
-input_ids = torch.tensor([tokenizer.encode(line)])
-```
--- a/model_cards/vinai/phobert-base/README.md
+++ b/model_cards/vinai/phobert-base/README.md
-# <a name="introduction"></a> PhoBERT: Pre-trained language models for Vietnamese 
-Pre-trained PhoBERT models are the state-of-the-art language models for Vietnamese ([Pho](https://en.wikipedia.org/wiki/Pho), i.e. "Phở", is a popular food in Vietnam):
- - Two PhoBERT versions of "base" and "large" are the first public large-scale monolingual language models pre-trained for Vietnamese. PhoBERT pre-training approach is based on [RoBERTa](https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.md)  which optimizes the [BERT](https://github.com/google-research/bert) pre-training procedure for more robust performance.
- - PhoBERT outperforms previous monolingual and multilingual approaches, obtaining new state-of-the-art performances on four downstream Vietnamese NLP tasks of Part-of-speech tagging, Dependency parsing, Named-entity recognition and Natural language inference.
-The general architecture and experimental results of PhoBERT can be found in our EMNLP-2020 Findings [paper](https://arxiv.org/abs/2003.00744):
-    @article{phobert,
-    title     = {{PhoBERT: Pre-trained language models for Vietnamese}},
-    author    = {Dat Quoc Nguyen and Anh Tuan Nguyen},
-    journal   = {Findings of EMNLP},
-    year      = {2020}
-    }
-**Please CITE** our paper when PhoBERT is used to help produce published results or is incorporated into other software.
-For further information or requests, please go to [PhoBERT's homepage](https://github.com/VinAIResearch/PhoBERT)!
-### Installation <a name="install2"></a>
- -  Python 3.6+, and PyTorch 1.1.0+ (or TensorFlow 2.0+)
- -  Install `transformers`:
-        - `git clone https://github.com/huggingface/transformers.git`
-        - `cd transformers`
-        - `pip3 install --upgrade .`
-### Pre-trained models <a name="models2"></a>
-Model | #params | Arch.  | Pre-training data
---|---|---|---
-`vinai/phobert-base` | 135M | base | 20GB  of texts
-`vinai/phobert-large` | 370M | large | 20GB  of texts
-### Example usage <a name="usage2"></a>
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer
-phobert = AutoModel.from_pretrained("vinai/phobert-base")
-tokenizer = AutoTokenizer.from_pretrained("vinai/phobert-base")
-# INPUT TEXT MUST BE ALREADY WORD-SEGMENTED!
-line = "Tôi là sinh_viên trường đại_học Công_nghệ ."
-input_ids = torch.tensor([tokenizer.encode(line)])
-with torch.no_grad():
-    features = phobert(input_ids)  # Models outputs are now tuples
-## With TensorFlow 2.0+:
-# from transformers import TFAutoModel
-# phobert = TFAutoModel.from_pretrained("vinai/phobert-base")
-```
--- a/model_cards/vinai/phobert-large/README.md
+++ b/model_cards/vinai/phobert-large/README.md
-# <a name="introduction"></a> PhoBERT: Pre-trained language models for Vietnamese 
-Pre-trained PhoBERT models are the state-of-the-art language models for Vietnamese ([Pho](https://en.wikipedia.org/wiki/Pho), i.e. "Phở", is a popular food in Vietnam):
- - Two PhoBERT versions of "base" and "large" are the first public large-scale monolingual language models pre-trained for Vietnamese. PhoBERT pre-training approach is based on [RoBERTa](https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.md)  which optimizes the [BERT](https://github.com/google-research/bert) pre-training procedure for more robust performance.
- - PhoBERT outperforms previous monolingual and multilingual approaches, obtaining new state-of-the-art performances on four downstream Vietnamese NLP tasks of Part-of-speech tagging, Dependency parsing, Named-entity recognition and Natural language inference.
-The general architecture and experimental results of PhoBERT can be found in our EMNLP-2020 Findings [paper](https://arxiv.org/abs/2003.00744):
-    @article{phobert,
-    title     = {{PhoBERT: Pre-trained language models for Vietnamese}},
-    author    = {Dat Quoc Nguyen and Anh Tuan Nguyen},
-    journal   = {Findings of EMNLP},
-    year      = {2020}
-    }
-**Please CITE** our paper when PhoBERT is used to help produce published results or is incorporated into other software.
-For further information or requests, please go to [PhoBERT's homepage](https://github.com/VinAIResearch/PhoBERT)!
-### Installation <a name="install2"></a>
- -  Python 3.6+, and PyTorch 1.1.0+ (or TensorFlow 2.0+)
- -  Install `transformers`:
-        - `git clone https://github.com/huggingface/transformers.git`
-        - `cd transformers`
-        - `pip3 install --upgrade .`
-### Pre-trained models <a name="models2"></a>
-Model | #params | Arch.  | Pre-training data
---|---|---|---
-`vinai/phobert-base` | 135M | base | 20GB  of texts
-`vinai/phobert-large` | 370M | large | 20GB  of texts
-### Example usage <a name="usage2"></a>
-```python
-import torch
-from transformers import AutoModel, AutoTokenizer
-phobert = AutoModel.from_pretrained("vinai/phobert-large")
-tokenizer = AutoTokenizer.from_pretrained("vinai/phobert-large")
-# INPUT TEXT MUST BE ALREADY WORD-SEGMENTED!
-line = "Tôi là sinh_viên trường đại_học Công_nghệ ."
-input_ids = torch.tensor([tokenizer.encode(line)])
-with torch.no_grad():
-    features = phobert(input_ids)  # Models outputs are now tuples
-## With TensorFlow 2.0+:
-# from transformers import TFAutoModel
-# phobert = TFAutoModel.from_pretrained("vinai/phobert-large")
-```
--- a/model_cards/voidful/albert_chinese_base/README.md
+++ b/model_cards/voidful/albert_chinese_base/README.md
---
-language: zh
---
-# albert_chinese_base
-This a albert_chinese_base model from [Google's github](https://github.com/google-research/ALBERT)  
-converted by huggingface's [script](https://github.com/huggingface/transformers/blob/master/src/transformers/convert_albert_original_tf_checkpoint_to_pytorch.py)
-## Attention (注意)
-Since sentencepiece is not used in albert_chinese_base model   
-you have to call BertTokenizer instead of AlbertTokenizer !!!    
-we can eval it using an example on MaskedLM   
-由於 albert_chinese_base 模型沒有用 sentencepiece   
-用AlbertTokenizer會載不進詞表，因此需要改用BertTokenizer !!!   
-我們可以跑MaskedLM預測來驗證這個做法是否正確   
-## Justify (驗證有效性)
-[colab trial](https://colab.research.google.com/drive/1Wjz48Uws6-VuSHv_-DcWLilv77-AaYgj)   
-```python
-from transformers import *
-import torch
-from torch.nn.functional import softmax
-pretrained = 'voidful/albert_chinese_base'
-tokenizer = BertTokenizer.from_pretrained(pretrained)
-model = AlbertForMaskedLM.from_pretrained(pretrained)
-inputtext = "今天[MASK]情很好"
-maskpos = tokenizer.encode(inputtext, add_special_tokens=True).index(103)
-input_ids = torch.tensor(tokenizer.encode(inputtext, add_special_tokens=True)).unsqueeze(0)  # Batch size 1
-outputs = model(input_ids, masked_lm_labels=input_ids)
-loss, prediction_scores = outputs[:2]
-logit_prob = softmax(prediction_scores[0, maskpos]).data.tolist()
-predicted_index = torch.argmax(prediction_scores[0, maskpos]).item()
-predicted_token = tokenizer.convert_ids_to_tokens([predicted_index])[0]
-print(predicted_token,logit_prob[predicted_index])
-```
-Result: `感 0.36333346366882324`