Added model cards for SciBERT models uploaded under AllenAI org (#3330)

* Create README.md * model card * add model card for cased

Added model cards for SciBERT models uploaded under AllenAI org (#3330)
* Create README.md * model card * add model card for cased
20139b7c · Kyle Lo · GitHub · cae334c4 · 20139b7c · 20139b7c
Unverified Commit 20139b7c authored Mar 18, 2020 by Kyle Lo Committed by GitHub Mar 18, 2020
2 changed files
--- a/model_cards/allenai/scibert_scivocab_cased/README.md
+++ b/model_cards/allenai/scibert_scivocab_cased/README.md
+# SciBERT
+
+This is the pretrained model presented in [SciBERT: A Pretrained Language Model for Scientific Text](https://www.aclweb.org/anthology/D19-1371/), which is a BERT model trained on scientific text.
+
+The training corpus was papers taken from [Semantic Scholar](https://www.semanticscholar.org). Corpus size is 1.14M papers, 3.1B tokens. We use the full text of the papers in training, not just abstracts.
+
+SciBERT has its own wordpiece vocabulary (scivocab) that's built to best match the training corpus. We trained cased and uncased versions. 
+
+Available models include:
+* `scibert_scivocab_cased`
+* `scibert_scivocab_uncased`
+
+
+The original repo can be found [here](https://github.com/allenai/scibert).
+
+If using these models, please cite the following paper:
+```
+@inproceedings{beltagy-etal-2019-scibert,
+    title = "SciBERT: A Pretrained Language Model for Scientific Text",
+    author = "Beltagy, Iz  and Lo, Kyle  and Cohan, Arman",
+    booktitle = "EMNLP",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://www.aclweb.org/anthology/D19-1371"
+}
+```
--- a/model_cards/allenai/scibert_scivocab_uncased/README.md
+++ b/model_cards/allenai/scibert_scivocab_uncased/README.md
+# SciBERT
+
+This is the pretrained model presented in [SciBERT: A Pretrained Language Model for Scientific Text](https://www.aclweb.org/anthology/D19-1371/), which is a BERT model trained on scientific text.
+
+The training corpus was papers taken from [Semantic Scholar](https://www.semanticscholar.org). Corpus size is 1.14M papers, 3.1B tokens. We use the full text of the papers in training, not just abstracts.
+
+SciBERT has its own wordpiece vocabulary (scivocab) that's built to best match the training corpus. We trained cased and uncased versions. 
+
+Available models include:
+* `scibert_scivocab_cased`
+* `scibert_scivocab_uncased`
+
+
+The original repo can be found [here](https://github.com/allenai/scibert).
+
+If using these models, please cite the following paper:
+```
+@inproceedings{beltagy-etal-2019-scibert,
+    title = "SciBERT: A Pretrained Language Model for Scientific Text",
+    author = "Beltagy, Iz  and Lo, Kyle  and Cohan, Arman",
+    booktitle = "EMNLP",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://www.aclweb.org/anthology/D19-1371"
+}
+```