Added precisions in SciBERT-NLI model card (#3410)

7372e62b · Gabriele Sarti · GitHub · 471cce24 · 7372e62b
Unverified Commit 7372e62b authored Mar 24, 2020 by Gabriele Sarti Committed by GitHub Mar 24, 2020
Show whitespace changes
Inline Side-by-side

Showing with 17 additions and 13 deletions

model_cards/gsarti/scibert-nli/README.md model_cards/gsarti/scibert-nli/README.md +17 -13

No files found.
--- a/model_cards/gsarti/scibert-nli/README.md
+++ b/model_cards/gsarti/scibert-nli/README.md
@@ -4,24 +4,26 @@ This is the model [SciBERT](https://github.com/allenai/scibert) [1] fine-tuned o
 The model uses the original `scivocab` wordpiece vocabulary and was trained using the **average pooling strategy** and a **softmax loss**.
-**Base model**: `allenai/scibert-scivocab-cased` from HuggingFace AutoModel
+**Base model**: `allenai/scibert-scivocab-cased` from HuggingFace's `AutoModel`.
+**Training time**: ~4 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.
 **Parameters**:
 | Parameter        | Value |
-|----------------|-------|
+|------------------|-------|
 | Batch size       | 64    |
 | Training steps   | 20000 |
 | Warmup steps     | 1450  |
+| Lowercasing      | True  |
+| Max. Seq. Length | 128   |
 **Performances**: The performance was evaluated on the test portion of the [STS dataset](http://ixa2.si.ehu.es/stswiki/index.php/STSbenchmark) using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity.
 | Model                         | Score       |
-|-----------------------------|-------------|
+|-------------------------------|-------------|
-| `scibert-nli` (ours)        | 74.50       |
+| `scibert-nli` (this)          | 74.50       |
-| `bert-base-nli-mean-tokens` | 77.12       |
+| `bert-base-nli-mean-tokens`[3]| 77.12       |
 An example usage for similarity-based scientific paper retrieval is provided in the [Covid Papers Browser](https://github.com/gsarti/covid-papers-browser) repository.
@@ -30,3 +32,5 @@ An example usage for similarity-based scientific paper retrieval is provided in
 [1] I. Beltagy et al, [SciBERT: A Pretrained Language Model for Scientific Text](https://www.aclweb.org/anthology/D19-1371/)
 [2] A. Conneau et al., [Supervised Learning of Universal Sentence Representations from Natural Language Inference Data](https://www.aclweb.org/anthology/D17-1070/)
+[3] N. Reimers et I. Gurevych, [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://www.aclweb.org/anthology/D19-1410/)