Unverified Commit 7372e62b authored by Gabriele Sarti's avatar Gabriele Sarti Committed by GitHub
Browse files

Added precisions in SciBERT-NLI model card (#3410)

parent 471cce24
...@@ -4,24 +4,26 @@ This is the model [SciBERT](https://github.com/allenai/scibert) [1] fine-tuned o ...@@ -4,24 +4,26 @@ This is the model [SciBERT](https://github.com/allenai/scibert) [1] fine-tuned o
The model uses the original `scivocab` wordpiece vocabulary and was trained using the **average pooling strategy** and a **softmax loss**. The model uses the original `scivocab` wordpiece vocabulary and was trained using the **average pooling strategy** and a **softmax loss**.
**Base model**: `allenai/scibert-scivocab-cased` from HuggingFace AutoModel **Base model**: `allenai/scibert-scivocab-cased` from HuggingFace's `AutoModel`.
**Training time**: ~4 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.
**Parameters**: **Parameters**:
| Parameter | Value | | Parameter | Value |
|----------------|-------| |------------------|-------|
| Batch size | 64 | | Batch size | 64 |
| Training steps | 20000 | | Training steps | 20000 |
| Warmup steps | 1450 | | Warmup steps | 1450 |
| Lowercasing | True |
| Max. Seq. Length | 128 |
**Performances**: The performance was evaluated on the test portion of the [STS dataset](http://ixa2.si.ehu.es/stswiki/index.php/STSbenchmark) using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity. **Performances**: The performance was evaluated on the test portion of the [STS dataset](http://ixa2.si.ehu.es/stswiki/index.php/STSbenchmark) using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity.
| Model | Score | | Model | Score |
|-----------------------------|-------------| |-------------------------------|-------------|
| `scibert-nli` (ours) | 74.50 | | `scibert-nli` (this) | 74.50 |
| `bert-base-nli-mean-tokens` | 77.12 | | `bert-base-nli-mean-tokens`[3]| 77.12 |
An example usage for similarity-based scientific paper retrieval is provided in the [Covid Papers Browser](https://github.com/gsarti/covid-papers-browser) repository. An example usage for similarity-based scientific paper retrieval is provided in the [Covid Papers Browser](https://github.com/gsarti/covid-papers-browser) repository.
...@@ -30,3 +32,5 @@ An example usage for similarity-based scientific paper retrieval is provided in ...@@ -30,3 +32,5 @@ An example usage for similarity-based scientific paper retrieval is provided in
[1] I. Beltagy et al, [SciBERT: A Pretrained Language Model for Scientific Text](https://www.aclweb.org/anthology/D19-1371/) [1] I. Beltagy et al, [SciBERT: A Pretrained Language Model for Scientific Text](https://www.aclweb.org/anthology/D19-1371/)
[2] A. Conneau et al., [Supervised Learning of Universal Sentence Representations from Natural Language Inference Data](https://www.aclweb.org/anthology/D17-1070/) [2] A. Conneau et al., [Supervised Learning of Universal Sentence Representations from Natural Language Inference Data](https://www.aclweb.org/anthology/D17-1070/)
[3] N. Reimers et I. Gurevych, [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://www.aclweb.org/anthology/D19-1410/)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment