We have released various pre-trained Cross Encoder models via our [Cross Encoder Hugging Face organization](https://huggingface.co/models?author=cross-encoder). Additionally, numerous community CrossEncoder models have been publicly released on the Hugging Face Hub.
Each of these models can be easily downloaded and used like so:
Cross-Encoders require text pairs as inputs and output a score 0...1 (if the Sigmoid activation function is used). They do not work for individual sentences and they don't compute embeddings for individual texts.
## MS MARCO
[MS MARCO Passage Retrieval](https://github.com/microsoft/MSMARCO-Passage-Ranking) is a large dataset with real user queries from Bing search engine with annotated relevant text passages.
```eval_rst
.. note::
You can initialize these models with ``default_activation_function=torch.nn.Sigmoid()`` to force the model to return scores between 0 and 1. Otherwise, the raw value can reasonably range between -10 and 10.
```
-[cross-encoder/ms-marco-TinyBERT-L-2-v2](https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-2) - MRR@10 on MS Marco Dev Set: 32.56
-[cross-encoder/ms-marco-MiniLM-L-2-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-2-v2) - MRR@10 on MS Marco Dev Set: 34.85
-[cross-encoder/ms-marco-MiniLM-L-4-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-4-v2) - MRR@10 on MS Marco Dev Set: 37.70
-[cross-encoder/ms-marco-MiniLM-L-6-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-6-v2) - MRR@10 on MS Marco Dev Set: 39.01
-[cross-encoder/ms-marco-MiniLM-L-12-v2](https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2) - MRR@10 on MS Marco Dev Set: 39.02
For details on the usage, see [Retrieve & Re-Rank](../../examples/applications/retrieve_rerank/README.md) or [MS MARCO Cross-Encoders](../pretrained-models/ce-msmarco.md).
## SQuAD (QNLI)
QNLI is based on the [SQuAD dataset](https://rajpurkar.github.io/SQuAD-explorer/)([HF](https://huggingface.co/datasets/rajpurkar/squad)) and was introduced by the [GLUE Benchmark](https://arxiv.org/abs/1804.07461)([HF](https://huggingface.co/datasets/nyu-mll/glue)). Given a passage from Wikipedia, annotators created questions that are answerable by that passage.
-[cross-encoder/qnli-distilroberta-base](https://huggingface.co/cross-encoder/qnli-distilroberta-base) - Accuracy on QNLI dev set: 90.96
-[cross-encoder/qnli-electra-base](https://huggingface.co/cross-encoder/qnli-electra-base) - Accuracy on QNLI dev set: 93.21
They return a score 0...1 indicating the semantic similarity of the given sentence pair.
-[cross-encoder/stsb-TinyBERT-L-4](https://huggingface.co/cross-encoder/stsb-TinyBERT-L-4) - STSbenchmark test performance: 85.50
-[cross-encoder/stsb-distilroberta-base](https://huggingface.co/cross-encoder/stsb-distilroberta-base) - STSbenchmark test performance: 87.92
-[cross-encoder/stsb-roberta-base](https://huggingface.co/cross-encoder/stsb-roberta-base) - STSbenchmark test performance: 90.17
-[cross-encoder/stsb-roberta-large](https://huggingface.co/cross-encoder/stsb-roberta-large) - STSbenchmark test performance: 91.47
## Quora Duplicate Questions
These models have been trained on the [Quora duplicate questions dataset](https://huggingface.co/datasets/sentence-transformers/quora-duplicates). They can used like the STSb models and give a score 0...1 indicating the probability that two questions are duplicate questions.
-[cross-encoder/quora-distilroberta-base](https://huggingface.co/cross-encoder/quora-distilroberta-base) - Average Precision dev set: 87.48
-[cross-encoder/quora-roberta-base](https://huggingface.co/cross-encoder/quora-roberta-base) - Average Precision dev set: 87.80
-[cross-encoder/quora-roberta-large](https://huggingface.co/cross-encoder/quora-roberta-large) - Average Precision dev set: 87.91
```eval_rst
.. note::
The model don't work for question similarity. The question *How to learn Java* and *How to learn Python* will get a low score, as these questions are not duplicates. For question similarity, the respective bi-encoder trained on the Quora dataset yields much more meaningful results.
```
## NLI
Given two sentences, are these contradicting each other, entailing one the other or are these netural? The following models were trained on the [SNLI](https://huggingface.co/datasets/stanfordnlp/snli) and [MultiNLI](https://huggingface.co/datasets/nyu-mll/multi_nli) datasets.
-[cross-encoder/nli-deberta-v3-base](https://huggingface.co/cross-encoder/nli-deberta-v3-base) - Accuracy on MNLI mismatched set: 90.04
-[cross-encoder/nli-deberta-base](https://huggingface.co/cross-encoder/nli-deberta-base) - Accuracy on MNLI mismatched set: 88.08
-[cross-encoder/nli-deberta-v3-xsmall](https://huggingface.co/cross-encoder/nli-deberta-v3-xsmall) - Accuracy on MNLI mismatched set: 87.77
-[cross-encoder/nli-deberta-v3-small](https://huggingface.co/cross-encoder/nli-deberta-v3-small) - Accuracy on MNLI mismatched set: 87.55
-[cross-encoder/nli-roberta-base](https://huggingface.co/cross-encoder/nli-roberta-base) - Accuracy on MNLI mismatched set: 87.47
-[cross-encoder/nli-MiniLM2-L6-H768](https://huggingface.co/cross-encoder/nli-MiniLM2-L6-H768) - Accuracy on MNLI mismatched set: 86.89
-[cross-encoder/nli-distilroberta-base](https://huggingface.co/cross-encoder/nli-distilroberta-base) - Accuracy on MNLI mismatched set: 83.98
See the following examples how to train Cross-Encoders:
-[training_stsbenchmark.py](../../../examples/training/cross-encoder/training_stsbenchmark.py) - Example how to train for Semantic Textual Similarity (STS) on the STS benchmark dataset.
-[training_quora_duplicate_questions.py](../../../examples/training/cross-encoder/training_quora_duplicate_questions.py) - Example how to train a Cross-Encoder to predict if two questions are duplicates. Uses Quora Duplicate Questions as training dataset.
-[training_nli.py](../../../examples/training/cross-encoder/training_nli.py) - Example for a multilabel classification task for Natural Language Inference (NLI) task.
The CrossEncoder training approach has not been updated in v3.0 when `training Sentence Transformer models <../sentence_transformer/training_overview.html>`_ was improved. Improving training CrossEncoders is planned for a future major update.
```
The `CrossEncoder` class is a wrapper around Huggingface `AutoModelForSequenceClassification`, but with some methods to make training and predicting scores a little bit easier. The saved models are 100% compatible with Huggingface and can also be loaded with their classes.
First, you need some sentence pair data. You can either have a continuous score, like:
Then, you define the base model and the number of labels. You can take any [Hugging Face pre-trained model](https://huggingface.co/models) that is compatible with AutoModel:
For binary tasks and tasks with continuous scores (like STS), we set num_labels=1. For classification tasks, we set it to the number of labels we have.
```eval_rst
We start the training by calling :meth:`CrossEncoder.fit <sentence_transformers.cross_encoder.CrossEncoder.fit>`:
("How many people live in Berlin?","Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers."),
("How many people live in Berlin?","Berlin is well known for its museums."),
])
#=>array([8.607138,-4.3200774],dtype=float32)
#3.Rankalistofpassagesforaquery
query="How many people live in Berlin?"
passages=[
"Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.",
"Berlin is well known for its museums.",
"In 2014, the city state Berlin had 37,368 live births (+6.6%), a record number since 1991.",
"The urban area of Berlin comprised about 4.1 million people in 2014, making it the seventh most populous urban area in the European Union.",
"The city of Paris had a population of 2,165,423 people within its administrative city limits as of January 1, 2019",
"An estimated 300,000-420,000 Muslims reside in Berlin, making up about 8-11 percent of the population.",
"Berlin is subdivided into 12 boroughs or districts (Bezirke).",
"In 2015, the total labour force in Berlin was 1.85 million.",
"In 2013 around 600,000 Berliners were registered in one of the more than 2,300 sport and fitness clubs.",
"Berlin has a yearly total of about 135 million day visitors, which puts it in third place among the most-visited city destinations in the European Union.",
8.92 The urban area of Berlin comprised about 4.1 million people in 2014, making it the seventh most populous urban area in the European Union.
8.61 Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.
8.24 An estimated 300,000-420,000 Muslims reside in Berlin, making up about 8-11 percent of the population.
7.60 In 2014, the city state Berlin had 37,368 live births (+6.6%), a record number since 1991.
6.35 In 2013 around 600,000 Berliners were registered in one of the more than 2,300 sport and fitness clubs.
5.42 Berlin has a yearly total of about 135 million day visitors, which puts it in third place among the most-visited city destinations in the European Union.
3.45 In 2015, the total labour force in Berlin was 1.85 million.
0.33 Berlin is subdivided into 12 boroughs or districts (Bezirke).
-4.24 The city of Paris had a population of 2,165,423 people within its administrative city limits as of January 1, 2019
We recommend **Python 3.8** or higher, **[PyTorch 1.11.0](https://pytorch.org/get-started/locally/)** or higher and **[transformers v4.32.0](https://github.com/huggingface/transformers)** or higher.
We recommend **Python 3.8+**, **[PyTorch 1.11.0+](https://pytorch.org/get-started/locally/)**, and **[transformers v4.34.0+](https://github.com/huggingface/transformers)**. There are three options to install Sentence Transformers:
***Default:** This allows for loading, saving, and inference (i.e., getting embeddings) of models.
***Default and Training**: All of the above plus training.
***Development**: All of the above plus some dependencies for developing Sentence Transformers, see [Editable Install](#editable-install).
## Install SentenceTransformers
## Install with pip
**Install with pip**
```eval_rst
.. tab:: Default
::
pip install -U sentence-transformers
.. tab:: Default and Training
::
pip install -U "sentence-transformers[train]"
To use `Weights and Biases <https://wandb.ai/>`_ to track your training logs, you should also install ``wandb`` **(recommended)**::
pip install wandb
And to track your Carbon Emissions while training and have this information automatically included in your model cards, also install ``codecarbon`` **(recommended)**::
To use `Weights and Biases <https://wandb.ai/>`_ to track your training logs, you should also install ``wandb`` **(recommended)**::
pip install wandb
And to track your Carbon Emissions while training and have this information automatically included in your model cards, also install ``codecarbon`` **(recommended)**::
You can install ``sentence-transformers`` directly from source to take advantage of the bleeding edge `master` branch rather than the latest stable release:
To use `Weights and Biases <https://wandb.ai/>`_ to track your training logs, you should also install ``wandb`` **(recommended)**::
pip install wandb
And to track your carbon emissions while training and have this information automatically included in your model cards, also install ``codecarbon`` **(recommended)**::
Alternatively, you can also clone the latest version from the [repository](https://github.com/UKPLab/sentence-transformers) and install it directly from the source code:
````
pip install -e .
````
These commands will link the new `sentence-transformers` folder and your Python library paths, such that this folder will be used when importing `sentence-transformers`.
## Install PyTorch with CUDA support
If you want to use a GPU / CUDA, you must install PyTorch with the matching CUDA Version. Follow
[PyTorch - Get Started](https://pytorch.org/get-started/locally/) for further details how to install PyTorch.
To use a GPU/CUDA, you must install PyTorch with CUDA support. Follow [PyTorch - Get Started](https://pytorch.org/get-started/locally/) for installation steps.