README.md 5.34 KB
Newer Older
Rayyyyy's avatar
Rayyyyy committed
1
2
3
# Retrieve & Re-Rank
In [Semantic Search](../semantic-search/README.md) we have shown how to use SentenceTransformer to compute embeddings for queries, sentences, and paragraphs and how to use this for semantic search. 

Rayyyyy's avatar
Rayyyyy committed
4
For complex search tasks, for example question answering retrieval, the search can significantly be improved by using **Retrieve & Re-Rank**.
Rayyyyy's avatar
Rayyyyy committed
5
6
7

## Retrieve & Re-Rank Pipeline

Rayyyyy's avatar
Rayyyyy committed
8
The following pipeline for Information Retrieval / Question Answering Retrieval works very well. All components are provided and explained in this article:
Rayyyyy's avatar
Rayyyyy committed
9
10
11

![InformationRetrieval](https://raw.githubusercontent.com/UKPLab/sentence-transformers/master/docs/img/InformationRetrieval.png)

Rayyyyy's avatar
Rayyyyy committed
12
Given a search query, we first use a **retrieval system** that retrieves a large list of e.g. 100 possible hits which are potentially relevant for the query. For the retrieval, we can use either lexical search, e.g. with a vector engine like Elasticsearch, or we can use dense retrieval with a bi-encoder.  However, the retrieval system might retrieve documents that are not that relevant for the search query. Hence, in a second stage, we use a **re-ranker** based on a **cross-encoder** that scores the relevancy of all candidates for the given search query. The output will be a ranked list of hits we can present to the user.
Rayyyyy's avatar
Rayyyyy committed
13
14

## Retrieval: Bi-Encoder
Rayyyyy's avatar
Rayyyyy committed
15
For the retrieval of the candidate set, we can either use lexical search (e.g. [Elasticsearch](https://www.elastic.co/elasticsearch/)), or we can use a bi-encoder which is implemented in Sentence Transformers.
Rayyyyy's avatar
Rayyyyy committed
16
17
18
19
20

Lexical search looks for literal matches of the query words in your document collection. It will not recognize synonyms, acronyms or spelling variations. In contrast, semantic search (or dense retrieval) encodes the search query into vector space and retrieves the document embeddings that are close in vector space. 

![SemanticSearch](https://raw.githubusercontent.com/UKPLab/sentence-transformers/master/docs/img/SemanticSearch.png)

Rayyyyy's avatar
Rayyyyy committed
21
Semantic search overcomes the shortcomings of lexical search and can recognize synonym and acronyms. Have a look at the [semantic search article](../semantic-search/README.md) for different options to implement semantic search.
Rayyyyy's avatar
Rayyyyy committed
22
23
24
25


## Re-Ranker: Cross-Encoder

Rayyyyy's avatar
Rayyyyy committed
26
The retriever has to be efficient for large document collections with millions of entries. However, it might return irrelevant candidates. A re-ranker based on a Cross-Encoder can substantially improve the final results for the user. The query and a possible document is passed simultaneously to transformer network, which then outputs a single score between 0 and 1 indicating how relevant the document is for the given query. 
Rayyyyy's avatar
Rayyyyy committed
27
28
29

![CrossEncoder](https://raw.githubusercontent.com/UKPLab/sentence-transformers/master/docs/img/CrossEncoder.png)

Rayyyyy's avatar
Rayyyyy committed
30
The advantage of Cross-Encoders is the higher performance, as they perform attention across the query and the document. Scoring thousands or millions of (query, document)-pairs would be rather slow. Hence, we use the retriever to create a set of e.g. 100 possible candidates which are then re-ranked by the Cross-Encoder.
Rayyyyy's avatar
Rayyyyy committed
31
32
33
34
35
36
37
38
39
40
41
42
43
44

## Example Scripts

* **[retrieve_rerank_simple_wikipedia.ipynb](retrieve_rerank_simple_wikipedia.ipynb)** [ [Colab Version](https://colab.research.google.com/github/UKPLab/sentence-transformers/blob/master/examples/applications/retrieve_rerank/retrieve_rerank_simple_wikipedia.ipynb) ]: This script uses the smaller [Simple English Wikipedia](https://simple.wikipedia.org/wiki/Main_Page) as document collection to provide answers to user questions / search queries. First, we split all Wikipedia articles into paragraphs and encode them with a bi-encoder. If a new query / question is entered, it is encoded by the same bi-encoder and the paragraphs with the highest cosine-similarity are retrieved (see [semantic search](../semantic-search/README.md)). Next, the retrieved candidates are scored by a Cross-Encoder re-ranker and the 5 passages with the highest score from the Cross-Encoder are presented to the user.
- **[in_document_search_crossencoder.py](in_document_search_crossencoder.py):** If you only have a small set of paragraphs, we don't do the retrieval stage. This is for example the case if you want to perform search within a single document. In this example, we take the Wikipedia article about Europe and split it into paragraphs. Then, the search query / question and all paragraphs are scored using the Cross-Encoder re-ranker. The most relevant passages for the query are returned.


## Pre-trained Bi-Encoders (Retrieval)

The bi-encoder produces embeddings independently for your paragraphs and for your search queries. You can use it like this:

```python
from sentence_transformers import SentenceTransformer

Rayyyyy's avatar
Rayyyyy committed
45
model = SentenceTransformer("multi-qa-mpnet-base-dot-v1")
Rayyyyy's avatar
Rayyyyy committed
46
47
48
49
50
51
52
53
54
55
56
57
58
59

docs = [
    "My first paragraph. That contains information",
    "Python is a programming language.",
]
document_embeddings = model.encode(docs)

query = "What is Python?"
query_embedding = model.encode(query)
```

For more details how to compare the embeddings, see [semantic search](../semantic-search/README.md).

We provide pre-trained models based on:
Rayyyyy's avatar
Rayyyyy committed
60
- **MS MARCO:** 500k real user queries from Bing search engine. See [MS MARCO models](../../../docs/pretrained-models/msmarco-v3.html) 
Rayyyyy's avatar
Rayyyyy committed
61
62
63

## Pre-trained Cross-Encoders (Re-Ranker)

Rayyyyy's avatar
Rayyyyy committed
64
For pre-trained Cross Encoder models, see: [MS MARCO Cross-Encoders](../../../docs/pretrained-models/ce-msmarco.html)