README.md 3.49 KB
Newer Older
Rayyyyy's avatar
Rayyyyy committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
<h1 align="center">Chinese Massive Text Embedding Benchmark</h1>
<p align="center">
    <a href="https://www.python.org/">
            <img alt="Build" src="https://img.shields.io/badge/Contribution-Welcome-blue">
    </a>
    <a href="https://huggingface.co/C-MTEB">
        <img alt="Build" src="https://img.shields.io/badge/C_MTEB-🤗-yellow">
    </a>
    <a href="https://www.python.org/">
        <img alt="Build" src="https://img.shields.io/badge/Made with-Python-red">
    </a>
</p>

<h4 align="center">
    <p>
        <a href=#installation>Installation</a> |
        <a href=#evaluation>Evaluation</a>  |
        <a href="#leaderboard">Leaderboard</a> |
        <a href="#tasks">Tasks</a> |
        <a href="#acknowledgement">Acknowledgement</a> |
    <p>
</h4>


## Installation
C-MTEB is devloped based on [MTEB](https://github.com/embeddings-benchmark/mteb).
```
pip install -U C_MTEB
```
Or clone this repo and install as editable
```
git clone https://github.com/FlagOpen/FlagEmbedding.git
cd FlagEmbedding/C_MTEB
pip install -e .
```

## Evaluation

### Evaluate reranker
```bash
python eval_cross_encoder.py --model_name_or_path BAAI/bge-reranker-base
```

### Evaluate embedding model
* **With our scripts**

You can **reproduce the results of `baai-general-embedding (bge)`** using the provided python script (see [eval_C-MTEB.py](./eval_C-MTEB.py) )
```bash
python eval_C-MTEB.py --model_name_or_path BAAI/bge-large-zh

# for MTEB leaderboard
python eval_MTEB.py --model_name_or_path BAAI/bge-large-en

```

* **With sentence-transformers**

You can use C-MTEB easily in the same way as [MTEB](https://github.com/embeddings-benchmark/mteb).

Note that the original sentence-transformers model doesn't support instruction.
So this method cannot test the performance of `bge-*` models.

```python
from mteb import MTEB
from C_MTEB import *
from sentence_transformers import SentenceTransformer

# Define the sentence-transformers model name
model_name = "bert-base-uncased"

model = SentenceTransformer(model_name)
evaluation = MTEB(task_langs=['zh'])
results = evaluation.run(model, output_folder=f"zh_results/{model_name}")
```


* **Using a custom model**
To evaluate a new model, you can load it via sentence_transformers if it is supported by sentence_transformers.
Otherwise, models should be implemented like below (implementing an `encode` function taking as input a list of sentences, and returning a list of embeddings (embeddings can be `np.array`, `torch.tensor`, etc.).):

```python
class MyModel():
    def encode(self, sentences, batch_size=32, **kwargs):
        """ Returns a list of embeddings for the given sentences.
        Args:
            sentences (`List[str]`): List of sentences to encode
            batch_size (`int`): Batch size for the encoding

        Returns:
            `List[np.ndarray]` or `List[tensor]`: List of embeddings for the given sentences
        """
        pass

model = MyModel()
evaluation = MTEB(tasks=["T2Retrival"])
evaluation.run(model)
```

## Acknowledgement

We thank the great tool from [Massive Text Embedding Benchmark](https://github.com/embeddings-benchmark/mteb)  and the open-source datasets from Chinese NLP community.


## Citation

If you find this repository useful, please consider citation

```
@misc{c-pack,
      title={C-Pack: Packaged Resources To Advance General Chinese Embedding},
      author={Shitao Xiao and Zheng Liu and Peitian Zhang and Niklas Muennighoff},
      year={2023},
      eprint={2309.07597},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```