"sgl-kernel/include/utils.h" did not exist on "06dd2eab84387cf47ff0db3b48e35373119c8347"
README_zh-CN.md 7.21 KB
Newer Older
gaotongxiao's avatar
gaotongxiao committed
1
<div align="center">
Tong Gao's avatar
Tong Gao committed
2
3
4
  <img src="docs/zh_cn/_static/image/logo.svg" width="500px"/>
  <br />
  <br />
gaotongxiao's avatar
gaotongxiao committed
5

Hubert's avatar
Hubert committed
6
[![docs](https://readthedocs.org/projects/opencompass/badge)](https://opencompass.readthedocs.io/zh_CN)
Hubert's avatar
Hubert committed
7
[![license](https://img.shields.io/github/license/InternLM/opencompass.svg)](https://github.com/InternLM/opencompass/blob/main/LICENSE)
Tong Gao's avatar
Tong Gao committed
8

Hubert's avatar
Hubert committed
9
<!-- [![PyPI](https://badge.fury.io/py/opencompass.svg)](https://pypi.org/project/opencompass/) -->
gaotongxiao's avatar
gaotongxiao committed
10

Tong Gao's avatar
Tong Gao committed
11
[🌐Website](https://opencompass.org.cn/) |
Ezra-Yu's avatar
Ezra-Yu committed
12
[📘Documentation](https://opencompass.readthedocs.io/zh_CN/latest/index.html) |
13
[🛠️Installation](https://opencompass.readthedocs.io/zh_CN/latest/get_started.html#id1) |
gaotongxiao's avatar
gaotongxiao committed
14
15
16
17
18
19
[🤔Reporting Issues](https://github.com/InternLM/opencompass/issues/new/choose)

[English](/README.md) | 简体中文

</div>

20
<p align="center">
vansin's avatar
vansin committed
21
    👋 加入我们的<a href="https://twitter.com/intern_lm" target="_blank">推特</a><a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a><a href="https://r.vansin.top/?r=internwx" target="_blank">微信社区</a>
22
23
</p>

Tong Gao's avatar
Tong Gao committed
24
25
26
27
欢迎来到OpenCompass!

就像指南针在我们的旅程中为我们导航一样,我们希望OpenCompass能够帮助你穿越评估大型语言模型的重重迷雾。OpenCompass提供丰富的算法和功能支持,期待OpenCompass能够帮助社区更便捷地对NLP模型的性能进行公平全面的评估。

Leymore's avatar
Leymore committed
28
29
## 更新

30
31
32
- **\[2023.07.27\]** 新增了 [CMMLU](https://github.com/haonan-li/CMMLU)! 欢迎更多的数据集加入 OpenCompass. 🔥🔥🔥.
- **\[2023.07.21\]** Llama-2 的评测结果已更新在 OpenCompass [大语言模型评测榜单](https://opencompass.org.cn/leaderboard-llm)!  🔥🔥🔥.
- **\[2023.07.19\]** 新增了 [Llama-2](https://ai.meta.com/llama/)!我们近期将会公布其评测结果。\[[文档](./docs/zh_cn/get_started.md#安装)\] 🔥🔥🔥。
Leymore's avatar
Leymore committed
33
34
- **\[2023.07.13\]** 发布了 [MMBench](https://opencompass.org.cn/MMBench),该数据集经过细致整理,用于评测多模态模型全方位能力 🔥🔥🔥。

gaotongxiao's avatar
gaotongxiao committed
35
36
## 介绍

Tong Gao's avatar
Tong Gao committed
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
OpenCompass 是面向大模型评测的一站式平台。其主要特点如下:

- **开源可复现**:提供公平、公开、可复现的大模型评测方案

- **全面的能力维度**:五大维度设计,提供 50+ 个数据集约 30 万题的的模型评测方案,全面评估模型能力

- **丰富的模型支持**:已支持 20+ HuggingFace 及 API 模型

- **分布式高效评测**:一行命令实现任务分割和分布式评测,数小时即可完成千亿模型全量评测

- **多样化评测范式**:支持零样本、小样本及思维链评测,结合标准型或对话型提示词模板,轻松激发各种模型最大性能

- **灵活化拓展**:想增加新模型或数据集?想要自定义更高级的任务分割策略,甚至接入新的集群管理系统?OpenCompass 的一切均可轻松扩展!

## 性能榜单

我们将陆续提供开源模型和API模型的具体性能榜单,请见 [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) 。如需加入评测,请提供模型仓库地址或标准的 API 接口至邮箱  `opencompass@pjlab.org.cn`.

Tau's avatar
Tau committed
55
[![image](https://github.com/InternLM/opencompass/assets/13503330/76237116-a9dd-4207-abef-7ff73b89568a)](https://opencompass.org.cn/rank)
Tong Gao's avatar
Tong Gao committed
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71

## 数据集支持

<table align="center">
  <tbody>
    <tr align="center" valign="bottom">
      <td>
        <b>语言</b>
      </td>
      <td>
        <b>知识</b>
      </td>
      <td>
        <b>推理</b>
      </td>
      <td>
Songyang Zhang's avatar
Songyang Zhang committed
72
        <b>学科</b>
Tong Gao's avatar
Tong Gao committed
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
      </td>
      <td>
        <b>理解</b>
      </td>
    </tr>
    <tr valign="top">
      <td>
<details open>
<summary><b>字词释义</b></summary>

- WiC
- SummEdits

</details>

<details open>
<summary><b>成语习语</b></summary>

- CHID

</details>

<details open>
<summary><b>语义相似度</b></summary>

- AFQMC
- BUSTM

</details>

<details open>
<summary><b>指代消解</b></summary>

- CLUEWSC
- WSC
- WinoGrande

</details>

<details open>
<summary><b>翻译</b></summary>

- Flores

</details>
      </td>
      <td>
<details open>
<summary><b>知识问答</b></summary>

- BoolQ
- CommonSenseQA
- NaturalQuestion
- TrivialQA

</details>

<details open>
<summary><b>多语种问答</b></summary>

- TyDi-QA

</details>
      </td>
      <td>
<details open>
<summary><b>文本蕴含</b></summary>

- CMNLI
- OCNLI
- OCNLI_FC
- AX-b
- AX-g
- CB
- RTE

</details>

<details open>
<summary><b>常识推理</b></summary>

- StoryCloze
- StoryCloze-CN(即将上线)
- COPA
- ReCoRD
- HellaSwag
- PIQA
- SIQA

</details>

<details open>
<summary><b>数学推理</b></summary>

- MATH
- GSM8K

</details>

<details open>
<summary><b>定理应用</b></summary>

- TheoremQA

</details>

<details open>
<summary><b>代码</b></summary>

- HumanEval
- MBPP

</details>

<details open>
<summary><b>综合推理</b></summary>

- BBH

</details>
      </td>
      <td>
<details open>
<summary><b>初中/高中/大学/职业考试</b></summary>

- GAOKAO-2023
- CEval
- AGIEval
- MMLU
- GAOKAO-Bench
203
- CMMLU
Tong Gao's avatar
Tong Gao committed
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
- ARC

</details>
      </td>
      <td>
<details open>
<summary><b>阅读理解</b></summary>

- C3
- CMRC
- DRCD
- MultiRC
- RACE

</details>

<details open>
<summary><b>内容总结</b></summary>

- CSL
- LCSTS
- XSum

</details>

<details open>
<summary><b>内容分析</b></summary>

- EPRSTMT
- LAMBADA
- TNEWS

</details>
      </td>
    </tr>
</td>
    </tr>
  </tbody>
</table>

## 模型支持
gaotongxiao's avatar
gaotongxiao committed
245

Tong Gao's avatar
Tong Gao committed
246
247
248
249
<table align="center">
  <tbody>
    <tr align="center" valign="bottom">
      <td>
Songyang Zhang's avatar
Songyang Zhang committed
250
        <b>开源模型</b>
Tong Gao's avatar
Tong Gao committed
251
252
253
254
      </td>
      <td>
        <b>API 模型</b>
      </td>
Songyang Zhang's avatar
Songyang Zhang committed
255
      <!-- <td>
Tong Gao's avatar
Tong Gao committed
256
        <b>自定义模型</b>
Songyang Zhang's avatar
Songyang Zhang committed
257
      </td> -->
Tong Gao's avatar
Tong Gao committed
258
259
260
    </tr>
    <tr valign="top">
      <td>
gaotongxiao's avatar
gaotongxiao committed
261

Tong Gao's avatar
Tong Gao committed
262
263
264
265
266
267
268
269
270
271
272
273
- LLaMA
- Vicuna
- Alpaca
- Baichuan
- WizardLM
- ChatGLM-6B
- ChatGLM2-6B
- MPT
- Falcon
- TigerBot
- MOSS
- ……
gaotongxiao's avatar
gaotongxiao committed
274

Tong Gao's avatar
Tong Gao committed
275
276
</td>
<td>
gaotongxiao's avatar
gaotongxiao committed
277

Songyang Zhang's avatar
Songyang Zhang committed
278
- OpenAI
Tong Gao's avatar
Tong Gao committed
279
280
281
- Claude (即将推出)
- PaLM (即将推出)
- ……
gaotongxiao's avatar
gaotongxiao committed
282

Tong Gao's avatar
Tong Gao committed
283
</td>
Songyang Zhang's avatar
Songyang Zhang committed
284
<!-- <td>
gaotongxiao's avatar
gaotongxiao committed
285

Tong Gao's avatar
Tong Gao committed
286
287
- GLM
- ……
gaotongxiao's avatar
gaotongxiao committed
288

Songyang Zhang's avatar
Songyang Zhang committed
289
</td> -->
Tong Gao's avatar
Tong Gao committed
290
291
292
</tr>
  </tbody>
</table>
gaotongxiao's avatar
gaotongxiao committed
293

Ezra-Yu's avatar
Ezra-Yu committed
294
## 安装
gaotongxiao's avatar
gaotongxiao committed
295

296
下面展示了快速安装以及准备数据集的步骤。
gaotongxiao's avatar
gaotongxiao committed
297
298

```Python
Tong Gao's avatar
Tong Gao committed
299
conda create --name opencompass python=3.10 pytorch torchvision pytorch-cuda -c nvidia -c pytorch -y
gaotongxiao's avatar
gaotongxiao committed
300
301
302
303
304
conda activate opencompass
git clone https://github.com/InternLM/opencompass opencompass
cd opencompass
pip install -e .
# 下载数据集到 data/ 处
305
wget https://github.com/InternLM/opencompass/releases/download/0.1.1/OpenCompassData.zip
Tong Gao's avatar
Tong Gao committed
306
unzip OpenCompassData.zip
gaotongxiao's avatar
gaotongxiao committed
307
308
```

309
310
有部分第三方功能,如 Humaneval 以及 Llama,可能需要额外步骤才能正常运行,详细步骤请参考[安装指南](https://opencompass.readthedocs.io/zh_CN/latest/get_started.html)

gaotongxiao's avatar
gaotongxiao committed
311
312
## 评测

313
314
315
确保按照上述步骤正确安装 OpenCompass 并准备好数据集后,请阅读[快速上手](https://opencompass.readthedocs.io/zh_CN/latest/get_started.html#id3)了解如何运行一个评测任务。

更多教程请查看我们的[文档](https://opencompass.readthedocs.io/zh_CN/latest/index.html)
gaotongxiao's avatar
gaotongxiao committed
316

Tong Gao's avatar
Tong Gao committed
317
## 致谢
gaotongxiao's avatar
gaotongxiao committed
318
319
320
321
322
323
324
325
326
327
328
329
330

该项目部分的代码引用并修改自 [OpenICL](https://github.com/Shark-NLP/OpenICL)

## 引用

```bibtex
@misc{2023opencompass,
    title={OpenCompass: A Universal Evaluation Platform for Foundation Models},
    author={OpenCompass Contributors},
    howpublished = {\url{https://github.com/InternLM/OpenCompass}},
    year={2023}
}
```