@@ -15,7 +15,7 @@ English | [简体中文](README_zh-CN.md)
...
@@ -15,7 +15,7 @@ English | [简体中文](README_zh-CN.md)
</div>
</div>
Welcome to **OpenCompass**!
Welcome to **OpenCompass**!
Just like a compass guides us on our journey, OpenCompass will guide you through the complex landscape of evaluating large language models. With its powerful algorithms and intuitive interface, OpenCompass makes it easy to assess the quality and effectiveness of your NLP models.
Just like a compass guides us on our journey, OpenCompass will guide you through the complex landscape of evaluating large language models. With its powerful algorithms and intuitive interface, OpenCompass makes it easy to assess the quality and effectiveness of your NLP models.
...
@@ -37,7 +37,6 @@ OpenCompass is a one-stop platform for large model evaluation, aiming to provide
...
@@ -37,7 +37,6 @@ OpenCompass is a one-stop platform for large model evaluation, aiming to provide
We provide [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) for community to rank all public models and API models. If you would like to join the evaluation, please provide the model repository URL or a standard API interface to the email address `opencompass@pjlab.org.cn`.
We provide [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) for community to rank all public models and API models. If you would like to join the evaluation, please provide the model repository URL or a standard API interface to the email address `opencompass@pjlab.org.cn`.
@@ -60,7 +60,7 @@ Here's a detailed step-by-step explanation of this case study:
...
@@ -60,7 +60,7 @@ Here's a detailed step-by-step explanation of this case study:
<details>
<details>
<summary>prepare datasets</summary>
<summary>prepare datasets</summary>
The SiQA and PiQA benchmarks can be automatically downloaded through their respective links here and here, so no manual downloading is required here. However, some other datasets may require manual downloads. Please refer to the documentation [Prepare Datasets](docs/zh_cn/user_guides/dataset_prepare.md) for more information.
The SiQA and PiQA benchmarks can be automatically downloaded through their respective links here and here, so no manual downloading is required here. However, some other datasets may require manual downloads. Please refer to the documentation [Prepare Datasets](./user_guides/dataset_prepare.md) for more information.
Create a '.py' configuration file and add the following content:
Create a '.py' configuration file and add the following content:
@@ -39,11 +39,17 @@ The datasets supported by OpenCompass mainly include two parts:
...
@@ -39,11 +39,17 @@ The datasets supported by OpenCompass mainly include two parts:
[Huggingface Dataset](https://huggingface.co/datasets) provides a large number of datasets. OpenCompass has supported most of the datasets commonly used for performance comparison, please refer to `configs/dataset` for the specific list of supported datasets.
[Huggingface Dataset](https://huggingface.co/datasets) provides a large number of datasets. OpenCompass has supported most of the datasets commonly used for performance comparison, please refer to `configs/dataset` for the specific list of supported datasets.
2.OpenCompass Self-built Datasets
2.Third-party Datasets
In addition to supporting Huggingface's existing datasets, OpenCompass also provides some self-built CN datasets. In the future, a dataset-related link will be provided for users to download and use. Following the instructions in the document to place the datasets uniformly in the `./data` directory can complete dataset preparation.
In addition to supporting Huggingface's existing datasets, OpenCompass also provides some third-party and self-built datasets. Run the following commands to download and place the datasets in the `./data` directory can complete dataset preparation.
It is important to note that the Repo not only contains self-built datasets, but also includes some HF-supported datasets for testing convenience.