Unverified Commit 0c75f0f9 authored by Songyang Zhang's avatar Songyang Zhang Committed by GitHub
Browse files

[Update] Update introduction of CompassBench-2024-Q1 (#769)



* [Doc] Update Example of CompassBench

* [Doc] Update Example of CompassBench

* [Doc] Update Example of CompassBench

* update

* Update docs/zh_cn/advanced_guides/compassbench_intro.md
Co-authored-by: default avatarFengzhe Zhou <zfz-960727@163.com>

---------
Co-authored-by: default avatarFengzhe Zhou <zfz-960727@163.com>
parent 2163f939
...@@ -6,7 +6,8 @@ exclude: | ...@@ -6,7 +6,8 @@ exclude: |
opencompass/openicl/icl_evaluator/hf_metrics/| opencompass/openicl/icl_evaluator/hf_metrics/|
opencompass/datasets/lawbench/utils| opencompass/datasets/lawbench/utils|
opencompass/datasets/lawbench/evaluation_functions/| opencompass/datasets/lawbench/evaluation_functions/|
opencompass/datasets/medbench opencompass/datasets/medbench|
docs/zh_cn/advanced_guides/compassbench_intro.md
) )
repos: repos:
- repo: https://gitee.com/openmmlab/mirrors-flake8 - repo: https://gitee.com/openmmlab/mirrors-flake8
......
...@@ -6,7 +6,8 @@ exclude: | ...@@ -6,7 +6,8 @@ exclude: |
opencompass/openicl/icl_evaluator/hf_metrics/| opencompass/openicl/icl_evaluator/hf_metrics/|
opencompass/datasets/lawbench/utils| opencompass/datasets/lawbench/utils|
opencompass/datasets/lawbench/evaluation_functions/| opencompass/datasets/lawbench/evaluation_functions/|
opencompass/datasets/medbench/ opencompass/datasets/medbench/|
docs/zh_cn/advanced_guides/compassbench_intro.md
) )
repos: repos:
- repo: https://github.com/PyCQA/flake8 - repo: https://github.com/PyCQA/flake8
......
...@@ -31,6 +31,8 @@ At that time, we will release rankings for both open-source models and commercia ...@@ -31,6 +31,8 @@ At that time, we will release rankings for both open-source models and commercia
We sincerely invite various large models to join the OpenCompass to showcase their performance advantages in different fields. At the same time, we also welcome researchers and developers to provide valuable suggestions and contributions to jointly promote the development of the LLMs. If you have any questions or needs, please feel free to [contact us](mailto:opencompass@pjlab.org.cn). In addition, relevant evaluation contents, performance statistics, and evaluation methods will be open-source along with the leaderboard release. We sincerely invite various large models to join the OpenCompass to showcase their performance advantages in different fields. At the same time, we also welcome researchers and developers to provide valuable suggestions and contributions to jointly promote the development of the LLMs. If you have any questions or needs, please feel free to [contact us](mailto:opencompass@pjlab.org.cn). In addition, relevant evaluation contents, performance statistics, and evaluation methods will be open-source along with the leaderboard release.
We have provided the more details of the CompassBench 2023 in [Doc](docs/zh_cn/advanced_guides/compassbench_intro.md).
Let's look forward to the release of the OpenCompass 2023 LLM Annual Leaderboard! Let's look forward to the release of the OpenCompass 2023 LLM Annual Leaderboard!
## 🧭 Welcome ## 🧭 Welcome
......
...@@ -31,6 +31,8 @@ ...@@ -31,6 +31,8 @@
我们诚挚邀请各类大模型接入OpenCompass评测体系,以展示其在各个领域的性能优势。同时,也欢迎广大研究者、开发者向我们提供宝贵的意见和建议,共同推动大模型领域的发展。如有任何问题或需求,请随时[联系我们](mailto:opencompass@pjlab.org.cn)。此外,相关评测内容,性能数据,评测方法也将随榜单发布一并开源。 我们诚挚邀请各类大模型接入OpenCompass评测体系,以展示其在各个领域的性能优势。同时,也欢迎广大研究者、开发者向我们提供宝贵的意见和建议,共同推动大模型领域的发展。如有任何问题或需求,请随时[联系我们](mailto:opencompass@pjlab.org.cn)。此外,相关评测内容,性能数据,评测方法也将随榜单发布一并开源。
我们提供了本次评测所使用的部分题目示例,详情请见[CompassBench 2023](docs/zh_cn/advanced_guides/compassbench_intro.md).
<p>让我们共同期待OpenCompass 2023年度大模型榜单的发布,期待各大模型在榜单上的精彩表现!</p> <p>让我们共同期待OpenCompass 2023年度大模型榜单的发布,期待各大模型在榜单上的精彩表现!</p>
## 🧭 欢迎 ## 🧭 欢迎
......
This diff is collapsed.
...@@ -70,6 +70,7 @@ OpenCompass 上手路线 ...@@ -70,6 +70,7 @@ OpenCompass 上手路线
advanced_guides/subjective_evaluation.md advanced_guides/subjective_evaluation.md
advanced_guides/circular_eval.md advanced_guides/circular_eval.md
advanced_guides/contamination_eval.md advanced_guides/contamination_eval.md
advanced_guides/compassbench_intro.md
.. _工具: .. _工具:
.. toctree:: .. toctree::
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment