Initial commit

c289ecc0 · xinghao · c289ecc0 · c289ecc0 · c289ecc0 · c289ecc0
Commit c289ecc0 authored Oct 21, 2025 by xinghao
20 changed files
--- a/.codespellrc
+++ b/.codespellrc
+[codespell]
+skip = *.ipynb
+count =
+quiet-level = 3
+ignore-words-list = nd, ans, ques, rouge, softwares, wit
--- a/.github/ISSUE_TEMPLATE/1_bug-report.yml
+++ b/.github/ISSUE_TEMPLATE/1_bug-report.yml
+name: 🐞 Bug report
+description: Create a report to help us improve
+labels: ["bug"]
+title: "[Bug] "
+body:
+  - type: markdown
+    attributes:
+      value: |
+        For general questions or idea discussions, please post it to our [**Forum**](https://github.com/open-compass/opencompass/discussions).
+        If you have already identified the reason, we strongly appreciate you creating a new PR according to [the tutorial](https://opencompass.readthedocs.io/en/master/community/CONTRIBUTING.html)!
+        If you need our help, please fill in the following form to help us to identify the bug.
+  - type: checkboxes
+    attributes:
+      label: Prerequisite
+      description: Please check the following items before creating a new issue.
+      options:
+      - label: I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help.
+        required: true
+      - label: The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass).
+        required: true
+  - type: dropdown
+    id: task
+    attributes:
+      label: Type
+      description: The problem arises when
+      options:
+        - I'm evaluating with the officially supported tasks/models/datasets.
+        - I have modified the code (config is not considered code), or I'm working on my own tasks/models/datasets.
+    validations:
+      required: true
+  - type: textarea
+    id: environment
+    validations:
+      required: true
+    attributes:
+      label: Environment
+      description: |
+        Please run `python -c "import opencompass.utils;import pprint;pprint.pprint(dict(opencompass.utils.collect_env()))"` to collect necessary environment information and paste it here.
+      placeholder: |
+        ```python
+        # The output the above command
+        ```
+  - type: textarea
+    attributes:
+      label: Reproduces the problem - code/configuration sample
+      description: |
+        Please provide a code or configuration sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
+      placeholder: |
+        ```python
+        # Sample code to reproduce the problem
+        ```
+    validations:
+      required: true
+  - type: textarea
+    attributes:
+      label: Reproduces the problem - command or script
+      description: |
+        What command or script did you run?
+      placeholder: |
+        ```shell
+        The command or script you run.
+        ```
+    validations:
+      required: true
+  - type: textarea
+    attributes:
+      label: Reproduces the problem - error message
+      description: |
+        Please provide the error message or logs you got, with the full traceback.
+        Tip: You can attach images or log files by dragging them into the text area..
+      placeholder: |
+        ```
+        The error message or logs you got, with the full traceback.
+        ```
+    validations:
+      required: true
+  - type: textarea
+    id: other
+    attributes:
+      label: Other information
+      description: |
+        Tell us anything else you think we should know.
+        1. What's your expected result?
+        2. What dataset did you use?
+        3. What do you think might be the reason?
--- a/.github/ISSUE_TEMPLATE/2_feature-request.yml
+++ b/.github/ISSUE_TEMPLATE/2_feature-request.yml
+name: 🚀 Feature request
+description: Suggest an idea for this project
+labels: ["enhancement"]
+title: "[Feature] "
+body:
+  - type: markdown
+    attributes:
+      value: |
+        For general questions or idea discussions, please post it to our [**Forum**](https://github.com/open-compass/opencompass/discussions).
+        If you have already implemented the feature, we strongly appreciate you creating a new PR according to [the tutorial](https://opencompass.readthedocs.io/en/master/community/CONTRIBUTING.html)!
+  - type: textarea
+    id: describe
+    validations:
+      required: true
+    attributes:
+      label: Describe the feature
+      description: |
+        What kind of feature do you want OpenCompass to add. If there is an official code release or third-party implementation, please also provide the information here, which would be very helpful.
+      placeholder: |
+        A clear and concise description of the motivation of the feature.
+        Ex1. It is inconvenient when \[....\].
+        Ex2. There is a recent paper \[....\], which is very helpful for \[....\].
+  - type: checkboxes
+    id: pr
+    attributes:
+      label: Will you implement it?
+      options:
+        - label: I would like to implement this feature and create a PR!
--- a/.github/ISSUE_TEMPLATE/3_bug-report_zh.yml
+++ b/.github/ISSUE_TEMPLATE/3_bug-report_zh.yml
+name: 🐞 报告 Bug
+description: 报告你在使用中遇到的不合预期的情况
+labels: ["bug"]
+title: "[Bug] "
+body:
+  - type: markdown
+    attributes:
+      value: |
+        我们推荐使用英语模板 Bug report，以便你的问题帮助更多人。
+        如果需要询问一般性的问题或者想法，请在我们的[**论坛**](https://github.com/open-compass/opencompass/discussions)讨论。
+        如果你已经有了解决方案，我们非常欢迎你直接创建一个新的 PR 来解决这个问题。创建 PR 的流程可以参考[文档](https://opencompass.readthedocs.io/zh_CN/master/community/CONTRIBUTING.html)。
+        如果你需要我们的帮助，请填写以下内容帮助我们定位 Bug。
+  - type: checkboxes
+    attributes:
+      label: 先决条件
+      description: 在创建新问题之前，请检查以下项目。
+      options:
+      - label: 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。
+        required: true
+      - label: 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。
+        required: true
+  - type: dropdown
+    id: task
+    attributes:
+      label: 问题类型
+      description: 问题出现时
+      options:
+        - 我正在使用官方支持的任务/模型/数据集进行评估。
+        - 我修改了代码（配置不视为代码），或者我正在处理我自己的任务/模型/数据集。
+    validations:
+      required: true
+  - type: textarea
+    id: environment
+    validations:
+      required: true
+    attributes:
+      label: 环境
+      description: |
+        请运行 `python -c "import opencompass.utils;import pprint;pprint.pprint(dict(opencompass.utils.collect_env()))"` 来收集必要的环境信息并粘贴在此处。
+      placeholder: |
+        ```python
+        # 上述命令的输出
+        ```
+  - type: textarea
+    attributes:
+      label: 重现问题 - 代码/配置示例
+      description: |
+        请提供重现您遇到的问题的代码或配置示例。它可以是一个Colab链接或仅仅是一个代码片段。
+      placeholder: |
+        ```python
+        # 重现问题的示例代码
+        ```
+    validations:
+      required: true
+  - type: textarea
+    attributes:
+      label: 重现问题 - 命令或脚本
+      description: |
+        您运行了什么命令或脚本？
+      placeholder: |
+        ```shell
+        您运行的命令或脚本。
+        ```
+    validations:
+      required: true
+  - type: textarea
+    attributes:
+      label: 重现问题 - 错误信息
+      description: |
+        请提供您收到的错误消息或日志，并提供完整的追溯。
+        提示：您可以通过拖放图片或日志文件到文本区域来附加它们。
+      placeholder: |
+        ```
+        您收到的错误消息或日志，带有完整的追溯。
+        ```
+    validations:
+      required: true
+  - type: textarea
+    id: other
+    attributes:
+      label: 其他信息
+      description: |
+        告诉我们其他有价值的信息。
+        1. 你是否对代码或配置文件做了任何改动？
+        2. 你认为可能的原因是什么？
--- a/.github/ISSUE_TEMPLATE/4_feature-request_zh.yml
+++ b/.github/ISSUE_TEMPLATE/4_feature-request_zh.yml
+name: 🚀 功能建议
+description: 建议一项新的功能
+labels: ["enhancement"]
+title: "[Feature] "
+body:
+  - type: markdown
+    attributes:
+      value: |
+        推荐使用英语模板 Feature request，以便你的问题帮助更多人。
+        如果需要询问一般性的问题或者想法，请在我们的[**论坛**](https://github.com/open-compass/opencompass/discussions)讨论。
+        如果你已经实现了该功能，我们非常欢迎你直接创建一个新的 PR 来解决这个问题。创建 PR 的流程可以参考[文档](https://opencompass.readthedocs.io/zh_CN/master/community/CONTRIBUTING.html)。
+  - type: textarea
+    id: describe
+    validations:
+      required: true
+    attributes:
+      label: 描述该功能
+      description: |
+        你希望 OpenCompass 添加什么功能？如果存在相关的论文、官方实现或者第三方实现，请同时贴出链接，这将非常有帮助。
+      placeholder: |
+        简要说明该功能，及为什么需要该功能
+        例 1. 现在进行 xxx 的时候不方便
+        例 2. 最近的论文中提出了有一个很有帮助的 xx
+  - type: checkboxes
+    id: pr
+    attributes:
+      label: 是否希望自己实现该功能？
+      options:
+        - label: 我希望自己来实现这一功能，并向 OpenCompass 贡献代码！
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
+blank_issues_enabled: false
+contact_links:
+  - name: 📚 OpenCompass Documentation (官方文档)
+    url: https://opencompass.readthedocs.io/en/latest/
+    about: Check if your question is answered in docs
+  - name: 💬 General questions (寻求帮助)
+    url: https://github.com/open-compass/opencompass/discussions
+    about: Ask general usage questions and discuss with other OpenCompass community members
+  - name: 🌐 Explore OpenCompass (官网)
+    url: https://opencompass.org.cn/
+    about: Get know more about OpenCompass
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
+Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
+## Motivation
+Please describe the motivation of this PR and the goal you want to achieve through this PR.
+## Modification
+Please briefly describe what modification is made in this PR.
+## BC-breaking (Optional)
+Does the modification introduce changes that break the backward compatibility of the downstream repositories?
+If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
+## Use cases (Optional)
+If this PR introduces a new feature, it is better to list some use cases here and update the documentation.
+## Checklist
+**Before PR**:
+- [ ] Pre-commit or other linting tools are used to fix the potential lint issues.
+- [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
+- [ ] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
+- [ ] The documentation has been modified accordingly, like docstring or example tutorials.
+**After PR**:
+- [ ] If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
+- [ ] CLA has been signed and all committers have signed the CLA in this PR.
--- a/.github/workflows/daily-run-test.yml
+++ b/.github/workflows/daily-run-test.yml
--- a/.github/workflows/link-check.yml
+++ b/.github/workflows/link-check.yml
+name: 'Link check'
+on:
+  schedule:
+    # check links at 01:30 a.m. every day
+    - cron: '30 1 * * *'
+  workflow_dispatch: # allow manual trigger
+jobs:
+  link-check:
+    runs-on: ubuntu-latest
+    steps:
+      # - uses: actions/checkout@v3
+      - name: Install linkchecker
+        run: |
+          pip install linkchecker
+      - name: Run linkchecker
+        run: |
+          linkchecker https://opencompass.readthedocs.io/ --no-robots -t 30 --no-warnings \
+            --ignore-url "https://opencompass.readthedocs.io/.*/static/images/opencompass_logo.svg" \
+            --ignore-url "https://opencompass.readthedocs.io/.*/_static/images/icon-menu-dots.svg" \
+            --ignore-url "https://opencompass.readthedocs.io/policy" \
+            --ignore-url "https://opencompass.readthedocs.io/(en|zh_CN)/[0-9a-f]{40}/.*"
--- a/.github/workflows/lint.yml
+++ b/.github/workflows/lint.yml
+name: lint
+on: [push, pull_request]
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+jobs:
+  lint:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+      - name: Set up Python 3.10
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.10'
+      - name: Install pre-commit hook
+        run: |
+          pip install pre-commit==3.8.0 mmengine==0.10.5
+          pre-commit install
+      - name: Linting
+        run: pre-commit run --all-files
--- a/.github/workflows/pr-run-test.yml
+++ b/.github/workflows/pr-run-test.yml
+name: pr_run_test
+on:
+  pull_request:
+    paths-ignore:
+      - 'README.md'
+      - 'README_zh-CN.md'
+      - 'docs/**'
+      - 'configs/**'
+      - 'tools/**'
+  workflow_dispatch:
+  schedule:
+    - cron:  '56 22 * * *'
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+env:
+  CONDA_ENV: pr_test
+  HF_DATASETS_OFFLINE: 1
+  HF_EVALUATE_OFFLINE: 1
+  TRANSFORMERS_OFFLINE: 1
+  VLLM_USE_MODELSCOPE: false
+  LMDEPLOY_USE_MODELSCOPE: false
+  HF_HUB_OFFLINE: 1
+  CONDA_PATH: /mnt/shared-storage-user/opencompass-shared/qa-llm-cicd/miniconda3
+  REPORT_ROOT: /mnt/shared-storage-user/opencompass-shared/qa-llm-cicd/eval_report/prtest
+  COMPASS_DATA_CACHE: /mnt/shared-storage-user/auto-eval-pipeline/opencompass/llmeval/compass_data_cache
+  HF_DATASETS_CACHE: /mnt/shared-storage-user/auto-eval-pipeline/opencompass/llmeval/hf_cache
+  HF_HUB_CACHE: /mnt/shared-storage-user/large-model-center-share-weights/hf_hub
+  KUBEBRAIN_CLUSTER_ENTRY: https://h.pjlab.org.cn
+  KUBEBRAIN_NAMESPACE: ailab-opencompass
+jobs:
+  pr_run_test:
+    runs-on: yidian_cu12
+    timeout-minutes: 45
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v2
+      - name: Prepare - Install opencompass
+        run: |
+          . ${{env.CONDA_PATH}}/bin/activate
+          conda activate ${{env.CONDA_ENV}}
+          python3 -m pip uninstall opencompass -y
+          python3 -m pip install .[full] -i https://pkg.pjlab.org.cn/repository/pypi-proxy/simple/ --trusted-host pkg.pjlab.org.cn --no-cache-dir
+          conda info --envs
+      - name: conda env
+        run: |
+          . ${{env.CONDA_PATH}}/bin/activate
+          conda activate ${{env.CONDA_ENV}}
+          conda info --envs
+          pip list
+          lmdeploy check_env
+      - name: Run test
+        run: |
+          . ${{env.CONDA_PATH}}/bin/activate
+          conda activate ${{env.CONDA_ENV}}
+          conda info --envs
+          rjob submit --name=pr-test-${{ github.run_id }}-${{ github.run_attempt }} --charged-group=opencompass_gpu --private-machine=group --group=opencompass_gpu --gpu=2 --cpu=32 --memory=32568 --private-machine=group --image=registry.h.pjlab.org.cn/ailab-puyu/xpuyu:torch-2.6.0-45d96d5f-0607 --env=COMPASS_DATA_CACHE=/mnt/shared-storage-user/auto-eval-pipeline/opencompass/llmeval/compass_data_cache --env=TIKTOKEN_CACHE_DIR=/mnt/shared-storage-user/auto-eval-pipeline/opencompass/llmeval/share_tiktoken --env=HF_HUB_CACHE=/mnt/shared-storage-user/large-model-center-share-weights/hf_hub --env=HF_ENDPOINT=https://hf-mirror.com --env=HF_DATASETS_CACHE=/mnt/shared-storage-user/auto-eval-pipeline/opencompass/llmeval/hf_cache --env=HF_HUB_CACHE=/mnt/shared-storage-user/large-model-center-share-weights/hf_hub --env=CUDA_MODULE_LOADING=EAGER --env=HF_DATASETS_OFFLINE=1 --env=TRANSFORMERS_OFFLINE=1 --env=HF_EVALUATE_OFFLINE=1 --env=HF_HUB_OFFLINE=1 --mount=gpfs://gpfs1/opencompass-shared:/mnt/shared-storage-user/opencompass-shared --mount=gpfs://gpfs1/auto-eval-pipeline:/mnt/shared-storage-user/auto-eval-pipeline --mount=gpfs://gpfs1/large-model-center-share-weights:/mnt/shared-storage-user/large-model-center-share-weights --host-network=True -- bash -exc '/mnt/shared-storage-user/opencompass-shared/qa-llm-cicd/pr_test.sh ${{env.REPORT_ROOT}}/${{ github.run_id }}'
+          for i in {1..300}; do
+            current_status=$(rjob get pr-test-${{ github.run_id }}-${{ github.run_attempt }} | grep -oP 'rjob [^:]+: \K[^ ]+')
+            if [[ $current_status == "Succeeded" || $current_status == "Failed" || $current_status == "Stopped" ]]; then
+                echo "Current status: $current_status, stop checking"
+                break
+            fi
+            sleep 6
+          done
+      - name:  Get result
+        run: |
+          score=$(sed -n '$p' ${{env.REPORT_ROOT}}/${{ github.run_id }}/regression_result1/*/summary/*.csv | awk -F ',' '{print $NF}')
+          if (( ${score%.*} >= 75 && ${score%.*} <= 80 )); then
+             echo "score is $score between 75 and 80"
+          else
+             echo "score is $score not between 75 and 80"
+             exit 1
+          fi
+          score=$(sed -n '$p' ${{env.REPORT_ROOT}}/${{ github.run_id }}/regression_result2/*/summary/*.csv | awk -F ',' '{print $NF}')
+          if (( ${score%.*} >= 75 && ${score%.*} <= 80 )); then
+             echo "score is $score between 75 and 80"
+          else
+             echo "score is $score not between 75 and 80"
+             exit 1
+          fi
+          score=$(sed -n '$p' ${{env.REPORT_ROOT}}/${{ github.run_id }}/regression_result3/*/summary/*.csv | awk -F ',' '{print $NF}')
+          if (( ${score%.*} >= 75 && ${score%.*} <= 80 )); then
+             echo "score is $score between 75 and 80"
+          else
+             echo "score is $score not between 75 and 80"
+             exit 1
+          fi
+      - name:  Uninstall opencompass
+        if: always()
+        run: |
+          . ${{env.CONDA_PATH}}/bin/activate
+          conda activate ${{env.CONDA_ENV}}
+          python3 -m pip uninstall opencompass -y
+          conda info --envs
+  notify_to_feishu:
+    if: ${{ always() && !cancelled() && contains(needs.*.result, 'failure') && (github.ref_name == 'develop' || github.ref_name == 'main') }}
+    needs: [pr_run_test]
+    timeout-minutes: 5
+    runs-on: self-hosted
+    steps:
+      - name: notify
+        run: |
+          curl -X POST -H "Content-Type: application/json" -d '{"msg_type":"post","content":{"post":{"zh_cn":{"title":"Opencompass- pr test failed","content":[[{"tag":"text","text":"branch: ${{github.ref_name}}, run action: ${{github.workflow}} failed. "},{"tag":"a","text":"Please click here for details ","href":"https://github.com/'${{ github.repository }}'/actions/runs/'${GITHUB_RUN_ID}'"},{"tag":"at","user_id":"'${{ secrets.USER_ID }}'"}]]}}}}'  ${{ secrets.WEBHOOK_URL }}
--- a/.github/workflows/pr-stage-check.yml
+++ b/.github/workflows/pr-stage-check.yml
+name: pr_stage_test
+on:
+  pull_request:
+    paths-ignore:
+      - 'README.md'
+      - 'README_zh-CN.md'
+      - 'docs/**'
+      - 'configs/**'
+      - 'tools/**'
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+jobs:
+  build:
+    runs-on: ubuntu-22.04
+    strategy:
+      matrix:
+        python-version: ['3.10']
+        include:
+          - torch: 2.5.1
+    steps:
+      - uses: actions/checkout@v3
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v4
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: Upgrade pip
+        run: python -m pip install --upgrade pip
+      - name: Install PyTorch
+        run: pip install torch==${{matrix.torch}} -f https://download.pytorch.org/whl/cpu/torch_stable.html
+      - name: Install system dependencies
+        run: |
+          sudo sed -i '$ a deb http://th.archive.ubuntu.com/ubuntu jammy main' /etc/apt/sources.list
+          sudo apt-get update && sudo apt-get install -y libc6 libffi-dev libncursesw6 wget unzip
+      - name: Upgrade pip
+        run: python -m pip install pip --upgrade
+      - name: Install opencompass dependencies
+        run: |
+          python -m pip install -r requirements.txt
+      - name: Build and install
+        run: python -m pip install -e .
+      - name: Prepare dataset
+        run: |
+          wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
+          unzip OpenCompassData-core-20240207.zip
+      - name: Dry run test
+        run: |
+          python run.py --models hf_opt_125m --datasets siqa_gen winograd_ppl --dry-run
+  build_cu117:
+    runs-on: ubuntu-22.04
+    container:
+      image: nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04
+    strategy:
+      matrix:
+        python-version: ['3.10']
+    steps:
+      - uses: actions/checkout@v3
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v4
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: Fetch GPG keys
+        run: |
+          apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
+          apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub
+      - name: Install Python-dev
+        run: apt-get update && apt-get install -y python${{matrix.python-version}}-dev
+        if: ${{matrix.python-version != 3.10}}
+      - name: Install system dependencies
+        run: |
+          apt-get update
+          apt-get install -y ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libxrender-dev libc6 libc6-dev
+          sed -i '$ a deb http://th.archive.ubuntu.com/ubuntu jammy main' /etc/apt/sources.list
+          apt-get update && apt-get install -y libc6 libffi-dev libncursesw6 wget unzip
+      - name: Upgrade pip
+        run: python -m pip install pip --upgrade
+      - name: Install opencompass dependencies
+        run: |
+          python -m pip install -r requirements.txt
+      - name: Build and install
+        run: python -m pip install -e .
+      - name: Prepare dataset
+        run: |
+          wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
+          unzip OpenCompassData-core-20240207.zip
+      - name: Dry run test
+        run: |
+          python run.py --models hf_opt_125m --datasets siqa_gen winograd_ppl --dry-run
+  build_windows:
+    runs-on: windows-2022
+    strategy:
+      matrix:
+        python-version: ['3.10']
+        platform: [cpu]
+    steps:
+      - uses: actions/checkout@v3
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v4
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: Upgrade pip
+        run: python -m pip install pip --upgrade
+      - name: Install PyTorch
+        run: pip install torch==2.5.1 -f https://download.pytorch.org/whl/cpu/torch_stable.html
+      - name: Install opencompass dependencies
+        run: |
+          pip install -r requirements.txt
+      - name: Build and install
+        run: pip install -e .
+      - name: Prepare dataset
+        run: |
+          Invoke-WebRequest -Uri https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip -OutFile OpenCompassData-core-20240207.zip
+          unzip OpenCompassData-core-20240207.zip
+      - name: Dry run test
+        run: |
+          python run.py --models hf_opt_125m --datasets siqa_gen winograd_ppl --dry-run
--- a/.github/workflows/publish-to-pypi.yml
+++ b/.github/workflows/publish-to-pypi.yml
+name: deploy
+on:
+  push:
+  workflow_dispatch:
+    inputs:
+      confirm_publish:
+        description: 'Type YES to confirm publishing to PyPI'
+        required: true
+        type: string
+jobs:
+  build-n-publish:
+    runs-on: ubuntu-latest
+    if: |
+      github.event_name == 'push' && startsWith(github.event.ref, 'refs/tags') ||
+      (github.event_name == 'workflow_dispatch' && inputs.confirm_publish == 'YES')
+    steps:
+      - uses: actions/checkout@v2
+      - name: Set up Python 3.10
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.10'
+      - name: Build lagent
+        run: |
+          pip install wheel
+          python setup.py sdist bdist_wheel
+      - name: Publish distribution to PyPI
+        run: |
+          pip install twine
+          twine upload dist/* -u __token__ -p ${{ secrets.pypi_password }}
--- a/.gitignore
+++ b/.gitignore
+.DS_Store
+output_*/
+outputs/
+scripts/
+icl_inference_output/
+.vscode/
+tmp/
+configs/eval_subjective_alignbench_test.py
+configs/openai_key.py
+configs/secrets.py
+configs/datasets/log.json
+configs/eval_debug*.py
+configs/viz_*.py
+configs/**/*_bkup.py
+opencompass/**/*_bkup.py
+data
+work_dirs
+outputs
+models/*
+configs/internal/
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+*.ipynb
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+.idea
+# Auto generate documentation
+docs/en/_build/
+docs/zh_cn/_build/
+# .zip
+*.zip
+# sft config ignore list
+configs/sft_cfg/*B_*
+configs/sft_cfg/1B/*
+configs/sft_cfg/7B/*
+configs/sft_cfg/20B/*
+configs/sft_cfg/60B/*
+configs/sft_cfg/100B/*
+configs/cky/
+configs/_internal_legacy*
+# in case llama clone in the opencompass
+llama/
+# in case ilagent clone in the opencompass
+ilagent/
+# ignore the config file for criticbench evaluation
+configs/sft_cfg/criticbench_eval/*
+# path of turbomind's model after runing `lmdeploy.serve.turbomind.deploy`
+turbomind/
+# cibench output
+*.db
+*.pth
+*.pt
+*.onnx
+*.gz
+*.gz.*
+*.png
+*.txt
+*.jpg
+*.json
+*.jsonl
+*.csv
+*.npy
+*.c
+# aliyun
+core.*
--- a/.owners.yml
+++ b/.owners.yml
+assign:
+  issues: enabled
+  pull_requests: disabled
+  strategy:
+    # random
+    daily-shift-based
+  scedule:
+    '*/1 * * * *'
+  assignees:
+    - bittersweet1999
+    - liushz
+    - MaiziXiao
+    - acylam
+    - tonysy
--- a/.pre-commit-config-zh-cn.yaml
+++ b/.pre-commit-config-zh-cn.yaml
+exclude: |
+    (?x)^(
+      tests/data/|
+      tests/dataset/|
+      opencompass/models/internal/|
+      opencompass/utils/internal/|
+      opencompass/openicl/icl_evaluator/hf_metrics/|
+      opencompass/datasets/lawbench/utils|
+      opencompass/datasets/lawbench/evaluation_functions/|
+      opencompass/datasets/medbench/|
+      opencompass/datasets/teval/|
+      opencompass/datasets/NPHardEval/|
+      opencompass/datasets/TheoremQA|
+      opencompass/datasets/subjective/mtbench101.py|
+      docs/zh_cn/advanced_guides/compassbench_intro.md |
+      docs/zh_cn/advanced_guides/compassbench_v2_0.md |
+      opencompass/utils/datasets.py |
+      opencompass/utils/datasets_info.py
+    )
+repos:
+  - repo: https://gitee.com/openmmlab/mirrors-flake8
+    rev: 5.0.4
+    hooks:
+      - id: flake8
+        exclude: |
+            (?x)^(
+                opencompass/configs/|
+                examples/
+            )
+  - repo: https://gitee.com/openmmlab/mirrors-isort
+    rev: 5.11.5
+    hooks:
+      - id: isort
+        exclude: |
+            (?x)^(
+                opencompass/configs/|
+                examples/
+            )
+  - repo: https://gitee.com/openmmlab/mirrors-yapf
+    rev: v0.32.0
+    hooks:
+      - id: yapf
+        exclude: |
+            (?x)^(
+                opencompass/configs/|
+                examples/
+            )
+  - repo: https://gitee.com/openmmlab/mirrors-codespell
+    rev: v2.2.1
+    hooks:
+      - id: codespell
+        exclude: |
+            (?x)^(
+                .*\.jsonl|
+                .*\.md.template|
+                opencompass/configs/ |
+                examples/
+            )
+  - repo: https://gitee.com/openmmlab/mirrors-pre-commit-hooks
+    rev: v4.3.0
+    hooks:
+      - id: trailing-whitespace
+        exclude: |
+            (?x)^(
+              dicts/|
+              projects/.*?/dicts/|
+            )
+      - id: check-yaml
+      - id: end-of-file-fixer
+        exclude: |
+            (?x)^(
+              dicts/|
+              projects/.*?/dicts/|
+            )
+      - id: requirements-txt-fixer
+      - id: double-quote-string-fixer
+      - id: check-merge-conflict
+      - id: fix-encoding-pragma
+        args: ["--remove"]
+      - id: mixed-line-ending
+        args: ["--fix=lf"]
+  - repo: https://gitee.com/openmmlab/mirrors-mdformat
+    rev: 0.7.9
+    hooks:
+      - id: mdformat
+        args: ["--number", "--table-width", "200"]
+        additional_dependencies:
+          - mdformat-openmmlab
+          - mdformat_frontmatter
+          - linkify-it-py
+        exclude: configs/
+  - repo: https://gitee.com/openmmlab/mirrors-docformatter
+    rev: v1.3.1
+    hooks:
+      - id: docformatter
+        args: ["--in-place", "--wrap-descriptions", "79"]
+  - repo: local
+    hooks:
+    -   id: update-dataset-suffix
+        name: dataset suffix updater
+        entry: ./tools/update_dataset_suffix.py
+        language: script
+        pass_filenames: true
+        require_serial: true
+        files: ^opencompass/configs/datasets
+  - repo: local
+    hooks:
+    -   id: update-dataset-suffix-pacakge
+        name: dataset suffix updater(package)
+        entry: ./tools/update_dataset_suffix.py
+        language: script
+        pass_filenames: false
+        # require_serial: true
+        # files: ^opencompass/configs/datasets
+        args:
+          - --root_folder
+          - opencompass/configs/datasets
+  # - repo: https://github.com/open-mmlab/pre-commit-hooks
+  #   rev: v0.2.0  # Use the ref you want to point at
+  #   hooks:
+  #     - id: check-algo-readme
+      # - id: check-copyright
+      #   args: ["mmocr", "tests", "tools"]  # these directories will be checked
\ No newline at end of file
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
+exclude: |
+    (?x)^(
+      tests/data/|
+      tests/dataset/|
+      opencompass/models/internal/|
+      opencompass/utils/internal/|
+      opencompass/openicl/icl_evaluator/hf_metrics/|
+      opencompass/datasets/lawbench/utils|
+      opencompass/datasets/lawbench/evaluation_functions/|
+      opencompass/datasets/medbench/|
+      opencompass/datasets/matbench/|
+      opencompass/datasets/teval/|
+      opencompass/datasets/NPHardEval/|
+      opencompass/datasets/TheoremQA|
+      opencompass/datasets/subjective/mtbench101.py|
+      docs/zh_cn/advanced_guides/compassbench_intro.md |
+      docs/zh_cn/advanced_guides/compassbench_v2_0.md |
+      opencompass/utils/datasets.py |
+      opencompass/utils/datasets_info.py
+    )
+repos:
+  - repo: https://github.com/PyCQA/flake8
+    rev: 5.0.4
+    hooks:
+      - id: flake8
+        exclude: |
+            (?x)^(
+                opencompass/configs/|
+                examples/
+            )
+  - repo: https://github.com/PyCQA/isort
+    rev: 5.11.5
+    hooks:
+      - id: isort
+        exclude: |
+            (?x)^(
+                opencompass/configs/|
+                examples/
+            )
+  - repo: https://github.com/pre-commit/mirrors-yapf
+    rev: v0.32.0
+    hooks:
+      - id: yapf
+        exclude: |
+            (?x)^(
+                opencompass/configs/|
+                examples/
+            )
+  - repo: https://github.com/codespell-project/codespell
+    rev: v2.2.1
+    hooks:
+      - id: codespell
+        exclude: |
+            (?x)^(
+                .*\.jsonl|
+                .*\.md.template|
+                opencompass/configs/ |
+                examples/
+            )
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v5.0.0
+    hooks:
+      - id: trailing-whitespace
+        exclude: |
+            (?x)^(
+              dicts/|
+              projects/.*?/dicts/|
+            )
+      - id: check-yaml
+      - id: end-of-file-fixer
+        exclude: |
+            (?x)^(
+              dicts/|
+              projects/.*?/dicts/|
+            )
+      - id: requirements-txt-fixer
+      - id: double-quote-string-fixer
+      - id: check-merge-conflict
+      - id: fix-encoding-pragma
+        args: ["--remove"]
+      - id: mixed-line-ending
+        args: ["--fix=lf"]
+  - repo: https://github.com/executablebooks/mdformat
+    rev: 0.7.9
+    hooks:
+      - id: mdformat
+        args: ["--number", "--table-width", "200"]
+        additional_dependencies:
+          - mdformat-openmmlab
+          - mdformat_frontmatter
+          - linkify-it-py
+        exclude: configs/
+  # - repo: https://github.com/myint/docformatter
+  #   rev: v1.3.1
+  #   hooks:
+  #     - id: docformatter
+  #       args: ["--in-place", "--wrap-descriptions", "79"]
+  - repo: local
+    hooks:
+    -   id: update-dataset-suffix
+        name: dataset suffix updater
+        entry: ./tools/update_dataset_suffix.py
+        language: script
+        pass_filenames: true
+        require_serial: true
+        files: ^opencompass/configs/datasets
+  - repo: local
+    hooks:
+    -   id: update-dataset-suffix-pacakge
+        name: dataset suffix updater(package)
+        entry: ./tools/update_dataset_suffix.py
+        language: script
+        pass_filenames: false
+        # require_serial: true
+        # files: ^opencompass/configs/datasets
+        args:
+          - --root_folder
+          - opencompass/configs/datasets
+  # - repo: https://github.com/open-mmlab/pre-commit-hooks
+  #   rev: v0.2.0  # Use the ref you want to point at
+  #   hooks:
+  #     - id: check-algo-readme
+      # - id: check-copyright
+      #   args: ["mmocr", "tests", "tools"]  # these directories will be checked
\ No newline at end of file
--- a/LICENSE
+++ b/LICENSE
+Copyright 2020 OpenCompass Authors. All rights reserved.
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+   1. Definitions.
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+   END OF TERMS AND CONDITIONS
+   APPENDIX: How to apply the Apache License to your work.
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+   Copyright 2020 OpenCompass Authors.
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+       http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/MANIFEST.in
+++ b/MANIFEST.in
+recursive-include opencompass/configs *.py *.yml *.json *.txt *.md
+recursive-include opencompass/openicl/icl_evaluator/hf_metrics *.py
+recursive-include opencompass/datasets *.py *.yml *.json *.txt *.md *.yaml
--- a/README.md
+++ b/README.md
+<div align="center">
+  <img src="docs/zh_cn/_static/image/logo.svg" width="500px"/>
+  <br />
+  <br />
+## 🛠️ 安装指南
+下面提供了快速安装和数据集准备的步骤。
+### 💻 环境搭建
+建议使用 `docker` 搭建环境。
+- #### 创建容器
+  ```bash
+  docker run --shm-size 500g --network=host --name=opencompass --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v /path/to/workspace/:/path/to/workspace/ -v /opt/hyhal:/opt/hyhal:ro -it image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.4.1-ubuntu22.04-dtk25.04.1-py3.10 bash
+  ```
+- #### 通过pip安装OpenCompass
+  ```bash
+  # 支持绝大多数数据集及模型
+  pip install -U opencompass
+```
+- #### 基于源码安装OpenCompass
+  如果希望使用 OpenCompass 的最新功能，也可以从源代码构建它：
+  ```bash
+  cd opencompass
+  pip install -e .
+  ```
+### 📂 数据准备
+#### 提前离线下载
+OpenCompass支持使用本地数据集进行评测，数据集的下载和解压可以通过以下命令完成：
+```bash
+# 下载数据集到 data/ 处
+wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
+unzip OpenCompassData-core-20240207.zip
+```
+#### 从 OpenCompass 自动下载
+OpenCompass已经支持从存储服务器自动下载数据集。可以通过额外的 `--dry-run` 参数来运行评估以下载这些数据集。
+目前支持的数据集列表在[这里](https://github.com/open-compass/opencompass/blob/main/opencompass/utils/datasets_info.py#L259)。更多数据集将会很快上传。
+#### (可选) 使用 ModelScope 自动下载
+另外，还可以使用[ModelScope](www.modelscope.cn)来加载数据集：
+环境准备：
+```bash
+pip install modelscope
+export DATASET_SOURCE=ModelScope
+```
+配置好环境后，无需下载全部数据，直接提交评测任务即可。目前支持的数据集有：
+```bash
+humaneval, triviaqa, commonsenseqa, tydiqa, strategyqa, cmmlu, lambada, piqa, ceval, math, LCSTS, Xsum, winogrande, openbookqa, AGIEval, gsm8k, nq, race, siqa, mbpp, mmlu, hellaswag, ARC, BBH, xstory_cloze, summedits, GAOKAO-BENCH, OCNLI, cmnli
+```
+有部分第三方功能,如 Humaneval 以及 Llama,可能需要额外步骤才能正常运行，详细步骤请参考[安装指南](https://opencompass.readthedocs.io/zh_CN/latest/get_started/installation.html)。
+## 🏗️ ️评测
+在确保按照上述步骤正确安装了 OpenCompass 并准备好了数据集之后，现在您可以开始使用 OpenCompass 进行首次评估！
+- ### 首次评测
+  OpenCompass 支持通过命令行界面 (CLI) 或 Python 脚本来设置配置。对于简单的评估设置，推荐使用 CLI；而对于更复杂的评估，则建议使用脚本方式。可以在examples文件夹下找到更多脚本示例。
+  ```bash
+  # 命令行界面 (CLI)
+  opencompass --models hf_internlm2_5_1_8b_chat --datasets demo_gsm8k_chat_gen
+  # Python 脚本
+  opencompass examples/eval_chat_demo.py
+  ```
+  你可以在[examples](./examples) 文件夹下找到更多的脚本示例。
+- ### API评测
+  OpenCompass 在设计上并不区分开源模型与 API 模型。可以以相同的方式或甚至在同一设置中评估这两种类型的模型。
+  ```bash
+  export OPENAI_API_KEY="YOUR_OPEN_API_KEY"
+  # 命令行界面 (CLI)
+  opencompass --models gpt_4o_2024_05_13 --datasets demo_gsm8k_chat_gen
+  # Python 脚本
+  opencompass  examples/eval_api_demo.py
+  ```
+## 📖 参考资料
+https://github.com/open-compass/opencompass
\ No newline at end of file