Commit c289ecc0 authored by xinghao's avatar xinghao
Browse files

Initial commit

parents
Pipeline #3004 canceled with stages
[codespell]
skip = *.ipynb
count =
quiet-level = 3
ignore-words-list = nd, ans, ques, rouge, softwares, wit
name: 🐞 Bug report
description: Create a report to help us improve
labels: ["bug"]
title: "[Bug] "
body:
- type: markdown
attributes:
value: |
For general questions or idea discussions, please post it to our [**Forum**](https://github.com/open-compass/opencompass/discussions).
If you have already identified the reason, we strongly appreciate you creating a new PR according to [the tutorial](https://opencompass.readthedocs.io/en/master/community/CONTRIBUTING.html)!
If you need our help, please fill in the following form to help us to identify the bug.
- type: checkboxes
attributes:
label: Prerequisite
description: Please check the following items before creating a new issue.
options:
- label: I have searched [Issues](https://github.com/open-compass/opencompass/issues/) and [Discussions](https://github.com/open-compass/opencompass/discussions) but cannot get the expected help.
required: true
- label: The bug has not been fixed in the [latest version](https://github.com/open-compass/opencompass).
required: true
- type: dropdown
id: task
attributes:
label: Type
description: The problem arises when
options:
- I'm evaluating with the officially supported tasks/models/datasets.
- I have modified the code (config is not considered code), or I'm working on my own tasks/models/datasets.
validations:
required: true
- type: textarea
id: environment
validations:
required: true
attributes:
label: Environment
description: |
Please run `python -c "import opencompass.utils;import pprint;pprint.pprint(dict(opencompass.utils.collect_env()))"` to collect necessary environment information and paste it here.
placeholder: |
```python
# The output the above command
```
- type: textarea
attributes:
label: Reproduces the problem - code/configuration sample
description: |
Please provide a code or configuration sample that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
placeholder: |
```python
# Sample code to reproduce the problem
```
validations:
required: true
- type: textarea
attributes:
label: Reproduces the problem - command or script
description: |
What command or script did you run?
placeholder: |
```shell
The command or script you run.
```
validations:
required: true
- type: textarea
attributes:
label: Reproduces the problem - error message
description: |
Please provide the error message or logs you got, with the full traceback.
Tip: You can attach images or log files by dragging them into the text area..
placeholder: |
```
The error message or logs you got, with the full traceback.
```
validations:
required: true
- type: textarea
id: other
attributes:
label: Other information
description: |
Tell us anything else you think we should know.
1. What's your expected result?
2. What dataset did you use?
3. What do you think might be the reason?
name: 🚀 Feature request
description: Suggest an idea for this project
labels: ["enhancement"]
title: "[Feature] "
body:
- type: markdown
attributes:
value: |
For general questions or idea discussions, please post it to our [**Forum**](https://github.com/open-compass/opencompass/discussions).
If you have already implemented the feature, we strongly appreciate you creating a new PR according to [the tutorial](https://opencompass.readthedocs.io/en/master/community/CONTRIBUTING.html)!
- type: textarea
id: describe
validations:
required: true
attributes:
label: Describe the feature
description: |
What kind of feature do you want OpenCompass to add. If there is an official code release or third-party implementation, please also provide the information here, which would be very helpful.
placeholder: |
A clear and concise description of the motivation of the feature.
Ex1. It is inconvenient when \[....\].
Ex2. There is a recent paper \[....\], which is very helpful for \[....\].
- type: checkboxes
id: pr
attributes:
label: Will you implement it?
options:
- label: I would like to implement this feature and create a PR!
name: 🐞 报告 Bug
description: 报告你在使用中遇到的不合预期的情况
labels: ["bug"]
title: "[Bug] "
body:
- type: markdown
attributes:
value: |
我们推荐使用英语模板 Bug report,以便你的问题帮助更多人。
如果需要询问一般性的问题或者想法,请在我们的[**论坛**](https://github.com/open-compass/opencompass/discussions)讨论。
如果你已经有了解决方案,我们非常欢迎你直接创建一个新的 PR 来解决这个问题。创建 PR 的流程可以参考[文档](https://opencompass.readthedocs.io/zh_CN/master/community/CONTRIBUTING.html)。
如果你需要我们的帮助,请填写以下内容帮助我们定位 Bug。
- type: checkboxes
attributes:
label: 先决条件
description: 在创建新问题之前,请检查以下项目。
options:
- label: 我已经搜索过 [问题](https://github.com/open-compass/opencompass/issues/) 和 [讨论](https://github.com/open-compass/opencompass/discussions) 但未得到预期的帮助。
required: true
- label: 错误在 [最新版本](https://github.com/open-compass/opencompass) 中尚未被修复。
required: true
- type: dropdown
id: task
attributes:
label: 问题类型
description: 问题出现时
options:
- 我正在使用官方支持的任务/模型/数据集进行评估。
- 我修改了代码(配置不视为代码),或者我正在处理我自己的任务/模型/数据集。
validations:
required: true
- type: textarea
id: environment
validations:
required: true
attributes:
label: 环境
description: |
请运行 `python -c "import opencompass.utils;import pprint;pprint.pprint(dict(opencompass.utils.collect_env()))"` 来收集必要的环境信息并粘贴在此处。
placeholder: |
```python
# 上述命令的输出
```
- type: textarea
attributes:
label: 重现问题 - 代码/配置示例
description: |
请提供重现您遇到的问题的代码或配置示例。它可以是一个Colab链接或仅仅是一个代码片段。
placeholder: |
```python
# 重现问题的示例代码
```
validations:
required: true
- type: textarea
attributes:
label: 重现问题 - 命令或脚本
description: |
您运行了什么命令或脚本?
placeholder: |
```shell
您运行的命令或脚本。
```
validations:
required: true
- type: textarea
attributes:
label: 重现问题 - 错误信息
description: |
请提供您收到的错误消息或日志,并提供完整的追溯。
提示:您可以通过拖放图片或日志文件到文本区域来附加它们。
placeholder: |
```
您收到的错误消息或日志,带有完整的追溯。
```
validations:
required: true
- type: textarea
id: other
attributes:
label: 其他信息
description: |
告诉我们其他有价值的信息。
1. 你是否对代码或配置文件做了任何改动?
2. 你认为可能的原因是什么?
name: 🚀 功能建议
description: 建议一项新的功能
labels: ["enhancement"]
title: "[Feature] "
body:
- type: markdown
attributes:
value: |
推荐使用英语模板 Feature request,以便你的问题帮助更多人。
如果需要询问一般性的问题或者想法,请在我们的[**论坛**](https://github.com/open-compass/opencompass/discussions)讨论。
如果你已经实现了该功能,我们非常欢迎你直接创建一个新的 PR 来解决这个问题。创建 PR 的流程可以参考[文档](https://opencompass.readthedocs.io/zh_CN/master/community/CONTRIBUTING.html)。
- type: textarea
id: describe
validations:
required: true
attributes:
label: 描述该功能
description: |
你希望 OpenCompass 添加什么功能?如果存在相关的论文、官方实现或者第三方实现,请同时贴出链接,这将非常有帮助。
placeholder: |
简要说明该功能,及为什么需要该功能
例 1. 现在进行 xxx 的时候不方便
例 2. 最近的论文中提出了有一个很有帮助的 xx
- type: checkboxes
id: pr
attributes:
label: 是否希望自己实现该功能?
options:
- label: 我希望自己来实现这一功能,并向 OpenCompass 贡献代码!
blank_issues_enabled: false
contact_links:
- name: 📚 OpenCompass Documentation (官方文档)
url: https://opencompass.readthedocs.io/en/latest/
about: Check if your question is answered in docs
- name: 💬 General questions (寻求帮助)
url: https://github.com/open-compass/opencompass/discussions
about: Ask general usage questions and discuss with other OpenCompass community members
- name: 🌐 Explore OpenCompass (官网)
url: https://opencompass.org.cn/
about: Get know more about OpenCompass
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
## Motivation
Please describe the motivation of this PR and the goal you want to achieve through this PR.
## Modification
Please briefly describe what modification is made in this PR.
## BC-breaking (Optional)
Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
## Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here and update the documentation.
## Checklist
**Before PR**:
- [ ] Pre-commit or other linting tools are used to fix the potential lint issues.
- [ ] Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
- [ ] The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
- [ ] The documentation has been modified accordingly, like docstring or example tutorials.
**After PR**:
- [ ] If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects.
- [ ] CLA has been signed and all committers have signed the CLA in this PR.
This diff is collapsed.
name: 'Link check'
on:
schedule:
# check links at 01:30 a.m. every day
- cron: '30 1 * * *'
workflow_dispatch: # allow manual trigger
jobs:
link-check:
runs-on: ubuntu-latest
steps:
# - uses: actions/checkout@v3
- name: Install linkchecker
run: |
pip install linkchecker
- name: Run linkchecker
run: |
linkchecker https://opencompass.readthedocs.io/ --no-robots -t 30 --no-warnings \
--ignore-url "https://opencompass.readthedocs.io/.*/static/images/opencompass_logo.svg" \
--ignore-url "https://opencompass.readthedocs.io/.*/_static/images/icon-menu-dots.svg" \
--ignore-url "https://opencompass.readthedocs.io/policy" \
--ignore-url "https://opencompass.readthedocs.io/(en|zh_CN)/[0-9a-f]{40}/.*"
name: lint
on: [push, pull_request]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python 3.10
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install pre-commit hook
run: |
pip install pre-commit==3.8.0 mmengine==0.10.5
pre-commit install
- name: Linting
run: pre-commit run --all-files
name: pr_run_test
on:
pull_request:
paths-ignore:
- 'README.md'
- 'README_zh-CN.md'
- 'docs/**'
- 'configs/**'
- 'tools/**'
workflow_dispatch:
schedule:
- cron: '56 22 * * *'
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
CONDA_ENV: pr_test
HF_DATASETS_OFFLINE: 1
HF_EVALUATE_OFFLINE: 1
TRANSFORMERS_OFFLINE: 1
VLLM_USE_MODELSCOPE: false
LMDEPLOY_USE_MODELSCOPE: false
HF_HUB_OFFLINE: 1
CONDA_PATH: /mnt/shared-storage-user/opencompass-shared/qa-llm-cicd/miniconda3
REPORT_ROOT: /mnt/shared-storage-user/opencompass-shared/qa-llm-cicd/eval_report/prtest
COMPASS_DATA_CACHE: /mnt/shared-storage-user/auto-eval-pipeline/opencompass/llmeval/compass_data_cache
HF_DATASETS_CACHE: /mnt/shared-storage-user/auto-eval-pipeline/opencompass/llmeval/hf_cache
HF_HUB_CACHE: /mnt/shared-storage-user/large-model-center-share-weights/hf_hub
KUBEBRAIN_CLUSTER_ENTRY: https://h.pjlab.org.cn
KUBEBRAIN_NAMESPACE: ailab-opencompass
jobs:
pr_run_test:
runs-on: yidian_cu12
timeout-minutes: 45
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Prepare - Install opencompass
run: |
. ${{env.CONDA_PATH}}/bin/activate
conda activate ${{env.CONDA_ENV}}
python3 -m pip uninstall opencompass -y
python3 -m pip install .[full] -i https://pkg.pjlab.org.cn/repository/pypi-proxy/simple/ --trusted-host pkg.pjlab.org.cn --no-cache-dir
conda info --envs
- name: conda env
run: |
. ${{env.CONDA_PATH}}/bin/activate
conda activate ${{env.CONDA_ENV}}
conda info --envs
pip list
lmdeploy check_env
- name: Run test
run: |
. ${{env.CONDA_PATH}}/bin/activate
conda activate ${{env.CONDA_ENV}}
conda info --envs
rjob submit --name=pr-test-${{ github.run_id }}-${{ github.run_attempt }} --charged-group=opencompass_gpu --private-machine=group --group=opencompass_gpu --gpu=2 --cpu=32 --memory=32568 --private-machine=group --image=registry.h.pjlab.org.cn/ailab-puyu/xpuyu:torch-2.6.0-45d96d5f-0607 --env=COMPASS_DATA_CACHE=/mnt/shared-storage-user/auto-eval-pipeline/opencompass/llmeval/compass_data_cache --env=TIKTOKEN_CACHE_DIR=/mnt/shared-storage-user/auto-eval-pipeline/opencompass/llmeval/share_tiktoken --env=HF_HUB_CACHE=/mnt/shared-storage-user/large-model-center-share-weights/hf_hub --env=HF_ENDPOINT=https://hf-mirror.com --env=HF_DATASETS_CACHE=/mnt/shared-storage-user/auto-eval-pipeline/opencompass/llmeval/hf_cache --env=HF_HUB_CACHE=/mnt/shared-storage-user/large-model-center-share-weights/hf_hub --env=CUDA_MODULE_LOADING=EAGER --env=HF_DATASETS_OFFLINE=1 --env=TRANSFORMERS_OFFLINE=1 --env=HF_EVALUATE_OFFLINE=1 --env=HF_HUB_OFFLINE=1 --mount=gpfs://gpfs1/opencompass-shared:/mnt/shared-storage-user/opencompass-shared --mount=gpfs://gpfs1/auto-eval-pipeline:/mnt/shared-storage-user/auto-eval-pipeline --mount=gpfs://gpfs1/large-model-center-share-weights:/mnt/shared-storage-user/large-model-center-share-weights --host-network=True -- bash -exc '/mnt/shared-storage-user/opencompass-shared/qa-llm-cicd/pr_test.sh ${{env.REPORT_ROOT}}/${{ github.run_id }}'
for i in {1..300}; do
current_status=$(rjob get pr-test-${{ github.run_id }}-${{ github.run_attempt }} | grep -oP 'rjob [^:]+: \K[^ ]+')
if [[ $current_status == "Succeeded" || $current_status == "Failed" || $current_status == "Stopped" ]]; then
echo "Current status: $current_status, stop checking"
break
fi
sleep 6
done
- name: Get result
run: |
score=$(sed -n '$p' ${{env.REPORT_ROOT}}/${{ github.run_id }}/regression_result1/*/summary/*.csv | awk -F ',' '{print $NF}')
if (( ${score%.*} >= 75 && ${score%.*} <= 80 )); then
echo "score is $score between 75 and 80"
else
echo "score is $score not between 75 and 80"
exit 1
fi
score=$(sed -n '$p' ${{env.REPORT_ROOT}}/${{ github.run_id }}/regression_result2/*/summary/*.csv | awk -F ',' '{print $NF}')
if (( ${score%.*} >= 75 && ${score%.*} <= 80 )); then
echo "score is $score between 75 and 80"
else
echo "score is $score not between 75 and 80"
exit 1
fi
score=$(sed -n '$p' ${{env.REPORT_ROOT}}/${{ github.run_id }}/regression_result3/*/summary/*.csv | awk -F ',' '{print $NF}')
if (( ${score%.*} >= 75 && ${score%.*} <= 80 )); then
echo "score is $score between 75 and 80"
else
echo "score is $score not between 75 and 80"
exit 1
fi
- name: Uninstall opencompass
if: always()
run: |
. ${{env.CONDA_PATH}}/bin/activate
conda activate ${{env.CONDA_ENV}}
python3 -m pip uninstall opencompass -y
conda info --envs
notify_to_feishu:
if: ${{ always() && !cancelled() && contains(needs.*.result, 'failure') && (github.ref_name == 'develop' || github.ref_name == 'main') }}
needs: [pr_run_test]
timeout-minutes: 5
runs-on: self-hosted
steps:
- name: notify
run: |
curl -X POST -H "Content-Type: application/json" -d '{"msg_type":"post","content":{"post":{"zh_cn":{"title":"Opencompass- pr test failed","content":[[{"tag":"text","text":"branch: ${{github.ref_name}}, run action: ${{github.workflow}} failed. "},{"tag":"a","text":"Please click here for details ","href":"https://github.com/'${{ github.repository }}'/actions/runs/'${GITHUB_RUN_ID}'"},{"tag":"at","user_id":"'${{ secrets.USER_ID }}'"}]]}}}}' ${{ secrets.WEBHOOK_URL }}
name: pr_stage_test
on:
pull_request:
paths-ignore:
- 'README.md'
- 'README_zh-CN.md'
- 'docs/**'
- 'configs/**'
- 'tools/**'
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
build:
runs-on: ubuntu-22.04
strategy:
matrix:
python-version: ['3.10']
include:
- torch: 2.5.1
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Upgrade pip
run: python -m pip install --upgrade pip
- name: Install PyTorch
run: pip install torch==${{matrix.torch}} -f https://download.pytorch.org/whl/cpu/torch_stable.html
- name: Install system dependencies
run: |
sudo sed -i '$ a deb http://th.archive.ubuntu.com/ubuntu jammy main' /etc/apt/sources.list
sudo apt-get update && sudo apt-get install -y libc6 libffi-dev libncursesw6 wget unzip
- name: Upgrade pip
run: python -m pip install pip --upgrade
- name: Install opencompass dependencies
run: |
python -m pip install -r requirements.txt
- name: Build and install
run: python -m pip install -e .
- name: Prepare dataset
run: |
wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
unzip OpenCompassData-core-20240207.zip
- name: Dry run test
run: |
python run.py --models hf_opt_125m --datasets siqa_gen winograd_ppl --dry-run
build_cu117:
runs-on: ubuntu-22.04
container:
image: nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04
strategy:
matrix:
python-version: ['3.10']
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Fetch GPG keys
run: |
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub
- name: Install Python-dev
run: apt-get update && apt-get install -y python${{matrix.python-version}}-dev
if: ${{matrix.python-version != 3.10}}
- name: Install system dependencies
run: |
apt-get update
apt-get install -y ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libxrender-dev libc6 libc6-dev
sed -i '$ a deb http://th.archive.ubuntu.com/ubuntu jammy main' /etc/apt/sources.list
apt-get update && apt-get install -y libc6 libffi-dev libncursesw6 wget unzip
- name: Upgrade pip
run: python -m pip install pip --upgrade
- name: Install opencompass dependencies
run: |
python -m pip install -r requirements.txt
- name: Build and install
run: python -m pip install -e .
- name: Prepare dataset
run: |
wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
unzip OpenCompassData-core-20240207.zip
- name: Dry run test
run: |
python run.py --models hf_opt_125m --datasets siqa_gen winograd_ppl --dry-run
build_windows:
runs-on: windows-2022
strategy:
matrix:
python-version: ['3.10']
platform: [cpu]
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Upgrade pip
run: python -m pip install pip --upgrade
- name: Install PyTorch
run: pip install torch==2.5.1 -f https://download.pytorch.org/whl/cpu/torch_stable.html
- name: Install opencompass dependencies
run: |
pip install -r requirements.txt
- name: Build and install
run: pip install -e .
- name: Prepare dataset
run: |
Invoke-WebRequest -Uri https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip -OutFile OpenCompassData-core-20240207.zip
unzip OpenCompassData-core-20240207.zip
- name: Dry run test
run: |
python run.py --models hf_opt_125m --datasets siqa_gen winograd_ppl --dry-run
name: deploy
on:
push:
workflow_dispatch:
inputs:
confirm_publish:
description: 'Type YES to confirm publishing to PyPI'
required: true
type: string
jobs:
build-n-publish:
runs-on: ubuntu-latest
if: |
github.event_name == 'push' && startsWith(github.event.ref, 'refs/tags') ||
(github.event_name == 'workflow_dispatch' && inputs.confirm_publish == 'YES')
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.10
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Build lagent
run: |
pip install wheel
python setup.py sdist bdist_wheel
- name: Publish distribution to PyPI
run: |
pip install twine
twine upload dist/* -u __token__ -p ${{ secrets.pypi_password }}
.DS_Store
output_*/
outputs/
scripts/
icl_inference_output/
.vscode/
tmp/
configs/eval_subjective_alignbench_test.py
configs/openai_key.py
configs/secrets.py
configs/datasets/log.json
configs/eval_debug*.py
configs/viz_*.py
configs/**/*_bkup.py
opencompass/**/*_bkup.py
data
work_dirs
outputs
models/*
configs/internal/
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
*.ipynb
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
.idea
# Auto generate documentation
docs/en/_build/
docs/zh_cn/_build/
# .zip
*.zip
# sft config ignore list
configs/sft_cfg/*B_*
configs/sft_cfg/1B/*
configs/sft_cfg/7B/*
configs/sft_cfg/20B/*
configs/sft_cfg/60B/*
configs/sft_cfg/100B/*
configs/cky/
configs/_internal_legacy*
# in case llama clone in the opencompass
llama/
# in case ilagent clone in the opencompass
ilagent/
# ignore the config file for criticbench evaluation
configs/sft_cfg/criticbench_eval/*
# path of turbomind's model after runing `lmdeploy.serve.turbomind.deploy`
turbomind/
# cibench output
*.db
*.pth
*.pt
*.onnx
*.gz
*.gz.*
*.png
*.txt
*.jpg
*.json
*.jsonl
*.csv
*.npy
*.c
# aliyun
core.*
assign:
issues: enabled
pull_requests: disabled
strategy:
# random
daily-shift-based
scedule:
'*/1 * * * *'
assignees:
- bittersweet1999
- liushz
- MaiziXiao
- acylam
- tonysy
exclude: |
(?x)^(
tests/data/|
tests/dataset/|
opencompass/models/internal/|
opencompass/utils/internal/|
opencompass/openicl/icl_evaluator/hf_metrics/|
opencompass/datasets/lawbench/utils|
opencompass/datasets/lawbench/evaluation_functions/|
opencompass/datasets/medbench/|
opencompass/datasets/teval/|
opencompass/datasets/NPHardEval/|
opencompass/datasets/TheoremQA|
opencompass/datasets/subjective/mtbench101.py|
docs/zh_cn/advanced_guides/compassbench_intro.md |
docs/zh_cn/advanced_guides/compassbench_v2_0.md |
opencompass/utils/datasets.py |
opencompass/utils/datasets_info.py
)
repos:
- repo: https://gitee.com/openmmlab/mirrors-flake8
rev: 5.0.4
hooks:
- id: flake8
exclude: |
(?x)^(
opencompass/configs/|
examples/
)
- repo: https://gitee.com/openmmlab/mirrors-isort
rev: 5.11.5
hooks:
- id: isort
exclude: |
(?x)^(
opencompass/configs/|
examples/
)
- repo: https://gitee.com/openmmlab/mirrors-yapf
rev: v0.32.0
hooks:
- id: yapf
exclude: |
(?x)^(
opencompass/configs/|
examples/
)
- repo: https://gitee.com/openmmlab/mirrors-codespell
rev: v2.2.1
hooks:
- id: codespell
exclude: |
(?x)^(
.*\.jsonl|
.*\.md.template|
opencompass/configs/ |
examples/
)
- repo: https://gitee.com/openmmlab/mirrors-pre-commit-hooks
rev: v4.3.0
hooks:
- id: trailing-whitespace
exclude: |
(?x)^(
dicts/|
projects/.*?/dicts/|
)
- id: check-yaml
- id: end-of-file-fixer
exclude: |
(?x)^(
dicts/|
projects/.*?/dicts/|
)
- id: requirements-txt-fixer
- id: double-quote-string-fixer
- id: check-merge-conflict
- id: fix-encoding-pragma
args: ["--remove"]
- id: mixed-line-ending
args: ["--fix=lf"]
- repo: https://gitee.com/openmmlab/mirrors-mdformat
rev: 0.7.9
hooks:
- id: mdformat
args: ["--number", "--table-width", "200"]
additional_dependencies:
- mdformat-openmmlab
- mdformat_frontmatter
- linkify-it-py
exclude: configs/
- repo: https://gitee.com/openmmlab/mirrors-docformatter
rev: v1.3.1
hooks:
- id: docformatter
args: ["--in-place", "--wrap-descriptions", "79"]
- repo: local
hooks:
- id: update-dataset-suffix
name: dataset suffix updater
entry: ./tools/update_dataset_suffix.py
language: script
pass_filenames: true
require_serial: true
files: ^opencompass/configs/datasets
- repo: local
hooks:
- id: update-dataset-suffix-pacakge
name: dataset suffix updater(package)
entry: ./tools/update_dataset_suffix.py
language: script
pass_filenames: false
# require_serial: true
# files: ^opencompass/configs/datasets
args:
- --root_folder
- opencompass/configs/datasets
# - repo: https://github.com/open-mmlab/pre-commit-hooks
# rev: v0.2.0 # Use the ref you want to point at
# hooks:
# - id: check-algo-readme
# - id: check-copyright
# args: ["mmocr", "tests", "tools"] # these directories will be checked
\ No newline at end of file
exclude: |
(?x)^(
tests/data/|
tests/dataset/|
opencompass/models/internal/|
opencompass/utils/internal/|
opencompass/openicl/icl_evaluator/hf_metrics/|
opencompass/datasets/lawbench/utils|
opencompass/datasets/lawbench/evaluation_functions/|
opencompass/datasets/medbench/|
opencompass/datasets/matbench/|
opencompass/datasets/teval/|
opencompass/datasets/NPHardEval/|
opencompass/datasets/TheoremQA|
opencompass/datasets/subjective/mtbench101.py|
docs/zh_cn/advanced_guides/compassbench_intro.md |
docs/zh_cn/advanced_guides/compassbench_v2_0.md |
opencompass/utils/datasets.py |
opencompass/utils/datasets_info.py
)
repos:
- repo: https://github.com/PyCQA/flake8
rev: 5.0.4
hooks:
- id: flake8
exclude: |
(?x)^(
opencompass/configs/|
examples/
)
- repo: https://github.com/PyCQA/isort
rev: 5.11.5
hooks:
- id: isort
exclude: |
(?x)^(
opencompass/configs/|
examples/
)
- repo: https://github.com/pre-commit/mirrors-yapf
rev: v0.32.0
hooks:
- id: yapf
exclude: |
(?x)^(
opencompass/configs/|
examples/
)
- repo: https://github.com/codespell-project/codespell
rev: v2.2.1
hooks:
- id: codespell
exclude: |
(?x)^(
.*\.jsonl|
.*\.md.template|
opencompass/configs/ |
examples/
)
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: trailing-whitespace
exclude: |
(?x)^(
dicts/|
projects/.*?/dicts/|
)
- id: check-yaml
- id: end-of-file-fixer
exclude: |
(?x)^(
dicts/|
projects/.*?/dicts/|
)
- id: requirements-txt-fixer
- id: double-quote-string-fixer
- id: check-merge-conflict
- id: fix-encoding-pragma
args: ["--remove"]
- id: mixed-line-ending
args: ["--fix=lf"]
- repo: https://github.com/executablebooks/mdformat
rev: 0.7.9
hooks:
- id: mdformat
args: ["--number", "--table-width", "200"]
additional_dependencies:
- mdformat-openmmlab
- mdformat_frontmatter
- linkify-it-py
exclude: configs/
# - repo: https://github.com/myint/docformatter
# rev: v1.3.1
# hooks:
# - id: docformatter
# args: ["--in-place", "--wrap-descriptions", "79"]
- repo: local
hooks:
- id: update-dataset-suffix
name: dataset suffix updater
entry: ./tools/update_dataset_suffix.py
language: script
pass_filenames: true
require_serial: true
files: ^opencompass/configs/datasets
- repo: local
hooks:
- id: update-dataset-suffix-pacakge
name: dataset suffix updater(package)
entry: ./tools/update_dataset_suffix.py
language: script
pass_filenames: false
# require_serial: true
# files: ^opencompass/configs/datasets
args:
- --root_folder
- opencompass/configs/datasets
# - repo: https://github.com/open-mmlab/pre-commit-hooks
# rev: v0.2.0 # Use the ref you want to point at
# hooks:
# - id: check-algo-readme
# - id: check-copyright
# args: ["mmocr", "tests", "tools"] # these directories will be checked
\ No newline at end of file
Copyright 2020 OpenCompass Authors. All rights reserved.
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2020 OpenCompass Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
recursive-include opencompass/configs *.py *.yml *.json *.txt *.md
recursive-include opencompass/openicl/icl_evaluator/hf_metrics *.py
recursive-include opencompass/datasets *.py *.yml *.json *.txt *.md *.yaml
<div align="center">
<img src="docs/zh_cn/_static/image/logo.svg" width="500px"/>
<br />
<br />
## 🛠️ 安装指南
下面提供了快速安装和数据集准备的步骤。
### 💻 环境搭建
建议使用 `docker` 搭建环境。
- #### 创建容器
```bash
docker run --shm-size 500g --network=host --name=opencompass --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v /path/to/workspace/:/path/to/workspace/ -v /opt/hyhal:/opt/hyhal:ro -it image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.4.1-ubuntu22.04-dtk25.04.1-py3.10 bash
```
- #### 通过pip安装OpenCompass
```bash
# 支持绝大多数数据集及模型
pip install -U opencompass
```
- #### 基于源码安装OpenCompass
如果希望使用 OpenCompass 的最新功能,也可以从源代码构建它:
```bash
cd opencompass
pip install -e .
```
### 📂 数据准备
#### 提前离线下载
OpenCompass支持使用本地数据集进行评测,数据集的下载和解压可以通过以下命令完成:
```bash
# 下载数据集到 data/ 处
wget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
unzip OpenCompassData-core-20240207.zip
```
#### 从 OpenCompass 自动下载
OpenCompass已经支持从存储服务器自动下载数据集。可以通过额外的 `--dry-run` 参数来运行评估以下载这些数据集。
目前支持的数据集列表在[这里](https://github.com/open-compass/opencompass/blob/main/opencompass/utils/datasets_info.py#L259)。更多数据集将会很快上传。
#### (可选) 使用 ModelScope 自动下载
另外,还可以使用[ModelScope](www.modelscope.cn)来加载数据集:
环境准备:
```bash
pip install modelscope
export DATASET_SOURCE=ModelScope
```
配置好环境后,无需下载全部数据,直接提交评测任务即可。目前支持的数据集有:
```bash
humaneval, triviaqa, commonsenseqa, tydiqa, strategyqa, cmmlu, lambada, piqa, ceval, math, LCSTS, Xsum, winogrande, openbookqa, AGIEval, gsm8k, nq, race, siqa, mbpp, mmlu, hellaswag, ARC, BBH, xstory_cloze, summedits, GAOKAO-BENCH, OCNLI, cmnli
```
有部分第三方功能,如 Humaneval 以及 Llama,可能需要额外步骤才能正常运行,详细步骤请参考[安装指南](https://opencompass.readthedocs.io/zh_CN/latest/get_started/installation.html)。
## 🏗️ ️评测
在确保按照上述步骤正确安装了 OpenCompass 并准备好了数据集之后,现在您可以开始使用 OpenCompass 进行首次评估!
- ### 首次评测
OpenCompass 支持通过命令行界面 (CLI) 或 Python 脚本来设置配置。对于简单的评估设置,推荐使用 CLI;而对于更复杂的评估,则建议使用脚本方式。可以在examples文件夹下找到更多脚本示例。
```bash
# 命令行界面 (CLI)
opencompass --models hf_internlm2_5_1_8b_chat --datasets demo_gsm8k_chat_gen
# Python 脚本
opencompass examples/eval_chat_demo.py
```
你可以在[examples](./examples) 文件夹下找到更多的脚本示例。
- ### API评测
OpenCompass 在设计上并不区分开源模型与 API 模型。可以以相同的方式或甚至在同一设置中评估这两种类型的模型。
```bash
export OPENAI_API_KEY="YOUR_OPEN_API_KEY"
# 命令行界面 (CLI)
opencompass --models gpt_4o_2024_05_13 --datasets demo_gsm8k_chat_gen
# Python 脚本
opencompass examples/eval_api_demo.py
```
## 📖 参考资料
https://github.com/open-compass/opencompass
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment