- 27 May, 2024 1 commit
-
-
https://github.com/opendatalab/CHARMjxd authored
* support CHARM (https://github.com/opendatalab/CHARM ) reasoning tasks * fix lint error * add dataset card for CHARM * minor refactor * add txt --------- Co-authored-by:
wujiang <wujiang@pjlab.org.cn> Co-authored-by:
Leymore <zfz-960727@163.com>
-
- 24 May, 2024 2 commits
-
-
bittersweet1999 authored
-
klein authored
Fix the bug in drop_gen: wrong import
-
- 21 May, 2024 2 commits
-
-
liushz authored
* Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Fix Llama-3 meta template * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Update acclerator * Update MathBench --------- Co-authored-by:liuhongwei <liuhongwei@pjlab.org.cn>
-
Fengzhe Zhou authored
-
- 17 May, 2024 1 commit
-
-
Fengzhe Zhou authored
-
- 14 May, 2024 4 commits
-
-
Fengzhe Zhou authored
-
Fengzhe Zhou authored
-
Xu Song authored
* [Feat] Support dataset_suffix check for mixed configs * update mixed suffix * update suffix --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
Fengzhe Zhou authored
* add TheoremQA with 5-shot * add huggingface_above_v4_33 classes * use num_worker partitioner in cli * update theoremqa * update TheoremQA * add TheoremQA * rename theoremqa -> TheoremQA * update TheoremQA output path * rewrite many model configs * update huggingface * further update * refine configs * update configs * update configs * add configs/eval_llama3_instruct.py * add summarizer multi faceted * update bbh datasets * update configs/models/hf_llama/lmdeploy_llama3_8b_instruct.py * rename class * update readme * update hf above v4.33
-
- 13 May, 2024 2 commits
-
-
Mo Li authored
* update few-shot example * add 128k
-
bittersweet1999 authored
-
- 08 May, 2024 1 commit
-
-
JuhaoLiang authored
* add AceGPT-MMLUArabic benchmark * update readme and fix lint issue * remove unused package * add MMLUArabic zero-shot settings * rename filename and update readme
-
- 06 May, 2024 5 commits
-
-
Fangyu Lei authored
* s3eval_branch * update s3eval
-
Xu Song authored
* [Fix] Fix AGIEval chinese sets * Create agieval_gen_617738.py * [Fix] Fix AGIEval chinese sets * Restore agieval_gen_64afd3.py * Update agieval_gen.py * Create agieval_mixed_0fa998.py * Update agieval_mixed.py
-
Yggdrasill7D6 authored
* add mgsm datasets * fix lint * fix lint * update mgsm * update mgsm * ease code spell * update * update * update --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
klein authored
* [Feature] update drop dataset from openai simple eval * update drop template presentation * update --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
Fengzhe Zhou authored
* add mmlu prompt from simple_evals, openai * return empty str on failure
-
- 30 Apr, 2024 1 commit
-
-
Alexander Lam authored
* fixed formatting based on pre-commit tests * fixed typo in comments; reduced the number of models in the eval config * fixed a bug in LLMCompressionDataset, where setting samples=None would result in passing test[:None] to load_dataset * removed unnecessary variable in _format_table_pivot; changed lark_reporter message to English
-
- 28 Apr, 2024 2 commits
-
-
liushz authored
* Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Fix Llama-3 meta template * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation --------- Co-authored-by:liuhongwei <liuhongwei@pjlab.org.cn>
-
Yggdrasill7D6 authored
* add flames datasets * fix lint * rm quota * add judgemodel info and fix os path * support flames dataset * support flames dataset --------- Co-authored-by:bittersweet1999 <1487910649@qq.com>
-
- 26 Apr, 2024 4 commits
-
-
Francis-llgg authored
* add gpqa_openai_simple_eval * 触发CI构建 * reorg --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
klein authored
* modify the requirements/runtime.txt: numpy==1.23.4 --> numpy>=1.23.4 * update cibench: dataset and evluation * cibench summarizer bug * update cibench * move extract_code import --------- Co-authored-by:
zhangchuyu@pjlab.org.cn <zhangchuyu@pjlab.org.cn> Co-authored-by:
Leymore <zfz-960727@163.com>
-
bittersweet1999 authored
* support arenahard * support arenahard * support arenahard
-
bittersweet1999 authored
* support openai math evaluation * support openai math evaluation * support openai math evaluation * support math llm judge * support math llm judge
-
- 24 Apr, 2024 1 commit
-
-
Jingming Zhuo authored
* [Feature] Add IFEval * add humaneval prompt from simple_evals, openai
-
- 22 Apr, 2024 2 commits
-
-
Fengzhe Zhou authored
* add TheoremQA with 5-shot * cherry pick from add-huggingface-above-v4.33, good TheoremQA results
-
bittersweet1999 authored
* fix multiround * fix
-
- 19 Apr, 2024 1 commit
-
-
Fengzhe Zhou authored
-
- 12 Apr, 2024 1 commit
-
-
liuwei130 authored
* add ChemBench * update results * molbench -> ChemBench --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
- 09 Apr, 2024 1 commit
-
-
Fengzhe Zhou authored
-
- 07 Apr, 2024 3 commits
-
-
Mo Li authored
* Conflicts: configs/summarizers/needlebench.py * fix lint problems
-
Mo Li authored
* Squashed commit of the following: commit c48ad194c3976dc63d1b60d8c8ab2d5ff9e1cbfe Author: DseidLi <2568818204@qq.com> Date: Tue Apr 2 16:57:43 2024 +0800 add atc_choice commit 3ac6efea29619573e6fac8fa3cce464853dcead0 Merge: 2d4e5597 8e3a9c3 Author: DseidLi <2568818204@qq.com> Date: Tue Apr 2 16:41:38 2024 +0800 Merge branch 'atc_choice' into atc_add_choice commit 8e3a9c396a3e5546d3faf584183f6fd60b974d5e Merge: 150a036 0a6a03fe Author: DseidLi <2568818204@qq.com> Date: Tue Mar 26 04:47:07 2024 +0800 Merge branch 'main' into atc_choice Conflicts: configs/summarizers/needlebench.py opencompass/datasets/needlebench/multi.py opencompass/datasets/needlebench/origin.py opencompass/datasets/needlebench/parallel.py commit 150a036d6d990f26a57c974d1af83d88c31a0f9d Merge: 8d6ac9a 940dd18 Author: DseidLi <2568818204@qq.com> Date: Wed Mar 20 03:49:08 2024 +0800 Merge branch 'needlebench_fix' into atc_choice commit 8d6ac9a1a43b1c9d0f0ea27e7d58968a203ea898 Author: DseidLi <2568818204@qq.com> Date: Wed Mar 20 03:41:49 2024 +0800 optimize needlebench code commit 940dd18a4270f24bc69edd2a780182c68918e1a9 Author: DseidLi <2568818204@qq.com> Date: Wed Mar 20 03:39:46 2024 +0800 fix vllm commit d8be6877bc41051f3edcc0421c462c834c0f1c9a Merge: ecad78a 2527fda Author: DseidLi <2568818204@qq.com> Date: Tue Mar 19 21:07:08 2024 +0800 Merge remote-tracking branch 'origin/add_1M_dataset' into atc_choice commit 2527fda8a546595bcaea1e5261367bc1097faec8 Author: DseidLi <2568818204@qq.com> Date: Tue Mar 19 16:03:40 2024 +0800 add model configs commit 75425acdf80d6d25ee24bb0aa60ac48539262e76 Author: DseidLi <2568818204@qq.com> Date: Tue Mar 19 16:02:15 2024 +0800 add prompt postion args commit 367ba1ba612a8cec5df1f80d5e5ae4e285baf38b Author: DseidLi <2568818204@qq.com> Date: Wed Feb 28 21:40:00 2024 +0800 add Needlebench-1000K configs commit ecad78af14c4bb00fe325779114b384c57ab30bf Author: DseidLi <2568818204@qq.com> Date: Thu Mar 14 22:08:32 2024 +0800 fix atc commit 08772c0787b18872abadc9ffec3223941a5ee0c2 Merge: 9f3f8cf caf1cf8a Author: DseidLi <2568818204@qq.com> Date: Thu Mar 14 22:07:28 2024 +0800 Merge branch 'main' into atc_choice Conflicts: configs/datasets/needlebench/readme.md configs/datasets/needlebench/readme_zh-CN.md configs/summarizers/needlebench.py opencompass/datasets/needlebench/atc.py opencompass/summarizers/needlebench.py commit 9f3f8cfb4452722734d334114ac1d14110e57406 Author: DseidLi <2568818204@qq.com> Date: Thu Mar 14 21:35:53 2024 +0800 add atc-choice test commit 52be7c1202376b4e09821188b826f1a805328129 Author: DseidLi <2568818204@qq.com> Date: Wed Mar 6 02:54:15 2024 +0800 update needlebench randomseed and add vllm qwen14b commit fc1effce596ae2e5ece4933e8cd34aef8e64a6f9 Merge: 4e747ed caf1cf8a Author: DseidLi <2568818204@qq.com> Date: Wed Mar 6 02:51:14 2024 +0800 Merge branch 'main' into add_model_configs commit 31834f9b23af3354ac3581ec86d693d0f05cdd1c Merge: 7dabc82 120bf8b3 Author: DseidLi <2568818204@qq.com> Date: Sun Mar 3 23:29:42 2024 +0800 Merge branch 'main' of https://github.com/open-compass/opencompass into atc_choice commit 4e747ed1988ddbcfcc7fff334601259ade72d363 Author: DseidLi <2568818204@qq.com> Date: Sun Mar 3 22:15:25 2024 +0800 add internlm2-lmdeploy model and gemma configs commit 7dabc828123d711c8cf834d6aab4137bb55e85ed Author: DseidLi <2568818204@qq.com> Date: Sat Mar 2 17:26:15 2024 +0800 add atc choice version -ZH commit 996f8ae43d3f946a052f736717ead139d153e2dd Author: DseidLi <2568818204@qq.com> Date: Wed Feb 28 16:58:56 2024 +0800 update readme for needlebench commit f7266e873cb34ccf18a7f20b2c5821af8416a14f Author: DseidLi <2568818204@qq.com> Date: Wed Feb 28 16:44:53 2024 +0800 move readme.md commit 1c7375681dea13996802e45b878dc4929ea8fa65 Author: DseidLi <2568818204@qq.com> Date: Wed Feb 28 16:38:31 2024 +0800 fix linting error commit b6524f3ebfb8a3a12a5ad3e3fa7a8a0921fcb6c1 Author: DseidLi <2568818204@qq.com> Date: Wed Feb 28 16:33:51 2024 +0800 lint summarizer commit c0d1190e39d3b6724f677346df2572df9af59f25 Author: DseidLi <2568818204@qq.com> Date: Wed Feb 28 16:29:03 2024 +0800 add needlebench intro, fix summarizer commit 0965baf78588e29d813b61d73f0ebd868a0ce3d0 Author: DseidLi <2568818204@qq.com> Date: Mon Feb 26 13:31:26 2024 +0800 fix bug in needlebench summarizer commit 5d32b31eb85382026935f356190ad92b103afd98 Author: DseidLi <2568818204@qq.com> Date: Sat Feb 24 03:19:08 2024 +0800 update act prompt commit af82a7f085e394d83aa84043e2881dd50115942c Merge: 32bf9fe 53fe788d Author: DseidLi <2568818204@qq.com> Date: Fri Feb 23 17:50:32 2024 +0800 Merge remote-tracking branch 'upstream/main' into needlebench commit 32bf9fe802eaf8e8e5b33ff17b2a897058f8b66b Author: DseidLi <2568818204@qq.com> Date: Fri Feb 23 17:31:32 2024 +0800 simplify needlebench 32k, 128k, 200k for eval commit a7cb025e05a48449de9839005fada02bd5bff15a Author: DseidLi <2568818204@qq.com> Date: Fri Feb 23 14:48:58 2024 +0800 add needlebench * fix summarizer * remove repeated code * remove chinese comments -
Mo Li authored
* add needlebench datasets suffix * fix import * update run.py args for summarizer key and dataset suffix * update utils/run.py
-
- 02 Apr, 2024 1 commit
-
-
bittersweet1999 authored
* support multi-model judge and moe judge * test_moe * test_moe * test * add moe judge * support multi-judge-model
-
- 28 Mar, 2024 1 commit
-
-
bittersweet1999 authored
* support alpacaeval_v2 * support alpacaeval * update docs * update docs
-
- 25 Mar, 2024 1 commit
-
-
Mo Li authored
* add Needlebench-1000K configs * add prompt postion args * add model configs * Update parallel.py * fix lint
-
- 19 Mar, 2024 3 commits
-
-
Connor-Shen authored
* [Feature] update apps/taco * [Feature] update apps/taco
-
Connor-Shen authored
* update post process * update post process
-
Connor-Shen authored
* [Feat] Support TACO * update README * update README
-