- 08 Jan, 2024 1 commit
-
-
Yuchen Yan authored
Co-authored-by:yanyuchen04 <yanyuchen04@meituan.com>
-
- 05 Jan, 2024 2 commits
-
-
Connor-Shen authored
* support mbpp+ * support mbpp+ * minor fix * [Feat] minor fix --------- Co-authored-by:yingfhu <yingfhu@gmail.com>
-
bittersweet1999 authored
* add subject ir * Add ir dataset * Add ir dataset
-
- 04 Jan, 2024 1 commit
-
-
bittersweet1999 authored
* multi_round dataset * add multi_round evaluation
-
- 02 Jan, 2024 1 commit
-
-
Mo Li authored
* Add NeedleInAHaystack Test * Apply pre-commit formatting * Update configs/eval_hf_internlm_chat_20b_cdme.py Co-authored-by:
Songyang Zhang <tonysy@users.noreply.github.com> * add needle in haystack test * update needle in haystack test * update plot function in tools_needleinahaystack.py * optimizing needleinahaystack dataset generation strategy * modify minor formatting issues * add English version support * change NeedleInAHaystackDataset to dynamic loading * change NeedleInAHaystackDataset to dynamic loading * fix needleinahaystack test eval bug * fix needleinahaystack config bug --------- Co-authored-by:
Songyang Zhang <tonysy@users.noreply.github.com>
-
- 01 Jan, 2024 2 commits
-
-
Francis-llgg authored
* check * message * add * change prompt * change a para nameq * modify name of the file * delete an useless file
-
Francis-llgg authored
* add new dataset mastermath2024v1 * change it to simplified chinese prompt * change file name
-
- 29 Dec, 2023 3 commits
-
-
Mo Li authored
* Add NeedleInAHaystack Test * Apply pre-commit formatting * Update configs/eval_hf_internlm_chat_20b_cdme.py Co-authored-by:
Songyang Zhang <tonysy@users.noreply.github.com> * add needle in haystack test * update needle in haystack test * update plot function in tools_needleinahaystack.py * optimizing needleinahaystack dataset generation strategy * modify minor formatting issues --------- Co-authored-by:
Songyang Zhang <tonysy@users.noreply.github.com>
-
Hubert authored
* [Feat] update code dataset * [Feat] update code dataset * [Feat] update code dataset
-
bittersweet1999 authored
-
- 28 Dec, 2023 2 commits
-
-
bittersweet1999 authored
-
Connor-Shen authored
* add chinese_version of humaneval,mbpp * add humaneval&mbpp gen.py * minor fix * minor add --------- Co-authored-by:yingfhu <yingfhu@gmail.com>
-
- 27 Dec, 2023 3 commits
-
-
Hubert authored
-
bittersweet1999 authored
* add judgellm prompts * add judgelm prompts * update import info * fix situation that no abbr in config * fix situation that no abbr in config * add summarizer for other judgellm * change config name * add maxlen * add maxlen * dict assert * dict assert * fix strings * fix strings
-
Yang Yong authored
* Update LightllmApi and Fix mmlu bug * checkout mmlu_gen_a484b3.py --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
- 26 Dec, 2023 1 commit
-
-
philipwangOvO authored
* add InfiniteBench * add InfiniteBench --------- Co-authored-by:wangchonghua <wangchonghua@pjlab.org.cn>
-
- 25 Dec, 2023 1 commit
-
-
Fengzhe Zhou authored
-
- 23 Dec, 2023 1 commit
-
-
Mo Li authored
* Add NeedleInAHaystack Test * Apply pre-commit formatting * Update configs/eval_hf_internlm_chat_20b_cdme.py Co-authored-by:
Songyang Zhang <tonysy@users.noreply.github.com> * add needle in haystack test * update needle in haystack test --------- Co-authored-by:
Songyang Zhang <tonysy@users.noreply.github.com>
-
- 20 Dec, 2023 2 commits
-
-
Skyfall-xzz authored
* [Feature] Add reasonbench dataset * add configs for supporting generative inference & merge datasets in the same category * modify config filename to prompt version * fix codes to meet pre-commit requirements * lint the code to meet pre-commit requirements * Align Load_data Sourcecode Briefly * fix bugs * reduce code redundancy
-
Jingming authored
* [Feature] Support the use of humaneval_plus. * [Feature] Add humaneval_plus_gen.py * minor check * [Fix] Fix bug --------- Co-authored-by:yingfhu <yingfhu@gmail.com>
-
- 19 Dec, 2023 2 commits
-
-
bittersweet1999 authored
-
Hubert authored
* minor add * minor add * minor fix
-
- 14 Dec, 2023 1 commit
-
-
Songyang Zhang authored
* update alignmentbench * update alignmentbench * update alignmentbench
-
- 13 Dec, 2023 1 commit
-
-
bittersweet1999 authored
* alignmentbench infer and judge * alignmentbench * alignmentbench done * alignment all done * alignment all done
-
- 12 Dec, 2023 1 commit
-
-
bittersweet1999 authored
[Feature] Add double order of subjective evaluation and removing duplicated response among two models (#692) * add features * add doc string * add doc string
-
- 11 Dec, 2023 2 commits
-
-
bittersweet1999 authored
* new version of subject * fixed draw * fixed draw * fixed draw * done * done * done * done * fixed lint
-
Hubert authored
-
- 09 Dec, 2023 1 commit
-
-
Xiaoming Shi authored
* update medbench * medbench update * format medbench * format --------- Co-authored-by:
施晓明 <PJLAB\shixiaoming@pjnl104220118l.pjlab.org> Co-authored-by:
Leymore <zfz-960727@163.com>
-
- 08 Dec, 2023 1 commit
-
-
liyucheng09 authored
* add contamination analysis to ceval * fix bugs * add contamination docs * to pass CI check * update --------- Co-authored-by:
zhangyifan1 <zhangyifan1@pjlab.org.cn> Co-authored-by:
Leymore <zfz-960727@163.com>
-
- 06 Dec, 2023 1 commit
-
-
bittersweet1999 authored
* TabMWP * TabMWP * fixed * fixed * fixed * done * done * done * add new subjective judgement * add new subjective judgement * add new subjective judgement * add new subjective judgement * add new subjective judgement * modified to a more general way * modified to a more general way * final * final * add summarizer * add new summarize * fixed * fixed * fixed --------- Co-authored-by:caomaosong <caomaosong@pjlab.org.cn>
-
- 01 Dec, 2023 4 commits
-
-
rolellm authored
* added rolebench * 修改了不合理的变量名 * 修改了评论中的变量名
-
liushz authored
* Update MathBench CodeInterpreter & fix MathBench Bug * Fix errors * update --------- Co-authored-by:
liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by:
Fengzhe Zhou <zfz-960727@163.com>
-
Hubert authored
* [Feat] update gsm8k and math agent config * minor fix
-
liushz authored
* Add WikiBench * Add WikiBench * format --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
- 30 Nov, 2023 2 commits
-
-
liushz authored
* add Chinese version: csqa crowspairs nq * Update cn_data * Update cn_data * update format --------- Co-authored-by:
liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by:
Leymore <zfz-960727@163.com>
-
Ma Zerun authored
* [Feature] Support chat style inferencer. * [Fix] use new prompt * [Fix] use new prompt --------- Co-authored-by:yingfhu <yingfhu@gmail.com>
-
- 29 Nov, 2023 1 commit
-
-
Fengzhe Zhou authored
-
- 27 Nov, 2023 2 commits
-
-
liushz authored
* Add SVAMP dataset * Add SVAMP dataset * Add SVAMP dataset * Add gsm_hard dataset * Add gsm_hard dataset * format --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
Fengzhe Zhou authored
-
- 23 Nov, 2023 1 commit
-
-
Fengzhe Zhou authored
-