- 16 May, 2024 1 commit
-
-
zhulinJulia24 authored
* Update pr-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update pr-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update daily-run-test.yml * Update oc_score_baseline.yaml * Update daily-run-test.yml * Update oc_score_assert.py --------- Co-authored-by:zhulin1 <zhulin1@pjlab.org.cn>
-
- 15 May, 2024 5 commits
-
-
Fengzhe Zhou authored
* enable HuggingFacewithChatTemplate with --accelerator via cli * rm vllm_internlm2_chat_7b
-
liushz authored
* Update acclerator * update run --------- Co-authored-by:
liuhongwei <liuhongwei@pjlab.org.cn> Co-authored-by:
Fengzhe Zhou <zfz-960727@163.com>
-
Fengzhe Zhou authored
-
Fengzhe Zhou authored
-
bittersweet1999 authored
Co-authored-by:Leymore <zfz-960727@163.com>
-
- 14 May, 2024 4 commits
-
-
Fengzhe Zhou authored
-
Fengzhe Zhou authored
-
Xu Song authored
* [Feat] Support dataset_suffix check for mixed configs * update mixed suffix * update suffix --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
Fengzhe Zhou authored
* add TheoremQA with 5-shot * add huggingface_above_v4_33 classes * use num_worker partitioner in cli * update theoremqa * update TheoremQA * add TheoremQA * rename theoremqa -> TheoremQA * update TheoremQA output path * rewrite many model configs * update huggingface * further update * refine configs * update configs * update configs * add configs/eval_llama3_instruct.py * add summarizer multi faceted * update bbh datasets * update configs/models/hf_llama/lmdeploy_llama3_8b_instruct.py * rename class * update readme * update hf above v4.33
-
- 13 May, 2024 2 commits
-
-
Mo Li authored
* update few-shot example * add 128k
-
bittersweet1999 authored
-
- 11 May, 2024 1 commit
-
-
bittersweet1999 authored
* fix alpacaeval * fix alpacaeval
-
- 09 May, 2024 2 commits
-
-
Fengzhe Zhou authored
(cherry picked from commit 4beb6d9ab655d8a626971841b7acfd9fae9d438f) Co-authored-by:liuhongwei <liuhongwei@pjlab.org.cn>
-
Alexander Lam authored
* added qwen moe and mixtral 8x22 model configs * updated README files news section
-
- 08 May, 2024 3 commits
-
-
Mo Li authored
* update needleinahaystack eval docs * update needlebench summarizer * fix english docs typo
-
bittersweet1999 authored
-
JuhaoLiang authored
* add AceGPT-MMLUArabic benchmark * update readme and fix lint issue * remove unused package * add MMLUArabic zero-shot settings * rename filename and update readme
-
- 06 May, 2024 5 commits
-
-
Fangyu Lei authored
* s3eval_branch * update s3eval
-
Xu Song authored
* [Fix] Fix AGIEval chinese sets * Create agieval_gen_617738.py * [Fix] Fix AGIEval chinese sets * Restore agieval_gen_64afd3.py * Update agieval_gen.py * Create agieval_mixed_0fa998.py * Update agieval_mixed.py
-
Yggdrasill7D6 authored
* add mgsm datasets * fix lint * fix lint * update mgsm * update mgsm * ease code spell * update * update * update --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
klein authored
* [Feature] update drop dataset from openai simple eval * update drop template presentation * update --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
Fengzhe Zhou authored
* add mmlu prompt from simple_evals, openai * return empty str on failure
-
- 30 Apr, 2024 3 commits
-
-
Yang Yong authored
-
Fengzhe Zhou authored
-
Alexander Lam authored
* fixed formatting based on pre-commit tests * fixed typo in comments; reduced the number of models in the eval config * fixed a bug in LLMCompressionDataset, where setting samples=None would result in passing test[:None] to load_dataset * removed unnecessary variable in _format_table_pivot; changed lark_reporter message to English
-
- 29 Apr, 2024 3 commits
-
-
Ikko Eltociear Ashimine authored
requiresments -> requirements
-
bittersweet1999 authored
-
Songyang Zhang authored
* [Update] Update performance of common benchmarks * [Update] Update performance of common benchmarks * [Update] Update performance of common benchmarks
-
- 28 Apr, 2024 5 commits
-
-
liushz authored
* Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Add Math Evaluation with Judge Model Evaluator * Fix Llama-3 meta template * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation * Fix MATH with JudgeLM Evaluation --------- Co-authored-by:liuhongwei <liuhongwei@pjlab.org.cn>
-
bittersweet1999 authored
-
Lyu Han authored
* adapt to lmdeploy v0.4.0 * compatible
-
Yggdrasill7D6 authored
* add flames datasets * fix lint * rm quota * add judgemodel info and fix os path * support flames dataset * support flames dataset --------- Co-authored-by:bittersweet1999 <1487910649@qq.com>
-
Mo Li authored
* update NeedleInAHaystack Test Docs * update docs
-
- 26 Apr, 2024 6 commits
-
-
dmitrysarov authored
* fix output typing, change mutable list to immutable tuple * import missed type * format --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
binary-husky authored
* fix relative path bug * format --------- Co-authored-by:
hmp <505030475@qq.com> Co-authored-by:
Leymore <zfz-960727@163.com>
-
Wang Xingjin authored
* add vllm get_ppl * add vllm get_ppl * format --------- Co-authored-by:
xingjin.wang <xingjin.wang@mihoyo.com> Co-authored-by:
Leymore <zfz-960727@163.com>
-
Haodong Duan authored
* Remove MultiModal * update index.rst * update README * remove mmbench codes * update news --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
Francis-llgg authored
* add gpqa_openai_simple_eval * 触发CI构建 * reorg --------- Co-authored-by:Leymore <zfz-960727@163.com>
-
klein authored
* modify the requirements/runtime.txt: numpy==1.23.4 --> numpy>=1.23.4 * update cibench: dataset and evluation * cibench summarizer bug * update cibench * move extract_code import --------- Co-authored-by:
zhangchuyu@pjlab.org.cn <zhangchuyu@pjlab.org.cn> Co-authored-by:
Leymore <zfz-960727@163.com>
-