- 22 Nov, 2024 1 commit
-
-
Baber Abbasi authored
-
- 20 Nov, 2024 1 commit
-
-
Baber Abbasi authored
* fix test task * dont call lm.chat_template each time
-
- 18 Nov, 2024 3 commits
-
-
Kozzy Voudouris authored
* Add metabench (Kipnis et al. 2024) * Update metabench tasks for full replication of original benchmarks, using publicly available datasets * Remove unnecessary import * Add permute versions of each task, where the answer orders are randomly shuffled. * Add metabench group for easier evaluations * Fix mmlu counts after removing duplicate * Add secondary datasets * Fix f-string error * Fix f-string error for permute processing * Add original hash to outputs for easy matching to original results * Add line break at end of utils files * Remove extra line from winogrande * Reformat for linters * fix multiple input test * appease pre-commit * Add metabench to tasks README * fix multiple input `test_doc_to_text` --------- Co-authored-by:Baber <baber@hey.com>
-
Baber Abbasi authored
-
Baber Abbasi authored
* add hf mamba to mamba_lm * fix _model_generate for hf
-
- 16 Nov, 2024 2 commits
-
-
Wonseok Hwang authored
* release kbl-v0.1 * fix linting * remove rag tasks as doc_to_text functions cause trouble * remove remaining rag tasks * remove unnecessary repeat in yaml files and rag dataset in hf-hub * remove unncessary newline; introduce cfg files in lbox/kbl in hf * Make task yaml files consistent to hf-datasets-config * Make task yaml files consistent to hf-datasets-config * Remove trailing empty space in doc-to-text * Remove unncessary yaml file * Fix task nameing error * trailing space removed
-
Baber Abbasi authored
* pre-commit update * update github actions * make logging less verbose * fix artifacts
-
- 15 Nov, 2024 2 commits
-
-
Oyvind Tafjord authored
-
Nikodem Szwast authored
* refactor code, fix config path bug * update types to be from typing lib * add pre-commit formatting * specify version of ibm_watsonx_ai package * adjust get_watsonx_credentials() function, add minor refactor to adress PR review comments * change missing installation hint from ibm_watsonx_ai to lm_eval[ibm_watsonx_ai]
-
- 12 Nov, 2024 1 commit
-
-
Alex Titterton authored
-
- 11 Nov, 2024 2 commits
-
-
Baber Abbasi authored
-
Baber Abbasi authored
* batch commit * :Revert "batch commit" This reverts commit d859d1ca . * batch commit * checkout from main * checkout from main * checkout from main * checkout from main * checkout from main * cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * cleanup * Chat template fix (#7) * cleanup * cleanup * cleanup * linting * fix tests * add ifeval install to new_task CI * Revert "add ifeval install to new_task CI" This reverts commit 1d19449bb7fbfa05d51e7cd20950475eae533bf1. * adds leaderboard tasks (#1) * adds leaderboard tasks * Delete lm_eval/tasks/leaderboard/leaderboard_chat_template.yaml * add readme * Delete lm_eval/tasks/leaderboard/mmlu_pro/mmlu_pro_chat_template.yaml * modify readme * fix bbh task * fix bbh salient task * modify the readme * Delete lm_eval/tasks/leaderboard/ifeval/README.md * Delete lm_eval/tasks/leaderboard/math/README.md * add leaderboard to the tasks repertory * add anouncment about new leaderbaord tasks * linting * Update README.md Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> * installs ifeval dependency in new_task github workflow --------- Co-authored-by:
Nathan Habib <nathan.habib@huggingface.com> Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> * fix math parser * fix math parser * fix version * add warning about chat template --------- Co-authored-by:
Nathan Habib <nathan.habib@huggingface.co> Co-authored-by:
Nathan Habib <30601243+NathanHB@users.noreply.github.com> Co-authored-by:
Nathan Habib <nathan.habib@huggingface.com> Co-authored-by:
Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com> Co-authored-by:
Nathan Habib <nathan.habib19@gmail.com>
-
- 09 Nov, 2024 2 commits
-
-
Baber Abbasi authored
* download nltk `punkt_tab` on LOCAL_RANK=0 * remove print * remove `time` * nit
-
Baber Abbasi authored
* switch `max_tokens` for `max_completion_tokens`. OpenAI ChatCompletions * remove stop, temp=1 for o1 * add chat assertion * HF_DATASETS_TRUST_REMOTE_CODE = True for task tests * move warning
-
- 07 Nov, 2024 3 commits
-
-
Baber Abbasi authored
* pass device_map other than auto for parallelize
-
Baber Abbasi authored
-
Baber Abbasi authored
-
- 06 Nov, 2024 1 commit
-
-
Rob Geada authored
-
- 05 Nov, 2024 3 commits
-
-
mtkachenko authored
* add jaqket_v2 and jcommonsenseqa * remove comments * remove num_beams as it is incompatible with vllm * add jnli + refactor * rename jnla -> jnli * add jsquad + replace colon chars with the Japanese unicode * ignore whitespaces in generation tasks * add marc_ja * add xwinograd + simplify other yamls * add mgsm and xlsum * refactor xlsum * add ja_leaderboard tag * edit README.md * update README.md * add credit + minor changes * run ruff format * address review comments + add group * remove aggregate_metric_list * remove tags * update tasks/README.md
-
zxcvuser authored
* Modify label errors in catcola and paws * Update version to 1.0 in pawsx_template_yaml * add changelog --------- Co-authored-by:Baber <baber@hey.com>
-
Sypherd authored
-
- 04 Nov, 2024 1 commit
-
-
Hailey Schoelkopf authored
-
- 01 Nov, 2024 1 commit
-
-
Sypherd authored
-
- 31 Oct, 2024 1 commit
-
-
Qubitium-ModelCloud authored
* support gptqmodel * code opt * add gptqmodel option * Update huggingface.py * Update pyproject.toml * gptqmodel version upgraded to 1.0.6 * GPTQModel version upgraded to 1.0.8 * Update pyproject.toml * fix ruff-format error * add gptqmodel test * Update gptqmodel test model * skip cuda * python3.8 compatible * Update README.md * Update README.md --------- Co-authored-by:CL-ModelCloud <cl@modelcloud.ai>
-
- 30 Oct, 2024 3 commits
-
-
Samuel Monson authored
-
zxcvuser authored
* Add xquad task * Update general README * Run pre-commit
-
Chris Kerwell Gresla authored
* fix: use lora_request for data parallel vllm evals * fix(docs): include type hint * chore: lint, et pre-commit al --------- Co-authored-by:Chris Kerwell Gresla <chris@wafer.systems>
-
- 25 Oct, 2024 1 commit
-
-
Kiersten Stokes authored
* Update pyproject.toml with watsonx package extra Signed-off-by:
kiersten-stokes <kierstenstokes@gmail.com> * Remove unused function Signed-off-by:
kiersten-stokes <kierstenstokes@gmail.com> --------- Signed-off-by:
kiersten-stokes <kierstenstokes@gmail.com>
-
- 23 Oct, 2024 1 commit
-
-
Nikodem Szwast authored
* add support for IBM watsonx_llm * add ibm_watsonx_ai package to optional-dependencies * move global scope imports to inner scope * change cache to lru_cache * fix circular import * use 3.8 typing * use 3.8 typing --------- Co-authored-by:Baber <baber@hey.com>
-
- 22 Oct, 2024 2 commits
-
-
Leonid Sinev authored
* Replace generic exception classes with a more specific ones * rerun pre-commit to pass linter tests * Revert "rerun pre-commit to pass linter tests" This reverts commit 67f88ccf144469853217704520e613196042d859. * reduce repetitions in errors or so * Replace generic exception class with a more specific one
-
Iker García-Ferrero authored
Update prompt according to: https://github.com/ikergarcia1996/NoticIA/blob/main/prompts.py
-
- 20 Oct, 2024 1 commit
-
-
Yuxian Gu authored
-
- 17 Oct, 2024 2 commits
- 16 Oct, 2024 1 commit
-
-
zxcvuser authored
* added tasks to spanish_bench * fixed capitalization in escola and run pre-commit * Update _flores_common_yaml * Update _flores_common_yaml * Update direct_yaml * Update cot_yaml * Update cot_yaml * Update _flores_common_yaml --------- Co-authored-by:Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
-
- 14 Oct, 2024 1 commit
-
-
Elron Bandel authored
* Add Unitxt Multimodality Support Signed-off-by:
elronbandel <elronbandel@gmail.com> * Update Signed-off-by:
elronbandel <elronbandel@gmail.com> * Fix formatting Signed-off-by:
elronbandel <elronbandel@gmail.com> --------- Signed-off-by:
elronbandel <elronbandel@gmail.com>
-
- 08 Oct, 2024 4 commits
-
-
Hailey Schoelkopf authored
-
Hailey Schoelkopf authored
-
Baber Abbasi authored
* max_images are passed on to vllms `limit_mm_per_prompt` * replace max image placeholders in string * handle chat_template error * move `fewshot_random_seed` to global
-
Baber Abbasi authored
* switch conditional checks to `self.backend` * nit * nit * commit feedback * fix test; update precommit hooks * add escape hatch for custom self.AUTO_MODEL_CLASS * add escape hatch for custom self.AUTO_MODEL_CLASS * fix * move assertion * add logging messages * update AUTO_MODEL_CLASS behavior in _get_backend --------- Co-authored-by:haileyschoelkopf <hailey@eleuther.ai>
-