Commits · 07bd7e23cb4a4ac8d787314ee189e17bbfa3a142 · gaoqiong / lm-evaluation-harness

11 Mar, 2025 1 commit
- initialize tokenizer with bos_token (#2781) · 07bd7e23
  Baber Abbasi authored Mar 11, 2025
  
  07bd7e23
21 Feb, 2025 1 commit

Lintang Sutawika authored Feb 20, 2025



* changed source of eval_logger

* allow eval_logger to be set from args

* removed verbosity arg from non-main methods

* fix logging

* pre-commit

* set verbosity in eval logger

* replace utils.eval_logger

* fix logging in main

* add logging to docs

* add logging message

* nit

* add logging to docs

* refactor setup_logging to utils

---------
Co-authored-by: Baber <baber@hey.com>

1ba35e62

19 Jan, 2025 1 commit
- update pre-commit (#2632) · f724be69
  Baber Abbasi authored Jan 19, 2025
```
* update pre-commit
```
  f724be69
15 Jan, 2025 1 commit

assistant prefill (#2615) · 703fbffd

Baber Abbasi authored Jan 15, 2025

* add assistant prefix

* add arc_challenge from llama

* nit

* nit

* nit

* add assistant prefix

* add mmlu_llama

* nit

* nit

* Revert "nit"

This reverts commit 6a97f8356237305e375212b966b30e8de59dd4bc.

* fix regex bug

* add assistant_prefix to vllm

* add `Question:`

* add mmlu_pro

* add fewshot assistant_prefix

* use `assistant_prefill`

* typehints

* nits

* nits

* add to docs

* add readme

703fbffd

07 Jan, 2025 1 commit

Fix gguf loading via Transformers (#2596) · 16cfe464

CL-ModelCloud authored Jan 07, 2025



* hf support load gguf file

* code review

* code review

* code clean up

* note about use_fast compat with gguf

---------
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>

16cfe464

19 Dec, 2024 1 commit
- add warning for truncation (#2585) · 6ccd520f
  Baber Abbasi authored Dec 19, 2024
```
* add warning for truncation
```
  6ccd520f
16 Dec, 2024 1 commit

batch `loglikelihood_rolling` across requests (#2559) · 0bfb0220

Baber Abbasi authored Dec 16, 2024

* batch all rolling token windows

* nit

* copy to vllm

* fix max_length for `get_rolling_token_windows`

* bugfix

* bugfix

* add type hints

0bfb0220

30 Nov, 2024 1 commit
- make utility function to handle `until` (#2518) · 0230356c
  Baber Abbasi authored Nov 30, 2024
```
* make utility function to handle `until`

* fix text
```
  0230356c
16 Nov, 2024 1 commit
- update pre-commit hooks and git actions (#2497) · badf273a
  Baber Abbasi authored Nov 16, 2024
```
* pre-commit update

* update github actions

* make logging less verbose

* fix artifacts
```
  badf273a
11 Nov, 2024 2 commits

change warning to debug (#2481) · 6b628d9a
Baber Abbasi authored Nov 11, 2024

6b628d9a

Fix chat template; fix leaderboard math (#2475) · 77c811ea

Baber Abbasi authored Nov 11, 2024

* batch commit

* :Revert "batch commit"

This reverts commit d859d1ca

.

* batch commit

* checkout from main

* checkout from main

* checkout from main

* checkout from main

* checkout from main

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* Chat template fix (#7)

* cleanup

* cleanup

* cleanup

* linting

* fix tests

* add ifeval install to new_task CI

* Revert "add ifeval install to new_task CI"

This reverts commit 1d19449bb7fbfa05d51e7cd20950475eae533bf1.

* adds leaderboard tasks (#1)

* adds leaderboard tasks

* Delete lm_eval/tasks/leaderboard/leaderboard_chat_template.yaml

* add readme

* Delete lm_eval/tasks/leaderboard/mmlu_pro/mmlu_pro_chat_template.yaml

* modify readme

* fix bbh task

* fix bbh salient task

* modify the readme

* Delete lm_eval/tasks/leaderboard/ifeval/README.md

* Delete lm_eval/tasks/leaderboard/math/README.md

* add leaderboard to the tasks repertory

* add anouncment about new leaderbaord tasks

* linting

* Update README.md
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* installs ifeval dependency in new_task github workflow

---------
Co-authored-by: Nathan Habib <nathan.habib@huggingface.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* fix math parser

* fix math parser

* fix version

* add warning about chat template

---------
Co-authored-by: Nathan Habib <nathan.habib@huggingface.co>
Co-authored-by: Nathan Habib <30601243+NathanHB@users.noreply.github.com>
Co-authored-by: Nathan Habib <nathan.habib@huggingface.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Nathan Habib <nathan.habib19@gmail.com>

77c811ea

07 Nov, 2024 1 commit
- pass device_map other than auto for parallelize (#2457) · 4155ec7f
  Baber Abbasi authored Nov 07, 2024
```
* pass device_map other than auto for parallelize
```
  4155ec7f
31 Oct, 2024 1 commit

Add GPTQModel support for evaluating GPTQ models (#2217) · 4f8e479e

Qubitium-ModelCloud authored Nov 01, 2024



* support gptqmodel

* code opt

* add gptqmodel option

* Update huggingface.py

* Update pyproject.toml

* gptqmodel version upgraded to 1.0.6

* GPTQModel version upgraded to 1.0.8

* Update pyproject.toml

* fix ruff-format error

* add gptqmodel test

* Update gptqmodel test model

* skip cuda

* python3.8 compatible

* Update README.md

* Update README.md

---------
Co-authored-by: CL-ModelCloud <cl@modelcloud.ai>

4f8e479e

22 Oct, 2024 1 commit

[Fix] Replace generic exception classes with a more specific ones (#1989) · d4ae9635

Leonid Sinev authored Oct 22, 2024

* Replace generic exception classes with a more specific ones

* rerun pre-commit to pass linter tests

* Revert "rerun pre-commit to pass linter tests"

This reverts commit 67f88ccf144469853217704520e613196042d859.

* reduce repetitions in errors or so

* Replace generic exception class with a more specific one

d4ae9635

08 Oct, 2024 1 commit

HF: switch conditional checks to `self.backend` from `AUTO_MODEL_CLASS` (#2353) · ab2c46c3

Baber Abbasi authored Oct 09, 2024



* switch conditional checks to `self.backend`

* nit

* nit

* commit feedback

* fix test; update precommit hooks

* add escape hatch for custom self.AUTO_MODEL_CLASS

* add escape hatch for custom self.AUTO_MODEL_CLASS

* fix

* move assertion

* add logging messages

* update AUTO_MODEL_CLASS behavior in _get_backend

---------
Co-authored-by: haileyschoelkopf <hailey@eleuther.ai>

ab2c46c3

13 Sep, 2024 1 commit

Multimodal prototyping (#2243) · fb963f0f

Lintang Sutawika authored Sep 13, 2024



* add WIP hf vlm class

* add doc_to_image

* add mmmu tasks

* fix merge conflicts

* add lintang's changes to hf_vlms.py

* fix doc_to_image

* added yaml_path for config-loading

* revert

* add line to process str type v

* update

* modeling cleanup

* add aggregation for mmmu

* rewrite MMMU processing code based on only MMMU authors' repo (doc_to_image still WIP)

* implemented doc_to_image

* update doc_to_image to accept list of features

* update functions

* readd image processed

* update args process

* bugfix for repeated images fed to model

* push WIP loglikelihood code

* commit most recent code (generative ; qwen2-vl testing)

* preliminary image_token_id handling

* small mmmu update: some qs have >4 mcqa options

* push updated modeling code

* use processor.apply_chat_template

* add mathvista draft

* nit

* nit

* ensure no footguns in text<>multimodal LM<>task incompatibility

* add notification to readme regarding launch of prototype!

* fix compatibility check

* reorganize mmmu configs

* chat_template=None

* add interleave chat_template

* add condition

* add max_images; interleave=true

* nit

* testmini_mcq

* nit

* pass image string; convert img

* add vllm

* add init

* vlm add multi attr

* fixup

* pass max images to vllm model init

* nit

* encoding to device

* fix HFMultimodalLM.chat_template ?

* add mmmu readme

* remove erroneous prints

* use HFMultimodalLM.chat_template ; restore tasks/__init__.py

* add docstring for replace_placeholders in utils

* fix `replace_placeholders`; set image_string=None

* fix typo

* cleanup + fix merge conflicts

* update MMMU readme

* del mathvista

* add some sample scores

* Update README.md

* add log msg for image_string value

---------
Co-authored-by: haileyschoelkopf <hailey@eleuther.ai>
Co-authored-by: Baber Abbasi <baber@eleuther.ai>
Co-authored-by: Baber <baber@hey.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

fb963f0f

04 Sep, 2024 1 commit

Chat Template fix (cont. #2235) (#2269) · 7a1614eb

Baber Abbasi authored Sep 04, 2024



* default chat template method fix

* move chat_template to TemplateLM

* remove hotfix

* handle openai `chat_template`

* Update lm_eval/api/model.py
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* add 'max_tokens' to gen_kwargs

* pre-commit

---------
Co-authored-by: KonradSzafer <szafer.konrad@gmail.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

7a1614eb

28 Aug, 2024 1 commit

Fix `loglikelihood_rolling` caching ( #1821 ) (#2187) · 8138fd52

Hailey Schoelkopf authored Aug 28, 2024



* fix revision type

* allow for None-input loglikelihood reqs to be cached

* handle no remaining cache items

* pre-commit

* change cache_hook.add_partial(loglikelihood_rolling...) convention

---------
Co-authored-by: Baber Abbasi <baber@eleuther.ai>

8138fd52

22 Aug, 2024 1 commit
- Fix logging when resizing embedding layer in peft mode (#2239) · e9287fce
  Wessel Poelman authored Aug 22, 2024
  
  e9287fce
20 Aug, 2024 1 commit

Add multiple chat template (#2129) · 3740a5d2

KonradSzafer authored Aug 20, 2024



* multiple chat template support

* help doc update

* add transformers link to docstring

* model args update

* comment update

* statement simplification

* simplified chat_template property

* docs update

* removed template arg from HFLM class

* interface doc update

* model guide update

* interface doc update

* reuse apply_chat_template variable

* model guide refactor

* interface doc update

* removed old definition

* last nits

* last nits

* last nits

* better wording

* last nits

* Remove unnecessary Optional

* Apply suggestions from code review
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* return variable rename

---------
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

3740a5d2

05 Aug, 2024 2 commits

fix revision type (#2184) · 7ff13e9e
Hailey Schoelkopf authored Aug 05, 2024

7ff13e9e

Dp and mp support (#2056) · 0ce7734d

Nathan Habib authored Aug 05, 2024

* batch commit

* :Revert "batch commit"

This reverts commit d859d1ca

.

* batch commit

* checkout from main

* checkout from main

* checkout from main

* checkout from main

* checkout from main

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* cleanup

* linting

* add doc

* Update lm_eval/models/huggingface.py
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* Update README.md
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* Update lm_eval/models/huggingface.py

* linter

* Apply suggestions from code review
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* style

* remove prepare

* fix

* style

* last check

* Update lm_eval/models/huggingface.py
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

---------
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
Co-authored-by: clementine@huggingface.co <clementine@huggingface.co>

0ce7734d

15 Jul, 2024 1 commit
- make recurrent_gemma model types included in the force-BOS case (#2105) · 9884ad6e
  Hailey Schoelkopf authored Jul 15, 2024
  
  9884ad6e
02 Jul, 2024 1 commit
- update gemma-2 default BOS behavior (#2049) · 67a990e7
  Hailey Schoelkopf authored Jul 01, 2024
  
  67a990e7
28 Jun, 2024 1 commit

Add chat template to `vllm` (#2034) · cc2d3463

Baber Abbasi authored Jun 28, 2024



* add chat template

* refactor token padding

* nit

* nit

* check on failing test

* check transformers version

* remove transformers pin

* add ids to test

* nit

* fixup

* fix bos bug

* nit

* fixup! fix bos bug

* increase tolerance for table test

* don't detokenize vllm logprobs

* Update lm_eval/models/utils.py
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* pre-commit run --all-files

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

cc2d3463

03 Jun, 2024 1 commit

Add chat template (#1873) · 070d31df

KonradSzafer authored Jun 03, 2024



* initial chat template

* tokenizer attribute check

* variable rename

* interface update

* system instruction

* system inst default update

* fewshot as multiturn

* typing update

* indent update

* added comments

* Adding a fewshot in a more readable way

* linting

* Moved apply chat template to LM

* multiturn alternation fix

* cache key update

* apply chat template method fix

* add system prompt hash to cache_key

* tokenizer name property for cache_key

* property name fix

* linting backward compatibility fix

* docs and errors update

* add documentation on adding chat template compatibility to model_guide

* fewshot as multiturn check fix

* saving system inst and chat template in results

* eval tracker update

* docs update

* Apply suggestions from code review
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

---------
Co-authored-by: haileyschoelkopf <hailey@eleuther.ai>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

070d31df

30 May, 2024 1 commit

[HFLM]Add support for Ascend NPU (#1886) · 8f716817

Huazhong Ji authored May 31, 2024



* [HFLM]Add support for Ascend NPU
Co-authored-by: jiaqiw09 <jiaqiw960714@gmail.com>
Co-authored-by: zhabuye <2947436155@qq.com>

* bump accelerate dependency version to 0.26.0 for NPU compat.

---------
Co-authored-by: jiaqiw09 <jiaqiw960714@gmail.com>
Co-authored-by: zhabuye <2947436155@qq.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

8f716817

24 May, 2024 1 commit
- [HFLM]Use Accelerate's API to reduce hard-coded CUDA code (#1880) · c4c15917
  Huazhong Ji authored May 24, 2024
  
  c4c15917
19 May, 2024 1 commit

Fix: support PEFT/LoRA with added tokens (#1828) · 86319a9b

Nick Doiron authored May 19, 2024



* resize model embeddings

* resize only

* tokenizer help

* load tokenizer before model

* add comment and run precommit lint

* Add log message
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

86319a9b

07 May, 2024 2 commits
- Logging Updates (Alphabetize table printouts, fix eval tracker bug) (#1774) (#1791) · d4a913c4
  Hailey Schoelkopf authored May 07, 2024
```
* fix auto-batch size bug for seq2seq models

* alphabetize task + group tables ; fix eval tracker bug

* fix eval tracker bug
```
  d4a913c4
- Fix Caching Tests ; Remove `pretrained=gpt2` default (#1775) · 7fe2b93c
  Hailey Schoelkopf authored May 07, 2024
  
  7fe2b93c
03 May, 2024 1 commit

evaluation tracker implementation (#1766) · 59cf408a

KonradSzafer authored May 03, 2024

* evaluation tracker implementation

* OVModelForCausalLM test fix

* typo fix

* moved methods args

* multiple args in one flag

* loggers moved to dedicated dir

* improved filename sanitization

59cf408a

16 Apr, 2024 1 commit

Add delta weights model loading (#1712) · 12a165d1

KonradSzafer authored Apr 16, 2024

* added delta weights

* removed debug

* readme update

* better error handling

* autogptq warn

* warn update

* peft and delta error, explicitly deleting _model_delta

* linter fix

12a165d1

25 Mar, 2024 2 commits

Seq2seq fix (#1604) · 262f879a

Lintang Sutawika authored Mar 25, 2024



* fix on --task list

* add fixes to tokeniation

* differentiate encoding for seq2seq and decoder

* return token setting

* format for pre-commit

* Seq2seq fix, pt2 (#1630)

* getting model class only when defined

* encode_pair handles None, add_special_tokens turned into dict with default value

---------
Co-authored-by: achervyakov <77295913+artemorloff@users.noreply.github.com>

262f879a

peft Version Assertion (#1635) · 8e72f267
WoosungMyung authored Mar 26, 2024
```
* peft Version Assertion

* fix the linter issue
```
8e72f267

20 Mar, 2024 1 commit

Fixes to Loglikelihood prefix token / VLLM (#1611) · c7b03ad4

Hailey Schoelkopf authored Mar 20, 2024

* make vllm use prefix_token_id ; have prefix_token_id be optional method to define

* custom_prefix_token_id wasn't set if not passed

c7b03ad4

19 Mar, 2024 2 commits
- fix until arg processing (#1608) · d4b8fc13
  achervyakov authored Mar 20, 2024
  
  d4b8fc13
- Revert "Patch for Seq2Seq Model predictions (#1584)" (#1601) · f871646f
  Hailey Schoelkopf authored Mar 19, 2024
```
This reverts commit b7923a84.
```
  f871646f
18 Mar, 2024 1 commit

use BOS token in loglikelihood (#1588) · a4192489

kwrobel.eth authored Mar 18, 2024



* use BOS token in loglikelihood

* improve comments

* add model arg

* log prefix token id

* log prefix token id

* Update lm_eval/api/model.py
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* change name to prefix_token_id

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

a4192489

17 Mar, 2024 1 commit

Patch for Seq2Seq Model predictions (#1584) · b7923a84

Lintang Sutawika authored Mar 18, 2024



* Differentiate _encode_pair setting for decoder and enc-dec models

* tok_decode to not skip special token so that eos doen't become empty string

* Update model.py

* Update model.py

* Update huggingface.py

* Update lm_eval/models/huggingface.py
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* Update model.py

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

b7923a84