Commits · 0c134ee944d97998013eaff6f4e76d1b9fa87ecd · gaoqiong / lm-evaluation-harness

12 Sep, 2025 1 commit
- add quote to type hints (#3292) · 0c134ee9
  fxmarty-amd authored Sep 12, 2025
  
  0c134ee9
08 Sep, 2025 1 commit

Ignore seed when splitting batch in chunks with groupby (#3047) · 44398478

Slim Frikha authored Sep 09, 2025



* feat(vllm_causallms): make collator ignore seed when splitting batch into chunks

* fix(collator): revert PR changes

* fix(vllm-causallm): update collator call with groupby None

* feat(sglang-causallms): make generation accept a list of sampling params

---------
Co-authored-by: Baber <baber@hey.com>

44398478

21 Aug, 2025 1 commit
- Fix `add_bos_token` not updated for Gemma tokenizer (#3206) · 206b7722
  Cyrus Leung authored Aug 21, 2025
```
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
```
  206b7722
02 Aug, 2025 1 commit

Update vLLM compatibility (#3024) · bc811365

Cyrus Leung authored Aug 03, 2025



* Update vLLM compatibility
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

* add TokensPrompt to all generate calls

---------
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Baber <baber@hey.com>

bc811365

24 Jul, 2025 2 commits
- vllm: remove device (#3181) · 4f8195f1
  Baber Abbasi authored Jul 24, 2025
  
  4f8195f1
- fix vllm test issue that call pop() from None (#3182) · 5f5f35e5
  weiliang authored Jul 24, 2025
  
  5f5f35e5
23 Jul, 2025 2 commits

Remove "device" from vllm_causallms.py (#3176) · 8c6fde08

Michael Goin authored Jul 23, 2025

Device has been a deprecated arg for a few releases of vLLM and is now removed in 0.10.0 https://github.com/vllm-project/vllm/pull/21349

8c6fde08

Added `chat_template_args` to pass additional kwargs to tokenizer.apply_chat_template (#3164) · 2eea3f50

Avelina Asada Hadji-Kyriacou authored Jul 23, 2025



* added support for additional chat template arguments

* use `enable_thinking`

* add wrap logging function

* add `chat_template_args` back to HF

---------
Co-authored-by: Baber <baber@hey.com>

2eea3f50

16 Jul, 2025 1 commit

truncate thinking tags in generations (#3145) · 51ede33c

Baber Abbasi authored Jul 17, 2025

* feat: add postprocessing for generated text to strip stop sequences and thinking tokens

* nit

* fix: trim leading whitespace after stripping thinking tokens from generation

* feat: add think_end_token to model_args

* nit

* nit

* nit

* add to readme

* nit

51ede33c

15 Jul, 2025 1 commit
- fix: vllm lora (#3132) · 3102a8e4
  MaYongQing authored Jul 15, 2025
  
  3102a8e4
25 Jun, 2025 1 commit
- remove system message if `TemplateError` (#3076) · 0f63d4f5
  Baber Abbasi authored Jun 25, 2025
  
  0f63d4f5
08 Jun, 2025 1 commit

[longbench] fix metric calculation (#2983) · 147e9d61

Baber Abbasi authored Jun 08, 2025

* use all answers

* use middle truncation

* maybe fix classification score

* strip classification preds

* [vllm] remove stop tokens post-hoc

* strip all preds

* pacify pre-commit

* start on truncation utility

* add to readme

* add a footgun doc

* fix newline in yaml templates

* do not strip code_sim preds!

* fix pre-commit config

* fix instruction warning

* add not to longbench readme

147e9d61

03 Jun, 2025 1 commit
- fix: fix vllm issue with DP>1 (#3025) · d57e3d65
  Younes B authored Jun 03, 2025
  
  d57e3d65
26 May, 2025 1 commit

[vllm] data parallel for V1 (#3011) · 5a481f43

Baber Abbasi authored May 26, 2025

* add data_parallel for V1

* use Process instead of Queue

* ray used if V0 DP

* better error handling

* fix truncation warning comparison

5a481f43

23 May, 2025 1 commit
- [Fix] Update `resolve_hf_chat_template` arguments (#2992) · 357d4eaa
  fxmarty-amd authored May 23, 2025
```
* fix arguments

* pacify pre-commit

---------
Co-authored-by: Baber <baber@hey.com>
```
  357d4eaa
19 May, 2025 1 commit
- [SGLANG] Add the SGLANG generate API (#2997) · 53c65300
  Baber Abbasi authored May 19, 2025
```
* add `sglang-generate`

* nit

* nit

* nit

* pacify pre-commit
```
  53c65300
15 May, 2025 1 commit
- Add device arg to model_args passed to LLM object in VLLM model class (#2879) · 96966f53
  Filippo Momentè authored May 15, 2025
```
* fix: pass device arg in model_ar in vllm_causallms

* casting device arg to str in vLLM model args
```
  96966f53
10 May, 2025 1 commit
- fix: type error while checking context length (#2972) · 1c03af33
  Sungjae Lee authored May 10, 2025
  
  1c03af33
09 May, 2025 1 commit
- add warning on truncation (#2962) · 2f03271d
  Baber Abbasi authored May 09, 2025
  
  2f03271d
06 May, 2025 1 commit
- Add support for enable_thinking argument in vllm model, set default to False (#2947) · ab618f01
  Alexandre Marques authored May 06, 2025
  
  ab618f01
16 Apr, 2025 1 commit
- fix resolve_hf_chat_template version (#2917) · 38ba7dce
  Baber Abbasi authored Apr 16, 2025
```
* fix resolve_hf_chat_template version

* pre-commit
```
  38ba7dce
14 Apr, 2025 1 commit

Extend support for chat template in vLLM (#2902) · 2a41c02e

Alexandre Marques authored Apr 14, 2025

* Add support for chat templates defined outside of tokenizer_config.json, as supported by vLLM

* Update template name to avoid conflict with other variable

2a41c02e

20 Mar, 2025 2 commits
- [VLLM, SLANG] default temp=0.0 (#2819) · c6b9aeeb
  Baber Abbasi authored Mar 20, 2025
  
  c6b9aeeb
- Configure the pad tokens for Qwen when using vLLM (#2810) · 61b63da7
  Yifei Zhang authored Mar 20, 2025
  
  61b63da7
11 Mar, 2025 1 commit
- initialize tokenizer with bos_token (#2781) · 07bd7e23
  Baber Abbasi authored Mar 11, 2025
  
  07bd7e23
27 Feb, 2025 1 commit
- fix vllm data parallel (#2746) · a87fe425
  Baber Abbasi authored Feb 27, 2025
```
* remove ray.remote resources

* remove kobtest tag (registered as group)
```
  a87fe425
21 Feb, 2025 1 commit

Logging (#2203) · 1ba35e62

Lintang Sutawika authored Feb 20, 2025



* changed source of eval_logger

* allow eval_logger to be set from args

* removed verbosity arg from non-main methods

* fix logging

* pre-commit

* set verbosity in eval logger

* replace utils.eval_logger

* fix logging in main

* add logging to docs

* add logging message

* nit

* add logging to docs

* refactor setup_logging to utils

---------
Co-authored-by: Baber <baber@hey.com>

1ba35e62

17 Feb, 2025 1 commit
- fix vllm (#2708) · 52df63b7
  Baber Abbasi authored Feb 17, 2025
```
* fix vllm

* fix data_parallel

* copy to multimodal
```
  52df63b7
07 Feb, 2025 1 commit
- remove cuda device assertion (#2680) · a40fe42a
  Baber Abbasi authored Feb 07, 2025
  
  a40fe42a
19 Jan, 2025 1 commit
- update pre-commit (#2632) · f724be69
  Baber Abbasi authored Jan 19, 2025
```
* update pre-commit
```
  f724be69
15 Jan, 2025 1 commit

assistant prefill (#2615) · 703fbffd

Baber Abbasi authored Jan 15, 2025

* add assistant prefix

* add arc_challenge from llama

* nit

* nit

* nit

* add assistant prefix

* add mmlu_llama

* nit

* nit

* Revert "nit"

This reverts commit 6a97f8356237305e375212b966b30e8de59dd4bc.

* fix regex bug

* add assistant_prefix to vllm

* add `Question:`

* add mmlu_pro

* add fewshot assistant_prefix

* use `assistant_prefill`

* typehints

* nits

* nits

* add to docs

* add readme

703fbffd

16 Dec, 2024 1 commit

batch `loglikelihood_rolling` across requests (#2559) · 0bfb0220

Baber Abbasi authored Dec 16, 2024

* batch all rolling token windows

* nit

* copy to vllm

* fix max_length for `get_rolling_token_windows`

* bugfix

* bugfix

* add type hints

0bfb0220

30 Nov, 2024 1 commit
- make utility function to handle `until` (#2518) · 0230356c
  Baber Abbasi authored Nov 30, 2024
```
* make utility function to handle `until`

* fix text
```
  0230356c
15 Nov, 2024 1 commit
- Fix revision parameter to vllm get_tokenizer (#2492) · e20e1ddc
  Oyvind Tafjord authored Nov 15, 2024
  
  e20e1ddc
30 Oct, 2024 1 commit

Fix lora requests when dp with vllm (#2433) · 838a3e03

Chris Kerwell Gresla authored Oct 30, 2024



* fix: use lora_request for data parallel vllm evals

* fix(docs): include type hint

* chore: lint, et pre-commit al

---------
Co-authored-by: Chris Kerwell Gresla <chris@wafer.systems>

838a3e03

22 Oct, 2024 1 commit

[Fix] Replace generic exception classes with a more specific ones (#1989) · d4ae9635

Leonid Sinev authored Oct 22, 2024

* Replace generic exception classes with a more specific ones

* rerun pre-commit to pass linter tests

* Revert "rerun pre-commit to pass linter tests"

This reverts commit 67f88ccf144469853217704520e613196042d859.

* reduce repetitions in errors or so

* Replace generic exception class with a more specific one

d4ae9635

04 Sep, 2024 1 commit

Chat Template fix (cont. #2235) (#2269) · 7a1614eb

Baber Abbasi authored Sep 04, 2024



* default chat template method fix

* move chat_template to TemplateLM

* remove hotfix

* handle openai `chat_template`

* Update lm_eval/api/model.py
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* add 'max_tokens' to gen_kwargs

* pre-commit

---------
Co-authored-by: KonradSzafer <szafer.konrad@gmail.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

7a1614eb

30 Aug, 2024 1 commit

API: fix maxlen; vllm: prefix_token_id bug (#2262) · b31f92e8

Baber Abbasi authored Aug 30, 2024

* max_length - 1 (generation always >= 1)

* vllm: fix rolling prefix_token

* nit: add comment

* fixup! max_length should be handled for logliklihoods

b31f92e8

28 Aug, 2024 1 commit

Fix `loglikelihood_rolling` caching ( #1821 ) (#2187) · 8138fd52

Hailey Schoelkopf authored Aug 28, 2024



* fix revision type

* allow for None-input loglikelihood reqs to be cached

* handle no remaining cache items

* pre-commit

* change cache_hook.add_partial(loglikelihood_rolling...) convention

---------
Co-authored-by: Baber Abbasi <baber@eleuther.ai>

8138fd52

02 Jul, 2024 1 commit
- update gemma-2 default BOS behavior (#2049) · 67a990e7
  Hailey Schoelkopf authored Jul 01, 2024
  
  67a990e7