Commits · 793469e05a3e4cbc0dc8fcffd33df79238949fc8 · gaoqiong / lm-evaluation-harness

12 Jun, 2024 1 commit
- Fix self.max_tokens in anthropic_llms.py (#1848) · 793469e0
  Nikita Lozhnikov authored Jun 12, 2024
```
Fix bug where `self.max_tokens` was not set
```
  793469e0
11 Jun, 2024 1 commit
- add hacky add_bos_token forcing for Gemma to VLLM too (#1857) · b3e4c49a
  Hailey Schoelkopf authored Jun 11, 2024
  
  b3e4c49a
03 Jun, 2024 1 commit

KonradSzafer authored Jun 03, 2024



* initial chat template

* tokenizer attribute check

* variable rename

* interface update

* system instruction

* system inst default update

* fewshot as multiturn

* typing update

* indent update

* added comments

* Adding a fewshot in a more readable way

* linting

* Moved apply chat template to LM

* multiturn alternation fix

* cache key update

* apply chat template method fix

* add system prompt hash to cache_key

* tokenizer name property for cache_key

* property name fix

* linting backward compatibility fix

* docs and errors update

* add documentation on adding chat template compatibility to model_guide

* fewshot as multiturn check fix

* saving system inst and chat template in results

* eval tracker update

* docs update

* Apply suggestions from code review
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

---------
Co-authored-by: haileyschoelkopf <hailey@eleuther.ai>
Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

070d31df

30 May, 2024 1 commit

[HFLM]Add support for Ascend NPU (#1886) · 8f716817

Huazhong Ji authored May 31, 2024



* [HFLM]Add support for Ascend NPU
Co-authored-by: jiaqiw09 <jiaqiw960714@gmail.com>
Co-authored-by: zhabuye <2947436155@qq.com>

* bump accelerate dependency version to 0.26.0 for NPU compat.

---------
Co-authored-by: jiaqiw09 <jiaqiw960714@gmail.com>
Co-authored-by: zhabuye <2947436155@qq.com>
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

8f716817

28 May, 2024 1 commit
- Updated vllm imports in vllm_causallms.py (#1890) · b4cd85d4
  Michael Goin authored May 28, 2024
```
* Reorder vllm imports in vllm_causallms.py

* Update vllm_causallms.py
```
  b4cd85d4
24 May, 2024 1 commit
- [HFLM]Use Accelerate's API to reduce hard-coded CUDA code (#1880) · c4c15917
  Huazhong Ji authored May 24, 2024
  
  c4c15917
23 May, 2024 1 commit
- Unpin vllm in dependencies (#1874) · 5711ab87
  Edward Gan authored May 23, 2024
  
  5711ab87
19 May, 2024 1 commit

Fix: support PEFT/LoRA with added tokens (#1828) · 86319a9b

Nick Doiron authored May 19, 2024



* resize model embeddings

* resize only

* tokenizer help

* load tokenizer before model

* add comment and run precommit lint

* Add log message
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

86319a9b

07 May, 2024 2 commits
- Logging Updates (Alphabetize table printouts, fix eval tracker bug) (#1774) (#1791) · d4a913c4
  Hailey Schoelkopf authored May 07, 2024
```
* fix auto-batch size bug for seq2seq models

* alphabetize task + group tables ; fix eval tracker bug

* fix eval tracker bug
```
  d4a913c4
- Fix Caching Tests ; Remove `pretrained=gpt2` default (#1775) · 7fe2b93c
  Hailey Schoelkopf authored May 07, 2024
  
  7fe2b93c
05 May, 2024 2 commits
- Fix bug in setting until kwarg in openai completions (#1784) · 30c060d2
  ciaranby authored May 05, 2024
  
  30c060d2
- remove echo parameter in OpenAI completions API (#1779) · c34986da
  kwrobel.eth authored May 05, 2024
```
* remove echo parameter in OpenAI completions API

* remove context length parameter doc string
```
  c34986da
03 May, 2024 1 commit

evaluation tracker implementation (#1766) · 59cf408a

KonradSzafer authored May 03, 2024

* evaluation tracker implementation

* OVModelForCausalLM test fix

* typo fix

* moved methods args

* multiple args in one flag

* loggers moved to dedicated dir

* improved filename sanitization

59cf408a

02 May, 2024 2 commits
- Add option to set OpenVINO config (#1730) · e6394715
  Helena Kloosterman authored May 02, 2024
```
* Add option to set OpenVINO config

* Use utils.eval_logger for logging
```
  e6394715
- vllm lora support (#1756) · 83fd78a2
  bcicc authored May 02, 2024
```
* vllm lora support

* remove print

* version check, rename lora kwarg
```
  83fd78a2
18 Apr, 2024 1 commit
- fix error when appending eot_token_id for generate_until tasks (#1699) · dc5eba86
  Sergio Perez authored Apr 18, 2024
  
  dc5eba86
16 Apr, 2024 2 commits

Add `neuralmagic` models for `sparseml` and `deepsparse` (#1674) · 8b326be7

Michael Goin authored Apr 16, 2024



* Add neuralmagic models for SparseML and DeepSparse

* Update to latest and add test

* Format

* Fix list to List

* Format

* Add deepsparse/sparseml to automated testing

* Update pyproject.toml

* Update pyproject.toml

* Update README

* Fixes for dtype and device

* Format

* Fix test

* Apply suggestions from code review
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* Address review comments!

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

8b326be7

Add delta weights model loading (#1712) · 12a165d1

KonradSzafer authored Apr 16, 2024

* added delta weights

* removed debug

* readme update

* better error handling

* autogptq warn

* warn update

* peft and delta error, explicitly deleting _model_delta

* linter fix

12a165d1

05 Apr, 2024 1 commit

Anthropic Chat API (#1594) · 27924d77

Seungwoo Ryu authored Apr 06, 2024



* claude3

* supply for anthropic claude3

* supply for anthropic claude3

* anthropic config changes

* add callback options on anthropic

* line passed

* claude3 tiny change

* help anthropic installation

* mention sysprompt / being careful with format in readme

---------
Co-authored-by: haileyschoelkopf <hailey@eleuther.ai>

27924d77

01 Apr, 2024 1 commit

Fix CLI --batch_size arg for openai-completions/local-completions (#1656) · 9516087b

Michael Goin authored Apr 01, 2024

The OpenAI interface supports batch size as an argument to the completions API, but does not seem to support specification of this on the CLI i.e. `lm_eval --model openai-completions --batch_size 16 ...` because of a simple lack of str->int conversion.

This is confirmed by my usage and stacktrace from running `OPENAI_API_KEY=dummy lm_eval --model local-completions --tasks gsm8k --batch_size 16 --model_args model=nm-
testing/zephyr-beta-7b-gptq-g128,tokenizer_backend=huggingface,base_url=http://localhost:8000/v1`:
```
Traceback (most recent call last):
  File "/home/michael/venv/bin/lm_eval", line 8, in <module>
    sys.exit(cli_evaluate())
  File "/home/michael/code/lm-evaluation-harness/lm_eval/__main__.py", line 341, in cli_evaluate
    results = evaluator.simple_evaluate(
  File "/home/michael/code/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper
    return fn(*args, **kwargs)
  File "/home/michael/code/lm-evaluation-harness/lm_eval/evaluator.py", line 251, in simple_evaluate
    results = evaluate(
  File "/home/michael/code/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper
    return fn(*args, **kwargs)
  File "/home/michael/code/lm-evaluation-harness/lm_eval/evaluator.py", line 390, in evaluate
    resps = getattr(lm, reqtype)(cloned_reqs)
  File "/home/michael/code/lm-evaluation-harness/lm_eval/models/openai_completions.py", line 263, in generate_until
    list(sameuntil_chunks(re_ord.get_reordered(), self.batch_size)),
  File "/home/michael/code/lm-evaluation-harness/lm_eval/models/openai_completions.py", line 251, in sameuntil_chunks
    if len(ret) >= size or x[1] != lastuntil:
TypeError: '>=' not supported between instances of 'int' and 'str'
```

9516087b

27 Mar, 2024 1 commit
- Fix conditional import for Nemo LM class (#1641) · 0dffdbb4
  Hailey Schoelkopf authored Mar 27, 2024
  
  0dffdbb4
26 Mar, 2024 1 commit

Integration of NeMo models into LM Evaluation Harness library (#1598) · e9d429e1

Sergio Perez authored Mar 26, 2024

* Integration of NeMo models into LM Evaluation Harness library

* rename nemo model as nemo_lm

* move nemo section in readme after hf section

* use self.eot_token_id in get_until()

* improve progress bar showing loglikelihood requests

* data replication or tensor/pipeline replication working fine within one node

* run pre-commit on modified files

* check whether dependencies are installed

* clarify usage of torchrun in README

e9d429e1

25 Mar, 2024 2 commits

Seq2seq fix (#1604) · 262f879a

Lintang Sutawika authored Mar 25, 2024



* fix on --task list

* add fixes to tokeniation

* differentiate encoding for seq2seq and decoder

* return token setting

* format for pre-commit

* Seq2seq fix, pt2 (#1630)

* getting model class only when defined

* encode_pair handles None, add_special_tokens turned into dict with default value

---------
Co-authored-by: achervyakov <77295913+artemorloff@users.noreply.github.com>

262f879a

peft Version Assertion (#1635) · 8e72f267
WoosungMyung authored Mar 26, 2024
```
* peft Version Assertion

* fix the linter issue
```
8e72f267

21 Mar, 2024 1 commit
- OpenAI Completions -- fix passing of unexpected 'until' arg (#1612) · 34c9b7e4
  Hailey Schoelkopf authored Mar 21, 2024
  
  34c9b7e4
20 Mar, 2024 1 commit

Fixes to Loglikelihood prefix token / VLLM (#1611) · c7b03ad4

Hailey Schoelkopf authored Mar 20, 2024

* make vllm use prefix_token_id ; have prefix_token_id be optional method to define

* custom_prefix_token_id wasn't set if not passed

c7b03ad4

19 Mar, 2024 2 commits
- fix until arg processing (#1608) · d4b8fc13
  achervyakov authored Mar 20, 2024
  
  d4b8fc13
- Revert "Patch for Seq2Seq Model predictions (#1584)" (#1601) · f871646f
  Hailey Schoelkopf authored Mar 19, 2024
```
This reverts commit b7923a84.
```
  f871646f
18 Mar, 2024 1 commit

use BOS token in loglikelihood (#1588) · a4192489

kwrobel.eth authored Mar 18, 2024



* use BOS token in loglikelihood

* improve comments

* add model arg

* log prefix token id

* log prefix token id

* Update lm_eval/api/model.py
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* change name to prefix_token_id

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

a4192489

17 Mar, 2024 1 commit

Patch for Seq2Seq Model predictions (#1584) · b7923a84

Lintang Sutawika authored Mar 18, 2024



* Differentiate _encode_pair setting for decoder and enc-dec models

* tok_decode to not skip special token so that eos doen't become empty string

* Update model.py

* Update model.py

* Update huggingface.py

* Update lm_eval/models/huggingface.py
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* Update model.py

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

b7923a84

13 Mar, 2024 1 commit

add manual tqdm disabling management (#1569) · e74ec966

achervyakov authored Mar 13, 2024



* add manual tqdm disabling management

* add typing to all new args

* apply precommit changes

---------
Co-authored-by: haileyschoelkopf <hailey@eleuther.ai>

e74ec966

09 Mar, 2024 1 commit

Add compatibility for vLLM's new Logprob object (#1549) · 8051d954

Antoni Baum authored Mar 09, 2024



* Add compatibility for vLLM's new Logprob object

* Fix

* Update lm_eval/models/vllm_causallms.py

* fix format?

* trailing whitespace

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

8051d954

06 Mar, 2024 1 commit

Update installation commands in openai_completions.py and contributing... · 9e6e2402

Sungho Park authored Mar 07, 2024


Update installation commands in openai_completions.py and contributing document and, update wandb_args description (#1536)

* Update openai completions and docs/CONTRIBUTING.md

* Update wandb args description

* Update docs/interface.md

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

9e6e2402

03 Mar, 2024 1 commit

Vllm update DP+TP (#1508) · e5e35fca

Baber Abbasi authored Mar 03, 2024

* use `@ray.remote` with distributed vLLM

* update versions

* bugfix

* unpin vllm

* fix pre-commit

* added version assertion error

* Revert "added version assertion error"

This reverts commit 8041e9b78e95eea9f4f4d0dc260115ba8698e9cc.

* added version assertion for DP

* expand DP note

* add warning

* nit

* pin vllm

* fix typos

e5e35fca

01 Mar, 2024 2 commits
- Improve data-parallel request partitioning for VLLM (#1477) · 27a3da96
  Hailey Schoelkopf authored Mar 01, 2024
```
* add undistribute + use more_itertools

* remove divide() util fn

* add more_itertools as dependency
```
  27a3da96
- always include EOS token in stopsequences if possible (#1480) · 284dd80d
  Hailey Schoelkopf authored Mar 01, 2024
  
  284dd80d
28 Feb, 2024 1 commit
- fix duplicated kwargs in some model init (#1495) · b177c82c
  Linsong Chu authored Feb 28, 2024
  
  b177c82c
27 Feb, 2024 2 commits

Fix AttributeError in huggingface.py When 'model_type' is Missing (#1489) · cc771eca

Rich authored Feb 27, 2024



* model_type attribute error

Getting attribute error when using a model without a 'model_type'

* fix w/ and w/out the 'model_type' specification

* use getattr(), also fix other config.model_type reference

* Update huggingface.py

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

cc771eca

Refactor `evaluater.evaluate` (#1441) · 5ccd65d4

Baber Abbasi authored Feb 27, 2024



* change `all_gather` to `gather`

* add TaskOutput utility class

* Add FilterResults class and refactor task handling.

* Rename `key` to `filter_key` for clarity

* Add `print_writeout` function in utils.py

* Add function to calculate limit size.

* Add doc_iterator method to Task class

* Refactor `doc_iterator` and cleanup in Task class

* remove superfluous bits

* change `all_gather` to `gather`

* bugfix

* bugfix

* fix `gather`

* Refactor `gather` loop

* Refactor aggregate metrics calculation

* Refactor and simplify aggregate metrics calculation
Removed unused code

* Simplify metrics calculation and remove unused code.

* simplify the metrics calculation in `utils.py` and `evaluator.py`.

* Fix group metric

* change evaluate to hf_evaluate

* change evaluate to hf_evaluate

* add docs

* add docs

* nits

* make isslice keyword only

* nit

* add todo

* nit

* nit

* nit: swap order samples_metrics tuple

* move instance sorting outside loop

* nit

* nit

* Add __repr__ for ConfigurableTask

* nit

* nit

* Revert "nit"

This reverts commit dab8d9977a643752a17f840fd8cf7e4b107df28f.

* fix some logging

* nit

* fix `predict_only` bug. thanks to `@LSinev`!

* change `print_tasks` to `prepare_print_tasks`

* nits

* move eval utils

* move eval utils

* nit

* add comment

* added tqdm descriptions

* Update lm_eval/evaluator_utils.py
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

* fix mgsm bug

* nit

* fix `build_all_requests`

* pre-commit

* add ceil to limit

---------
Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

5ccd65d4

26 Feb, 2024 1 commit
- Revert "setting trust_remote_code (#1467)" (#1474) · f6befdb9
  Hailey Schoelkopf authored Feb 26, 2024
```
This reverts commit c1145dfd.
```
  f6befdb9