- 06 Jul, 2025 1 commit
-
-
Baber Abbasi authored
-
- 03 Jul, 2025 1 commit
-
-
Ankush authored
* fix(hf-gguf): skip gguf_file if external tokenizer is provided * docs(readme): add instructions for evaluating GGUF models with Hugging Face backend
-
- 30 Jun, 2025 1 commit
-
-
Baber Abbasi authored
* Try fixing issue 3026 which is caused by the quantization_config argument introduced in Commit 758c5ed8 . The argument is in Dict type, but for a GPTQ quantized model, it has a conflict with the huggingface interface which expects QuantizationConfigMixin type. Current solution is removing quantization_config argument in HFLM._create_model() of lm_eval/models/huggingface.py. Require further modification to restore the functionality provided by the previous commit. * wrap quantization_config in AutoQuantizationConfig * handle quantization config not dict * wrap quantization_config in AutoQuantizationConfig if dict --------- Co-authored-by:
shanhx2000 <hs359@duke.edu>
-
- 25 Jun, 2025 2 commits
-
-
Younes B authored
* add subfolder * lint * change it to empty string * fix typehints --------- Co-authored-by:Baber <baber@hey.com>
-
Baber Abbasi authored
-
- 23 Jun, 2025 1 commit
-
-
NourFahmy authored
* Fix Anthropic API compatibility issues in chat completions solves two important compatibility issues between the LM Eval Harness and Anthropic's API: 1) The type field issue - Anthropic's Messages API doesn't accept the type field that other APIs might expect, that was previously included 2) The stop sequences issue - Anthropic requires stop sequences to contain non-whitespace characters tested with most recent models from anthopic; claude-sonnet-4-0, claude-opus-4-0, resolved my local api errors * pacufy pre-commit * add type --------- Co-authored-by:Baber <baber@hey.com>
-
- 08 Jun, 2025 1 commit
-
-
Baber Abbasi authored
* use all answers * use middle truncation * maybe fix classification score * strip classification preds * [vllm] remove stop tokens post-hoc * strip all preds * pacify pre-commit * start on truncation utility * add to readme * add a footgun doc * fix newline in yaml templates * do not strip code_sim preds! * fix pre-commit config * fix instruction warning * add not to longbench readme
-
- 03 Jun, 2025 1 commit
-
-
Younes B authored
-
- 02 Jun, 2025 1 commit
-
-
Yury Sulsky authored
-
- 26 May, 2025 1 commit
-
-
Baber Abbasi authored
* add data_parallel for V1 * use Process instead of Queue * ray used if V0 DP * better error handling * fix truncation warning comparison
-
- 23 May, 2025 2 commits
-
-
Ameya Godbole authored
* FIX error due to grouping queries with different continuation length Make Collator choose query with the longest continuation as the candidate for generation * use max for key selection * added comments explaining variable cont length (identical ctx+cont[:-1]) --------- Co-authored-by:Baber <baber@hey.com>
-
fxmarty-amd authored
* fix arguments * pacify pre-commit --------- Co-authored-by:Baber <baber@hey.com>
-
- 21 May, 2025 3 commits
-
-
achervyakov authored
* first version of image resizing * fixed bug * clean up `resize_image` --------- Co-authored-by:
Artem Safin <artemsafin67@gmail.com> Co-authored-by:
Baber <baber@hey.com>
-
Baber Abbasi authored
* use images with apis * pacify pre-commit
-
Rob Geada authored
* Log tokenized request warning only once * Fix logging for concurrent usecase as well
-
- 19 May, 2025 1 commit
-
-
Baber Abbasi authored
* add `sglang-generate` * nit * nit * nit * pacify pre-commit
-
- 15 May, 2025 1 commit
-
-
Filippo Momentè authored
* fix: pass device arg in model_ar in vllm_causallms * casting device arg to str in vLLM model args
-
- 10 May, 2025 1 commit
-
-
Sungjae Lee authored
-
- 09 May, 2025 1 commit
-
-
Baber Abbasi authored
-
- 06 May, 2025 1 commit
-
-
Alexandre Marques authored
-
- 18 Apr, 2025 1 commit
-
-
Avelina9X authored
* Added softmax_dtype argument to coerce log_softmax computations * move softmax_dtype --------- Co-authored-by:Baber <baber@hey.com>
-
- 16 Apr, 2025 2 commits
-
-
achervyakov authored
-
Baber Abbasi authored
* fix resolve_hf_chat_template version * pre-commit
-
- 15 Apr, 2025 1 commit
-
-
Jerry Zhang authored
* Add support for quantization_config Summary: Previously quantization_config is ignored, so torchao quantized models are not supported, this PR adds that. Test Plan: lm_eval --model hf --model_args pretrained=jerryzh168/gemma3-int4wo --tasks hellaswag --device cuda:0 --batch_size 8 Reviewers: Subscribers: Tasks: Tags: * quantization_config is optional
-
- 14 Apr, 2025 1 commit
-
-
Alexandre Marques authored
* Add support for chat templates defined outside of tokenizer_config.json, as supported by vLLM * Update template name to avoid conflict with other variable
-
- 04 Apr, 2025 1 commit
-
-
Nikodem Szwast authored
* update authnentications methods, add support for deployment_id * run pre-commit on changed file
-
- 20 Mar, 2025 2 commits
-
-
Baber Abbasi authored
-
Yifei Zhang authored
-
- 18 Mar, 2025 1 commit
-
-
Baber Abbasi authored
* add min_pixels, max_pixels * fix
-
- 17 Mar, 2025 1 commit
-
-
Kiersten Stokes authored
* Add support for token-based auth for watsonx models * Fix lint * Move dotenv import to inner scope * Improve readability of _verify_credentials
-
- 14 Mar, 2025 2 commits
-
-
achervyakov authored
* Added audio-modality pipeline for qwen2-audio model * Beauty imports * fix apply_chat_template args * update default audio placeholders list * add demo task - common_voice subset * add audiolm_qwen libs to pyproject.toml * pre-commit beautify --------- Co-authored-by:Alexandra Rak <rakalexandra@mail.ru>
-
daniel-salib authored
-
- 11 Mar, 2025 1 commit
-
-
Baber Abbasi authored
-
- 04 Mar, 2025 1 commit
-
-
Lucia Quirke authored
* Enable steering HF models Co-authored-by:
Matthew Khoriaty <matthewkhoriaty2026@u.northwestern.edu> * increase HF download timeout * Update readme; improve steering vector device handling * Update latest news * remove HF timeout increase * fix tests * ignore sae lens test * fix accidental force push --------- Co-authored-by:
Matthew Khoriaty <matthewkhoriaty2026@u.northwestern.edu>
-
- 27 Feb, 2025 1 commit
-
-
Baber Abbasi authored
* remove ray.remote resources * remove kobtest tag (registered as group)
-
- 25 Feb, 2025 1 commit
-
-
Jinwei authored
* initial components to support sglang * init of class SGLangLM * draft for generate_until of SGLang model * mock loglikelihood * initial loglikelihood_tokens * todo: fix bug of sglang engine init * implement generation tasks and test * support output type loglikelihood and loglikelihood_rolling (#1) * . * loglikelihood_rolling * / * support dp_size>1 * typo * add tests and clean code * skip tests of sglang for now * fix OOM error of sglang pytest * finish test for sglang * add sglang to readme * fix OOM of tests and clean SGLang model * update readme * clean pyproject and add tests for evaluator * add accuracy tests and it passed locally * add notes for test * Update README.md update readme * pre-commit --------- Co-authored-by:
Xiaotong Jiang <xiaotong.jiang@databricks.com> Co-authored-by:
Baber Abbasi <92168766+baberabb@users.noreply.github.com> Co-authored-by:
Baber <baber@hey.com>
-
- 24 Feb, 2025 1 commit
-
-
Jocelyn authored
* add o3-mini support * fix linter tests
-
- 21 Feb, 2025 1 commit
-
-
Lintang Sutawika authored
* changed source of eval_logger * allow eval_logger to be set from args * removed verbosity arg from non-main methods * fix logging * pre-commit * set verbosity in eval logger * replace utils.eval_logger * fix logging in main * add logging to docs * add logging message * nit * add logging to docs * refactor setup_logging to utils --------- Co-authored-by:Baber <baber@hey.com>
-
- 17 Feb, 2025 1 commit
-
-
Baber Abbasi authored
* fix vllm * fix data_parallel * copy to multimodal
-
- 12 Feb, 2025 1 commit
-
-
achervyakov authored
-