1. 06 Jul, 2025 1 commit
  2. 03 Jul, 2025 1 commit
    • Ankush's avatar
      Bugfix/hf tokenizer gguf override (#3098) · ff41a856
      Ankush authored
      * fix(hf-gguf): skip gguf_file if external tokenizer is provided
      
      * docs(readme): add instructions for evaluating GGUF models with Hugging Face backend
      ff41a856
  3. 30 Jun, 2025 1 commit
    • Baber Abbasi's avatar
      [HF] fix quantization config (#3039) · fea4d11d
      Baber Abbasi authored
      * Try fixing issue 3026 which is caused by the quantization_config argument introduced in Commit 758c5ed8
      
      .
      The argument is in Dict type, but for a GPTQ quantized model, it has a conflict with the huggingface interface which expects QuantizationConfigMixin type.
      Current solution is removing quantization_config argument in HFLM._create_model() of lm_eval/models/huggingface.py.
      Require further modification to restore the functionality provided by the previous commit.
      
      * wrap quantization_config in AutoQuantizationConfig
      
      * handle quantization config not dict
      
      * wrap quantization_config in AutoQuantizationConfig if dict
      
      ---------
      Co-authored-by: default avatarshanhx2000 <hs359@duke.edu>
      fea4d11d
  4. 25 Jun, 2025 2 commits
  5. 23 Jun, 2025 1 commit
    • NourFahmy's avatar
      Fix Anthropic API compatibility issues in chat completions (#3054) · 8bc46207
      NourFahmy authored
      
      
      * Fix Anthropic API compatibility issues in chat completions
      
      solves two important compatibility issues between the LM Eval Harness and Anthropic's API:
      
      1) The type field issue - Anthropic's Messages API doesn't accept the type field that other APIs might expect, that was previously included
      2) The stop sequences issue - Anthropic requires stop sequences to contain non-whitespace characters
      
      tested with most recent models from anthopic; claude-sonnet-4-0, claude-opus-4-0, resolved my local api errors
      
      * pacufy pre-commit
      
      * add type
      
      ---------
      Co-authored-by: default avatarBaber <baber@hey.com>
      8bc46207
  6. 08 Jun, 2025 1 commit
    • Baber Abbasi's avatar
      [longbench] fix metric calculation (#2983) · 147e9d61
      Baber Abbasi authored
      * use all answers
      
      * use middle truncation
      
      * maybe fix classification score
      
      * strip classification preds
      
      * [vllm] remove stop tokens post-hoc
      
      * strip all preds
      
      * pacify pre-commit
      
      * start on truncation utility
      
      * add to readme
      
      * add a footgun doc
      
      * fix newline in yaml templates
      
      * do not strip code_sim preds!
      
      * fix pre-commit config
      
      * fix instruction warning
      
      * add not to longbench readme
      147e9d61
  7. 03 Jun, 2025 1 commit
  8. 02 Jun, 2025 1 commit
  9. 26 May, 2025 1 commit
  10. 23 May, 2025 2 commits
  11. 21 May, 2025 3 commits
  12. 19 May, 2025 1 commit
  13. 15 May, 2025 1 commit
  14. 10 May, 2025 1 commit
  15. 09 May, 2025 1 commit
  16. 06 May, 2025 1 commit
  17. 18 Apr, 2025 1 commit
  18. 16 Apr, 2025 2 commits
  19. 15 Apr, 2025 1 commit
    • Jerry Zhang's avatar
      Add support for quantization_config (#2842) · 758c5ed8
      Jerry Zhang authored
      * Add support for quantization_config
      
      Summary:
      Previously quantization_config is ignored, so torchao quantized models are not supported,
      this PR adds that.
      
      Test Plan:
      lm_eval --model hf --model_args pretrained=jerryzh168/gemma3-int4wo --tasks hellaswag --device cuda:0 --batch_size 8
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * quantization_config is optional
      758c5ed8
  20. 14 Apr, 2025 1 commit
  21. 04 Apr, 2025 1 commit
  22. 20 Mar, 2025 2 commits
  23. 18 Mar, 2025 1 commit
  24. 17 Mar, 2025 1 commit
  25. 14 Mar, 2025 2 commits
  26. 11 Mar, 2025 1 commit
  27. 04 Mar, 2025 1 commit
  28. 27 Feb, 2025 1 commit
  29. 25 Feb, 2025 1 commit
    • Jinwei's avatar
      Support SGLang as Potential Backend for Evaluation (#2703) · 29971faa
      Jinwei authored
      
      
      * initial components to support sglang
      
      * init of class SGLangLM
      
      * draft for generate_until of SGLang model
      
      * mock loglikelihood
      
      * initial loglikelihood_tokens
      
      * todo: fix bug of sglang engine init
      
      * implement generation tasks and test
      
      * support output type loglikelihood and loglikelihood_rolling (#1)
      
      * .
      
      * loglikelihood_rolling
      
      * /
      
      * support dp_size>1
      
      * typo
      
      * add tests and clean code
      
      * skip tests of sglang for now
      
      * fix OOM error of sglang pytest
      
      * finish test for sglang
      
      * add sglang to readme
      
      * fix OOM of tests and clean SGLang model
      
      * update readme
      
      * clean pyproject and add tests for evaluator
      
      * add accuracy tests and it passed locally
      
      * add notes for test
      
      * Update README.md
      
      update readme
      
      * pre-commit
      
      ---------
      Co-authored-by: default avatarXiaotong Jiang <xiaotong.jiang@databricks.com>
      Co-authored-by: default avatarBaber Abbasi <92168766+baberabb@users.noreply.github.com>
      Co-authored-by: default avatarBaber <baber@hey.com>
      29971faa
  30. 24 Feb, 2025 1 commit
  31. 21 Feb, 2025 1 commit
    • Lintang Sutawika's avatar
      Logging (#2203) · 1ba35e62
      Lintang Sutawika authored
      
      
      * changed source of eval_logger
      
      * allow eval_logger to be set from args
      
      * removed verbosity arg from non-main methods
      
      * fix logging
      
      * pre-commit
      
      * set verbosity in eval logger
      
      * replace utils.eval_logger
      
      * fix logging in main
      
      * add logging to docs
      
      * add logging message
      
      * nit
      
      * add logging to docs
      
      * refactor setup_logging to utils
      
      ---------
      Co-authored-by: default avatarBaber <baber@hey.com>
      1ba35e62
  32. 17 Feb, 2025 1 commit
  33. 12 Feb, 2025 1 commit