1. 06 Jul, 2025 1 commit
  2. 19 May, 2025 1 commit
  3. 14 Mar, 2025 1 commit
  4. 04 Mar, 2025 1 commit
  5. 25 Feb, 2025 1 commit
    • Jinwei's avatar
      Support SGLang as Potential Backend for Evaluation (#2703) · 29971faa
      Jinwei authored
      
      
      * initial components to support sglang
      
      * init of class SGLangLM
      
      * draft for generate_until of SGLang model
      
      * mock loglikelihood
      
      * initial loglikelihood_tokens
      
      * todo: fix bug of sglang engine init
      
      * implement generation tasks and test
      
      * support output type loglikelihood and loglikelihood_rolling (#1)
      
      * .
      
      * loglikelihood_rolling
      
      * /
      
      * support dp_size>1
      
      * typo
      
      * add tests and clean code
      
      * skip tests of sglang for now
      
      * fix OOM error of sglang pytest
      
      * finish test for sglang
      
      * add sglang to readme
      
      * fix OOM of tests and clean SGLang model
      
      * update readme
      
      * clean pyproject and add tests for evaluator
      
      * add accuracy tests and it passed locally
      
      * add notes for test
      
      * Update README.md
      
      update readme
      
      * pre-commit
      
      ---------
      Co-authored-by: default avatarXiaotong Jiang <xiaotong.jiang@databricks.com>
      Co-authored-by: default avatarBaber Abbasi <92168766+baberabb@users.noreply.github.com>
      Co-authored-by: default avatarBaber <baber@hey.com>
      29971faa
  6. 13 Dec, 2024 1 commit
  7. 23 Oct, 2024 1 commit
    • Nikodem Szwast's avatar
      Support for IBM watsonx_llm (#2397) · 1185e89a
      Nikodem Szwast authored
      
      
      * add support for IBM watsonx_llm
      
      * add ibm_watsonx_ai package to optional-dependencies
      
      * move global scope imports to inner scope
      
      * change cache to lru_cache
      
      * fix circular import
      
      * use 3.8 typing
      
      * use 3.8 typing
      
      ---------
      Co-authored-by: default avatarBaber <baber@hey.com>
      1185e89a
  8. 13 Sep, 2024 1 commit
    • Lintang Sutawika's avatar
      Multimodal prototyping (#2243) · fb963f0f
      Lintang Sutawika authored
      
      
      * add WIP hf vlm class
      
      * add doc_to_image
      
      * add mmmu tasks
      
      * fix merge conflicts
      
      * add lintang's changes to hf_vlms.py
      
      * fix doc_to_image
      
      * added yaml_path for config-loading
      
      * revert
      
      * add line to process str type v
      
      * update
      
      * modeling cleanup
      
      * add aggregation for mmmu
      
      * rewrite MMMU processing code based on only MMMU authors' repo (doc_to_image still WIP)
      
      * implemented doc_to_image
      
      * update doc_to_image to accept list of features
      
      * update functions
      
      * readd image processed
      
      * update args process
      
      * bugfix for repeated images fed to model
      
      * push WIP loglikelihood code
      
      * commit most recent code (generative ; qwen2-vl testing)
      
      * preliminary image_token_id handling
      
      * small mmmu update: some qs have >4 mcqa options
      
      * push updated modeling code
      
      * use processor.apply_chat_template
      
      * add mathvista draft
      
      * nit
      
      * nit
      
      * ensure no footguns in text<>multimodal LM<>task incompatibility
      
      * add notification to readme regarding launch of prototype!
      
      * fix compatibility check
      
      * reorganize mmmu configs
      
      * chat_template=None
      
      * add interleave chat_template
      
      * add condition
      
      * add max_images; interleave=true
      
      * nit
      
      * testmini_mcq
      
      * nit
      
      * pass image string; convert img
      
      * add vllm
      
      * add init
      
      * vlm add multi attr
      
      * fixup
      
      * pass max images to vllm model init
      
      * nit
      
      * encoding to device
      
      * fix HFMultimodalLM.chat_template ?
      
      * add mmmu readme
      
      * remove erroneous prints
      
      * use HFMultimodalLM.chat_template ; restore tasks/__init__.py
      
      * add docstring for replace_placeholders in utils
      
      * fix `replace_placeholders`; set image_string=None
      
      * fix typo
      
      * cleanup + fix merge conflicts
      
      * update MMMU readme
      
      * del mathvista
      
      * add some sample scores
      
      * Update README.md
      
      * add log msg for image_string value
      
      ---------
      Co-authored-by: default avatarhaileyschoelkopf <hailey@eleuther.ai>
      Co-authored-by: default avatarBaber Abbasi <baber@eleuther.ai>
      Co-authored-by: default avatarBaber <baber@hey.com>
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      fb963f0f
  9. 22 Jul, 2024 1 commit
    • Baber Abbasi's avatar
      Refactor API models (#2008) · 42dc2448
      Baber Abbasi authored
      
      
      * refactor pad_token handling to fn
      
      * fix docs
      
      * add pad_token_handling to vllm
      
      * start on API superclass
      
      * don't detokenize the returned logits
      
      * streamline vllm tokenizer
      
      * add type hint
      
      * pre-commit
      
      * seems to be in working order
      
      * add model to init
      
      * refactor api models
      
      * nit
      
      * cleanup
      
      * add pbar
      
      * fix type hints
      
      * change optional dependencies
      
      * json encode chat template
      
      * add type hints
      
      * deal with different prompt input requiremnts
      
      * nits
      
      * fix
      
      * cache inside async
      
      * fix
      
      * fix
      
      * nits
      
      * nits
      
      * nits
      
      * nit
      
      * fixup
      
      * fixup
      
      * nit
      
      * add dummy retry
      
      * add dummy retry
      
      * handle imports; skip failing test
      
      * add type hint
      
      * add tests
      
      * add dependency to tests
      
      * add package names to exception
      
      * nit
      
      * docs; type hints
      
      * handle api key
      
      * nit
      
      * tokenizer bug
      
      * fix tokenizer
      
      * nit
      
      * nit
      
      * add better error messages
      
      * nit
      
      * remove decorator
      
      * CI: install api dep
      
      * revert evaluator.py
      
      * consolidate
      
      * consolidate
      
      * nits
      
      * nit
      
      * fix typealias
      
      * nit
      
      * nit
      
      * nit
      
      * Update lm_eval/models/api_models.py
      
      typo
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * Update lm_eval/models/openai_completions.py
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * Update lm_eval/models/anthropic_llms.py
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * Update lm_eval/models/api_models.py
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * fix typo
      
      * add news section
      
      * add info for API
      
      * pre-commit
      
      * typo
      
      * fix bug: unpack logliklehood requests
      
      * fix bug: shared gen_kwargs mutated
      
      * nit: handle copy properly
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * Update api_models.py
      
      * Update README.md
      
      ---------
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      42dc2448
  10. 16 Apr, 2024 1 commit
  11. 26 Mar, 2024 1 commit
    • Sergio Perez's avatar
      Integration of NeMo models into LM Evaluation Harness library (#1598) · e9d429e1
      Sergio Perez authored
      * Integration of NeMo models into LM Evaluation Harness library
      
      * rename nemo model as nemo_lm
      
      * move nemo section in readme after hf section
      
      * use self.eot_token_id in get_until()
      
      * improve progress bar showing loglikelihood requests
      
      * data replication or tensor/pipeline replication working fine within one node
      
      * run pre-commit on modified files
      
      * check whether dependencies are installed
      
      * clarify usage of torchrun in README
      e9d429e1
  12. 26 Feb, 2024 1 commit
  13. 18 Feb, 2024 1 commit
  14. 06 Feb, 2024 1 commit
  15. 05 Feb, 2024 1 commit
  16. 26 Jan, 2024 1 commit
    • NoushNabi's avatar
      Add causalLM OpenVino models (#1290) · 97a67d27
      NoushNabi authored
      
      
      * added intel optimum
      
      * added intel optimum in readme
      
      * modified intel optimum
      
      * modified intel optimum
      
      * modified intel optimum
      
      * modified install optimum
      
      * modified path of IR file
      
      * added openvino_device
      
      * added openvino_device2
      
      * changed optimum-causal to openvino-causal
      
      * Update README.md
      
      * Update README.md
      
      * remove `lm_eval.base` import
      
      * update openvino-causal -> openvino ; pass device through super().__init__()
      
      * Update README.md
      
      * Add optimum to tests dependencies
      
      * apply pre-commit
      
      * fix so tests pass
      
      ---------
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      Co-authored-by: default avatarhaileyschoelkopf <hailey@eleuther.ai>
      97a67d27
  17. 22 Dec, 2023 1 commit
    • Hailey Schoelkopf's avatar
      Upstream Mamba Support (`mamba_ssm`) (#1110) · 5503b274
      Hailey Schoelkopf authored
      * modularize HFLM code
      
      * pass through extra kwargs to AutoModel.from_pretrained call
      
      * remove explicit model_kwargs
      
      * rename gptq -> autogptq
      
      * fix tokenizer pad token errors
      
      * ensure model always respects device_map and autogptq's selected devices
      
      * add a _get_config helper fn
      
      * add mambaLMWrapper
      
      * add mamba extra
      
      * add mamba extra
      
      * fix conditional import
      
      * Fix botched merge commit
      
      * Remove beginning-of-file comment for consistency
      
      * Add docstring for mambaLM re: supported kwargs
      
      * Alphabetize extras
      
      * Update extras table
      
      * appease precommit
      
      * run precommit on mamba_lm
      5503b274
  18. 27 Nov, 2023 1 commit
  19. 22 Nov, 2023 1 commit
  20. 21 Nov, 2023 1 commit
  21. 03 Nov, 2023 1 commit
  22. 04 Aug, 2023 3 commits
  23. 02 Aug, 2023 2 commits
  24. 27 Jun, 2023 1 commit
  25. 22 Jun, 2023 3 commits
  26. 21 Jun, 2023 1 commit
  27. 20 Jun, 2023 1 commit
  28. 12 Jun, 2023 1 commit
  29. 08 Jun, 2023 2 commits
  30. 07 Jun, 2023 1 commit
  31. 08 May, 2023 1 commit
  32. 24 Apr, 2023 2 commits
  33. 23 Apr, 2023 1 commit