1. 12 Jun, 2024 1 commit
  2. 11 Jun, 2024 1 commit
  3. 03 Jun, 2024 1 commit
  4. 30 May, 2024 1 commit
  5. 28 May, 2024 1 commit
  6. 24 May, 2024 1 commit
  7. 23 May, 2024 1 commit
  8. 19 May, 2024 1 commit
  9. 07 May, 2024 2 commits
  10. 05 May, 2024 2 commits
  11. 03 May, 2024 1 commit
    • KonradSzafer's avatar
      evaluation tracker implementation (#1766) · 59cf408a
      KonradSzafer authored
      * evaluation tracker implementation
      
      * OVModelForCausalLM test fix
      
      * typo fix
      
      * moved methods args
      
      * multiple args in one flag
      
      * loggers moved to dedicated dir
      
      * improved filename sanitization
      59cf408a
  12. 02 May, 2024 2 commits
  13. 18 Apr, 2024 1 commit
  14. 16 Apr, 2024 2 commits
  15. 05 Apr, 2024 1 commit
    • Seungwoo Ryu's avatar
      Anthropic Chat API (#1594) · 27924d77
      Seungwoo Ryu authored
      
      
      * claude3
      
      * supply for anthropic claude3
      
      * supply for anthropic claude3
      
      * anthropic config changes
      
      * add callback options on anthropic
      
      * line passed
      
      * claude3 tiny change
      
      * help anthropic installation
      
      * mention sysprompt / being careful with format in readme
      
      ---------
      Co-authored-by: default avatarhaileyschoelkopf <hailey@eleuther.ai>
      27924d77
  16. 01 Apr, 2024 1 commit
    • Michael Goin's avatar
      Fix CLI --batch_size arg for openai-completions/local-completions (#1656) · 9516087b
      Michael Goin authored
      The OpenAI interface supports batch size as an argument to the completions API, but does not seem to support specification of this on the CLI i.e. `lm_eval --model openai-completions --batch_size 16 ...` because of a simple lack of str->int conversion.
      
      This is confirmed by my usage and stacktrace from running `OPENAI_API_KEY=dummy lm_eval --model local-completions --tasks gsm8k --batch_size 16 --model_args model=nm-
      testing/zephyr-beta-7b-gptq-g128,tokenizer_backend=huggingface,base_url=http://localhost:8000/v1`:
      ```
      Traceback (most recent call last):
        File "/home/michael/venv/bin/lm_eval", line 8, in <module>
          sys.exit(cli_evaluate())
        File "/home/michael/code/lm-evaluation-harness/lm_eval/__main__.py", line 341, in cli_evaluate
          results = evaluator.simple_evaluate(
        File "/home/michael/code/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper
          return fn(*args, **kwargs)
        File "/home/michael/code/lm-evaluation-harness/lm_eval/evaluator.py", line 251, in simple_evaluate
          results = evaluate(
        File "/home/michael/code/lm-evaluation-harness/lm_eval/utils.py", line 288, in _wrapper
          return fn(*args, **kwargs)
        File "/home/michael/code/lm-evaluation-harness/lm_eval/evaluator.py", line 390, in evaluate
          resps = getattr(lm, reqtype)(cloned_reqs)
        File "/home/michael/code/lm-evaluation-harness/lm_eval/models/openai_completions.py", line 263, in generate_until
          list(sameuntil_chunks(re_ord.get_reordered(), self.batch_size)),
        File "/home/michael/code/lm-evaluation-harness/lm_eval/models/openai_completions.py", line 251, in sameuntil_chunks
          if len(ret) >= size or x[1] != lastuntil:
      TypeError: '>=' not supported between instances of 'int' and 'str'
      ```
      9516087b
  17. 27 Mar, 2024 1 commit
  18. 26 Mar, 2024 1 commit
    • Sergio Perez's avatar
      Integration of NeMo models into LM Evaluation Harness library (#1598) · e9d429e1
      Sergio Perez authored
      * Integration of NeMo models into LM Evaluation Harness library
      
      * rename nemo model as nemo_lm
      
      * move nemo section in readme after hf section
      
      * use self.eot_token_id in get_until()
      
      * improve progress bar showing loglikelihood requests
      
      * data replication or tensor/pipeline replication working fine within one node
      
      * run pre-commit on modified files
      
      * check whether dependencies are installed
      
      * clarify usage of torchrun in README
      e9d429e1
  19. 25 Mar, 2024 2 commits
  20. 21 Mar, 2024 1 commit
  21. 20 Mar, 2024 1 commit
  22. 19 Mar, 2024 2 commits
  23. 18 Mar, 2024 1 commit
  24. 17 Mar, 2024 1 commit
  25. 13 Mar, 2024 1 commit
  26. 09 Mar, 2024 1 commit
  27. 06 Mar, 2024 1 commit
  28. 03 Mar, 2024 1 commit
    • Baber Abbasi's avatar
      Vllm update DP+TP (#1508) · e5e35fca
      Baber Abbasi authored
      * use `@ray.remote` with distributed vLLM
      
      * update versions
      
      * bugfix
      
      * unpin vllm
      
      * fix pre-commit
      
      * added version assertion error
      
      * Revert "added version assertion error"
      
      This reverts commit 8041e9b78e95eea9f4f4d0dc260115ba8698e9cc.
      
      * added version assertion for DP
      
      * expand DP note
      
      * add warning
      
      * nit
      
      * pin vllm
      
      * fix typos
      e5e35fca
  29. 01 Mar, 2024 2 commits
  30. 28 Feb, 2024 1 commit
  31. 27 Feb, 2024 2 commits
    • Rich's avatar
      Fix AttributeError in huggingface.py When 'model_type' is Missing (#1489) · cc771eca
      Rich authored
      
      
      * model_type attribute error
      
      Getting attribute error when using a model without a 'model_type'
      
      * fix w/ and w/out the 'model_type' specification
      
      * use getattr(), also fix other config.model_type reference
      
      * Update huggingface.py
      
      ---------
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      cc771eca
    • Baber Abbasi's avatar
      Refactor `evaluater.evaluate` (#1441) · 5ccd65d4
      Baber Abbasi authored
      
      
      * change `all_gather` to `gather`
      
      * add TaskOutput utility class
      
      * Add FilterResults class and refactor task handling.
      
      * Rename `key` to `filter_key` for clarity
      
      * Add `print_writeout` function in utils.py
      
      * Add function to calculate limit size.
      
      * Add doc_iterator method to Task class
      
      * Refactor `doc_iterator` and cleanup in Task class
      
      * remove superfluous bits
      
      * change `all_gather` to `gather`
      
      * bugfix
      
      * bugfix
      
      * fix `gather`
      
      * Refactor `gather` loop
      
      * Refactor aggregate metrics calculation
      
      * Refactor and simplify aggregate metrics calculation
      Removed unused code
      
      * Simplify metrics calculation and remove unused code.
      
      * simplify the metrics calculation in `utils.py` and `evaluator.py`.
      
      * Fix group metric
      
      * change evaluate to hf_evaluate
      
      * change evaluate to hf_evaluate
      
      * add docs
      
      * add docs
      
      * nits
      
      * make isslice keyword only
      
      * nit
      
      * add todo
      
      * nit
      
      * nit
      
      * nit: swap order samples_metrics tuple
      
      * move instance sorting outside loop
      
      * nit
      
      * nit
      
      * Add __repr__ for ConfigurableTask
      
      * nit
      
      * nit
      
      * Revert "nit"
      
      This reverts commit dab8d9977a643752a17f840fd8cf7e4b107df28f.
      
      * fix some logging
      
      * nit
      
      * fix `predict_only` bug. thanks to `@LSinev`!
      
      * change `print_tasks` to `prepare_print_tasks`
      
      * nits
      
      * move eval utils
      
      * move eval utils
      
      * nit
      
      * add comment
      
      * added tqdm descriptions
      
      * Update lm_eval/evaluator_utils.py
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * fix mgsm bug
      
      * nit
      
      * fix `build_all_requests`
      
      * pre-commit
      
      * add ceil to limit
      
      ---------
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      5ccd65d4
  32. 26 Feb, 2024 1 commit