1. 04 Sep, 2024 1 commit
  2. 28 Aug, 2024 1 commit
  3. 22 Jul, 2024 1 commit
    • Baber Abbasi's avatar
      Refactor API models (#2008) · 42dc2448
      Baber Abbasi authored
      
      
      * refactor pad_token handling to fn
      
      * fix docs
      
      * add pad_token_handling to vllm
      
      * start on API superclass
      
      * don't detokenize the returned logits
      
      * streamline vllm tokenizer
      
      * add type hint
      
      * pre-commit
      
      * seems to be in working order
      
      * add model to init
      
      * refactor api models
      
      * nit
      
      * cleanup
      
      * add pbar
      
      * fix type hints
      
      * change optional dependencies
      
      * json encode chat template
      
      * add type hints
      
      * deal with different prompt input requiremnts
      
      * nits
      
      * fix
      
      * cache inside async
      
      * fix
      
      * fix
      
      * nits
      
      * nits
      
      * nits
      
      * nit
      
      * fixup
      
      * fixup
      
      * nit
      
      * add dummy retry
      
      * add dummy retry
      
      * handle imports; skip failing test
      
      * add type hint
      
      * add tests
      
      * add dependency to tests
      
      * add package names to exception
      
      * nit
      
      * docs; type hints
      
      * handle api key
      
      * nit
      
      * tokenizer bug
      
      * fix tokenizer
      
      * nit
      
      * nit
      
      * add better error messages
      
      * nit
      
      * remove decorator
      
      * CI: install api dep
      
      * revert evaluator.py
      
      * consolidate
      
      * consolidate
      
      * nits
      
      * nit
      
      * fix typealias
      
      * nit
      
      * nit
      
      * nit
      
      * Update lm_eval/models/api_models.py
      
      typo
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * Update lm_eval/models/openai_completions.py
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * Update lm_eval/models/anthropic_llms.py
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * Update lm_eval/models/api_models.py
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * fix typo
      
      * add news section
      
      * add info for API
      
      * pre-commit
      
      * typo
      
      * fix bug: unpack logliklehood requests
      
      * fix bug: shared gen_kwargs mutated
      
      * nit: handle copy properly
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * Update api_models.py
      
      * Update README.md
      
      ---------
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      42dc2448
  4. 28 Jun, 2024 1 commit
  5. 03 Jun, 2024 1 commit
  6. 25 Mar, 2024 1 commit
    • Lintang Sutawika's avatar
      Seq2seq fix (#1604) · 262f879a
      Lintang Sutawika authored
      
      
      * fix on --task list
      
      * add fixes to tokeniation
      
      * differentiate encoding for seq2seq and decoder
      
      * return token setting
      
      * format for pre-commit
      
      * Seq2seq fix, pt2 (#1630)
      
      * getting model class only when defined
      
      * encode_pair handles None, add_special_tokens turned into dict with default value
      
      ---------
      Co-authored-by: default avatarachervyakov <77295913+artemorloff@users.noreply.github.com>
      262f879a
  7. 20 Mar, 2024 1 commit
  8. 19 Mar, 2024 1 commit
  9. 18 Mar, 2024 1 commit
  10. 17 Mar, 2024 1 commit
  11. 13 Mar, 2024 1 commit
  12. 06 Mar, 2024 1 commit
  13. 27 Feb, 2024 1 commit
    • Baber Abbasi's avatar
      Refactor `evaluater.evaluate` (#1441) · 5ccd65d4
      Baber Abbasi authored
      
      
      * change `all_gather` to `gather`
      
      * add TaskOutput utility class
      
      * Add FilterResults class and refactor task handling.
      
      * Rename `key` to `filter_key` for clarity
      
      * Add `print_writeout` function in utils.py
      
      * Add function to calculate limit size.
      
      * Add doc_iterator method to Task class
      
      * Refactor `doc_iterator` and cleanup in Task class
      
      * remove superfluous bits
      
      * change `all_gather` to `gather`
      
      * bugfix
      
      * bugfix
      
      * fix `gather`
      
      * Refactor `gather` loop
      
      * Refactor aggregate metrics calculation
      
      * Refactor and simplify aggregate metrics calculation
      Removed unused code
      
      * Simplify metrics calculation and remove unused code.
      
      * simplify the metrics calculation in `utils.py` and `evaluator.py`.
      
      * Fix group metric
      
      * change evaluate to hf_evaluate
      
      * change evaluate to hf_evaluate
      
      * add docs
      
      * add docs
      
      * nits
      
      * make isslice keyword only
      
      * nit
      
      * add todo
      
      * nit
      
      * nit
      
      * nit: swap order samples_metrics tuple
      
      * move instance sorting outside loop
      
      * nit
      
      * nit
      
      * Add __repr__ for ConfigurableTask
      
      * nit
      
      * nit
      
      * Revert "nit"
      
      This reverts commit dab8d9977a643752a17f840fd8cf7e4b107df28f.
      
      * fix some logging
      
      * nit
      
      * fix `predict_only` bug. thanks to `@LSinev`!
      
      * change `print_tasks` to `prepare_print_tasks`
      
      * nits
      
      * move eval utils
      
      * move eval utils
      
      * nit
      
      * add comment
      
      * added tqdm descriptions
      
      * Update lm_eval/evaluator_utils.py
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * fix mgsm bug
      
      * nit
      
      * fix `build_all_requests`
      
      * pre-commit
      
      * add ceil to limit
      
      ---------
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      5ccd65d4
  14. 26 Feb, 2024 2 commits
    • Aaron V's avatar
      Create a means for caching task registration and request building. Ad… (#1372) · 1e6c9272
      Aaron V authored
      
      
      * Create a means for caching task registration and request building. Add the ability to specify an args dict for simple_evaluate().
      
      * Remove extra S in cache path in caching module
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      
      * Rename requests cache args, make model_args polymorphic so that a dict can also be accepted.
      
      * Update docs to reflect new caching behavior, add CLI args for requests caching. Create a function for deleting items in the cache.
      
      * Update documentation, fix minor bug with arg parsing for requests caching where an undefined variable was used.
      
      * Remove line from gitignore, add to cli for caching datasets.
      
      * Add hashing suffix to .pickles. Update test script typo.
      
      * Favor isinstance() over type() in evaluator.py
      
      * Add tests for caching, gets tests working, remove unneeded arg from build_all_requests().
      
      * Update arg description to simple_evaluate.
      
      * Update pyproject.toml
      
      * Fix typehint
      
      * Remove the use of random() for creating default cache pickle hash.
      
      * Check that cache dir exists before clearing it in request cache tests.
      
      * Fix linting problems.
      
      * Fix additional formatting errors.
      
      * Remove trailing whitespace.
      
      * Add new line to the end of .gitignore.
      
      ---------
      Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
      1e6c9272
    • Hailey Schoelkopf's avatar
      Add Gemma support (Add flag to control BOS token usage) (#1465) · 4c51111c
      Hailey Schoelkopf authored
      
      
      * add add_bos_token to HFLM
      
      * add BOS token flag to other local model classes
      
      ---------
      Co-authored-by: default avatarLintang Sutawika <lintang@eleuther.ai>
      4c51111c
  15. 22 Feb, 2024 1 commit
  16. 20 Dec, 2023 1 commit
    • Baber Abbasi's avatar
      Switch Linting to `ruff` (#1166) · 65b8761d
      Baber Abbasi authored
      * add ruff and isort. remove black and flake8
      
      * remove unnecessary dependencies
      
      * remove dependency from table
      
      * change order
      
      * ran ruff
      
      * check 3.9
      
      * exclude evaluator
      
      * update CI workflow
      
      * use ruff config in pyproject.toml
      
      * test
      
      * add isort rules to ruff
      
      * sort imports
      
      * import `make_table`
      
      * try stages for no-commit-to-branch
      
      * turn on mypy for pre-commit
      
      * test
      
      * test
      
      * test
      
      * change no-commit-to-branch to default
      
      * nits
      
      * fixed dependency
      65b8761d
  17. 29 Nov, 2023 1 commit
  18. 02 Nov, 2023 2 commits
  19. 18 Oct, 2023 2 commits
  20. 13 Sep, 2023 1 commit
  21. 25 Aug, 2023 1 commit
  22. 07 Aug, 2023 1 commit
  23. 16 Jul, 2023 2 commits
  24. 28 Jun, 2023 4 commits
  25. 23 Jun, 2023 1 commit
  26. 19 Jun, 2023 1 commit
  27. 07 Jun, 2023 1 commit
  28. 19 May, 2023 1 commit
  29. 02 May, 2023 1 commit
  30. 24 Apr, 2023 1 commit
  31. 19 Apr, 2023 1 commit