1. 16 Oct, 2025 1 commit
  2. 15 Oct, 2025 3 commits
  3. 02 Oct, 2025 1 commit
  4. 12 Sep, 2025 1 commit
  5. 08 Sep, 2025 1 commit
  6. 21 Aug, 2025 1 commit
  7. 02 Aug, 2025 1 commit
  8. 24 Jul, 2025 2 commits
  9. 23 Jul, 2025 2 commits
  10. 16 Jul, 2025 1 commit
    • Baber Abbasi's avatar
      truncate thinking tags in generations (#3145) · 51ede33c
      Baber Abbasi authored
      * feat: add postprocessing for generated text to strip stop sequences and thinking tokens
      
      * nit
      
      * fix: trim leading whitespace after stripping thinking tokens from generation
      
      * feat: add think_end_token to model_args
      
      * nit
      
      * nit
      
      * nit
      
      * add to readme
      
      * nit
      51ede33c
  11. 15 Jul, 2025 1 commit
  12. 25 Jun, 2025 1 commit
  13. 08 Jun, 2025 1 commit
    • Baber Abbasi's avatar
      [longbench] fix metric calculation (#2983) · 147e9d61
      Baber Abbasi authored
      * use all answers
      
      * use middle truncation
      
      * maybe fix classification score
      
      * strip classification preds
      
      * [vllm] remove stop tokens post-hoc
      
      * strip all preds
      
      * pacify pre-commit
      
      * start on truncation utility
      
      * add to readme
      
      * add a footgun doc
      
      * fix newline in yaml templates
      
      * do not strip code_sim preds!
      
      * fix pre-commit config
      
      * fix instruction warning
      
      * add not to longbench readme
      147e9d61
  14. 03 Jun, 2025 1 commit
  15. 26 May, 2025 1 commit
  16. 23 May, 2025 1 commit
  17. 19 May, 2025 1 commit
  18. 15 May, 2025 1 commit
  19. 10 May, 2025 1 commit
  20. 09 May, 2025 1 commit
  21. 06 May, 2025 1 commit
  22. 16 Apr, 2025 1 commit
  23. 14 Apr, 2025 1 commit
  24. 20 Mar, 2025 2 commits
  25. 11 Mar, 2025 1 commit
  26. 27 Feb, 2025 1 commit
  27. 21 Feb, 2025 1 commit
    • Lintang Sutawika's avatar
      Logging (#2203) · 1ba35e62
      Lintang Sutawika authored
      
      
      * changed source of eval_logger
      
      * allow eval_logger to be set from args
      
      * removed verbosity arg from non-main methods
      
      * fix logging
      
      * pre-commit
      
      * set verbosity in eval logger
      
      * replace utils.eval_logger
      
      * fix logging in main
      
      * add logging to docs
      
      * add logging message
      
      * nit
      
      * add logging to docs
      
      * refactor setup_logging to utils
      
      ---------
      Co-authored-by: default avatarBaber <baber@hey.com>
      1ba35e62
  28. 17 Feb, 2025 1 commit
  29. 07 Feb, 2025 1 commit
  30. 19 Jan, 2025 1 commit
  31. 15 Jan, 2025 1 commit
    • Baber Abbasi's avatar
      assistant prefill (#2615) · 703fbffd
      Baber Abbasi authored
      * add assistant prefix
      
      * add arc_challenge from llama
      
      * nit
      
      * nit
      
      * nit
      
      * add assistant prefix
      
      * add mmlu_llama
      
      * nit
      
      * nit
      
      * Revert "nit"
      
      This reverts commit 6a97f8356237305e375212b966b30e8de59dd4bc.
      
      * fix regex bug
      
      * add assistant_prefix to vllm
      
      * add `Question:`
      
      * add mmlu_pro
      
      * add fewshot assistant_prefix
      
      * use `assistant_prefill`
      
      * typehints
      
      * nits
      
      * nits
      
      * add to docs
      
      * add readme
      703fbffd
  32. 16 Dec, 2024 1 commit
  33. 30 Nov, 2024 1 commit
  34. 15 Nov, 2024 1 commit
  35. 30 Oct, 2024 1 commit