1. 16 Oct, 2025 6 commits
  2. 15 Oct, 2025 3 commits
  3. 02 Oct, 2025 1 commit
  4. 12 Sep, 2025 1 commit
  5. 08 Sep, 2025 1 commit
  6. 21 Aug, 2025 1 commit
  7. 02 Aug, 2025 1 commit
  8. 24 Jul, 2025 2 commits
  9. 23 Jul, 2025 2 commits
  10. 16 Jul, 2025 1 commit
    • Baber Abbasi's avatar
      truncate thinking tags in generations (#3145) · 51ede33c
      Baber Abbasi authored
      * feat: add postprocessing for generated text to strip stop sequences and thinking tokens
      
      * nit
      
      * fix: trim leading whitespace after stripping thinking tokens from generation
      
      * feat: add think_end_token to model_args
      
      * nit
      
      * nit
      
      * nit
      
      * add to readme
      
      * nit
      51ede33c
  11. 15 Jul, 2025 1 commit
  12. 25 Jun, 2025 1 commit
  13. 08 Jun, 2025 1 commit
    • Baber Abbasi's avatar
      [longbench] fix metric calculation (#2983) · 147e9d61
      Baber Abbasi authored
      * use all answers
      
      * use middle truncation
      
      * maybe fix classification score
      
      * strip classification preds
      
      * [vllm] remove stop tokens post-hoc
      
      * strip all preds
      
      * pacify pre-commit
      
      * start on truncation utility
      
      * add to readme
      
      * add a footgun doc
      
      * fix newline in yaml templates
      
      * do not strip code_sim preds!
      
      * fix pre-commit config
      
      * fix instruction warning
      
      * add not to longbench readme
      147e9d61
  14. 03 Jun, 2025 1 commit
  15. 26 May, 2025 1 commit
  16. 23 May, 2025 1 commit
  17. 19 May, 2025 1 commit
  18. 15 May, 2025 1 commit
  19. 10 May, 2025 1 commit
  20. 09 May, 2025 1 commit
  21. 06 May, 2025 1 commit
  22. 16 Apr, 2025 1 commit
  23. 14 Apr, 2025 1 commit
  24. 20 Mar, 2025 2 commits
  25. 11 Mar, 2025 1 commit
  26. 27 Feb, 2025 1 commit
  27. 21 Feb, 2025 1 commit
    • Lintang Sutawika's avatar
      Logging (#2203) · 1ba35e62
      Lintang Sutawika authored
      
      
      * changed source of eval_logger
      
      * allow eval_logger to be set from args
      
      * removed verbosity arg from non-main methods
      
      * fix logging
      
      * pre-commit
      
      * set verbosity in eval logger
      
      * replace utils.eval_logger
      
      * fix logging in main
      
      * add logging to docs
      
      * add logging message
      
      * nit
      
      * add logging to docs
      
      * refactor setup_logging to utils
      
      ---------
      Co-authored-by: default avatarBaber <baber@hey.com>
      1ba35e62
  28. 17 Feb, 2025 1 commit
  29. 07 Feb, 2025 1 commit
  30. 19 Jan, 2025 1 commit