1. 04 Oct, 2025 1 commit
    • Baber Abbasi's avatar
      Fewshot refactor (#3227) · 003e5852
      Baber Abbasi authored
      
      
      * overhaul `ContextSampler`
      
      * refactor masakhapos
      
      * move multi_target to `exact_match`
      
      * remove doc_to_choice from `boolq-seq2seq`
      
      * remove doc_to_choice in generation process_results
      
      * Remove unused `doc_to_choice` and fix superglue whitespaces
      
      * require multiple_inputs and multiple_targets to be explicitly set in taskconfig
      
      * fix copa; better logging in task init
      
      * fix doc_to_target to return int rather than str (deprecated)
      
      * fix processing regression; recursively parse lists fron template
      
      * remove redundant jinja parsing logic
      
      * remove promptsource
      
      * for multiple_inputs use `doc_to_text: list[str]``
      
      * Refactor `ContextSampler` `fewshot_context`
      
      * fix multiple_input context
      
      * fix `target_delimiter` with `gen_prefix`
      
      * `doc_to_text` is list for multiple_inputs
      
      * Refactor `count_bytes` and `count_words` methods to `@staticmethod`
      
      * make has_*(train/test/validation) to properties
      
      * remove `multi_target` `generate_until`
      
      * `fix doc_to_target/multiple_targets handling add tests
      
      * rename `multi_target` to `multiple_targets`
      
      * evalaute list when multiple targets
      
      * allow doc_to_target to return list
      
      * Remove gen_prefix space and add warning (#3239)
      
      * Remove gen_prefix space and add warning
      
      * fix null gen_prefix bug again
      
      * use git tests
      
      ---------
      Co-authored-by: default avatarBoaz Ben-Dov <bendboaz@gmail.com>
      003e5852
  2. 25 Sep, 2025 2 commits
  3. 14 Feb, 2025 1 commit
  4. 19 Jan, 2025 1 commit
  5. 17 Jan, 2025 1 commit
  6. 15 Jan, 2025 1 commit
    • Baber Abbasi's avatar
      assistant prefill (#2615) · 703fbffd
      Baber Abbasi authored
      * add assistant prefix
      
      * add arc_challenge from llama
      
      * nit
      
      * nit
      
      * nit
      
      * add assistant prefix
      
      * add mmlu_llama
      
      * nit
      
      * nit
      
      * Revert "nit"
      
      This reverts commit 6a97f8356237305e375212b966b30e8de59dd4bc.
      
      * fix regex bug
      
      * add assistant_prefix to vllm
      
      * add `Question:`
      
      * add mmlu_pro
      
      * add fewshot assistant_prefix
      
      * use `assistant_prefill`
      
      * typehints
      
      * nits
      
      * nits
      
      * add to docs
      
      * add readme
      703fbffd
  7. 22 Aug, 2024 1 commit
  8. 05 Aug, 2024 1 commit
    • Yu Shi Jie's avatar
      Mmlu Pro (#1961) · 69d56f45
      Yu Shi Jie authored
      
      
      * initialized mmlu_pro task
      
      * added generative mmlu-pro
      
      * added cot fewshot for mmlu-pro
      
      * Initial commit
      
      * updated mmlu-pro to take on 3 splits: test, val, dev
      
      * mmlu-pro: added continuation and flan_cot_zeroshot
      
      * added README.md for mmlu_pro
      
      * removed
      
      * update files
      
      * moved files out, and removed unused versions
      
      * updated
      
      * mmlu_pro:
      
      -changed task 'other' to 'miscellaneous'
      there is already a group named 'other'
      task and group with the same alias (e.g. mmlu_pro_other_generative) throws an error
      
      -fixed yaml backslash escape for fewshot cot
      
      * changed choices -> options in yaml config to fit dataset schema
      
      * ONLY FOR DEFAULT: fixed yaml file to use variable number of choices
      
      * mmlu-pro: fixed doc_to_text/choice/target configs for all variants
      
      * mmlu-pro: minor fixes
      
      * mmlu-pro/default: aligned with mmlu updates
      
      * mmlu-pro: update yaml content in line with mmlu
      
      * mmlu-pro: fixed mislabelling of task (math->chemistry)
      
      * mmlu-pro: fixed yaml formatting
      
      * add custom fewshot doc_to_text, target, and choice
      
      * add process for each subtask
      
      * add process for each subtask
      
      * pre-commit
      
      * pre-commit
      
      * format
      
      * resolved left out merge
      
      * deleted folders + updated readme
      
      * Update evaluator.py
      
      * Update evaluator.py
      
      ---------
      Co-authored-by: default avatarYu Shi Jie <shijie@tensorplex.ai>
      Co-authored-by: default avatarlintangsutawika <lintang@eleuther.ai>
      Co-authored-by: default avatarroot <root@455bdd73-01.cloud.together.ai>
      Co-authored-by: default avatarLintang Sutawika <lintang@sutawika.com>
      69d56f45
  9. 08 Jul, 2024 1 commit
  10. 03 Jun, 2024 1 commit
  11. 31 May, 2024 1 commit
  12. 06 May, 2024 1 commit
    • LSinev's avatar
      Provide ability for custom sampler for ConfigurableTask (#1616) · ae72cebc
      LSinev authored
      * Added fewshot sampling seeds to evaluator.simple_evaluate signature
      
      Way to control seed of fewshot sampling
      may help with #1591
      
      * Added ability for custom sampler for ConfigurableTask
      
      May be set in config like
      ```
      fewshot_config:
        sampler: !function utils.MyFewshotSampler
      ```
      
      * explicitly set fewshot random generator seed for HFLM generate_until_task test
      
      * add backward compatibility for three args seed setup
      
      * save seeds info to logs/reports
      ae72cebc
  13. 20 Dec, 2023 1 commit
    • Baber Abbasi's avatar
      Switch Linting to `ruff` (#1166) · 65b8761d
      Baber Abbasi authored
      * add ruff and isort. remove black and flake8
      
      * remove unnecessary dependencies
      
      * remove dependency from table
      
      * change order
      
      * ran ruff
      
      * check 3.9
      
      * exclude evaluator
      
      * update CI workflow
      
      * use ruff config in pyproject.toml
      
      * test
      
      * add isort rules to ruff
      
      * sort imports
      
      * import `make_table`
      
      * try stages for no-commit-to-branch
      
      * turn on mypy for pre-commit
      
      * test
      
      * test
      
      * test
      
      * change no-commit-to-branch to default
      
      * nits
      
      * fixed dependency
      65b8761d
  14. 04 Dec, 2023 1 commit
  15. 15 Sep, 2023 1 commit
  16. 14 Sep, 2023 1 commit
  17. 25 Aug, 2023 1 commit
  18. 09 Aug, 2023 1 commit
  19. 13 Jul, 2023 3 commits
  20. 22 Jun, 2023 1 commit
  21. 21 Jun, 2023 4 commits
  22. 16 Jun, 2023 1 commit
  23. 19 May, 2023 1 commit
  24. 01 May, 2023 1 commit
  25. 19 Apr, 2023 1 commit