• Felipe Maia Polo's avatar
    Add `--samples` Argument for Fine-Grained Task Evaluation in... · d693dcd2
    Felipe Maia Polo authored
    
     Add `--samples` Argument for Fine-Grained Task Evaluation in `lm-evaluation-harness`. This feature is the first step towards efficient multi-prompt evaluation with PromptEval [1,2] (#2520)
    
    * added option --examples
    
    * specifying examples in dictionary
    
    * run pre-commit - fix arg type
    
    Signed-off-by: Mírian Silva <mirianfrsilva@ibm.com
    
    * fixing bug when examples==None
    
    * fixing bug when examples==None
    
    * limit or examples must be None in simple_evaluate.py and in evaluator.py
    
    * run pre-commit (fix formatting)
    
    Signed-off-by: Mírian Silva <mirianfrsilva@ibm.com
    
    * merge main and run pre-commit (fix formatting)
    
    Signed-off-by: Mírian Silva <mirianfrsilva@ibm.com
    
    * Update __main__.py
    
    undefined "limit" and "examples"
    
    * update branch, fix conflicts, run pre-commit
    
    * nits
    
    * nits
    
    * change 'examples' to 'samples'
    
    ---------
    
    Signed-off-by: Mírian Silva <mirianfrsilva@ibm.com
    Co-authored-by: default avatarmirianfrsilva <mirianfrsilva@ibm.com>
    Co-authored-by: default avatarStella Biderman <stellabiderman@gmail.com>
    Co-authored-by: default avatarBaber <baber@hey.com>
    d693dcd2
task.py 71.4 KB