raiseValueError(f"Attempted to load model '{model_name}', but no model for this name found! Supported model names: {', '.join(MODEL_REGISTRY.keys())}")
TASK_REGISTRY={}
TASK_REGISTRY={}
...
@@ -133,7 +136,7 @@ searching in HF Evaluate library..."
...
@@ -133,7 +136,7 @@ searching in HF Evaluate library..."
defregister_aggregation(name):
defregister_aggregation(name):
# TODO: should we enforce a specific interface to aggregation metrics?
This list keeps track of which tasks' implementations have been ported to YAML / v2.0 of the Eval Harness.
This list keeps track of which tasks' implementations have been ported to YAML / v2.0 of the Eval Harness.
Boxes should be checked iff tasks are implemented in v2.0 and tested for regression. Tasks should be struck through if checked *against original introducing paper* implementation or popularizing implementation.
Boxes should be checked iff tasks are implemented in the refactor and tested for regression. Tasks should be struck through if checked *against original introducing paper* implementation or popularizing implementation.
- [ ] Glue
- [ ] Glue
- [] SuperGlue
- [x] SuperGlue
- [ ] CoQA
- [ ] CoQA
- [ ] DROP
- [ ] DROP
- [x] ~~Lambada~~
- [x] ~~Lambada~~
...
@@ -31,7 +31,7 @@ Boxes should be checked iff tasks are implemented in v2.0 and tested for regress
...
@@ -31,7 +31,7 @@ Boxes should be checked iff tasks are implemented in v2.0 and tested for regress