Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
b0040ba0
Unverified
Commit
b0040ba0
authored
Aug 21, 2025
by
James A. Michaelov
Committed by
GitHub
Aug 21, 2025
Browse files
Add BLiMP-NL (#3221)
* add blimp_nl * add template yaml file
parent
1bd96448
Changes
88
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
348 additions
and
0 deletions
+348
-0
lm_eval/tasks/blimp_nl/auxiliaries__semi_aspectual_1.yaml
lm_eval/tasks/blimp_nl/auxiliaries__semi_aspectual_1.yaml
+3
-0
lm_eval/tasks/blimp_nl/auxiliaries__semi_aspectual_2.yaml
lm_eval/tasks/blimp_nl/auxiliaries__semi_aspectual_2.yaml
+3
-0
lm_eval/tasks/blimp_nl/binding_principle_a__c_command.yaml
lm_eval/tasks/blimp_nl/binding_principle_a__c_command.yaml
+3
-0
lm_eval/tasks/blimp_nl/binding_principle_a__monomorphemic.yaml
...al/tasks/blimp_nl/binding_principle_a__monomorphemic.yaml
+3
-0
lm_eval/tasks/blimp_nl/blimp_nl_group.yaml
lm_eval/tasks/blimp_nl/blimp_nl_group.yaml
+291
-0
lm_eval/tasks/blimp_nl/complementive__ditransitive.yaml
lm_eval/tasks/blimp_nl/complementive__ditransitive.yaml
+3
-0
lm_eval/tasks/blimp_nl/complementive__intransitive.yaml
lm_eval/tasks/blimp_nl/complementive__intransitive.yaml
+3
-0
lm_eval/tasks/blimp_nl/complementive__position_adverb.yaml
lm_eval/tasks/blimp_nl/complementive__position_adverb.yaml
+3
-0
lm_eval/tasks/blimp_nl/complementive__position_verb.yaml
lm_eval/tasks/blimp_nl/complementive__position_verb.yaml
+3
-0
lm_eval/tasks/blimp_nl/complementive__transitive.yaml
lm_eval/tasks/blimp_nl/complementive__transitive.yaml
+3
-0
lm_eval/tasks/blimp_nl/crossing_dependencies__cross_dependency.yaml
...sks/blimp_nl/crossing_dependencies__cross_dependency.yaml
+3
-0
lm_eval/tasks/blimp_nl/determiners__geen_expletive.yaml
lm_eval/tasks/blimp_nl/determiners__geen_expletive.yaml
+3
-0
lm_eval/tasks/blimp_nl/determiners__geen_scrambling_1.yaml
lm_eval/tasks/blimp_nl/determiners__geen_scrambling_1.yaml
+3
-0
lm_eval/tasks/blimp_nl/determiners__geen_scrambling_2.yaml
lm_eval/tasks/blimp_nl/determiners__geen_scrambling_2.yaml
+3
-0
lm_eval/tasks/blimp_nl/determiners__negative_polarity.yaml
lm_eval/tasks/blimp_nl/determiners__negative_polarity.yaml
+3
-0
lm_eval/tasks/blimp_nl/extraposition__adjectival_adverbial.yaml
...l/tasks/blimp_nl/extraposition__adjectival_adverbial.yaml
+3
-0
lm_eval/tasks/blimp_nl/extraposition__adjectival_supplementive.yaml
...sks/blimp_nl/extraposition__adjectival_supplementive.yaml
+3
-0
lm_eval/tasks/blimp_nl/extraposition__argument_nominal.yaml
lm_eval/tasks/blimp_nl/extraposition__argument_nominal.yaml
+3
-0
lm_eval/tasks/blimp_nl/finite_argument_clause__complementizer.yaml
...asks/blimp_nl/finite_argument_clause__complementizer.yaml
+3
-0
lm_eval/tasks/blimp_nl/finite_argument_clause__perception_dat.yaml
...asks/blimp_nl/finite_argument_clause__perception_dat.yaml
+3
-0
No files found.
lm_eval/tasks/blimp_nl/auxiliaries__semi_aspectual_1.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
auxiliaries__semi_aspectual_1
include
:
_template_yaml
task
:
blimp_nl__auxiliaries__semi_aspectual_1
lm_eval/tasks/blimp_nl/auxiliaries__semi_aspectual_2.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
auxiliaries__semi_aspectual_2
include
:
_template_yaml
task
:
blimp_nl__auxiliaries__semi_aspectual_2
lm_eval/tasks/blimp_nl/binding_principle_a__c_command.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
binding_principle_a__c_command
include
:
_template_yaml
task
:
blimp_nl__binding_principle_a__c_command
lm_eval/tasks/blimp_nl/binding_principle_a__monomorphemic.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
binding_principle_a__monomorphemic
include
:
_template_yaml
task
:
blimp_nl__binding_principle_a__monomorphemic
lm_eval/tasks/blimp_nl/blimp_nl_group.yaml
0 → 100644
View file @
b0040ba0
group
:
blimp_nl
task
:
-
group
:
blimp_nl__adpositional_phrases
task
:
-
blimp_nl__adpositional_phrases__argument_r_extraction
-
blimp_nl__adpositional_phrases__argument_scrambling
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__adverbial_modification
task
:
-
blimp_nl__adverbial_modification__position_proform
-
blimp_nl__adverbial_modification__position_type
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__anaphor_agreement
task
:
-
blimp_nl__anaphor_agreement__number
-
blimp_nl__anaphor_agreement__person
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__argument_structure
task
:
-
blimp_nl__argument_structure__argument_number_ditransitive
-
blimp_nl__argument_structure__argument_number_in_transitive
-
blimp_nl__argument_structure__ditransitive_nomdat_1
-
blimp_nl__argument_structure__ditransitive_nomdat_2
-
blimp_nl__argument_structure__ditransitive_nomdat_3
-
blimp_nl__argument_structure__intransitive_unaccusative_1
-
blimp_nl__argument_structure__intransitive_unaccusative_2
-
blimp_nl__argument_structure__intransitive_unaccusative_3
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__auxiliaries
task
:
-
blimp_nl__auxiliaries__order_1
-
blimp_nl__auxiliaries__order_2
-
blimp_nl__auxiliaries__perfect
-
blimp_nl__auxiliaries__semi_aspectual_1
-
blimp_nl__auxiliaries__semi_aspectual_2
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__binding_principle_a
task
:
-
blimp_nl__binding_principle_a__c_command
-
blimp_nl__binding_principle_a__monomorphemic
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__complementive
task
:
-
blimp_nl__complementive__ditransitive
-
blimp_nl__complementive__intransitive
-
blimp_nl__complementive__position_adverb
-
blimp_nl__complementive__position_verb
-
blimp_nl__complementive__transitive
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__crossing_dependencies
task
:
-
blimp_nl__crossing_dependencies__cross_dependency
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__determiners
task
:
-
blimp_nl__determiners__geen_expletive
-
blimp_nl__determiners__geen_scrambling_1
-
blimp_nl__determiners__geen_scrambling_2
-
blimp_nl__determiners__negative_polarity
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__extraposition
task
:
-
blimp_nl__extraposition__adjectival_adverbial
-
blimp_nl__extraposition__adjectival_supplementive
-
blimp_nl__extraposition__argument_nominal
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__finite_argument_clause
task
:
-
blimp_nl__finite_argument_clause__complementizer
-
blimp_nl__finite_argument_clause__perception_dat
-
blimp_nl__finite_argument_clause__perception_of
-
blimp_nl__finite_argument_clause__position
-
blimp_nl__finite_argument_clause__sluicing_1
-
blimp_nl__finite_argument_clause__sluicing_2
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__infinitival_argument_clause
task
:
-
blimp_nl__infinitival_argument_clause__bare_verb_cluster
-
blimp_nl__infinitival_argument_clause__bare_verb_type_1
-
blimp_nl__infinitival_argument_clause__bare_verb_type_2
-
blimp_nl__infinitival_argument_clause__bare_verb_type_3
-
blimp_nl__infinitival_argument_clause__om_te
-
blimp_nl__infinitival_argument_clause__te_om_te_difference_1
-
blimp_nl__infinitival_argument_clause__te_om_te_difference_2
-
blimp_nl__infinitival_argument_clause__te_transparant_split
-
blimp_nl__infinitival_argument_clause__verb_type
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__nominalization
task
:
-
blimp_nl__nominalization__type_inf_1
-
blimp_nl__nominalization__type_inf_2
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__parasitic_gaps
task
:
-
blimp_nl__parasitic_gaps__scrambling
-
blimp_nl__parasitic_gaps__structure_type_1
-
blimp_nl__parasitic_gaps__structure_type_2
-
blimp_nl__parasitic_gaps__structure_type_3
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__passive
task
:
-
blimp_nl__passive__aci
-
blimp_nl__passive__ditransitive_1
-
blimp_nl__passive__ditransitive_2
-
blimp_nl__passive__impersonal
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__quantifiers
task
:
-
blimp_nl__quantifiers__universal_difference_agreement_plural
-
blimp_nl__quantifiers__universal_difference_agreement_singular
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__r_words
task
:
-
blimp_nl__r_words__adverbial
-
blimp_nl__r_words__weak_proform
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__relativization
task
:
-
blimp_nl__relativization__island
-
blimp_nl__relativization__pied_piping
-
blimp_nl__relativization__resumptive_prolepsis
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__topicalization
task
:
-
blimp_nl__topicalization__island
-
blimp_nl__topicalization__question_similarity_1
-
blimp_nl__topicalization__question_similarity_2
-
blimp_nl__topicalization__resumptive_prolepsis
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__verb_second
task
:
-
blimp_nl__verb_second__order_embedded
-
blimp_nl__verb_second__order_main
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__wh_movement
task
:
-
blimp_nl__wh_movement__filler_effect_gap
-
blimp_nl__wh_movement__filler_effect_no_gap
-
blimp_nl__wh_movement__hierarchy
-
blimp_nl__wh_movement__question_formation
-
blimp_nl__wh_movement__stranding_1
-
blimp_nl__wh_movement__stranding_2
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
-
group
:
blimp_nl__wh_movement_restrictions
task
:
-
blimp_nl__wh_movement_restrictions__bridge_verb_1
-
blimp_nl__wh_movement_restrictions__bridge_verb_2
-
blimp_nl__wh_movement_restrictions__island_1
-
blimp_nl__wh_movement_restrictions__island_2
-
blimp_nl__wh_movement_restrictions__resumptive_prolepsis
-
blimp_nl__wh_movement_restrictions__superiority
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
aggregate_metric_list
:
-
metric
:
acc
aggregation
:
mean
weight_by_size
:
false
-
metric
:
acc_norm
aggregation
:
mean
weight_by_size
:
false
lm_eval/tasks/blimp_nl/complementive__ditransitive.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
complementive__ditransitive
include
:
_template_yaml
task
:
blimp_nl__complementive__ditransitive
lm_eval/tasks/blimp_nl/complementive__intransitive.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
complementive__intransitive
include
:
_template_yaml
task
:
blimp_nl__complementive__intransitive
lm_eval/tasks/blimp_nl/complementive__position_adverb.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
complementive__position_adverb
include
:
_template_yaml
task
:
blimp_nl__complementive__position_adverb
lm_eval/tasks/blimp_nl/complementive__position_verb.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
complementive__position_verb
include
:
_template_yaml
task
:
blimp_nl__complementive__position_verb
lm_eval/tasks/blimp_nl/complementive__transitive.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
complementive__transitive
include
:
_template_yaml
task
:
blimp_nl__complementive__transitive
lm_eval/tasks/blimp_nl/crossing_dependencies__cross_dependency.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
crossing_dependencies__cross_dependency
include
:
_template_yaml
task
:
blimp_nl__crossing_dependencies__cross_dependency
lm_eval/tasks/blimp_nl/determiners__geen_expletive.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
determiners__geen_expletive
include
:
_template_yaml
task
:
blimp_nl__determiners__geen_expletive
lm_eval/tasks/blimp_nl/determiners__geen_scrambling_1.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
determiners__geen_scrambling_1
include
:
_template_yaml
task
:
blimp_nl__determiners__geen_scrambling_1
lm_eval/tasks/blimp_nl/determiners__geen_scrambling_2.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
determiners__geen_scrambling_2
include
:
_template_yaml
task
:
blimp_nl__determiners__geen_scrambling_2
lm_eval/tasks/blimp_nl/determiners__negative_polarity.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
determiners__negative_polarity
include
:
_template_yaml
task
:
blimp_nl__determiners__negative_polarity
lm_eval/tasks/blimp_nl/extraposition__adjectival_adverbial.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
extraposition__adjectival_adverbial
include
:
_template_yaml
task
:
blimp_nl__extraposition__adjectival_adverbial
lm_eval/tasks/blimp_nl/extraposition__adjectival_supplementive.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
extraposition__adjectival_supplementive
include
:
_template_yaml
task
:
blimp_nl__extraposition__adjectival_supplementive
lm_eval/tasks/blimp_nl/extraposition__argument_nominal.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
extraposition__argument_nominal
include
:
_template_yaml
task
:
blimp_nl__extraposition__argument_nominal
lm_eval/tasks/blimp_nl/finite_argument_clause__complementizer.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
finite_argument_clause__complementizer
include
:
_template_yaml
task
:
blimp_nl__finite_argument_clause__complementizer
lm_eval/tasks/blimp_nl/finite_argument_clause__perception_dat.yaml
0 → 100644
View file @
b0040ba0
dataset_name
:
finite_argument_clause__perception_dat
include
:
_template_yaml
task
:
blimp_nl__finite_argument_clause__perception_dat
Prev
1
2
3
4
5
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment