Unverified Commit fa837646 authored by LSinev's avatar LSinev Committed by GitHub
Browse files

Fix typos in multiple places (#2244)

ACLUE bibtex typo reported to ACL Anthology and fixed here as title in pdf is correct.
parent 259b756a
...@@ -69,7 +69,7 @@ ...@@ -69,7 +69,7 @@
| [mgsm](mgsm/README.md) | Benchmark of multilingual grade-school math problems. | Spanish, French, German, Russian, Chinese, Japanese, Thai, Swahili, Bengali, Telugu | | [mgsm](mgsm/README.md) | Benchmark of multilingual grade-school math problems. | Spanish, French, German, Russian, Chinese, Japanese, Thai, Swahili, Bengali, Telugu |
| [minerva_math](minerva_math/README.md) | Mathematics-focused tasks requiring numerical reasoning and problem-solving skills. | English | | [minerva_math](minerva_math/README.md) | Mathematics-focused tasks requiring numerical reasoning and problem-solving skills. | English |
| mmlu | Massive Multitask Language Understanding benchmark for broad domain language evaluation. Several variants are supported. | English | | mmlu | Massive Multitask Language Understanding benchmark for broad domain language evaluation. Several variants are supported. | English |
| [mmlusr](mmlusr/README.md) | Variation of MMLU designed to be more rigourous. | English | | [mmlusr](mmlusr/README.md) | Variation of MMLU designed to be more rigorous. | English |
| model_written_evals | Evaluation tasks auto-generated for evaluating a collection of AI Safety concerns. | | | model_written_evals | Evaluation tasks auto-generated for evaluating a collection of AI Safety concerns. | |
| [mutual](mutual/README.md) | A retrieval-based dataset for multi-turn dialogue reasoning. | English | | [mutual](mutual/README.md) | A retrieval-based dataset for multi-turn dialogue reasoning. | English |
| [nq_open](nq_open/README.md) | Open domain question answering tasks based on the Natural Questions dataset. | English | | [nq_open](nq_open/README.md) | Open domain question answering tasks based on the Natural Questions dataset. | English |
......
...@@ -492,7 +492,7 @@ class TaskManager: ...@@ -492,7 +492,7 @@ class TaskManager:
"`group` and `group_alias` keys in tasks' configs will no longer be used in the next release of lm-eval. " "`group` and `group_alias` keys in tasks' configs will no longer be used in the next release of lm-eval. "
"`tag` will be used to allow to call a collection of tasks just like `group`. " "`tag` will be used to allow to call a collection of tasks just like `group`. "
"`group` will be removed in order to not cause confusion with the new ConfigurableGroup " "`group` will be removed in order to not cause confusion with the new ConfigurableGroup "
"which will be the offical way to create groups with addition of group-wide configuations." "which will be the official way to create groups with addition of group-wide configurations."
) )
print_info = False print_info = False
# attr = "tag" # attr = "tag"
......
...@@ -14,7 +14,7 @@ Homepage: https://github.com/isen-zhang/ACLUE ...@@ -14,7 +14,7 @@ Homepage: https://github.com/isen-zhang/ACLUE
```bibtex ```bibtex
@inproceedings{zhang-li-2023-large, @inproceedings{zhang-li-2023-large,
title = "Can Large Langauge Model Comprehend {A}ncient {C}hinese? A Preliminary Test on {ACLUE}", title = "Can Large Language Model Comprehend {A}ncient {C}hinese? A Preliminary Test on {ACLUE}",
author = "Zhang, Yixuan and Li, Haonan", author = "Zhang, Yixuan and Li, Haonan",
booktitle = "Proceedings of the Ancient Language Processing Workshop", booktitle = "Proceedings of the Ancient Language Processing Workshop",
month = sep, month = sep,
......
...@@ -16,8 +16,8 @@ Homepage: https://eqbench.com/ ...@@ -16,8 +16,8 @@ Homepage: https://eqbench.com/
NOTE: There are some key differences between the lm-evaluation-harness version and the implementation described in the EQ-Bench paper (These have been OK'd by the author): NOTE: There are some key differences between the lm-evaluation-harness version and the implementation described in the EQ-Bench paper (These have been OK'd by the author):
- The lm-eval version uses the EQ-Bench v2 test set (171 questions) and score calculation. It does not incorporate the revision part of the prompt, as per v2.1 (https://github.com/EQ-bench/EQ-Bench) - The lm-eval version uses the EQ-Bench v2 test set (171 questions) and score calculation. It does not incorporate the revision part of the prompt, as per v2.1 (https://github.com/EQ-bench/EQ-Bench)
- No retries in lm-eval version (EQ-Bench pipeline retries with successively higher temps if it encounters unparseable answers) - No retries in lm-eval version (EQ-Bench pipeline retries with successively higher temps if it encounters unparsable answers)
- In the original implementation, unparseable answers are excluded from the final score, and 83% of answers have to be parseable or a fail is returned. The lm-eval version instead assigns 0 to unparsable answers and has no fail criteria. So for lower performing models, there may be differences with the EQ-Bench leaderboard. - In the original implementation, unparsable answers are excluded from the final score, and 83% of answers have to be parseable or a fail is returned. The lm-eval version instead assigns 0 to unparsable answers and has no fail criteria. So for lower performing models, there may be differences with the EQ-Bench leaderboard.
### Citation ### Citation
......
...@@ -78,7 +78,7 @@ _ENDING_OPTIONS = ("Any other questions?", "Is there anything else I can help wi ...@@ -78,7 +78,7 @@ _ENDING_OPTIONS = ("Any other questions?", "Is there anything else I can help wi
# The number of highlighted sections. # The number of highlighted sections.
_NUM_HIGHLIGHTED_SECTIONS = 4 _NUM_HIGHLIGHTED_SECTIONS = 4
# The section spliter. # The section splitter.
_SECTION_SPLITER = ("Section", "SECTION") _SECTION_SPLITER = ("Section", "SECTION")
# The number of sections. # The number of sections.
...@@ -153,7 +153,7 @@ class ResponseLanguageChecker(Instruction): ...@@ -153,7 +153,7 @@ class ResponseLanguageChecker(Instruction):
return self._description_pattern.format(language=_LANGUAGES[self._language]) return self._description_pattern.format(language=_LANGUAGES[self._language])
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"language": self._language} return {"language": self._language}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -223,7 +223,7 @@ class NumberOfSentences(Instruction): ...@@ -223,7 +223,7 @@ class NumberOfSentences(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"num_sentences": self._num_sentences_threshold, "num_sentences": self._num_sentences_threshold,
"relation": self._comparison_relation, "relation": self._comparison_relation,
...@@ -276,7 +276,7 @@ class PlaceholderChecker(Instruction): ...@@ -276,7 +276,7 @@ class PlaceholderChecker(Instruction):
return self._description_pattern.format(num_placeholders=self._num_placeholders) return self._description_pattern.format(num_placeholders=self._num_placeholders)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"num_placeholders": self._num_placeholders} return {"num_placeholders": self._num_placeholders}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -323,7 +323,7 @@ class BulletListChecker(Instruction): ...@@ -323,7 +323,7 @@ class BulletListChecker(Instruction):
return self._description_pattern.format(num_bullets=self._num_bullets) return self._description_pattern.format(num_bullets=self._num_bullets)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"num_bullets": self._num_bullets} return {"num_bullets": self._num_bullets}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -362,7 +362,7 @@ class ConstrainedResponseChecker(Instruction): ...@@ -362,7 +362,7 @@ class ConstrainedResponseChecker(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return None return None
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -393,7 +393,7 @@ class ConstrainedStartChecker(Instruction): ...@@ -393,7 +393,7 @@ class ConstrainedStartChecker(Instruction):
"""Build the instruction description. """Build the instruction description.
Args: Args:
starter: A string representing the keyward that the response should start starter: A string representing the keyword that the response should start
with. with.
Returns: Returns:
...@@ -409,7 +409,7 @@ class ConstrainedStartChecker(Instruction): ...@@ -409,7 +409,7 @@ class ConstrainedStartChecker(Instruction):
return self._description_pattern.format(starter=self._starter) return self._description_pattern.format(starter=self._starter)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"starter": self._starter} return {"starter": self._starter}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -458,7 +458,7 @@ class HighlightSectionChecker(Instruction): ...@@ -458,7 +458,7 @@ class HighlightSectionChecker(Instruction):
return self._description_pattern.format(num_highlights=self._num_highlights) return self._description_pattern.format(num_highlights=self._num_highlights)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"num_highlights": self._num_highlights} return {"num_highlights": self._num_highlights}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -469,12 +469,12 @@ class HighlightSectionChecker(Instruction): ...@@ -469,12 +469,12 @@ class HighlightSectionChecker(Instruction):
"""Checks if the number of highlighted sections meets the requirement. """Checks if the number of highlighted sections meets the requirement.
Args: Args:
value: a string repesenting the response. The response is expected to value: a string representing the response. The response is expected to
contain highlighted sections in the format of *highlighted*. contain highlighted sections in the format of *highlighted*.
Returns: Returns:
True if the actual number of highlighted sections in the format of True if the actual number of highlighted sections in the format of
*highlighed sections* meets the minimum requirement; otherwise False. *highlighted sections* meets the minimum requirement; otherwise False.
""" """
num_highlights = 0 num_highlights = 0
highlights = re.findall(r"\*[^\n\*]*\*", value) highlights = re.findall(r"\*[^\n\*]*\*", value)
...@@ -529,7 +529,7 @@ class SectionChecker(Instruction): ...@@ -529,7 +529,7 @@ class SectionChecker(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"section_spliter": self._section_spliter, "section_spliter": self._section_spliter,
"num_sections": self._num_sections, "num_sections": self._num_sections,
...@@ -582,7 +582,7 @@ class ParagraphChecker(Instruction): ...@@ -582,7 +582,7 @@ class ParagraphChecker(Instruction):
return self._description_pattern.format(num_paragraphs=self._num_paragraphs) return self._description_pattern.format(num_paragraphs=self._num_paragraphs)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"num_paragraphs": self._num_paragraphs} return {"num_paragraphs": self._num_paragraphs}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -642,7 +642,7 @@ class PostscriptChecker(Instruction): ...@@ -642,7 +642,7 @@ class PostscriptChecker(Instruction):
return self._description_pattern.format(postscript=self._postscript_marker) return self._description_pattern.format(postscript=self._postscript_marker)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"postscript_marker": self._postscript_marker} return {"postscript_marker": self._postscript_marker}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -672,7 +672,7 @@ class PostscriptChecker(Instruction): ...@@ -672,7 +672,7 @@ class PostscriptChecker(Instruction):
class RephraseChecker(Instruction): class RephraseChecker(Instruction):
"""Checks the repharse.""" """Checks the rephrase."""
def build_description(self, *, original_message): def build_description(self, *, original_message):
"""Build the instruction description. """Build the instruction description.
...@@ -701,7 +701,7 @@ class RephraseChecker(Instruction): ...@@ -701,7 +701,7 @@ class RephraseChecker(Instruction):
return self._description return self._description
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"original_message": self._reference_without_change} return {"original_message": self._reference_without_change}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -766,7 +766,7 @@ class KeywordChecker(Instruction): ...@@ -766,7 +766,7 @@ class KeywordChecker(Instruction):
return self._description_pattern.format(keywords=self._keywords) return self._description_pattern.format(keywords=self._keywords)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"keywords": self._keywords} return {"keywords": self._keywords}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -831,7 +831,7 @@ class KeywordFrequencyChecker(Instruction): ...@@ -831,7 +831,7 @@ class KeywordFrequencyChecker(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"keyword": self._keyword, "keyword": self._keyword,
"frequency": self._frequency, "frequency": self._frequency,
...@@ -894,7 +894,7 @@ class NumberOfWords(Instruction): ...@@ -894,7 +894,7 @@ class NumberOfWords(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"num_words": self._num_words, "relation": self._comparison_relation} return {"num_words": self._num_words, "relation": self._comparison_relation}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -922,7 +922,7 @@ class JsonFormat(Instruction): ...@@ -922,7 +922,7 @@ class JsonFormat(Instruction):
return self._description_pattern return self._description_pattern
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return None return None
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -996,7 +996,7 @@ class ParagraphFirstWordCheck(Instruction): ...@@ -996,7 +996,7 @@ class ParagraphFirstWordCheck(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"num_paragraphs": self._num_paragraphs, "num_paragraphs": self._num_paragraphs,
"nth_paragraph": self._nth_paragraph, "nth_paragraph": self._nth_paragraph,
...@@ -1089,7 +1089,7 @@ class KeySentenceChecker(Instruction): ...@@ -1089,7 +1089,7 @@ class KeySentenceChecker(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"num_sentences": self._num_sentences, "num_sentences": self._num_sentences,
"key_sentences": list(self._key_sentences), "key_sentences": list(self._key_sentences),
...@@ -1117,7 +1117,7 @@ class ForbiddenWords(Instruction): ...@@ -1117,7 +1117,7 @@ class ForbiddenWords(Instruction):
"""Build the instruction description. """Build the instruction description.
Args: Args:
forbidden_words: A sequences of strings respresenting words that are not forbidden_words: A sequences of strings representing words that are not
allowed in the response. allowed in the response.
Returns: Returns:
...@@ -1138,7 +1138,7 @@ class ForbiddenWords(Instruction): ...@@ -1138,7 +1138,7 @@ class ForbiddenWords(Instruction):
return self._description_pattern.format(forbidden_words=self._forbidden_words) return self._description_pattern.format(forbidden_words=self._forbidden_words)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"forbidden_words": self._forbidden_words} return {"forbidden_words": self._forbidden_words}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -1188,7 +1188,7 @@ class RephraseParagraph(Instruction): ...@@ -1188,7 +1188,7 @@ class RephraseParagraph(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"original_paragraph": self._original_paragraph, "original_paragraph": self._original_paragraph,
"low": self._low, "low": self._low,
...@@ -1225,7 +1225,7 @@ class TwoResponsesChecker(Instruction): ...@@ -1225,7 +1225,7 @@ class TwoResponsesChecker(Instruction):
return self._description_pattern return self._description_pattern
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return None return None
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
......
...@@ -78,7 +78,7 @@ _ENDING_OPTIONS = ("Any other questions?", "Is there anything else I can help wi ...@@ -78,7 +78,7 @@ _ENDING_OPTIONS = ("Any other questions?", "Is there anything else I can help wi
# The number of highlighted sections. # The number of highlighted sections.
_NUM_HIGHLIGHTED_SECTIONS = 4 _NUM_HIGHLIGHTED_SECTIONS = 4
# The section spliter. # The section splitter.
_SECTION_SPLITER = ("Section", "SECTION") _SECTION_SPLITER = ("Section", "SECTION")
# The number of sections. # The number of sections.
...@@ -153,7 +153,7 @@ class ResponseLanguageChecker(Instruction): ...@@ -153,7 +153,7 @@ class ResponseLanguageChecker(Instruction):
return self._description_pattern.format(language=_LANGUAGES[self._language]) return self._description_pattern.format(language=_LANGUAGES[self._language])
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"language": self._language} return {"language": self._language}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -223,7 +223,7 @@ class NumberOfSentences(Instruction): ...@@ -223,7 +223,7 @@ class NumberOfSentences(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"num_sentences": self._num_sentences_threshold, "num_sentences": self._num_sentences_threshold,
"relation": self._comparison_relation, "relation": self._comparison_relation,
...@@ -276,7 +276,7 @@ class PlaceholderChecker(Instruction): ...@@ -276,7 +276,7 @@ class PlaceholderChecker(Instruction):
return self._description_pattern.format(num_placeholders=self._num_placeholders) return self._description_pattern.format(num_placeholders=self._num_placeholders)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"num_placeholders": self._num_placeholders} return {"num_placeholders": self._num_placeholders}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -323,7 +323,7 @@ class BulletListChecker(Instruction): ...@@ -323,7 +323,7 @@ class BulletListChecker(Instruction):
return self._description_pattern.format(num_bullets=self._num_bullets) return self._description_pattern.format(num_bullets=self._num_bullets)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"num_bullets": self._num_bullets} return {"num_bullets": self._num_bullets}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -362,7 +362,7 @@ class ConstrainedResponseChecker(Instruction): ...@@ -362,7 +362,7 @@ class ConstrainedResponseChecker(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return None return None
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -393,7 +393,7 @@ class ConstrainedStartChecker(Instruction): ...@@ -393,7 +393,7 @@ class ConstrainedStartChecker(Instruction):
"""Build the instruction description. """Build the instruction description.
Args: Args:
starter: A string representing the keyward that the response should start starter: A string representing the keyword that the response should start
with. with.
Returns: Returns:
...@@ -409,7 +409,7 @@ class ConstrainedStartChecker(Instruction): ...@@ -409,7 +409,7 @@ class ConstrainedStartChecker(Instruction):
return self._description_pattern.format(starter=self._starter) return self._description_pattern.format(starter=self._starter)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"starter": self._starter} return {"starter": self._starter}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -458,7 +458,7 @@ class HighlightSectionChecker(Instruction): ...@@ -458,7 +458,7 @@ class HighlightSectionChecker(Instruction):
return self._description_pattern.format(num_highlights=self._num_highlights) return self._description_pattern.format(num_highlights=self._num_highlights)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"num_highlights": self._num_highlights} return {"num_highlights": self._num_highlights}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -469,12 +469,12 @@ class HighlightSectionChecker(Instruction): ...@@ -469,12 +469,12 @@ class HighlightSectionChecker(Instruction):
"""Checks if the number of highlighted sections meets the requirement. """Checks if the number of highlighted sections meets the requirement.
Args: Args:
value: a string repesenting the response. The response is expected to value: a string representing the response. The response is expected to
contain highlighted sections in the format of *highlighted*. contain highlighted sections in the format of *highlighted*.
Returns: Returns:
True if the actual number of highlighted sections in the format of True if the actual number of highlighted sections in the format of
*highlighed sections* meets the minimum requirement; otherwise False. *highlighted sections* meets the minimum requirement; otherwise False.
""" """
num_highlights = 0 num_highlights = 0
highlights = re.findall(r"\*[^\n\*]*\*", value) highlights = re.findall(r"\*[^\n\*]*\*", value)
...@@ -529,7 +529,7 @@ class SectionChecker(Instruction): ...@@ -529,7 +529,7 @@ class SectionChecker(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"section_spliter": self._section_spliter, "section_spliter": self._section_spliter,
"num_sections": self._num_sections, "num_sections": self._num_sections,
...@@ -582,7 +582,7 @@ class ParagraphChecker(Instruction): ...@@ -582,7 +582,7 @@ class ParagraphChecker(Instruction):
return self._description_pattern.format(num_paragraphs=self._num_paragraphs) return self._description_pattern.format(num_paragraphs=self._num_paragraphs)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"num_paragraphs": self._num_paragraphs} return {"num_paragraphs": self._num_paragraphs}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -642,7 +642,7 @@ class PostscriptChecker(Instruction): ...@@ -642,7 +642,7 @@ class PostscriptChecker(Instruction):
return self._description_pattern.format(postscript=self._postscript_marker) return self._description_pattern.format(postscript=self._postscript_marker)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"postscript_marker": self._postscript_marker} return {"postscript_marker": self._postscript_marker}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -672,7 +672,7 @@ class PostscriptChecker(Instruction): ...@@ -672,7 +672,7 @@ class PostscriptChecker(Instruction):
class RephraseChecker(Instruction): class RephraseChecker(Instruction):
"""Checks the repharse.""" """Checks the rephrase."""
def build_description(self, *, original_message): def build_description(self, *, original_message):
"""Build the instruction description. """Build the instruction description.
...@@ -701,7 +701,7 @@ class RephraseChecker(Instruction): ...@@ -701,7 +701,7 @@ class RephraseChecker(Instruction):
return self._description return self._description
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"original_message": self._reference_without_change} return {"original_message": self._reference_without_change}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -766,7 +766,7 @@ class KeywordChecker(Instruction): ...@@ -766,7 +766,7 @@ class KeywordChecker(Instruction):
return self._description_pattern.format(keywords=self._keywords) return self._description_pattern.format(keywords=self._keywords)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"keywords": self._keywords} return {"keywords": self._keywords}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -831,7 +831,7 @@ class KeywordFrequencyChecker(Instruction): ...@@ -831,7 +831,7 @@ class KeywordFrequencyChecker(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"keyword": self._keyword, "keyword": self._keyword,
"frequency": self._frequency, "frequency": self._frequency,
...@@ -894,7 +894,7 @@ class NumberOfWords(Instruction): ...@@ -894,7 +894,7 @@ class NumberOfWords(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"num_words": self._num_words, "relation": self._comparison_relation} return {"num_words": self._num_words, "relation": self._comparison_relation}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -922,7 +922,7 @@ class JsonFormat(Instruction): ...@@ -922,7 +922,7 @@ class JsonFormat(Instruction):
return self._description_pattern return self._description_pattern
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return None return None
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -996,7 +996,7 @@ class ParagraphFirstWordCheck(Instruction): ...@@ -996,7 +996,7 @@ class ParagraphFirstWordCheck(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"num_paragraphs": self._num_paragraphs, "num_paragraphs": self._num_paragraphs,
"nth_paragraph": self._nth_paragraph, "nth_paragraph": self._nth_paragraph,
...@@ -1089,7 +1089,7 @@ class KeySentenceChecker(Instruction): ...@@ -1089,7 +1089,7 @@ class KeySentenceChecker(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"num_sentences": self._num_sentences, "num_sentences": self._num_sentences,
"key_sentences": list(self._key_sentences), "key_sentences": list(self._key_sentences),
...@@ -1117,7 +1117,7 @@ class ForbiddenWords(Instruction): ...@@ -1117,7 +1117,7 @@ class ForbiddenWords(Instruction):
"""Build the instruction description. """Build the instruction description.
Args: Args:
forbidden_words: A sequences of strings respresenting words that are not forbidden_words: A sequences of strings representing words that are not
allowed in the response. allowed in the response.
Returns: Returns:
...@@ -1138,7 +1138,7 @@ class ForbiddenWords(Instruction): ...@@ -1138,7 +1138,7 @@ class ForbiddenWords(Instruction):
return self._description_pattern.format(forbidden_words=self._forbidden_words) return self._description_pattern.format(forbidden_words=self._forbidden_words)
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return {"forbidden_words": self._forbidden_words} return {"forbidden_words": self._forbidden_words}
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
...@@ -1188,7 +1188,7 @@ class RephraseParagraph(Instruction): ...@@ -1188,7 +1188,7 @@ class RephraseParagraph(Instruction):
) )
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return { return {
"original_paragraph": self._original_paragraph, "original_paragraph": self._original_paragraph,
"low": self._low, "low": self._low,
...@@ -1225,7 +1225,7 @@ class TwoResponsesChecker(Instruction): ...@@ -1225,7 +1225,7 @@ class TwoResponsesChecker(Instruction):
return self._description_pattern return self._description_pattern
def get_instruction_args(self): def get_instruction_args(self):
"""Returns the keyward args of `build_description`.""" """Returns the keyword args of `build_description`."""
return None return None
def get_instruction_args_keys(self): def get_instruction_args_keys(self):
......
...@@ -11,7 +11,7 @@ Homepage: https://www.scrolls-benchmark.com/ ...@@ -11,7 +11,7 @@ Homepage: https://www.scrolls-benchmark.com/
Since SCROLLS tasks are generally longer than the maximum sequence length of many models, Since SCROLLS tasks are generally longer than the maximum sequence length of many models,
it is possible to create "subset" tasks that contain only those samples whose tokenized length it is possible to create "subset" tasks that contain only those samples whose tokenized length
is less than some pre-defined limit. For example, to create a subset of "Qasper" that would is less than some pre-defined limit. For example, to create a subset of "Qasper" that would
be suitable for a model using the GPTNeoX tokenizer and a 4K maximium sequence length: be suitable for a model using the GPTNeoX tokenizer and a 4K maximum sequence length:
``` ```
class QasperGPTNeoX4K(Qasper): class QasperGPTNeoX4K(Qasper):
......
...@@ -439,7 +439,7 @@ class GovReport(_SCROLLSSummaryTask): ...@@ -439,7 +439,7 @@ class GovReport(_SCROLLSSummaryTask):
Note: The average length of the reference summaries is ~3,000 Note: The average length of the reference summaries is ~3,000
characters, or ~600 tokens as tokenized by GPT-NeoX. For causal models, characters, or ~600 tokens as tokenized by GPT-NeoX. For causal models,
it is recommended to set `max_gen_toks` sufficently large (e.g. 1024) it is recommended to set `max_gen_toks` sufficiently large (e.g. 1024)
to allow a full summary to be generated. to allow a full summary to be generated.
""" """
......
...@@ -11,7 +11,7 @@ from tqdm import tqdm ...@@ -11,7 +11,7 @@ from tqdm import tqdm
# Copy from https://github.com/iKala/ievals/blob/main/ievals/settings.py # Copy from https://github.com/iKala/ievals/blob/main/ievals/settings.py
# from TMMLU+ offical example # from TMMLU+ official example
categories = { categories = {
"STEM": [ "STEM": [
"physics", "physics",
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment