Early stop bug of greedy_until (primary_until should be a list of str)

I discovered that the accuracy of all models (e.g., llama7b, llama13b, starcoder) in the 'gsm8k-cot' task was 0%. After a thorough investigation, I realized that the generated text for each question was causing an early stop, preventing the 'regex_pattern' from finding any answers. This issue was caused by an incorrect assignment of the 'primary_until' variable in the 'greedy_until' function. Specifically, 'primary_until' should be a list of strings instead of a single string, as the 'stop_sequences' parameter in the 'stop_sequences_criteria' function requires a List[str]. Once I assigned 'primary_until' to '[until[0]]', the accuracy of llama7b in the 'gsm8k-cot' task increased to 1.67%.

Early stop bug of greedy_until (primary_until should be a list of str)
I discovered that the accuracy of all models (e.g., llama7b, llama13b, starcoder) in the 'gsm8k-cot' task was 0%. After a thorough investigation, I realized that the generated text for each question was causing an early stop, preventing the 'regex_pattern' from finding any answers. This issue was caused by an incorrect assignment of the 'primary_until' variable in the 'greedy_until' function. Specifically, 'primary_until' should be a list of strings instead of a single string, as the 'stop_sequences' parameter in the 'stop_sequences_criteria' function requires a List[str]. Once I assigned 'primary_until' to '[until[0]]', the accuracy of llama7b in the 'gsm8k-cot' task increased to 1.67%.
984f8793 · ZZR0 · GitHub · 2820042d · 984f8793
Unverified Commit 984f8793 authored Jul 24, 2023 by ZZR0 Committed by GitHub Jul 24, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

lm_eval/models/huggingface.py lm_eval/models/huggingface.py +1 -1

No files found.
--- a/lm_eval/models/huggingface.py
+++ b/lm_eval/models/huggingface.py
@@ -723,7 +723,7 @@ class HFLM(LM):
                else:
                    max_gen_toks = self.max_gen_toks
                # first stop sequence is used to halt generation upon encountering
-                (primary_until) = until[0]
+                primary_until = [until[0]]

                # set the max length in tokens of inputs ("context_enc")
                if self.AUTO_MODEL_CLASS == transformers.AutoModelForCausalLM: