Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
9baa294c
Commit
9baa294c
authored
May 11, 2022
by
jon-tow
Browse files
Merge branch 'master' of
https://github.com/EleutherAI/lm-evaluation-harness
parents
4587b718
e0396a4e
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
4 additions
and
5 deletions
+4
-5
lm_eval/evaluator.py
lm_eval/evaluator.py
+3
-4
scripts/clean_training_data/janitor_util.cpp
scripts/clean_training_data/janitor_util.cpp
+1
-1
No files found.
lm_eval/evaluator.py
View file @
9baa294c
import
collections
import
itertools
import
pathlib
import
numpy
as
np
import
random
import
lm_eval.metrics
import
lm_eval.models
import
lm_eval.tasks
import
lm_eval.base
import
lm_eval.decontamination
import
numpy
as
np
from
lm_eval.utils
import
positional_deprecated
,
run_task_tests
from
lm_eval.decontamination.decontaminate
import
get_train_overlap
@
positional_deprecated
...
...
@@ -229,6 +226,8 @@ def evaluate(
# Compare all tasks/sets at once to ensure a single training set scan
if
decontaminate
:
from
lm_eval.decontamination.decontaminate
import
get_train_overlap
print
(
"Finding train/test overlap, please wait..."
)
overlaps
=
get_train_overlap
(
docs_for_decontamination
,
decontamination_ngrams_path
,
limit
...
...
scripts/clean_training_data/janitor_util.cpp
View file @
9baa294c
...
...
@@ -176,7 +176,7 @@ clean_ngram_with_indices(std::string const &input, std::string const &ignore,
}
// Skip ignored characters
}
else
if
(
ignore
.
find
(
*
iter
)
!=
std
::
string
::
npos
)
{
}
else
if
(
ignore
.
find
(
ch
)
!=
std
::
string
::
npos
)
{
continue
;
// If it is a non-ignored character, add it to the ngram and update the
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment