Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
13842e41
"vscode:/vscode.git/clone" did not exist on "8d602649d008fc20e793a26a66a5aac8a79fb02a"
Unverified
Commit
13842e41
authored
Oct 20, 2020
by
Joe Davison
Committed by
GitHub
Oct 20, 2020
Browse files
PPL guide minor code snippet fix (#7938)
parent
0e24e4c1
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
6 additions
and
5 deletions
+6
-5
docs/source/perplexity.rst
docs/source/perplexity.rst
+6
-5
No files found.
docs/source/perplexity.rst
View file @
13842e41
...
...
@@ -125,18 +125,19 @@ are 512 preceding tokens available to condition on).
lls = []
for i in tqdm(range(0, encodings.input_ids.size(1), stride)):
begin_loc = max(i + stride - max_length, 0)
end_loc = i + stride
end_loc = min(i + stride, encodings.input_ids.size(1))
trg_len = end_loc - i # may be different from stride on last loop
input_ids = encodings.input_ids[:,begin_loc:end_loc].to(device)
target_ids = input_ids.clone()
target_ids[:,:-
s
tr
ide
] = -100
target_ids[:,:-tr
g_len
] = -100
with torch.no_grad():
outputs = model(input_ids, labels=target_ids)
log_likelihood = outputs[0] *
s
tr
ide
log_likelihood = outputs[0] * tr
g_len
lls.append(log_likelihood)
ppl = torch.exp(torch.stack(lls).sum() /
i
)
ppl = torch.exp(torch.stack(lls).sum() /
end_loc
)
Running this with the stride length equal to the max input length is
equivalent to the suboptimal, non-sliding-window strategy we discussed above.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment