Unverified Commit 4f403ea8 authored by Daniel Erenrich's avatar Daniel Erenrich Committed by GitHub
Browse files

Fix grammar in tokenizer_summary (#15614)

"to make ensure" is redundant.
parent 7a32e472
...@@ -219,7 +219,7 @@ equivalent to finding the symbol pair, whose probability divided by the probabil ...@@ -219,7 +219,7 @@ equivalent to finding the symbol pair, whose probability divided by the probabil
its second symbol is the greatest among all symbol pairs. *E.g.* `"u"`, followed by `"g"` would have only been its second symbol is the greatest among all symbol pairs. *E.g.* `"u"`, followed by `"g"` would have only been
merged if the probability of `"ug"` divided by `"u"`, `"g"` would have been greater than for any other symbol merged if the probability of `"ug"` divided by `"u"`, `"g"` would have been greater than for any other symbol
pair. Intuitively, WordPiece is slightly different to BPE in that it evaluates what it _loses_ by merging two symbols pair. Intuitively, WordPiece is slightly different to BPE in that it evaluates what it _loses_ by merging two symbols
to make ensure it's _worth it_. to ensure it's _worth it_.
<a id='unigram'></a> <a id='unigram'></a>
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment