Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
342ff6eb
"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "3fcfbe7549d9694f96e1f19630add4adf99dd421"
Unverified
Commit
342ff6eb
authored
Mar 28, 2022
by
Jia
Committed by
GitHub
Mar 28, 2022
Browse files
Update comments in class BatchEncoding (#15932)
parent
e02f95b2
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
5 additions
and
4 deletions
+5
-4
src/transformers/tokenization_utils_base.py
src/transformers/tokenization_utils_base.py
+5
-4
No files found.
src/transformers/tokenization_utils_base.py
View file @
342ff6eb
...
...
@@ -160,16 +160,17 @@ class TokenSpan(NamedTuple):
class
BatchEncoding
(
UserDict
):
"""
Holds the output of the [`~tokenization_utils_base.PreTrainedTokenizerBase.encode_plus`] and
[`~tokenization_utils_base.PreTrainedTokenizerBase.batch_encode`] methods (tokens, attention_masks, etc).
Holds the output of the [`~tokenization_utils_base.PreTrainedTokenizerBase.__call__`],
[`~tokenization_utils_base.PreTrainedTokenizerBase.encode_plus`] and
[`~tokenization_utils_base.PreTrainedTokenizerBase.batch_encode_plus`] methods (tokens, attention_masks, etc).
This class is derived from a python dictionary and can be used as a dictionary. In addition, this class exposes
utility methods to map from word/character space to token space.
Args:
data (`dict`):
Dictionary of lists/arrays/tensors returned by the
encode/
batch_encode methods
('input_ids',
'attention_mask', etc.).
Dictionary of lists/arrays/tensors returned by the
`__call__`/`encode_plus`/`
batch_encode
_plus`
methods
('input_ids',
'attention_mask', etc.).
encoding (`tokenizers.Encoding` or `Sequence[tokenizers.Encoding]`, *optional*):
If the tokenizer is a fast tokenizer which outputs additional information like mapping from word/character
space to token space the `tokenizers.Encoding` instance or list of instance (for batches) hold this
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment