tokenization_utils.rst 1.61 KB
Newer Older
Sylvain Gugger's avatar
Sylvain Gugger committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Utilities for Tokenizers
-----------------------------------------------------------------------------------------------------------------------

This page lists all the utility functions used by the tokenizers, mainly the class
:class:`~transformers.tokenization_utils_base.PreTrainedTokenizerBase` that implements the common methods between
:class:`~transformers.PreTrainedTokenizer` and :class:`~transformers.PreTrainedTokenizerFast` and the mixin
:class:`~transformers.tokenization_utils_base.SpecialTokensMixin`.

Most of those are only useful if you are studying the code of the tokenizers in the library.

PreTrainedTokenizerBase
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.tokenization_utils_base.PreTrainedTokenizerBase
    :special-members: __call__
    :members:


SpecialTokensMixin
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.tokenization_utils_base.SpecialTokensMixin
    :members:


Enums and namedtuples
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
28

Sylvain Gugger's avatar
Sylvain Gugger committed
29
30
31
32
33
34
35
36
37
38
39
.. autoclass:: transformers.tokenization_utils_base.ExplicitEnum

.. autoclass:: transformers.tokenization_utils_base.PaddingStrategy

.. autoclass:: transformers.tokenization_utils_base.TensorType

.. autoclass:: transformers.tokenization_utils_base.TruncationStrategy

.. autoclass:: transformers.tokenization_utils_base.CharSpan

.. autoclass:: transformers.tokenization_utils_base.TokenSpan