"examples/vscode:/vscode.git/clone" did not exist on "57c965a8f1000b4016ad219e616f509d8af3f5b5"
model.rst 4.7 KB
Newer Older
1
..
Sylvain Gugger's avatar
Sylvain Gugger committed
2
3
4
5
6
7
8
9
10
11
12
    Copyright 2020 The HuggingFace Team. All rights reserved.

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
    the License. You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
    an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
    specific language governing permissions and limitations under the License.

thomwolf's avatar
thomwolf committed
13
Models
Sylvain Gugger's avatar
Sylvain Gugger committed
14
-----------------------------------------------------------------------------------------------------------------------
thomwolf's avatar
thomwolf committed
15

16
17
18
19
The base classes :class:`~transformers.PreTrainedModel`, :class:`~transformers.TFPreTrainedModel`, and
:class:`~transformers.FlaxPreTrainedModel` implement the common methods for loading/saving a model either from a local
file or directory, or from a pretrained model configuration provided by the library (downloaded from HuggingFace's AWS
S3 repository).
20

Sylvain Gugger's avatar
Sylvain Gugger committed
21
22
:class:`~transformers.PreTrainedModel` and :class:`~transformers.TFPreTrainedModel` also implement a few methods which
are common among all the models to:
23
24
25
26

- resize the input token embeddings when new tokens are added to the vocabulary
- prune the attention heads of the model.

Sylvain Gugger's avatar
Sylvain Gugger committed
27
The other methods that are common to each model are defined in :class:`~transformers.modeling_utils.ModuleUtilsMixin`
Sylvain Gugger's avatar
Sylvain Gugger committed
28
(for the PyTorch models) and :class:`~transformers.modeling_tf_utils.TFModuleUtilsMixin` (for the TensorFlow models) or
Patrick von Platen's avatar
Patrick von Platen committed
29
30
31
for text generation, :class:`~transformers.generation_utils.GenerationMixin` (for the PyTorch models),
:class:`~transformers.generation_tf_utils.TFGenerationMixin` (for the TensorFlow models) and
:class:`~transformers.generation_flax_utils.FlaxGenerationMixin` (for the Flax/JAX models).
Sylvain Gugger's avatar
Sylvain Gugger committed
32
33


Sylvain Gugger's avatar
Sylvain Gugger committed
34
35
PreTrainedModel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
thomwolf's avatar
thomwolf committed
36

37
.. autoclass:: transformers.PreTrainedModel
38
    :special-members: push_to_hub
thomwolf's avatar
thomwolf committed
39
    :members:
LysandreJik's avatar
LysandreJik committed
40

Patrick von Platen's avatar
Patrick von Platen committed
41

42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
.. _from_pretrained-torch-dtype:

Model Instantiation dtype
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Under Pytorch a model normally gets instantiated with ``torch.float32`` format. This can be an issue if one tries to
load a model whose weights are in fp16, since it'd require twice as much memory. To overcome this limitation, you can
either explicitly pass the desired ``dtype`` using ``torch_dtype`` argument:

.. code-block:: python

    model = T5ForConditionalGeneration.from_pretrained("t5", torch_dtype=torch.float16)

or, if you want the model to always load in the most optimal memory pattern, you can use the special value ``"auto"``,
and then ``dtype`` will be automatically derived from the model's weights:

.. code-block:: python

    model = T5ForConditionalGeneration.from_pretrained("t5", torch_dtype="auto")

Models instantiated from scratch can also be told which ``dtype`` to use with:

.. code-block:: python

    config = T5Config.from_pretrained("t5")
    model = AutoModel.from_config(config)

Due to Pytorch design, this functionality is only available for floating dtypes.



Sylvain Gugger's avatar
Sylvain Gugger committed
73
74
ModuleUtilsMixin
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
75
76
77
78

.. autoclass:: transformers.modeling_utils.ModuleUtilsMixin
    :members:

Patrick von Platen's avatar
Patrick von Platen committed
79

Sylvain Gugger's avatar
Sylvain Gugger committed
80
81
TFPreTrainedModel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LysandreJik's avatar
LysandreJik committed
82

83
.. autoclass:: transformers.TFPreTrainedModel
84
    :special-members: push_to_hub
LysandreJik's avatar
LysandreJik committed
85
    :members:
Sylvain Gugger's avatar
Sylvain Gugger committed
86
87


Sylvain Gugger's avatar
Sylvain Gugger committed
88
89
TFModelUtilsMixin
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
90
91
92

.. autoclass:: transformers.modeling_tf_utils.TFModelUtilsMixin
    :members:
Sylvain Gugger's avatar
Sylvain Gugger committed
93
94


95
96
97
98
FlaxPreTrainedModel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.FlaxPreTrainedModel
99
    :special-members: push_to_hub
100
101
102
    :members:


103
Generation
Sylvain Gugger's avatar
Sylvain Gugger committed
104
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
105

Sylvain Gugger's avatar
Sylvain Gugger committed
106
107
108
109
.. autoclass:: transformers.generation_utils.GenerationMixin
    :members:

.. autoclass:: transformers.generation_tf_utils.TFGenerationMixin
Sylvain Gugger's avatar
Sylvain Gugger committed
110
    :members:
Sylvain Gugger's avatar
Sylvain Gugger committed
111

Patrick von Platen's avatar
Patrick von Platen committed
112
113
114
.. autoclass:: transformers.generation_flax_utils.FlaxGenerationMixin
    :members:

Sylvain Gugger's avatar
Sylvain Gugger committed
115
116
117
118
119
120

Pushing to the Hub
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.file_utils.PushToHubMixin
    :members: