"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "ff8870350151091d3d8b2af4c1c0fa3ebcc1052a"
Unverified Commit 7ceff67e authored by Hamel Husain's avatar Hamel Husain Committed by GitHub
Browse files

Finish Making Quick Tour respect the model object (#11467)



* finish quicktour

* fix import

* fix print

* explain config default better

* Update docs/source/quicktour.rst
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
parent 88ac60f7
...@@ -285,16 +285,24 @@ We can see we get the numbers from before: ...@@ -285,16 +285,24 @@ We can see we get the numbers from before:
tensor([[2.2043e-04, 9.9978e-01], tensor([[2.2043e-04, 9.9978e-01],
[5.3086e-01, 4.6914e-01]], grad_fn=<SoftmaxBackward>) [5.3086e-01, 4.6914e-01]], grad_fn=<SoftmaxBackward>)
If you have labels, you can provide them to the model, it will return a tuple with the loss and the final activations. If you provide the model with labels in addition to inputs, the model output object will also contain a ``loss``
attribute:
.. code-block:: .. code-block::
>>> ## PYTORCH CODE >>> ## PYTORCH CODE
>>> import torch >>> import torch
>>> pt_outputs = pt_model(**pt_batch, labels = torch.tensor([1, 0])) >>> pt_outputs = pt_model(**pt_batch, labels = torch.tensor([1, 0]))
>>> print(pt_outputs)
SequenceClassifierOutput(loss=tensor(0.3167, grad_fn=<NllLossBackward>), logits=tensor([[-4.0833, 4.3364],
[ 0.0818, -0.0418]], grad_fn=<AddmmBackward>), hidden_states=None, attentions=None)
>>> ## TENSORFLOW CODE >>> ## TENSORFLOW CODE
>>> import tensorflow as tf >>> import tensorflow as tf
>>> tf_outputs = tf_model(tf_batch, labels = tf.constant([1, 0])) >>> tf_outputs = tf_model(tf_batch, labels = tf.constant([1, 0]))
>>> print(tf_outputs)
TFSequenceClassifierOutput(loss=<tf.Tensor: shape=(2,), dtype=float32, numpy=array([2.2051287e-04, 6.3326043e-01], dtype=float32)>, logits=<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[-4.0832963 , 4.3364143 ],
[ 0.081807 , -0.04178282]], dtype=float32)>, hidden_states=None, attentions=None)
Models are standard `torch.nn.Module <https://pytorch.org/docs/stable/nn.html#torch.nn.Module>`__ or `tf.keras.Model Models are standard `torch.nn.Module <https://pytorch.org/docs/stable/nn.html#torch.nn.Module>`__ or `tf.keras.Model
<https://www.tensorflow.org/api_docs/python/tf/keras/Model>`__ so you can use them in your usual training loop. 🤗 <https://www.tensorflow.org/api_docs/python/tf/keras/Model>`__ so you can use them in your usual training loop. 🤗
...@@ -322,6 +330,7 @@ loading a saved PyTorch model in a TensorFlow model, use :func:`~transformers.TF ...@@ -322,6 +330,7 @@ loading a saved PyTorch model in a TensorFlow model, use :func:`~transformers.TF
.. code-block:: .. code-block::
from transformers import TFAutoModel
tokenizer = AutoTokenizer.from_pretrained(save_directory) tokenizer = AutoTokenizer.from_pretrained(save_directory)
model = TFAutoModel.from_pretrained(save_directory, from_pt=True) model = TFAutoModel.from_pretrained(save_directory, from_pt=True)
...@@ -329,6 +338,7 @@ and if you are loading a saved TensorFlow model in a PyTorch model, you should u ...@@ -329,6 +338,7 @@ and if you are loading a saved TensorFlow model in a PyTorch model, you should u
.. code-block:: .. code-block::
from transformers import AutoModel
tokenizer = AutoTokenizer.from_pretrained(save_directory) tokenizer = AutoTokenizer.from_pretrained(save_directory)
model = AutoModel.from_pretrained(save_directory, from_tf=True) model = AutoModel.from_pretrained(save_directory, from_tf=True)
...@@ -339,10 +349,12 @@ Lastly, you can also ask the model to return all hidden states and all attention ...@@ -339,10 +349,12 @@ Lastly, you can also ask the model to return all hidden states and all attention
>>> ## PYTORCH CODE >>> ## PYTORCH CODE
>>> pt_outputs = pt_model(**pt_batch, output_hidden_states=True, output_attentions=True) >>> pt_outputs = pt_model(**pt_batch, output_hidden_states=True, output_attentions=True)
>>> all_hidden_states, all_attentions = pt_outputs[-2:] >>> all_hidden_states = pt_outputs.hidden_states
>>> all_attentions = pt_outputs.attentions
>>> ## TENSORFLOW CODE >>> ## TENSORFLOW CODE
>>> tf_outputs = tf_model(tf_batch, output_hidden_states=True, output_attentions=True) >>> tf_outputs = tf_model(tf_batch, output_hidden_states=True, output_attentions=True)
>>> all_hidden_states, all_attentions = tf_outputs[-2:] >>> all_hidden_states = tf_outputs.hidden_states
>>> all_attentions = tf_outputs.attentions
Accessing the code Accessing the code
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...@@ -375,16 +387,16 @@ directly instantiate model and tokenizer without the auto magic: ...@@ -375,16 +387,16 @@ directly instantiate model and tokenizer without the auto magic:
Customizing the model Customizing the model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you want to change how the model itself is built, you can define your custom configuration class. Each architecture If you want to change how the model itself is built, you can define a custom configuration class. Each architecture
comes with its own relevant configuration (in the case of DistilBERT, :class:`~transformers.DistilBertConfig`) which comes with its own relevant configuration. For example, :class:`~transformers.DistilBertConfig` allows you to specify
allows you to specify any of the hidden dimension, dropout rate, etc. If you do core modifications, like changing the parameters such as the hidden dimension, dropout rate, etc for DistilBERT. If you do core modifications, like changing
hidden size, you won't be able to use a pretrained model anymore and will need to train from scratch. You would then the hidden size, you won't be able to use a pretrained model anymore and will need to train from scratch. You would
instantiate the model directly from this configuration. then instantiate the model directly from this configuration.
Here we use the predefined vocabulary of DistilBERT (hence load the tokenizer with the Below, we load a predefined vocabulary for a tokenizer with the
:func:`~transformers.DistilBertTokenizer.from_pretrained` method) and initialize the model from scratch (hence :func:`~transformers.DistilBertTokenizer.from_pretrained` method. However, unlike the tokenizer, we wish to initialize
instantiate the model from the configuration instead of using the the model from scratch. Therefore, we instantiate the model from a configuration instead of using the
:func:`~transformers.DistilBertForSequenceClassification.from_pretrained` method). :func:`~transformers.DistilBertForSequenceClassification.from_pretrained` method.
.. code-block:: .. code-block::
...@@ -401,9 +413,9 @@ instantiate the model from the configuration instead of using the ...@@ -401,9 +413,9 @@ instantiate the model from the configuration instead of using the
For something that only changes the head of the model (for instance, the number of labels), you can still use a For something that only changes the head of the model (for instance, the number of labels), you can still use a
pretrained model for the body. For instance, let's define a classifier for 10 different labels using a pretrained body. pretrained model for the body. For instance, let's define a classifier for 10 different labels using a pretrained body.
We could create a configuration with all the default values and just change the number of labels, but more easily, you Instead of creating a new configuration with all the default values just to change the number of labels, we can instead
can directly pass any argument a configuration would take to the :func:`from_pretrained` method and it will update the pass any argument a configuration would take to the :func:`from_pretrained` method and it will update the default
default configuration with it: configuration appropriately:
.. code-block:: .. code-block::
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment