Improve documentation of pooler_output in ModelOutput (#13228)

* update documentation of pooler_output in modeling_outputs, making it more clear and available for generic usage * Update src/transformers/modeling_outputs.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_outputs.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * run make style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Improve documentation of pooler_output in ModelOutput (#13228)
* update documentation of pooler_output in modeling_outputs, making it more clear and available for generic usage * Update src/transformers/modeling_outputs.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_outputs.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * run make style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
ef83dc4f · Navjot · GitHub · 7828194e · ef83dc4f
Unverified Commit ef83dc4f authored Aug 30, 2021 by Navjot Committed by GitHub Aug 30, 2021
Show whitespace changes
Inline Side-by-side

Showing with 8 additions and 6 deletions

src/transformers/modeling_outputs.py src/transformers/modeling_outputs.py +8 -6

No files found.
--- a/src/transformers/modeling_outputs.py
+++ b/src/transformers/modeling_outputs.py
@@ -55,9 +55,10 @@ class BaseModelOutputWithPooling(ModelOutput):
        last_hidden_state (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, sequence_length, hidden_size)`):
            Sequence of hidden-states at the output of the last layer of the model.
        pooler_output (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, hidden_size)`):
-            Last layer hidden-state of the first token of the sequence (classification token) further processed by a
-            Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence
-            prediction (classification) objective during pretraining.
+            Last layer hidden-state of the first token of the sequence (classification token) after further processing
+            through the layers used for the auxiliary pretraining task. E.g. for BERT-family of models, this returns
+            the classification token after processing through a linear layer and a tanh activation function. The linear
+            layer weights are trained from the next sentence prediction (classification) objective during pretraining.
        hidden_states (:obj:`tuple(torch.FloatTensor)`, `optional`, returned when ``output_hidden_states=True`` is passed or when ``config.output_hidden_states=True``):
            Tuple of :obj:`torch.FloatTensor` (one for the output of the embeddings + one for the output of each layer)
            of shape :obj:`(batch_size, sequence_length, hidden_size)`.
@@ -158,9 +159,10 @@ class BaseModelOutputWithPoolingAndCrossAttentions(ModelOutput):
        last_hidden_state (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, sequence_length, hidden_size)`):
            Sequence of hidden-states at the output of the last layer of the model.
        pooler_output (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, hidden_size)`):
-            Last layer hidden-state of the first token of the sequence (classification token) further processed by a
-            Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence
-            prediction (classification) objective during pretraining.
+            Last layer hidden-state of the first token of the sequence (classification token) after further processing
+            through the layers used for the auxiliary pretraining task. E.g. for BERT-family of models, this returns
+            the classification token after processing through a linear layer and a tanh activation function. The linear
+            layer weights are trained from the next sentence prediction (classification) objective during pretraining.
        hidden_states (:obj:`tuple(torch.FloatTensor)`, `optional`, returned when ``output_hidden_states=True`` is passed or when ``config.output_hidden_states=True``):
            Tuple of :obj:`torch.FloatTensor` (one for the output of the embeddings + one for the output of each layer)
            of shape :obj:`(batch_size, sequence_length, hidden_size)`.