Model utils doc (#6005)

* Document TF modeling utils * Document all model utils

Model utils doc (#6005)
* Document TF modeling utils * Document all model utils
3b44aa93 · Sylvain Gugger · GitHub · a5404052 · 3b44aa93 · 3b44aa93
Unverified Commit 3b44aa93 authored Jul 24, 2020 by Sylvain Gugger Committed by GitHub Jul 24, 2020
7 changed files
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -177,9 +177,9 @@ conversion utilities for the following models:
    main_classes/model
    main_classes/tokenizer
    main_classes/pipelines
+    main_classes/trainer
    main_classes/optimizer_schedules
    main_classes/processors
-    main_classes/trainer
    model_doc/auto
    model_doc/encoderdecoder
    model_doc/bert
@@ -205,3 +205,4 @@ conversion utilities for the following models:
    model_doc/retribert
    model_doc/mobilebert
    model_doc/dpr
+    internal/modeling_utils
--- a/docs/source/internal/modeling_utils.rst
+++ b/docs/source/internal/modeling_utils.rst
+Custom Layers and Utilities
+---------------------------
+This page lists all the custom layers used by the library, as well as the utility functions it provides for modeling.
+Most of those are only useful if you are studying the code of the models in the library.
+``Pytorch custom modules``
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. autoclass:: transformers.modeling_utils.Conv1D
+.. autoclass:: transformers.modeling_utils.PoolerStartLogits
+    :members: forward
+.. autoclass:: transformers.modeling_utils.PoolerEndLogits
+    :members: forward
+.. autoclass:: transformers.modeling_utils.PoolerAnswerClass
+    :members: forward
+.. autoclass:: transformers.modeling_utils.SquadHeadOutput
+.. autoclass:: transformers.modeling_utils.SQuADHead
+    :members: forward
+.. autoclass:: transformers.modeling_utils.SequenceSummary
+    :members: forward
+``PyTorch Helper Functions``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. autofunction:: transformers.apply_chunking_to_forward
+.. autofunction:: transformers.modeling_utils.find_pruneable_heads_and_indices
+.. autofunction:: transformers.modeling_utils.prune_layer
+.. autofunction:: transformers.modeling_utils.prune_conv1d_layer
+.. autofunction:: transformers.modeling_utils.prune_linear_layer
+``TensorFlow custom layers``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. autoclass:: transformers.modeling_tf_utils.TFConv1D
+.. autoclass:: transformers.modeling_tf_utils.TFSharedEmbeddings
+    :members: call
+.. autoclass:: transformers.modeling_tf_utils.TFSequenceSummary
+    :members: call
+``TensorFlow loss functions``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. autoclass:: transformers.modeling_tf_utils.TFCausalLanguageModelingLoss
+    :members:
+.. autoclass:: transformers.modeling_tf_utils.TFMaskedLanguageModelingLoss
+    :members:
+.. autoclass:: transformers.modeling_tf_utils.TFMultipleChoiceLoss
+    :members:
+.. autoclass:: transformers.modeling_tf_utils.TFQuestionAnsweringLoss
+    :members:
+.. autoclass:: transformers.modeling_tf_utils.TFSequenceClassificationLoss
+    :members:
+.. autoclass:: transformers.modeling_tf_utils.TFTokenClassificationLoss
+    :members:
+``TensorFlow Helper Functions``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.. autofunction:: transformers.modeling_tf_utils.cast_bool_to_primitive
+.. autofunction:: transformers.modeling_tf_utils.get_initializer
+.. autofunction:: transformers.modeling_tf_utils.keras_serializable
+.. autofunction:: transformers.modeling_tf_utils.shape_list
\ No newline at end of file
--- a/docs/source/main_classes/model.rst
+++ b/docs/source/main_classes/model.rst
 Models
 ----------------------------------------------------
-The base class :class:`~transformers.PreTrainedModel` implements the common methods for loading/saving a model either
+The base classes :class:`~transformers.PreTrainedModel` and :class:`~transformers.TFPreTrainedModel` implement the
-from a local file or directory, or from a pretrained model configuration provided by the library (downloaded from
+common methods for loading/saving a model either from a local file or directory, or from a pretrained model
-HuggingFace's AWS S3 repository).
+configuration provided by the library (downloaded from HuggingFace's AWS S3 repository).
-:class:`~transformers.PreTrainedModel` also implements a few methods which are common among all the models to:
+:class:`~transformers.PreTrainedModel` and :class:`~transformers.TFPreTrainedModel` also implement a few methods which
+are common among all the models to:
 - resize the input token embeddings when new tokens are added to the vocabulary
 - prune the attention heads of the model.
+The other methods that are common to each model are defined in :class:`~transformers.modeling_utils.ModuleUtilsMixin`
+(for the PyTorch models) and :class:`~transformers.modeling_tf_utils.TFModuleUtilsMixin` (for the TensorFlow models).
 ``PreTrainedModel``
 ~~~~~~~~~~~~~~~~~~~~~
 .. autoclass:: transformers.PreTrainedModel
    :members:
-``Helper Functions``
-~~~~~~~~~~~~~~~~~~~~~
-.. autofunction:: transformers.apply_chunking_to_forward
+``ModuleUtilsMixin``
+~~~~~~~~~~~~~~~~~~~~
+.. autoclass:: transformers.modeling_utils.ModuleUtilsMixin
+    :members:
 ``TFPreTrainedModel``
 ~~~~~~~~~~~~~~~~~~~~~
 .. autoclass:: transformers.TFPreTrainedModel
    :members:
+``TFModelUtilsMixin``
+~~~~~~~~~~~~~~~~~~~~~
+.. autoclass:: transformers.modeling_tf_utils.TFModelUtilsMixin
+    :members:
--- a/setup.cfg
+++ b/setup.cfg
@@ -43,5 +43,5 @@ multi_line_output = 3
 use_parentheses = True
 [flake8]
-ignore = E203, E501, E741, W503
+ignore = E203, E501, E741, W503, W605
 max-line-length = 119
--- a/src/transformers/configuration_utils.py
+++ b/src/transformers/configuration_utils.py
@@ -100,7 +100,7 @@ class PretrainedConfig(object):
              method of the model.
        Parameters for fine-tuning tasks
-            - **architectures** (:obj:List[`str`], `optional`) -- Model architectures that can be used with the
+            - **architectures** (:obj:`List[str]`, `optional`) -- Model architectures that can be used with the
              model pretrained weights.
            - **finetuning_task** (:obj:`str`, `optional`) -- Name of the task used to fine-tune the model. This can be
              used when converting from an original (TensorFlow or PyTorch) checkpoint.

--- a/src/transformers/modeling_tf_utils.py
+++ b/src/transformers/modeling_tf_utils.py
--- a/src/transformers/modeling_utils.py
+++ b/src/transformers/modeling_utils.py