Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
6c32d8bb
Commit
6c32d8bb
authored
Jan 14, 2020
by
Lysandre
Committed by
Lysandre Debut
Jan 14, 2020
Browse files
Size > Dimensionality + Remove final TODOs
parent
760164d6
Changes
8
Hide whitespace changes
Inline
Side-by-side
Showing
8 changed files
with
14 additions
and
13 deletions
+14
-13
src/transformers/configuration_albert.py
src/transformers/configuration_albert.py
+3
-3
src/transformers/configuration_bert.py
src/transformers/configuration_bert.py
+2
-2
src/transformers/configuration_ctrl.py
src/transformers/configuration_ctrl.py
+2
-2
src/transformers/configuration_distilbert.py
src/transformers/configuration_distilbert.py
+1
-1
src/transformers/configuration_gpt2.py
src/transformers/configuration_gpt2.py
+1
-1
src/transformers/configuration_openai.py
src/transformers/configuration_openai.py
+1
-1
src/transformers/configuration_xlm.py
src/transformers/configuration_xlm.py
+2
-1
src/transformers/configuration_xlnet.py
src/transformers/configuration_xlnet.py
+2
-2
No files found.
src/transformers/configuration_albert.py
View file @
6c32d8bb
...
...
@@ -47,9 +47,9 @@ class AlbertConfig(PretrainedConfig):
Vocabulary size of the ALBERT model. Defines the different tokens that
can be represented by the `inputs_ids` passed to the forward method of :class:`~transformers.AlbertModel`.
embedding_size (:obj:`int`, optional, defaults to 128):
Size
of vocabulary embeddings.
Dimensionality
of vocabulary embeddings.
hidden_size (:obj:`int`, optional, defaults to 4096):
Size
of the encoder layers and the pooler layer.
Dimensionality
of the encoder layers and the pooler layer.
num_hidden_layers (:obj:`int`, optional, defaults to 12):
Number of hidden layers in the Transformer encoder.
num_hidden_groups (:obj:`int`, optional, defaults to 1):
...
...
@@ -57,7 +57,7 @@ class AlbertConfig(PretrainedConfig):
num_attention_heads (:obj:`int`, optional, defaults to 64):
Number of attention heads for each attention layer in the Transformer encoder.
intermediate_size (:obj:`int`, optional, defaults to 16384):
The
size
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
The
dimensionality
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
inner_group_num (:obj:`int`, optional, defaults to 1):
The number of inner repetition of attention and ffn.
hidden_act (:obj:`str` or :obj:`function`, optional, defaults to "gelu_new"):
...
...
src/transformers/configuration_bert.py
View file @
6c32d8bb
...
...
@@ -65,13 +65,13 @@ class BertConfig(PretrainedConfig):
Vocabulary size of the BERT model. Defines the different tokens that
can be represented by the `inputs_ids` passed to the forward method of :class:`~transformers.BertModel`.
hidden_size (:obj:`int`, optional, defaults to 768):
Size
of the encoder layers and the pooler layer.
Dimensionality
of the encoder layers and the pooler layer.
num_hidden_layers (:obj:`int`, optional, defaults to 12):
Number of hidden layers in the Transformer encoder.
num_attention_heads (:obj:`int`, optional, defaults to 12):
Number of attention heads for each attention layer in the Transformer encoder.
intermediate_size (:obj:`int`, optional, defaults to 3072):
The size
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
Dimensionality
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
hidden_act (:obj:`str` or :obj:`function`, optional, defaults to "gelu"):
The non-linear activation function (function or string) in the encoder and pooler.
If string, "gelu", "relu", "swish" and "gelu_new" are supported.
...
...
src/transformers/configuration_ctrl.py
View file @
6c32d8bb
...
...
@@ -44,11 +44,11 @@ class CTRLConfig(PretrainedConfig):
The maximum sequence length that this model might ever be used with.
Typically set this to something large just in case (e.g., 512 or 1024 or 2048).
n_ctx (:obj:`int`, optional, defaults to 256):
Size
of the causal mask (usually same as n_positions).
Dimensionality
of the causal mask (usually same as n_positions).
n_embd (:obj:`int`, optional, defaults to 1280):
Dimensionality of the embeddings and hidden states.
dff (:obj:`int`, optional, defaults to 8192):
Size
of the inner dimension of the FFN.
Dimensionality
of the inner dimension of the FFN.
n_layer (:obj:`int`, optional, defaults to 48):
Number of hidden layers in the Transformer encoder.
n_head (:obj:`int`, optional, defaults to 16):
...
...
src/transformers/configuration_distilbert.py
View file @
6c32d8bb
...
...
@@ -56,7 +56,7 @@ class DistilBertConfig(PretrainedConfig):
n_heads (:obj:`int`, optional, defaults to 12):
Number of attention heads for each attention layer in the Transformer encoder.
dim (:obj:`int`, optional, defaults to 768):
Size
of the encoder layers and the pooler layer.
Dimensionality
of the encoder layers and the pooler layer.
intermediate_size (:obj:`int`, optional, defaults to 3072):
The size of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
dropout (:obj:`float`, optional, defaults to 0.1):
...
...
src/transformers/configuration_gpt2.py
View file @
6c32d8bb
...
...
@@ -52,7 +52,7 @@ class GPT2Config(PretrainedConfig):
The maximum sequence length that this model might ever be used with.
Typically set this to something large just in case (e.g., 512 or 1024 or 2048).
n_ctx (:obj:`int`, optional, defaults to 1024):
Size
of the causal mask (usually same as n_positions).
Dimensionality
of the causal mask (usually same as n_positions).
n_embd (:obj:`int`, optional, defaults to 768):
Dimensionality of the embeddings and hidden states.
n_layer (:obj:`int`, optional, defaults to 12):
...
...
src/transformers/configuration_openai.py
View file @
6c32d8bb
...
...
@@ -47,7 +47,7 @@ class OpenAIGPTConfig(PretrainedConfig):
The maximum sequence length that this model might ever be used with.
Typically set this to something large just in case (e.g., 512 or 1024 or 2048).
n_ctx (:obj:`int`, optional, defaults to 512):
Size
of the causal mask (usually same as n_positions).
Dimensionality
of the causal mask (usually same as n_positions).
n_embd (:obj:`int`, optional, defaults to 768):
Dimensionality of the embeddings and hidden states.
n_layer (:obj:`int`, optional, defaults to 12):
...
...
src/transformers/configuration_xlm.py
View file @
6c32d8bb
...
...
@@ -72,7 +72,8 @@ class XLMConfig(PretrainedConfig):
Causal models use a triangular attention mask in order to only attend to the left-side context instead
if a bidirectional context.
asm (:obj:`boolean`, optional, defaults to :obj:`False`):
TODO
Whether to use an adaptive log softmax projection layer instead of a linear layer for the prediction
layer.
n_langs (:obj:`int`, optional, defaults to 1):
The number of languages the model handles. Set to 1 for monolingual models.
use_lang_emb (:obj:`boolean`, optional, defaults to :obj:`True`)
...
...
src/transformers/configuration_xlnet.py
View file @
6c32d8bb
...
...
@@ -45,13 +45,13 @@ class XLNetConfig(PretrainedConfig):
Vocabulary size of the XLNet model. Defines the different tokens that
can be represented by the `inputs_ids` passed to the forward method of :class:`~transformers.XLNetModel`.
d_model (:obj:`int`, optional, defaults to 1024):
Size
of the encoder layers and the pooler layer.
Dimensionality
of the encoder layers and the pooler layer.
n_layer (:obj:`int`, optional, defaults to 24):
Number of hidden layers in the Transformer encoder.
n_head (:obj:`int`, optional, defaults to 16):
Number of attention heads for each attention layer in the Transformer encoder.
d_inner (:obj:`int`, optional, defaults to 4096):
The size
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
Dimensionality
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
ff_activation (:obj:`string`, optional, defaults to "gelu"):
The non-linear activation function (function or string) in the
encoder and pooler. If string, "gelu", "relu" and "swish" are supported.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment