Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
6c32d8bb
Commit
6c32d8bb
authored
Jan 14, 2020
by
Lysandre
Committed by
Lysandre Debut
Jan 14, 2020
Browse files
Size > Dimensionality + Remove final TODOs
parent
760164d6
Changes
8
Show whitespace changes
Inline
Side-by-side
Showing
8 changed files
with
14 additions
and
13 deletions
+14
-13
src/transformers/configuration_albert.py
src/transformers/configuration_albert.py
+3
-3
src/transformers/configuration_bert.py
src/transformers/configuration_bert.py
+2
-2
src/transformers/configuration_ctrl.py
src/transformers/configuration_ctrl.py
+2
-2
src/transformers/configuration_distilbert.py
src/transformers/configuration_distilbert.py
+1
-1
src/transformers/configuration_gpt2.py
src/transformers/configuration_gpt2.py
+1
-1
src/transformers/configuration_openai.py
src/transformers/configuration_openai.py
+1
-1
src/transformers/configuration_xlm.py
src/transformers/configuration_xlm.py
+2
-1
src/transformers/configuration_xlnet.py
src/transformers/configuration_xlnet.py
+2
-2
No files found.
src/transformers/configuration_albert.py
View file @
6c32d8bb
...
...
@@ -47,9 +47,9 @@ class AlbertConfig(PretrainedConfig):
Vocabulary size of the ALBERT model. Defines the different tokens that
can be represented by the `inputs_ids` passed to the forward method of :class:`~transformers.AlbertModel`.
embedding_size (:obj:`int`, optional, defaults to 128):
Size
of vocabulary embeddings.
Dimensionality
of vocabulary embeddings.
hidden_size (:obj:`int`, optional, defaults to 4096):
Size
of the encoder layers and the pooler layer.
Dimensionality
of the encoder layers and the pooler layer.
num_hidden_layers (:obj:`int`, optional, defaults to 12):
Number of hidden layers in the Transformer encoder.
num_hidden_groups (:obj:`int`, optional, defaults to 1):
...
...
@@ -57,7 +57,7 @@ class AlbertConfig(PretrainedConfig):
num_attention_heads (:obj:`int`, optional, defaults to 64):
Number of attention heads for each attention layer in the Transformer encoder.
intermediate_size (:obj:`int`, optional, defaults to 16384):
The
size
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
The
dimensionality
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
inner_group_num (:obj:`int`, optional, defaults to 1):
The number of inner repetition of attention and ffn.
hidden_act (:obj:`str` or :obj:`function`, optional, defaults to "gelu_new"):
...
...
src/transformers/configuration_bert.py
View file @
6c32d8bb
...
...
@@ -65,13 +65,13 @@ class BertConfig(PretrainedConfig):
Vocabulary size of the BERT model. Defines the different tokens that
can be represented by the `inputs_ids` passed to the forward method of :class:`~transformers.BertModel`.
hidden_size (:obj:`int`, optional, defaults to 768):
Size
of the encoder layers and the pooler layer.
Dimensionality
of the encoder layers and the pooler layer.
num_hidden_layers (:obj:`int`, optional, defaults to 12):
Number of hidden layers in the Transformer encoder.
num_attention_heads (:obj:`int`, optional, defaults to 12):
Number of attention heads for each attention layer in the Transformer encoder.
intermediate_size (:obj:`int`, optional, defaults to 3072):
The size
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
Dimensionality
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
hidden_act (:obj:`str` or :obj:`function`, optional, defaults to "gelu"):
The non-linear activation function (function or string) in the encoder and pooler.
If string, "gelu", "relu", "swish" and "gelu_new" are supported.
...
...
src/transformers/configuration_ctrl.py
View file @
6c32d8bb
...
...
@@ -44,11 +44,11 @@ class CTRLConfig(PretrainedConfig):
The maximum sequence length that this model might ever be used with.
Typically set this to something large just in case (e.g., 512 or 1024 or 2048).
n_ctx (:obj:`int`, optional, defaults to 256):
Size
of the causal mask (usually same as n_positions).
Dimensionality
of the causal mask (usually same as n_positions).
n_embd (:obj:`int`, optional, defaults to 1280):
Dimensionality of the embeddings and hidden states.
dff (:obj:`int`, optional, defaults to 8192):
Size
of the inner dimension of the FFN.
Dimensionality
of the inner dimension of the FFN.
n_layer (:obj:`int`, optional, defaults to 48):
Number of hidden layers in the Transformer encoder.
n_head (:obj:`int`, optional, defaults to 16):
...
...
src/transformers/configuration_distilbert.py
View file @
6c32d8bb
...
...
@@ -56,7 +56,7 @@ class DistilBertConfig(PretrainedConfig):
n_heads (:obj:`int`, optional, defaults to 12):
Number of attention heads for each attention layer in the Transformer encoder.
dim (:obj:`int`, optional, defaults to 768):
Size
of the encoder layers and the pooler layer.
Dimensionality
of the encoder layers and the pooler layer.
intermediate_size (:obj:`int`, optional, defaults to 3072):
The size of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
dropout (:obj:`float`, optional, defaults to 0.1):
...
...
src/transformers/configuration_gpt2.py
View file @
6c32d8bb
...
...
@@ -52,7 +52,7 @@ class GPT2Config(PretrainedConfig):
The maximum sequence length that this model might ever be used with.
Typically set this to something large just in case (e.g., 512 or 1024 or 2048).
n_ctx (:obj:`int`, optional, defaults to 1024):
Size
of the causal mask (usually same as n_positions).
Dimensionality
of the causal mask (usually same as n_positions).
n_embd (:obj:`int`, optional, defaults to 768):
Dimensionality of the embeddings and hidden states.
n_layer (:obj:`int`, optional, defaults to 12):
...
...
src/transformers/configuration_openai.py
View file @
6c32d8bb
...
...
@@ -47,7 +47,7 @@ class OpenAIGPTConfig(PretrainedConfig):
The maximum sequence length that this model might ever be used with.
Typically set this to something large just in case (e.g., 512 or 1024 or 2048).
n_ctx (:obj:`int`, optional, defaults to 512):
Size
of the causal mask (usually same as n_positions).
Dimensionality
of the causal mask (usually same as n_positions).
n_embd (:obj:`int`, optional, defaults to 768):
Dimensionality of the embeddings and hidden states.
n_layer (:obj:`int`, optional, defaults to 12):
...
...
src/transformers/configuration_xlm.py
View file @
6c32d8bb
...
...
@@ -72,7 +72,8 @@ class XLMConfig(PretrainedConfig):
Causal models use a triangular attention mask in order to only attend to the left-side context instead
if a bidirectional context.
asm (:obj:`boolean`, optional, defaults to :obj:`False`):
TODO
Whether to use an adaptive log softmax projection layer instead of a linear layer for the prediction
layer.
n_langs (:obj:`int`, optional, defaults to 1):
The number of languages the model handles. Set to 1 for monolingual models.
use_lang_emb (:obj:`boolean`, optional, defaults to :obj:`True`)
...
...
src/transformers/configuration_xlnet.py
View file @
6c32d8bb
...
...
@@ -45,13 +45,13 @@ class XLNetConfig(PretrainedConfig):
Vocabulary size of the XLNet model. Defines the different tokens that
can be represented by the `inputs_ids` passed to the forward method of :class:`~transformers.XLNetModel`.
d_model (:obj:`int`, optional, defaults to 1024):
Size
of the encoder layers and the pooler layer.
Dimensionality
of the encoder layers and the pooler layer.
n_layer (:obj:`int`, optional, defaults to 24):
Number of hidden layers in the Transformer encoder.
n_head (:obj:`int`, optional, defaults to 16):
Number of attention heads for each attention layer in the Transformer encoder.
d_inner (:obj:`int`, optional, defaults to 4096):
The size
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
Dimensionality
of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder.
ff_activation (:obj:`string`, optional, defaults to "gelu"):
The non-linear activation function (function or string) in the
encoder and pooler. If string, "gelu", "relu" and "swish" are supported.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment