Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
e1f3156b
"pytorch_transformers/preprocessing/glue.py" did not exist on "3821ecbf4ac442cbaad7a1fc0d8c20136bbfe32a"
Unverified
Commit
e1f3156b
authored
Nov 22, 2020
by
Santiago Castro
Committed by
GitHub
Nov 21, 2020
Browse files
Fix many typos (#8708)
parent
9c0afdaf
Changes
35
Show whitespace changes
Inline
Side-by-side
Showing
15 changed files
with
21 additions
and
21 deletions
+21
-21
model_cards/mrm8488/t5-base-finetuned-wikiSQL-sql-to-en/README.md
...rds/mrm8488/t5-base-finetuned-wikiSQL-sql-to-en/README.md
+1
-1
model_cards/mrm8488/t5-base-finetuned-wikiSQL/README.md
model_cards/mrm8488/t5-base-finetuned-wikiSQL/README.md
+1
-1
model_cards/mrm8488/t5-small-finetuned-quora-for-paraphrasing/README.md
...m8488/t5-small-finetuned-quora-for-paraphrasing/README.md
+1
-1
model_cards/mrm8488/t5-small-finetuned-squadv1/README.md
model_cards/mrm8488/t5-small-finetuned-squadv1/README.md
+1
-1
model_cards/mrm8488/t5-small-finetuned-squadv2/README.md
model_cards/mrm8488/t5-small-finetuned-squadv2/README.md
+1
-1
model_cards/mrm8488/t5-small-finetuned-wikiSQL/README.md
model_cards/mrm8488/t5-small-finetuned-wikiSQL/README.md
+1
-1
src/transformers/modeling_tf_pytorch_utils.py
src/transformers/modeling_tf_pytorch_utils.py
+1
-1
src/transformers/models/fsmt/modeling_fsmt.py
src/transformers/models/fsmt/modeling_fsmt.py
+1
-1
src/transformers/models/t5/modeling_tf_t5.py
src/transformers/models/t5/modeling_tf_t5.py
+1
-1
src/transformers/models/transfo_xl/modeling_transfo_xl.py
src/transformers/models/transfo_xl/modeling_transfo_xl.py
+3
-3
src/transformers/models/transfo_xl/modeling_transfo_xl_utilities.py
...ormers/models/transfo_xl/modeling_transfo_xl_utilities.py
+2
-2
src/transformers/models/xlm/modeling_tf_xlm.py
src/transformers/models/xlm/modeling_tf_xlm.py
+2
-2
src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
...ansformers/models/xlm_roberta/tokenization_xlm_roberta.py
+1
-1
src/transformers/models/xlnet/modeling_xlnet.py
src/transformers/models/xlnet/modeling_xlnet.py
+2
-2
src/transformers/optimization_tf.py
src/transformers/optimization_tf.py
+2
-2
No files found.
model_cards/mrm8488/t5-base-finetuned-wikiSQL-sql-to-en/README.md
View file @
e1f3156b
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
## Details of the Dataset 📚
## Details of the Dataset 📚
Dataset ID:
```wikisql```
from
[
Huggin
F
ace/NLP
](
https://huggingface.co/nlp/viewer/?dataset=wikisql
)
Dataset ID:
```wikisql```
from
[
Huggin
gf
ace/NLP
](
https://huggingface.co/nlp/viewer/?dataset=wikisql
)
| Dataset | Split | # samples |
| Dataset | Split | # samples |
| -------- | ----- | --------- |
| -------- | ----- | --------- |
...
...
model_cards/mrm8488/t5-base-finetuned-wikiSQL/README.md
View file @
e1f3156b
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
## Details of the Dataset 📚
## Details of the Dataset 📚
Dataset ID:
```wikisql```
from
[
Huggin
F
ace/NLP
](
https://huggingface.co/nlp/viewer/?dataset=wikisql
)
Dataset ID:
```wikisql```
from
[
Huggin
gf
ace/NLP
](
https://huggingface.co/nlp/viewer/?dataset=wikisql
)
| Dataset | Split | # samples |
| Dataset | Split | # samples |
| -------- | ----- | --------- |
| -------- | ----- | --------- |
...
...
model_cards/mrm8488/t5-small-finetuned-quora-for-paraphrasing/README.md
View file @
e1f3156b
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
## Details of the downstream task (Question Paraphrasing) - Dataset 📚❓↔️❓
## Details of the downstream task (Question Paraphrasing) - Dataset 📚❓↔️❓
Dataset ID:
```quora```
from
[
Huggin
F
ace/NLP
](
https://github.com/huggingface/nlp
)
Dataset ID:
```quora```
from
[
Huggin
gf
ace/NLP
](
https://github.com/huggingface/nlp
)
| Dataset | Split | # samples |
| Dataset | Split | # samples |
| -------- | ----- | --------- |
| -------- | ----- | --------- |
...
...
model_cards/mrm8488/t5-small-finetuned-squadv1/README.md
View file @
e1f3156b
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
Dataset ID:
```squad```
from
[
Huggin
F
ace/NLP
](
https://github.com/huggingface/nlp
)
Dataset ID:
```squad```
from
[
Huggin
gf
ace/NLP
](
https://github.com/huggingface/nlp
)
| Dataset | Split | # samples |
| Dataset | Split | # samples |
| -------- | ----- | --------- |
| -------- | ----- | --------- |
...
...
model_cards/mrm8488/t5-small-finetuned-squadv2/README.md
View file @
e1f3156b
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
## Details of the downstream task (Q&A) - Dataset 📚 🧐 ❓
Dataset ID:
```squad_v2```
from
[
Huggin
F
ace/NLP
](
https://github.com/huggingface/nlp
)
Dataset ID:
```squad_v2```
from
[
Huggin
gf
ace/NLP
](
https://github.com/huggingface/nlp
)
| Dataset | Split | # samples |
| Dataset | Split | # samples |
| -------- | ----- | --------- |
| -------- | ----- | --------- |
...
...
model_cards/mrm8488/t5-small-finetuned-wikiSQL/README.md
View file @
e1f3156b
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
...
@@ -19,7 +19,7 @@ Transfer learning, where a model is first pre-trained on a data-rich task before
## Details of the Dataset 📚
## Details of the Dataset 📚
Dataset ID:
```wikisql```
from
[
Huggin
F
ace/NLP
](
https://huggingface.co/nlp/viewer/?dataset=wikisql
)
Dataset ID:
```wikisql```
from
[
Huggin
gf
ace/NLP
](
https://huggingface.co/nlp/viewer/?dataset=wikisql
)
| Dataset | Split | # samples |
| Dataset | Split | # samples |
| -------- | ----- | --------- |
| -------- | ----- | --------- |
...
...
src/transformers/modeling_tf_pytorch_utils.py
View file @
e1f3156b
...
@@ -39,7 +39,7 @@ def convert_tf_weight_name_to_pt_weight_name(tf_name, start_prefix_to_remove="")
...
@@ -39,7 +39,7 @@ def convert_tf_weight_name_to_pt_weight_name(tf_name, start_prefix_to_remove="")
return tuple with:
return tuple with:
- pytorch model weight name
- pytorch model weight name
- transpose: boolean indicating wether TF2.0 and PyTorch weights matrices are transposed with regards to each
- transpose: boolean indicating w
h
ether TF2.0 and PyTorch weights matrices are transposed with regards to each
other
other
"""
"""
tf_name
=
tf_name
.
replace
(
":0"
,
""
)
# device ids
tf_name
=
tf_name
.
replace
(
":0"
,
""
)
# device ids
...
...
src/transformers/models/fsmt/modeling_fsmt.py
View file @
e1f3156b
...
@@ -951,7 +951,7 @@ class FSMTModel(PretrainedFSMTModel):
...
@@ -951,7 +951,7 @@ class FSMTModel(PretrainedFSMTModel):
output_hidden_states
=
output_hidden_states
,
output_hidden_states
=
output_hidden_states
,
return_dict
=
return_dict
,
return_dict
=
return_dict
,
)
)
# If the user passed a tuple for encoder_outputs, we wrap it in a BaseModelOuput when return_dict=False
# If the user passed a tuple for encoder_outputs, we wrap it in a BaseModelOu
t
put when return_dict=False
elif
return_dict
and
not
isinstance
(
encoder_outputs
,
BaseModelOutput
):
elif
return_dict
and
not
isinstance
(
encoder_outputs
,
BaseModelOutput
):
encoder_outputs
=
BaseModelOutput
(
encoder_outputs
=
BaseModelOutput
(
last_hidden_state
=
encoder_outputs
[
0
],
last_hidden_state
=
encoder_outputs
[
0
],
...
...
src/transformers/models/t5/modeling_tf_t5.py
View file @
e1f3156b
...
@@ -642,7 +642,7 @@ class TFT5MainLayer(tf.keras.layers.Layer):
...
@@ -642,7 +642,7 @@ class TFT5MainLayer(tf.keras.layers.Layer):
raise
ValueError
(
f
"You have to specify either
{
err_msg_prefix
}
inputs or
{
err_msg_prefix
}
inputs_embeds"
)
raise
ValueError
(
f
"You have to specify either
{
err_msg_prefix
}
inputs or
{
err_msg_prefix
}
inputs_embeds"
)
if
inputs_embeds
is
None
:
if
inputs_embeds
is
None
:
assert
self
.
embed_tokens
is
not
None
,
"You have to intialize the model with valid token embeddings"
assert
self
.
embed_tokens
is
not
None
,
"You have to in
i
tialize the model with valid token embeddings"
inputs_embeds
=
self
.
embed_tokens
(
input_ids
)
inputs_embeds
=
self
.
embed_tokens
(
input_ids
)
batch_size
,
seq_length
=
input_shape
batch_size
,
seq_length
=
input_shape
...
...
src/transformers/models/transfo_xl/modeling_transfo_xl.py
View file @
e1f3156b
...
@@ -667,9 +667,9 @@ class TransfoXLLMHeadModelOutput(ModelOutput):
...
@@ -667,9 +667,9 @@ class TransfoXLLMHeadModelOutput(ModelOutput):
@
property
@
property
def
logits
(
self
):
def
logits
(
self
):
# predic
i
ton scores are the output of the adaptive softmax, see
# predict
i
on scores are the output of the adaptive softmax, see
# the file `modeling_transfo_xl_utilities`. Since the adaptive
# the file `modeling_transfo_xl_utilities`. Since the adaptive
# softmax returns the log softmax value, `self.predic
i
ton_scores`
# softmax returns the log softmax value, `self.predict
i
on_scores`
# are strictly speaking not exactly `logits`, but behave the same
# are strictly speaking not exactly `logits`, but behave the same
# way logits do.
# way logits do.
return
self
.
prediction_scores
return
self
.
prediction_scores
...
@@ -886,7 +886,7 @@ class TransfoXLModel(TransfoXLPreTrainedModel):
...
@@ -886,7 +886,7 @@ class TransfoXLModel(TransfoXLPreTrainedModel):
head_mask
=
head_mask
.
unsqueeze
(
1
).
unsqueeze
(
1
).
unsqueeze
(
1
)
head_mask
=
head_mask
.
unsqueeze
(
1
).
unsqueeze
(
1
).
unsqueeze
(
1
)
head_mask
=
head_mask
.
to
(
head_mask
=
head_mask
.
to
(
dtype
=
next
(
self
.
parameters
()).
dtype
dtype
=
next
(
self
.
parameters
()).
dtype
)
# switch to floa
d
if need + fp16 compatibility
)
# switch to floa
t
if need + fp16 compatibility
else
:
else
:
head_mask
=
[
None
]
*
self
.
n_layer
head_mask
=
[
None
]
*
self
.
n_layer
...
...
src/transformers/models/transfo_xl/modeling_transfo_xl_utilities.py
View file @
e1f3156b
...
@@ -91,8 +91,8 @@ class ProjectedAdaptiveLogSoftmax(nn.Module):
...
@@ -91,8 +91,8 @@ class ProjectedAdaptiveLogSoftmax(nn.Module):
Return:
Return:
if labels is None: out :: [len*bsz x n_tokens] log probabilities of tokens over the vocabulary else: out ::
if labels is None: out :: [len*bsz x n_tokens] log probabilities of tokens over the vocabulary else: out ::
[(len-1)*bsz] Negative log likelihood We could replace this implementation by the native PyTorch one if
[(len-1)*bsz] Negative log likelihood
.
We could replace this implementation by the native PyTorch one if
their
'
s had an option to set bias on all clusters in the native one. here:
theirs had an option to set bias on all clusters in the native one. here:
https://github.com/pytorch/pytorch/blob/dbe6a7a9ff1a364a8706bf5df58a1ca96d2fd9da/torch/nn/modules/adaptive.py#L138
https://github.com/pytorch/pytorch/blob/dbe6a7a9ff1a364a8706bf5df58a1ca96d2fd9da/torch/nn/modules/adaptive.py#L138
"""
"""
...
...
src/transformers/models/xlm/modeling_tf_xlm.py
View file @
e1f3156b
...
@@ -633,11 +633,11 @@ XLM_INPUTS_DOCSTRING = r"""
...
@@ -633,11 +633,11 @@ XLM_INPUTS_DOCSTRING = r"""
A parallel sequence of tokens to be used to indicate the language of each token in the input. Indices are
A parallel sequence of tokens to be used to indicate the language of each token in the input. Indices are
languages ids which can be obtained from the language names by using two conversion mappings provided in
languages ids which can be obtained from the language names by using two conversion mappings provided in
the configuration of the model (only provided for multilingual models). More precisely, the `language name
the configuration of the model (only provided for multilingual models). More precisely, the `language name
to language id` mapping is in :obj:`model.config.lang2id` (which is a dictionary str
r
ing to int) and the
to language id` mapping is in :obj:`model.config.lang2id` (which is a dictionary string to int) and the
`language id to language name` mapping is in :obj:`model.config.id2lang` (dictionary int to string).
`language id to language name` mapping is in :obj:`model.config.id2lang` (dictionary int to string).
See usage examples detailed in the :doc:`multilingual documentation <../multilingual>`.
See usage examples detailed in the :doc:`multilingual documentation <../multilingual>`.
t
token_type_ids (:obj:`Numpy array` or :obj:`tf.Tensor` of shape :obj:`({0})`, `optional`):
token_type_ids (:obj:`Numpy array` or :obj:`tf.Tensor` of shape :obj:`({0})`, `optional`):
Segment token indices to indicate first and second portions of the inputs. Indices are selected in ``[0,
Segment token indices to indicate first and second portions of the inputs. Indices are selected in ``[0,
1]``:
1]``:
...
...
src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py
View file @
e1f3156b
...
@@ -54,7 +54,7 @@ PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES = {
...
@@ -54,7 +54,7 @@ PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES = {
class
XLMRobertaTokenizer
(
PreTrainedTokenizer
):
class
XLMRobertaTokenizer
(
PreTrainedTokenizer
):
"""
"""
Adapted from :class:`~transfomers.RobertaTokenizer` and class:`~transfomers.XLNetTokenizer`. Based on
Adapted from :class:`~transfo
r
mers.RobertaTokenizer` and class:`~transfo
r
mers.XLNetTokenizer`. Based on
`SentencePiece <https://github.com/google/sentencepiece>`__.
`SentencePiece <https://github.com/google/sentencepiece>`__.
This tokenizer inherits from :class:`~transformers.PreTrainedTokenizer` which contains most of the main methods.
This tokenizer inherits from :class:`~transformers.PreTrainedTokenizer` which contains most of the main methods.
...
...
src/transformers/models/xlnet/modeling_xlnet.py
View file @
e1f3156b
...
@@ -904,7 +904,7 @@ XLNET_INPUTS_DOCSTRING = r"""
...
@@ -904,7 +904,7 @@ XLNET_INPUTS_DOCSTRING = r"""
Mask values selected in ``[0, 1]``:
Mask values selected in ``[0, 1]``:
- 1 for tokens that are **masked**,
- 1 for tokens that are **masked**,
- 0 for tokens that are **not maked**.
- 0 for tokens that are **not ma
s
ked**.
You can only uses one of :obj:`input_mask` and :obj:`attention_mask`.
You can only uses one of :obj:`input_mask` and :obj:`attention_mask`.
head_mask (:obj:`torch.FloatTensor` of shape :obj:`(num_heads,)` or :obj:`(num_layers, num_heads)`, `optional`):
head_mask (:obj:`torch.FloatTensor` of shape :obj:`(num_heads,)` or :obj:`(num_layers, num_heads)`, `optional`):
...
@@ -1211,7 +1211,7 @@ class XLNetModel(XLNetPreTrainedModel):
...
@@ -1211,7 +1211,7 @@ class XLNetModel(XLNetPreTrainedModel):
head_mask
=
head_mask
.
unsqueeze
(
1
).
unsqueeze
(
1
).
unsqueeze
(
1
)
head_mask
=
head_mask
.
unsqueeze
(
1
).
unsqueeze
(
1
).
unsqueeze
(
1
)
head_mask
=
head_mask
.
to
(
head_mask
=
head_mask
.
to
(
dtype
=
next
(
self
.
parameters
()).
dtype
dtype
=
next
(
self
.
parameters
()).
dtype
)
# switch to floa
d
if need + fp16 compatibility
)
# switch to floa
t
if need + fp16 compatibility
else
:
else
:
head_mask
=
[
None
]
*
self
.
n_layer
head_mask
=
[
None
]
*
self
.
n_layer
...
...
src/transformers/optimization_tf.py
View file @
e1f3156b
...
@@ -167,9 +167,9 @@ class AdamWeightDecay(tf.keras.optimizers.Adam):
...
@@ -167,9 +167,9 @@ class AdamWeightDecay(tf.keras.optimizers.Adam):
beta_2 (:obj:`float`, `optional`, defaults to 0.999):
beta_2 (:obj:`float`, `optional`, defaults to 0.999):
The beta2 parameter in Adam, which is the exponential decay rate for the 2nd momentum estimates.
The beta2 parameter in Adam, which is the exponential decay rate for the 2nd momentum estimates.
epsilon (:obj:`float`, `optional`, defaults to 1e-7):
epsilon (:obj:`float`, `optional`, defaults to 1e-7):
The epsilon parame
n
ter in Adam, which is a small constant for numerical stability.
The epsilon parameter in Adam, which is a small constant for numerical stability.
amsgrad (:obj:`bool`, `optional`, default to `False`):
amsgrad (:obj:`bool`, `optional`, default to `False`):
Whether to apply AMSGrad vari
e
nt of this algorithm or not, see `On the Convergence of Adam and Beyond
Whether to apply AMSGrad vari
a
nt of this algorithm or not, see `On the Convergence of Adam and Beyond
<https://arxiv.org/abs/1904.09237>`__.
<https://arxiv.org/abs/1904.09237>`__.
weight_decay_rate (:obj:`float`, `optional`, defaults to 0):
weight_decay_rate (:obj:`float`, `optional`, defaults to 0):
The weight decay to apply.
The weight decay to apply.
...
...
Prev
1
2
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment