Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
58b170cd
Unverified
Commit
58b170cd
authored
Apr 11, 2024
by
hugehope
Committed by
GitHub
Apr 11, 2024
Browse files
chore: remove repetitive words (#30174)
Signed-off-by:
hugehope
<
cmm7@sina.cn
>
parent
e50be9a0
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
4 additions
and
4 deletions
+4
-4
src/transformers/models/canine/modeling_canine.py
src/transformers/models/canine/modeling_canine.py
+1
-1
src/transformers/models/mamba/configuration_mamba.py
src/transformers/models/mamba/configuration_mamba.py
+1
-1
src/transformers/models/rwkv/configuration_rwkv.py
src/transformers/models/rwkv/configuration_rwkv.py
+1
-1
src/transformers/optimization.py
src/transformers/optimization.py
+1
-1
No files found.
src/transformers/models/canine/modeling_canine.py
View file @
58b170cd
...
@@ -608,7 +608,7 @@ class CanineAttention(nn.Module):
...
@@ -608,7 +608,7 @@ class CanineAttention(nn.Module):
chunk_end
=
min
(
from_seq_length
,
chunk_start
+
self
.
attend_from_chunk_width
)
chunk_end
=
min
(
from_seq_length
,
chunk_start
+
self
.
attend_from_chunk_width
)
from_chunks
.
append
((
chunk_start
,
chunk_end
))
from_chunks
.
append
((
chunk_start
,
chunk_end
))
# Determine the chunks (windows) that will
will
attend *to*.
# Determine the chunks (windows) that will attend *to*.
to_chunks
=
[]
to_chunks
=
[]
if
self
.
first_position_attends_to_all
:
if
self
.
first_position_attends_to_all
:
to_chunks
.
append
((
0
,
to_seq_length
))
to_chunks
.
append
((
0
,
to_seq_length
))
...
...
src/transformers/models/mamba/configuration_mamba.py
View file @
58b170cd
...
@@ -67,7 +67,7 @@ class MambaConfig(PretrainedConfig):
...
@@ -67,7 +67,7 @@ class MambaConfig(PretrainedConfig):
residual_in_fp32 (`bool`, *optional*, defaults to `True`):
residual_in_fp32 (`bool`, *optional*, defaults to `True`):
Whether or not residuals should be in `float32`. If set to `False` residuals will keep the same `dtype` as the rest of the model
Whether or not residuals should be in `float32`. If set to `False` residuals will keep the same `dtype` as the rest of the model
time_step_rank (`Union[int,str]`, *optional*, defaults to `"auto"`):
time_step_rank (`Union[int,str]`, *optional*, defaults to `"auto"`):
Rank of the
the
discretization projection matrix. `"auto"` means that it will default to `math.ceil(self.hidden_size / 16)`
Rank of the discretization projection matrix. `"auto"` means that it will default to `math.ceil(self.hidden_size / 16)`
time_step_scale (`float`, *optional*, defaults to 1.0):
time_step_scale (`float`, *optional*, defaults to 1.0):
Scale used used to scale `dt_proj.bias`.
Scale used used to scale `dt_proj.bias`.
time_step_min (`float`, *optional*, defaults to 0.001):
time_step_min (`float`, *optional*, defaults to 0.001):
...
...
src/transformers/models/rwkv/configuration_rwkv.py
View file @
58b170cd
...
@@ -41,7 +41,7 @@ class RwkvConfig(PretrainedConfig):
...
@@ -41,7 +41,7 @@ class RwkvConfig(PretrainedConfig):
Vocabulary size of the RWKV model. Defines the number of different tokens that can be represented by the
Vocabulary size of the RWKV model. Defines the number of different tokens that can be represented by the
`inputs_ids` passed when calling [`RwkvModel`].
`inputs_ids` passed when calling [`RwkvModel`].
context_length (`int`, *optional*, defaults to 1024):
context_length (`int`, *optional*, defaults to 1024):
The maximum sequence length that this model can be
be
used with in a single forward (using it in RNN mode
The maximum sequence length that this model can be used with in a single forward (using it in RNN mode
lets use any sequence length).
lets use any sequence length).
hidden_size (`int`, *optional*, defaults to 4096):
hidden_size (`int`, *optional*, defaults to 4096):
Dimensionality of the embeddings and hidden states.
Dimensionality of the embeddings and hidden states.
...
...
src/transformers/optimization.py
View file @
58b170cd
...
@@ -273,7 +273,7 @@ def get_polynomial_decay_schedule_with_warmup(
...
@@ -273,7 +273,7 @@ def get_polynomial_decay_schedule_with_warmup(
lr_init
=
optimizer
.
defaults
[
"lr"
]
lr_init
=
optimizer
.
defaults
[
"lr"
]
if
not
(
lr_init
>
lr_end
):
if
not
(
lr_init
>
lr_end
):
raise
ValueError
(
f
"lr_end (
{
lr_end
}
) must be
be
smaller than initial lr (
{
lr_init
}
)"
)
raise
ValueError
(
f
"lr_end (
{
lr_end
}
) must be smaller than initial lr (
{
lr_init
}
)"
)
lr_lambda
=
partial
(
lr_lambda
=
partial
(
_get_polynomial_decay_schedule_with_warmup_lr_lambda
,
_get_polynomial_decay_schedule_with_warmup_lr_lambda
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment