Unverified Commit 36ade4a3 authored by Max Strobel's avatar Max Strobel Committed by GitHub
Browse files

fix(PatchTST): Wrong dropout used for PretainHead (#31117)



* fix(PatchTST): Wrong dropout used for PretainHead

* feat(PatchTST): remove unused config.dropout

---------
Co-authored-by: default avatarStrobel Maximilian (IFAG PSS SIS SCE ACM) <Maximilian.Strobel@infineon.com>
parent e83cf581
...@@ -67,8 +67,6 @@ class PatchTSTConfig(PretrainedConfig): ...@@ -67,8 +67,6 @@ class PatchTSTConfig(PretrainedConfig):
A value added to the denominator for numerical stability of normalization. A value added to the denominator for numerical stability of normalization.
attention_dropout (`float`, *optional*, defaults to 0.0): attention_dropout (`float`, *optional*, defaults to 0.0):
The dropout probability for the attention probabilities. The dropout probability for the attention probabilities.
dropout (`float`, *optional*, defaults to 0.0):
The dropout probability for all fully connected layers in the Transformer.
positional_dropout (`float`, *optional*, defaults to 0.0): positional_dropout (`float`, *optional*, defaults to 0.0):
The dropout probability in the positional embedding layer. The dropout probability in the positional embedding layer.
path_dropout (`float`, *optional*, defaults to 0.0): path_dropout (`float`, *optional*, defaults to 0.0):
...@@ -167,7 +165,6 @@ class PatchTSTConfig(PretrainedConfig): ...@@ -167,7 +165,6 @@ class PatchTSTConfig(PretrainedConfig):
norm_type: str = "batchnorm", norm_type: str = "batchnorm",
norm_eps: float = 1e-05, norm_eps: float = 1e-05,
attention_dropout: float = 0.0, attention_dropout: float = 0.0,
dropout: float = 0.0,
positional_dropout: float = 0.0, positional_dropout: float = 0.0,
path_dropout: float = 0.0, path_dropout: float = 0.0,
ff_dropout: float = 0.0, ff_dropout: float = 0.0,
...@@ -209,7 +206,6 @@ class PatchTSTConfig(PretrainedConfig): ...@@ -209,7 +206,6 @@ class PatchTSTConfig(PretrainedConfig):
self.num_attention_heads = num_attention_heads self.num_attention_heads = num_attention_heads
self.ffn_dim = ffn_dim self.ffn_dim = ffn_dim
self.num_hidden_layers = num_hidden_layers self.num_hidden_layers = num_hidden_layers
self.dropout = dropout
self.attention_dropout = attention_dropout self.attention_dropout = attention_dropout
self.share_embedding = share_embedding self.share_embedding = share_embedding
self.channel_attention = channel_attention self.channel_attention = channel_attention
......
...@@ -1262,7 +1262,7 @@ class PatchTSTMaskPretrainHead(nn.Module): ...@@ -1262,7 +1262,7 @@ class PatchTSTMaskPretrainHead(nn.Module):
def __init__(self, config: PatchTSTConfig): def __init__(self, config: PatchTSTConfig):
super().__init__() super().__init__()
self.dropout = nn.Dropout(config.dropout) self.dropout = nn.Dropout(config.head_dropout) if config.head_dropout > 0 else nn.Identity()
self.linear = nn.Linear(config.d_model, config.patch_length) self.linear = nn.Linear(config.d_model, config.patch_length)
self.use_cls_token = config.use_cls_token self.use_cls_token = config.use_cls_token
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment