"git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "0e774e57a67654ccb13e8684a2c08f7b11da9fb0"
Commit da2d8ca2 authored by lukovnikov's avatar lukovnikov
Browse files

fix for negative learning rate with warmup_linear in BertAdam (happens when...

fix for negative learning rate with warmup_linear in BertAdam (happens when t_total is specified incorrectly)
+ copied BERT optimization warmup functions to OpenAI optimization file + added comments
parent e04bab59
...@@ -37,7 +37,7 @@ def warmup_linear(x, warmup=0.002): ...@@ -37,7 +37,7 @@ def warmup_linear(x, warmup=0.002):
After `t_total`-th training step, learning rate is zero. """ After `t_total`-th training step, learning rate is zero. """
if x < warmup: if x < warmup:
return x/warmup return x/warmup
return max(1.0 - x, 0) return max((x-1.)/(warmup-1.), 0)
SCHEDULES = { SCHEDULES = {
'warmup_cosine':warmup_cosine, 'warmup_cosine':warmup_cosine,
......
...@@ -37,7 +37,7 @@ def warmup_linear(x, warmup=0.002): ...@@ -37,7 +37,7 @@ def warmup_linear(x, warmup=0.002):
After `t_total`-th training step, learning rate is zero. """ After `t_total`-th training step, learning rate is zero. """
if x < warmup: if x < warmup:
return x/warmup return x/warmup
return max(1.0 - x, 0) return max((x-1.)/(warmup-1.), 0)
SCHEDULES = { SCHEDULES = {
'warmup_cosine':warmup_cosine, 'warmup_cosine':warmup_cosine,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment