Commit da2d8ca2 authored by lukovnikov's avatar lukovnikov
Browse files

fix for negative learning rate with warmup_linear in BertAdam (happens when...

fix for negative learning rate with warmup_linear in BertAdam (happens when t_total is specified incorrectly)
+ copied BERT optimization warmup functions to OpenAI optimization file + added comments
parent e04bab59
......@@ -37,7 +37,7 @@ def warmup_linear(x, warmup=0.002):
After `t_total`-th training step, learning rate is zero. """
if x < warmup:
return x/warmup
return max(1.0 - x, 0)
return max((x-1.)/(warmup-1.), 0)
SCHEDULES = {
'warmup_cosine':warmup_cosine,
......
......@@ -37,7 +37,7 @@ def warmup_linear(x, warmup=0.002):
After `t_total`-th training step, learning rate is zero. """
if x < warmup:
return x/warmup
return max(1.0 - x, 0)
return max((x-1.)/(warmup-1.), 0)
SCHEDULES = {
'warmup_cosine':warmup_cosine,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment