Commit e00b4ff1 authored by thomwolf's avatar thomwolf
Browse files

fix #1017

parent 2f939713
......@@ -393,8 +393,8 @@ for batch in train_data:
loss = model(batch)
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), max_grad_norm) # Gradient clipping is not in AdamW anymore (so you can use amp without issue)
scheduler.step()
optimizer.step()
scheduler.step()
optimizer.zero_grad()
```
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment