Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
4cdbf63c
Unverified
Commit
4cdbf63c
authored
Jul 09, 2021
by
Stas Bekman
Committed by
GitHub
Jul 09, 2021
Browse files
[debugging utils] minor doc improvements (#12525)
parent
fb65f65e
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
16 additions
and
6 deletions
+16
-6
src/transformers/debug_utils.py
src/transformers/debug_utils.py
+16
-6
No files found.
src/transformers/debug_utils.py
View file @
4cdbf63c
...
@@ -87,6 +87,9 @@ class DebugUnderflowOverflow:
...
@@ -87,6 +87,9 @@ class DebugUnderflowOverflow:
debug_overflow = DebugUnderflowOverflow(model, max_frames_to_save=100)
debug_overflow = DebugUnderflowOverflow(model, max_frames_to_save=100)
To validate that you have set up this debugging feature correctly, and you intend to use it in a training that may
take hours to complete, first run it with normal tracing enabled for one of a few batches as explained in the next
section.
Mode 2. Specific batch absolute min/max tracing without detection
Mode 2. Specific batch absolute min/max tracing without detection
...
@@ -104,12 +107,19 @@ class DebugUnderflowOverflow:
...
@@ -104,12 +107,19 @@ class DebugUnderflowOverflow:
fast-forward right to that area.
fast-forward right to that area.
Early stopping:
You can also specify the batch number after which to stop the training, with ::
You can also specify the batch number after which to stop the training, with ::
debug_overflow = DebugUnderflowOverflow(model, trace_batch_nums=[1,3], abort_after_batch_num=3)
debug_overflow = DebugUnderflowOverflow(model, trace_batch_nums=[1,3], abort_after_batch_num=3)
This feature is mainly useful in the tracing mode, but you can use it for any more.
This feature is mainly useful in the tracing mode, but you can use it for any mode.
**Performance**:
As this module measures absolute ``min``/``max`` of each weight of the model on every forward it'll slow the
training down. Therefore remember to turn it off once the debugging needs have been met.
Args:
Args:
model (:obj:`nn.Module`):
model (:obj:`nn.Module`):
...
@@ -277,20 +287,20 @@ def get_abs_min_max(var, ctx):
...
@@ -277,20 +287,20 @@ def get_abs_min_max(var, ctx):
def
detect_overflow
(
var
,
ctx
):
def
detect_overflow
(
var
,
ctx
):
"""
"""
Report
of
the tensor contains any ``nan``
and
``inf`` entries.
Report
whether
the tensor contains any ``nan``
or
``inf`` entries.
This is useful for detecting overflows/underflows and best to call right after the function that did some math that
This is useful for detecting overflows/underflows and best to call right after the function that did some math that
modified the
variable
in question.
modified the
tensor
in question.
Th
e
function contains a few other helper features that you can enable and tweak directly if you want to track
Th
is
function contains a few other helper features that you can enable and tweak directly if you want to track
various other things.
various other things.
Args:
Args:
var: tensor variable to check
var:
the
tensor variable to check
ctx: the message to print as a context
ctx: the message to print as a context
Return:
Return:
True if ``inf`` or ``nan`` was detected, False otherwise
:obj:`
True
`
if ``inf`` or ``nan`` was detected,
:obj:`
False
`
otherwise
"""
"""
detected
=
False
detected
=
False
if
torch
.
isnan
(
var
).
any
().
item
():
if
torch
.
isnan
(
var
).
any
().
item
():
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment