Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
nni
Commits
a272da9e
Unverified
Commit
a272da9e
authored
Mar 10, 2021
by
J-shang
Committed by
GitHub
Mar 10, 2021
Browse files
hotfix unhandled `TrainingService is not assigned` and extend exec time in pipeline (#3442)
parent
62af469b
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
20 additions
and
16 deletions
+20
-16
test/config/assessors/curvefitting.yml
test/config/assessors/curvefitting.yml
+1
-1
test/nni_test/nnitest/run_tests.py
test/nni_test/nnitest/run_tests.py
+2
-2
ts/nni_manager/core/nnimanager.ts
ts/nni_manager/core/nnimanager.ts
+17
-13
No files found.
test/config/assessors/curvefitting.yml
View file @
a272da9e
authorName
:
nni
authorName
:
nni
experimentName
:
default_test
experimentName
:
default_test
maxExecDuration
:
5
m
maxExecDuration
:
10
m
maxTrialNum
:
8
maxTrialNum
:
8
trialConcurrency
:
8
trialConcurrency
:
8
searchSpacePath
:
../naive_trial/search_space.json
searchSpacePath
:
../naive_trial/search_space.json
...
...
test/nni_test/nnitest/run_tests.py
View file @
a272da9e
...
@@ -260,9 +260,9 @@ def run(args):
...
@@ -260,9 +260,9 @@ def run(args):
continue
continue
# remote mode need more time to cleanup
# remote mode need more time to cleanup
if
args
.
ts
==
'remote'
:
if
args
.
ts
==
'remote'
:
wait_for_port_available
(
8080
,
18
0
)
wait_for_port_available
(
8080
,
24
0
)
else
:
else
:
wait_for_port_available
(
8080
,
3
0
)
wait_for_port_available
(
8080
,
6
0
)
# adl mode need more time to cleanup PVC
# adl mode need more time to cleanup PVC
if
args
.
ts
==
'adl'
and
name
==
'nnictl-resume-2'
:
if
args
.
ts
==
'adl'
and
name
==
'nnictl-resume-2'
:
...
...
ts/nni_manager/core/nnimanager.ts
View file @
a272da9e
...
@@ -326,22 +326,26 @@ class NNIManager implements Manager {
...
@@ -326,22 +326,26 @@ class NNIManager implements Manager {
}
}
public
async
stopExperimentBottomHalf
():
Promise
<
void
>
{
public
async
stopExperimentBottomHalf
():
Promise
<
void
>
{
const
trialJobList
:
TrialJobDetail
[]
=
await
this
.
trainingService
.
listTrialJobs
();
try
{
const
trialJobList
:
TrialJobDetail
[]
=
await
this
.
trainingService
.
listTrialJobs
();
// DON'T try to make it in parallel, the training service may not handle it well.
// If there is performance concern, consider to support batch cancellation on training service.
// DON'T try to make it in parallel, the training service may not handle it well.
for
(
const
trialJob
of
trialJobList
)
{
// If there is performance concern, consider to support batch cancellation on training service.
if
(
trialJob
.
status
===
'
RUNNING
'
||
for
(
const
trialJob
of
trialJobList
)
{
trialJob
.
status
===
'
WAITING
'
)
{
if
(
trialJob
.
status
===
'
RUNNING
'
||
try
{
trialJob
.
status
===
'
WAITING
'
)
{
this
.
log
.
info
(
`cancelTrialJob:
${
trialJob
.
id
}
`
);
try
{
await
this
.
trainingService
.
cancelTrialJob
(
trialJob
.
id
);
this
.
log
.
info
(
`cancelTrialJob:
${
trialJob
.
id
}
`
);
}
catch
(
error
)
{
await
this
.
trainingService
.
cancelTrialJob
(
trialJob
.
id
);
this
.
log
.
debug
(
`ignorable error on canceling trial
${
trialJob
.
id
}
.
${
error
}
`
);
}
catch
(
error
)
{
this
.
log
.
debug
(
`ignorable error on canceling trial
${
trialJob
.
id
}
.
${
error
}
`
);
}
}
}
}
}
await
this
.
trainingService
.
cleanUp
();
}
catch
(
err
)
{
this
.
log
.
error
(
`
${
err
.
stack
}
`
);
}
}
await
this
.
trainingService
.
cleanUp
();
if
(
this
.
experimentProfile
.
endTime
===
undefined
)
{
if
(
this
.
experimentProfile
.
endTime
===
undefined
)
{
this
.
setEndtime
();
this
.
setEndtime
();
}
}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment