Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
356c98bd
Commit
356c98bd
authored
Aug 07, 2020
by
Kaushik Shivakumar
Browse files
Merge remote-tracking branch 'upstream/master' into detr-push-3
parents
d31aba8a
b9785623
Changes
360
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
0 additions
and
2996 deletions
+0
-2996
research/brain_coder/single_task/pg_train.py
research/brain_coder/single_task/pg_train.py
+0
-782
research/brain_coder/single_task/pg_train_test.py
research/brain_coder/single_task/pg_train_test.py
+0
-87
research/brain_coder/single_task/results_lib.py
research/brain_coder/single_task/results_lib.py
+0
-155
research/brain_coder/single_task/results_lib_test.py
research/brain_coder/single_task/results_lib_test.py
+0
-84
research/brain_coder/single_task/run.py
research/brain_coder/single_task/run.py
+0
-142
research/brain_coder/single_task/run_eval_tasks.py
research/brain_coder/single_task/run_eval_tasks.py
+0
-296
research/brain_coder/single_task/test_tasks.py
research/brain_coder/single_task/test_tasks.py
+0
-127
research/brain_coder/single_task/test_tasks_test.py
research/brain_coder/single_task/test_tasks_test.py
+0
-63
research/brain_coder/single_task/tune.py
research/brain_coder/single_task/tune.py
+0
-262
research/cognitive_mapping_and_planning/.gitignore
research/cognitive_mapping_and_planning/.gitignore
+0
-4
research/cognitive_mapping_and_planning/README.md
research/cognitive_mapping_and_planning/README.md
+0
-127
research/cognitive_mapping_and_planning/__init__.py
research/cognitive_mapping_and_planning/__init__.py
+0
-0
research/cognitive_mapping_and_planning/cfgs/__init__.py
research/cognitive_mapping_and_planning/cfgs/__init__.py
+0
-0
research/cognitive_mapping_and_planning/cfgs/config_cmp.py
research/cognitive_mapping_and_planning/cfgs/config_cmp.py
+0
-283
research/cognitive_mapping_and_planning/cfgs/config_common.py
...arch/cognitive_mapping_and_planning/cfgs/config_common.py
+0
-261
research/cognitive_mapping_and_planning/cfgs/config_distill.py
...rch/cognitive_mapping_and_planning/cfgs/config_distill.py
+0
-114
research/cognitive_mapping_and_planning/cfgs/config_vision_baseline.py
...itive_mapping_and_planning/cfgs/config_vision_baseline.py
+0
-173
research/cognitive_mapping_and_planning/data/.gitignore
research/cognitive_mapping_and_planning/data/.gitignore
+0
-3
research/cognitive_mapping_and_planning/data/README.md
research/cognitive_mapping_and_planning/data/README.md
+0
-33
research/cognitive_mapping_and_planning/datasets/__init__.py
research/cognitive_mapping_and_planning/datasets/__init__.py
+0
-0
No files found.
Too many changes to show.
To preserve performance only
360 of 360+
files are displayed.
Plain diff
Email patch
research/brain_coder/single_task/pg_train.py
deleted
100644 → 0
View file @
d31aba8a
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
r
"""Train RL agent on coding tasks."""
import
contextlib
import
cPickle
import
cProfile
import
marshal
import
os
import
time
from
absl
import
flags
from
absl
import
logging
import
tensorflow
as
tf
# internal session lib import
from
single_task
import
data
# brain coder
from
single_task
import
defaults
# brain coder
from
single_task
import
pg_agent
as
agent_lib
# brain coder
from
single_task
import
results_lib
# brain coder
FLAGS
=
flags
.
FLAGS
flags
.
DEFINE_string
(
'master'
,
''
,
'URL of the TensorFlow master to use.'
)
flags
.
DEFINE_integer
(
'ps_tasks'
,
0
,
'Number of parameter server tasks. Only set to 0 for '
'single worker training.'
)
flags
.
DEFINE_integer
(
'summary_interval'
,
10
,
'How often to write summaries.'
)
flags
.
DEFINE_integer
(
'summary_tasks'
,
16
,
'If greater than 0 only tasks 0 through summary_tasks - 1 '
'will write summaries. If 0, all tasks will write '
'summaries.'
)
flags
.
DEFINE_bool
(
'stop_on_success'
,
True
,
'If True, training will stop as soon as a solution is found. '
'If False, training will continue indefinitely until another '
'stopping condition is reached.'
)
flags
.
DEFINE_bool
(
'do_profiling'
,
False
,
'If True, cProfile profiler will run and results will be '
'written to logdir. WARNING: Results will not be written if '
'the code crashes. Make sure it exists successfully.'
)
flags
.
DEFINE_integer
(
'model_v'
,
0
,
'Model verbosity level.'
)
flags
.
DEFINE_bool
(
'delayed_graph_cleanup'
,
True
,
'If true, container for n-th run will not be reset until the (n+1)-th run '
'is complete. This greatly reduces the chance that a worker is still '
'using the n-th container when it is cleared.'
)
def
define_tuner_hparam_space
(
hparam_space_type
):
"""Define tunable hparams for grid search."""
if
hparam_space_type
not
in
(
'pg'
,
'pg-topk'
,
'topk'
,
'is'
):
raise
ValueError
(
'Hparam space is not valid: "%s"'
%
hparam_space_type
)
# Discrete hparam space is stored as a dict from hparam name to discrete
# values.
hparam_space
=
{}
if
hparam_space_type
in
(
'pg'
,
'pg-topk'
,
'is'
):
# Add a floating point parameter named learning rate.
hparam_space
[
'lr'
]
=
[
1e-5
,
1e-4
,
1e-3
]
hparam_space
[
'entropy_beta'
]
=
[
0.005
,
0.01
,
0.05
,
0.10
]
else
:
# 'topk'
# Add a floating point parameter named learning rate.
hparam_space
[
'lr'
]
=
[
1e-5
,
1e-4
,
1e-3
]
hparam_space
[
'entropy_beta'
]
=
[
0.0
,
0.005
,
0.01
,
0.05
,
0.10
]
if
hparam_space_type
in
(
'topk'
,
'pg-topk'
):
# topk tuning will be enabled.
hparam_space
[
'topk'
]
=
[
10
]
hparam_space
[
'topk_loss_hparam'
]
=
[
1.0
,
10.0
,
50.0
,
200.0
]
elif
hparam_space_type
==
'is'
:
# importance sampling tuning will be enabled.
hparam_space
[
'replay_temperature'
]
=
[
0.25
,
0.5
,
1.0
,
2.0
]
hparam_space
[
'alpha'
]
=
[
0.5
,
0.75
,
63
/
64.
]
return
hparam_space
def
write_hparams_to_config
(
config
,
hparams
,
hparam_space_type
):
"""Write hparams given by the tuner into the Config object."""
if
hparam_space_type
not
in
(
'pg'
,
'pg-topk'
,
'topk'
,
'is'
):
raise
ValueError
(
'Hparam space is not valid: "%s"'
%
hparam_space_type
)
config
.
agent
.
lr
=
hparams
.
lr
config
.
agent
.
entropy_beta
=
hparams
.
entropy_beta
if
hparam_space_type
in
(
'topk'
,
'pg-topk'
):
# topk tuning will be enabled.
config
.
agent
.
topk
=
hparams
.
topk
config
.
agent
.
topk_loss_hparam
=
hparams
.
topk_loss_hparam
elif
hparam_space_type
==
'is'
:
# importance sampling tuning will be enabled.
config
.
agent
.
replay_temperature
=
hparams
.
replay_temperature
config
.
agent
.
alpha
=
hparams
.
alpha
def
make_initialized_variable
(
value
,
name
,
shape
=
None
,
dtype
=
tf
.
float32
):
"""Create a tf.Variable with a constant initializer.
Args:
value: Constant value to initialize the variable with. This is the value
that the variable starts with.
name: Name of the variable in the TF graph.
shape: Shape of the variable. If None, variable will be a scalar.
dtype: Data type of the variable. Should be a TF dtype. Defaults to
tf.float32.
Returns:
tf.Variable instance.
"""
if
shape
is
None
:
shape
=
[]
return
tf
.
get_variable
(
name
=
name
,
shape
=
shape
,
initializer
=
tf
.
constant_initializer
(
value
),
dtype
=
dtype
,
trainable
=
False
)
class
AsyncTrainer
(
object
):
"""Manages graph creation and training.
This async trainer creates a global model on the parameter server, and a local
model (for this worker). Gradient updates are sent to the global model, and
the updated weights are synced to the local copy.
"""
def
__init__
(
self
,
config
,
task_id
,
ps_tasks
,
num_workers
,
is_chief
=
True
,
summary_writer
=
None
,
dtype
=
tf
.
float32
,
summary_interval
=
1
,
run_number
=
0
,
logging_dir
=
'/tmp'
,
model_v
=
0
):
self
.
config
=
config
self
.
data_manager
=
data
.
DataManager
(
config
,
run_number
=
run_number
,
do_code_simplification
=
not
FLAGS
.
stop_on_success
)
self
.
task_id
=
task_id
self
.
ps_tasks
=
ps_tasks
self
.
is_chief
=
is_chief
if
ps_tasks
==
0
:
assert
task_id
==
0
,
'No parameter servers specified. Expecting 1 task.'
assert
num_workers
==
1
,
(
'No parameter servers specified. Expecting 1 task.'
)
worker_device
=
'/job:localhost/replica:%d/task:0/cpu:0'
%
task_id
# worker_device = '/cpu:0'
# ps_device = '/cpu:0'
else
:
assert
num_workers
>
0
,
'There must be at least 1 training worker.'
worker_device
=
'/job:worker/replica:%d/task:0/cpu:0'
%
task_id
# ps_device = '/job:ps/replica:0/task:0/cpu:0'
logging
.
info
(
'worker_device: %s'
,
worker_device
)
logging_file
=
os
.
path
.
join
(
logging_dir
,
'solutions_%d.txt'
%
task_id
)
experience_replay_file
=
os
.
path
.
join
(
logging_dir
,
'replay_buffer_%d.pickle'
%
task_id
)
self
.
topk_file
=
os
.
path
.
join
(
logging_dir
,
'topk_buffer_%d.pickle'
%
task_id
)
tf
.
get_variable_scope
().
set_use_resource
(
True
)
# global model
with
tf
.
device
(
tf
.
train
.
replica_device_setter
(
ps_tasks
,
ps_device
=
'/job:ps/replica:0'
,
worker_device
=
worker_device
)):
with
tf
.
variable_scope
(
'global'
):
global_model
=
agent_lib
.
LMAgent
(
config
,
dtype
=
dtype
,
is_local
=
False
)
global_params_dict
=
{
p
.
name
:
p
for
p
in
global_model
.
sync_variables
}
self
.
global_model
=
global_model
self
.
global_step
=
make_initialized_variable
(
0
,
'global_step'
,
dtype
=
tf
.
int64
)
self
.
global_best_reward
=
make_initialized_variable
(
-
10.0
,
'global_best_reward'
,
dtype
=
tf
.
float64
)
self
.
is_best_model
=
make_initialized_variable
(
False
,
'is_best_model'
,
dtype
=
tf
.
bool
)
self
.
reset_is_best_model
=
self
.
is_best_model
.
assign
(
False
)
self
.
global_best_reward_placeholder
=
tf
.
placeholder
(
tf
.
float64
,
[],
name
=
'global_best_reward_placeholder'
)
self
.
assign_global_best_reward_op
=
tf
.
group
(
self
.
global_best_reward
.
assign
(
self
.
global_best_reward_placeholder
),
self
.
is_best_model
.
assign
(
True
))
def
assign_global_best_reward_fn
(
session
,
reward
):
reward
=
round
(
reward
,
10
)
best_reward
=
round
(
session
.
run
(
self
.
global_best_reward
),
10
)
is_best
=
reward
>
best_reward
if
is_best
:
session
.
run
(
self
.
assign_global_best_reward_op
,
{
self
.
global_best_reward_placeholder
:
reward
})
return
is_best
self
.
assign_global_best_reward_fn
=
assign_global_best_reward_fn
# Any worker will set to true when it finds a solution.
self
.
found_solution_flag
=
make_initialized_variable
(
False
,
'found_solution_flag'
,
dtype
=
tf
.
bool
)
self
.
found_solution_op
=
self
.
found_solution_flag
.
assign
(
True
)
self
.
run_number
=
make_initialized_variable
(
run_number
,
'run_number'
,
dtype
=
tf
.
int32
)
# Store a solution when found.
self
.
code_solution_variable
=
tf
.
get_variable
(
'code_solution'
,
[],
tf
.
string
,
initializer
=
tf
.
constant_initializer
(
''
))
self
.
code_solution_ph
=
tf
.
placeholder
(
tf
.
string
,
[],
name
=
'code_solution_ph'
)
self
.
code_solution_assign_op
=
self
.
code_solution_variable
.
assign
(
self
.
code_solution_ph
)
def
assign_code_solution_fn
(
session
,
code_solution_string
):
session
.
run
(
self
.
code_solution_assign_op
,
{
self
.
code_solution_ph
:
code_solution_string
})
self
.
assign_code_solution_fn
=
assign_code_solution_fn
# Count all programs sampled from policy. This does not include
# programs sampled from replay buffer.
# This equals NPE (number of programs executed). Only programs sampled
# from the policy need to be executed.
self
.
program_count
=
make_initialized_variable
(
0
,
'program_count'
,
dtype
=
tf
.
int64
)
# local model
with
tf
.
device
(
worker_device
):
with
tf
.
variable_scope
(
'local'
):
self
.
model
=
model
=
agent_lib
.
LMAgent
(
config
,
task_id
=
task_id
,
logging_file
=
logging_file
,
experience_replay_file
=
experience_replay_file
,
dtype
=
dtype
,
global_best_reward_fn
=
self
.
assign_global_best_reward_fn
,
found_solution_op
=
self
.
found_solution_op
,
assign_code_solution_fn
=
self
.
assign_code_solution_fn
,
program_count
=
self
.
program_count
,
stop_on_success
=
FLAGS
.
stop_on_success
,
verbose_level
=
model_v
)
local_params
=
model
.
trainable_variables
local_params_dict
=
{
p
.
name
:
p
for
p
in
local_params
}
# Pull global params to local model.
def
_global_to_local_scope
(
name
):
assert
name
.
startswith
(
'global/'
)
return
'local'
+
name
[
6
:]
sync_dict
=
{
local_params_dict
[
_global_to_local_scope
(
p_name
)]:
p
for
p_name
,
p
in
global_params_dict
.
items
()}
self
.
sync_op
=
tf
.
group
(
*
[
v_local
.
assign
(
v_global
)
for
v_local
,
v_global
in
sync_dict
.
items
()])
# Pair local gradients with global params.
grad_var_dict
=
{
gradient
:
sync_dict
[
local_var
]
for
local_var
,
gradient
in
model
.
gradients_dict
.
items
()}
# local model
model
.
make_summary_ops
()
# Don't put summaries under 'local' scope.
with
tf
.
variable_scope
(
'local'
):
self
.
train_op
=
model
.
optimizer
.
apply_gradients
(
grad_var_dict
.
items
(),
global_step
=
self
.
global_step
)
self
.
local_init_op
=
tf
.
variables_initializer
(
tf
.
get_collection
(
tf
.
GraphKeys
.
GLOBAL_VARIABLES
,
tf
.
get_variable_scope
().
name
))
self
.
local_step
=
0
self
.
last_summary_time
=
time
.
time
()
self
.
summary_interval
=
summary_interval
self
.
summary_writer
=
summary_writer
self
.
cached_global_step
=
-
1
self
.
cached_global_npe
=
-
1
logging
.
info
(
'summary_interval: %d'
,
self
.
summary_interval
)
# Load top-k buffer.
if
self
.
model
.
top_episodes
is
not
None
and
tf
.
gfile
.
Exists
(
self
.
topk_file
):
try
:
with
tf
.
gfile
.
FastGFile
(
self
.
topk_file
,
'r'
)
as
f
:
self
.
model
.
top_episodes
=
cPickle
.
loads
(
f
.
read
())
logging
.
info
(
'Loaded top-k buffer from disk with %d items. Location: "%s"'
,
len
(
self
.
model
.
top_episodes
),
self
.
topk_file
)
except
(
cPickle
.
UnpicklingError
,
EOFError
)
as
e
:
logging
.
warn
(
'Failed to load existing top-k buffer from disk. Removing bad file.'
'
\n
Location: "%s"
\n
Exception: %s'
,
self
.
topk_file
,
str
(
e
))
tf
.
gfile
.
Remove
(
self
.
topk_file
)
def
initialize
(
self
,
session
):
"""Run initialization ops."""
session
.
run
(
self
.
local_init_op
)
session
.
run
(
self
.
sync_op
)
self
.
cached_global_step
,
self
.
cached_global_npe
=
session
.
run
(
[
self
.
global_step
,
self
.
program_count
])
def
update_global_model
(
self
,
session
):
"""Run an update step.
1) Asynchronously copy global weights to local model.
2) Call into local model's update_step method, which does the following:
a) Sample batch of programs from policy.
b) Compute rewards.
c) Compute gradients and update the global model asynchronously.
3) Write tensorboard summaries to disk.
Args:
session: tf.Session instance.
"""
session
.
run
(
self
.
sync_op
)
# Copy weights from global to local.
with
session
.
as_default
():
result
=
self
.
model
.
update_step
(
session
,
self
.
data_manager
.
sample_rl_batch
(),
self
.
train_op
,
self
.
global_step
)
global_step
=
result
.
global_step
global_npe
=
result
.
global_npe
summaries
=
result
.
summaries_list
self
.
cached_global_step
=
global_step
self
.
cached_global_npe
=
global_npe
self
.
local_step
+=
1
if
self
.
summary_writer
and
self
.
local_step
%
self
.
summary_interval
==
0
:
if
not
isinstance
(
summaries
,
(
tuple
,
list
)):
summaries
=
[
summaries
]
summaries
.
append
(
self
.
_local_step_summary
())
if
self
.
is_chief
:
(
global_best_reward
,
found_solution_flag
,
program_count
)
=
session
.
run
(
[
self
.
global_best_reward
,
self
.
found_solution_flag
,
self
.
program_count
])
summaries
.
append
(
tf
.
Summary
(
value
=
[
tf
.
Summary
.
Value
(
tag
=
'model/best_reward'
,
simple_value
=
global_best_reward
)]))
summaries
.
append
(
tf
.
Summary
(
value
=
[
tf
.
Summary
.
Value
(
tag
=
'model/solution_found'
,
simple_value
=
int
(
found_solution_flag
))]))
summaries
.
append
(
tf
.
Summary
(
value
=
[
tf
.
Summary
.
Value
(
tag
=
'model/program_count'
,
simple_value
=
program_count
)]))
for
s
in
summaries
:
self
.
summary_writer
.
add_summary
(
s
,
global_step
)
self
.
last_summary_time
=
time
.
time
()
def
_local_step_summary
(
self
):
"""Compute number of local steps per time increment."""
dt
=
time
.
time
()
-
self
.
last_summary_time
steps_per_time
=
self
.
summary_interval
/
float
(
dt
)
return
tf
.
Summary
(
value
=
[
tf
.
Summary
.
Value
(
tag
=
'local_step/per_sec'
,
simple_value
=
steps_per_time
),
tf
.
Summary
.
Value
(
tag
=
'local_step/step'
,
simple_value
=
self
.
local_step
)])
def
maybe_save_best_model
(
self
,
session
,
saver
,
checkpoint_file
):
"""Check if this model got the highest reward and save to disk if so."""
if
self
.
is_chief
and
session
.
run
(
self
.
is_best_model
):
logging
.
info
(
'Saving best model to "%s"'
,
checkpoint_file
)
saver
.
save
(
session
,
checkpoint_file
)
session
.
run
(
self
.
reset_is_best_model
)
def
save_replay_buffer
(
self
):
"""Save replay buffer to disk.
Call this periodically so that training can recover if jobs go down.
"""
if
self
.
model
.
experience_replay
is
not
None
:
logging
.
info
(
'Saving experience replay buffer to "%s".'
,
self
.
model
.
experience_replay
.
save_file
)
self
.
model
.
experience_replay
.
incremental_save
(
True
)
def
delete_replay_buffer
(
self
):
"""Delete replay buffer from disk.
Call this at the end of training to clean up. Replay buffer can get very
large.
"""
if
self
.
model
.
experience_replay
is
not
None
:
logging
.
info
(
'Deleting experience replay buffer at "%s".'
,
self
.
model
.
experience_replay
.
save_file
)
tf
.
gfile
.
Remove
(
self
.
model
.
experience_replay
.
save_file
)
def
save_topk_buffer
(
self
):
"""Save top-k buffer to disk.
Call this periodically so that training can recover if jobs go down.
"""
if
self
.
model
.
top_episodes
is
not
None
:
logging
.
info
(
'Saving top-k buffer to "%s".'
,
self
.
topk_file
)
# Overwrite previous data each time.
with
tf
.
gfile
.
FastGFile
(
self
.
topk_file
,
'w'
)
as
f
:
f
.
write
(
cPickle
.
dumps
(
self
.
model
.
top_episodes
))
@
contextlib
.
contextmanager
def
managed_session
(
sv
,
master
=
''
,
config
=
None
,
start_standard_services
=
True
,
close_summary_writer
=
True
,
max_wait_secs
=
7200
):
# Same as Supervisor.managed_session, but with configurable timeout.
try
:
sess
=
sv
.
prepare_or_wait_for_session
(
master
=
master
,
config
=
config
,
start_standard_services
=
start_standard_services
,
max_wait_secs
=
max_wait_secs
)
yield
sess
except
tf
.
errors
.
DeadlineExceededError
:
raise
except
Exception
as
e
:
# pylint: disable=broad-except
sv
.
request_stop
(
e
)
finally
:
try
:
# Request all the threads to stop and wait for them to do so. Any
# exception raised by the threads is raised again from stop().
# Passing stop_grace_period_secs is for blocked enqueue/dequeue
# threads which are not checking for `should_stop()`. They
# will be stopped when we close the session further down.
sv
.
stop
(
close_summary_writer
=
close_summary_writer
)
finally
:
# Close the session to finish up all pending calls. We do not care
# about exceptions raised when closing. This takes care of
# blocked enqueue/dequeue calls.
try
:
sess
.
close
()
except
Exception
:
# pylint: disable=broad-except
# Silently ignore exceptions raised by close().
pass
def
train
(
config
,
is_chief
,
tuner
=
None
,
run_dir
=
None
,
run_number
=
0
,
results_writer
=
None
):
"""Run training loop.
Args:
config: config_lib.Config instance containing global config (agent and env).
is_chief: True if this worker is chief. Chief worker manages writing some
data to disk and initialization of the global model.
tuner: A tuner instance. If not tuning, leave as None.
run_dir: Directory where all data for this run will be written. If None,
run_dir = FLAGS.logdir. Set this argument when doing multiple runs.
run_number: Which run is this.
results_writer: Managest writing training results to disk. Results are a
dict of metric names and values.
Returns:
The trainer object used to run training updates.
"""
logging
.
info
(
'Will run asynchronous training.'
)
if
run_dir
is
None
:
run_dir
=
FLAGS
.
logdir
train_dir
=
os
.
path
.
join
(
run_dir
,
'train'
)
best_model_checkpoint
=
os
.
path
.
join
(
train_dir
,
'best.ckpt'
)
events_dir
=
'%s/events_%d'
%
(
run_dir
,
FLAGS
.
task_id
)
logging
.
info
(
'Events directory: %s'
,
events_dir
)
logging_dir
=
os
.
path
.
join
(
run_dir
,
'logs'
)
if
not
tf
.
gfile
.
Exists
(
logging_dir
):
tf
.
gfile
.
MakeDirs
(
logging_dir
)
status_file
=
os
.
path
.
join
(
logging_dir
,
'status.txt'
)
if
FLAGS
.
summary_tasks
and
FLAGS
.
task_id
<
FLAGS
.
summary_tasks
:
summary_writer
=
tf
.
summary
.
FileWriter
(
events_dir
)
else
:
summary_writer
=
None
# Only profile task 0.
if
FLAGS
.
do_profiling
:
logging
.
info
(
'Profiling enabled'
)
profiler
=
cProfile
.
Profile
()
profiler
.
enable
()
else
:
profiler
=
None
trainer
=
AsyncTrainer
(
config
,
FLAGS
.
task_id
,
FLAGS
.
ps_tasks
,
FLAGS
.
num_workers
,
is_chief
=
is_chief
,
summary_interval
=
FLAGS
.
summary_interval
,
summary_writer
=
summary_writer
,
logging_dir
=
logging_dir
,
run_number
=
run_number
,
model_v
=
FLAGS
.
model_v
)
variables_to_save
=
[
v
for
v
in
tf
.
global_variables
()
if
v
.
name
.
startswith
(
'global'
)]
global_init_op
=
tf
.
variables_initializer
(
variables_to_save
)
saver
=
tf
.
train
.
Saver
(
variables_to_save
)
var_list
=
tf
.
get_collection
(
tf
.
GraphKeys
.
TRAINABLE_VARIABLES
,
tf
.
get_variable_scope
().
name
)
logging
.
info
(
'Trainable vars:'
)
for
v
in
var_list
:
logging
.
info
(
' %s, %s, %s'
,
v
.
name
,
v
.
device
,
v
.
get_shape
())
logging
.
info
(
'All vars:'
)
for
v
in
tf
.
global_variables
():
logging
.
info
(
' %s, %s, %s'
,
v
.
name
,
v
.
device
,
v
.
get_shape
())
def
init_fn
(
unused_sess
):
logging
.
info
(
'No checkpoint found. Initialized global params.'
)
sv
=
tf
.
train
.
Supervisor
(
is_chief
=
is_chief
,
logdir
=
train_dir
,
saver
=
saver
,
summary_op
=
None
,
init_op
=
global_init_op
,
init_fn
=
init_fn
,
summary_writer
=
summary_writer
,
ready_op
=
tf
.
report_uninitialized_variables
(
variables_to_save
),
ready_for_local_init_op
=
None
,
global_step
=
trainer
.
global_step
,
save_model_secs
=
30
,
save_summaries_secs
=
30
)
# Add a thread that periodically checks if this Trial should stop
# based on an early stopping policy.
if
tuner
:
sv
.
Loop
(
60
,
tuner
.
check_for_stop
,
(
sv
.
coord
,))
last_replay_save_time
=
time
.
time
()
global_step
=
-
1
logging
.
info
(
'Starting session. '
'If this hangs, we
\'
re mostly likely waiting to connect '
'to the parameter server. One common cause is that the parameter '
'server DNS name isn
\'
t resolving yet, or is misspecified.'
)
should_retry
=
True
supervisor_deadline_exceeded
=
False
while
should_retry
:
try
:
with
managed_session
(
sv
,
FLAGS
.
master
,
max_wait_secs
=
60
)
as
session
,
session
.
as_default
():
should_retry
=
False
do_training
=
True
try
:
trainer
.
initialize
(
session
)
if
session
.
run
(
trainer
.
run_number
)
!=
run_number
:
# If we loaded existing model from disk, and the saved run number is
# different, throw an exception.
raise
RuntimeError
(
'Expecting to be on run %d, but is actually on run %d. '
'run_dir: "%s"'
%
(
run_number
,
session
.
run
(
trainer
.
run_number
),
run_dir
))
global_step
=
trainer
.
cached_global_step
logging
.
info
(
'Starting training at step=%d'
,
global_step
)
while
do_training
:
trainer
.
update_global_model
(
session
)
if
is_chief
:
trainer
.
maybe_save_best_model
(
session
,
saver
,
best_model_checkpoint
)
global_step
=
trainer
.
cached_global_step
global_npe
=
trainer
.
cached_global_npe
if
time
.
time
()
-
last_replay_save_time
>=
30
:
trainer
.
save_replay_buffer
()
trainer
.
save_topk_buffer
()
last_replay_save_time
=
time
.
time
()
# Stopping conditions.
if
tuner
and
tuner
.
should_trial_stop
():
logging
.
info
(
'Tuner requested early stopping. Finishing.'
)
do_training
=
False
if
is_chief
and
FLAGS
.
stop_on_success
:
found_solution
=
session
.
run
(
trainer
.
found_solution_flag
)
if
found_solution
:
do_training
=
False
logging
.
info
(
'Solution found. Finishing.'
)
if
FLAGS
.
max_npe
and
global_npe
>=
FLAGS
.
max_npe
:
# Max NPE (number of programs executed) reached.
logging
.
info
(
'Max NPE reached. Finishing.'
)
do_training
=
False
if
sv
.
should_stop
():
logging
.
info
(
'Supervisor issued stop. Finishing.'
)
do_training
=
False
except
tf
.
errors
.
NotFoundError
:
# Catch "Error while reading resource variable".
# The chief worker likely destroyed the container, so do not retry.
logging
.
info
(
'Caught NotFoundError. Quitting.'
)
do_training
=
False
should_retry
=
False
break
except
tf
.
errors
.
InternalError
as
e
:
# Catch "Invalid variable reference."
if
str
(
e
).
startswith
(
'Invalid variable reference.'
):
# The chief worker likely destroyed the container, so do not
# retry.
logging
.
info
(
'Caught "InternalError: Invalid variable reference.". '
'Quitting.'
)
do_training
=
False
should_retry
=
False
break
else
:
# Pass exception through.
raise
# Exited training loop. Write results to disk.
if
is_chief
and
results_writer
:
assert
not
should_retry
with
tf
.
gfile
.
FastGFile
(
status_file
,
'w'
)
as
f
:
f
.
write
(
'done'
)
(
program_count
,
found_solution
,
code_solution
,
best_reward
,
global_step
)
=
session
.
run
(
[
trainer
.
program_count
,
trainer
.
found_solution_flag
,
trainer
.
code_solution_variable
,
trainer
.
global_best_reward
,
trainer
.
global_step
])
results_dict
=
{
'max_npe'
:
FLAGS
.
max_npe
,
'batch_size'
:
config
.
batch_size
,
'max_batches'
:
FLAGS
.
max_npe
//
config
.
batch_size
,
'npe'
:
program_count
,
'max_global_repetitions'
:
FLAGS
.
num_repetitions
,
'max_local_repetitions'
:
FLAGS
.
num_repetitions
,
'code_solution'
:
code_solution
,
'best_reward'
:
best_reward
,
'num_batches'
:
global_step
,
'found_solution'
:
found_solution
,
'task'
:
trainer
.
data_manager
.
task_name
,
'global_rep'
:
run_number
}
logging
.
info
(
'results_dict: %s'
,
results_dict
)
results_writer
.
append
(
results_dict
)
except
tf
.
errors
.
AbortedError
:
# Catch "Graph handle is not found" error due to preempted jobs.
logging
.
info
(
'Caught AbortedError. Retying.'
)
should_retry
=
True
except
tf
.
errors
.
DeadlineExceededError
:
supervisor_deadline_exceeded
=
True
should_retry
=
False
if
is_chief
:
logging
.
info
(
'This is chief worker. Stopping all workers.'
)
sv
.
stop
()
if
supervisor_deadline_exceeded
:
logging
.
info
(
'Supervisor timed out. Quitting.'
)
else
:
logging
.
info
(
'Reached %s steps. Worker stopped.'
,
global_step
)
# Dump profiling.
"""
How to use profiling data.
Download the profiler dump to your local machine, say to PROF_FILE_PATH.
In a separate script, run something like the following:
import pstats
p = pstats.Stats(PROF_FILE_PATH)
p.strip_dirs().sort_stats('cumtime').print_stats()
This will sort by 'cumtime', which "is the cumulative time spent in this and
all subfunctions (from invocation till exit)."
https://docs.python.org/2/library/profile.html#instant-user-s-manual
"""
# pylint: disable=pointless-string-statement
if
profiler
:
prof_file
=
os
.
path
.
join
(
run_dir
,
'task_%d.prof'
%
FLAGS
.
task_id
)
logging
.
info
(
'Done profiling.
\n
Dumping to "%s".'
,
prof_file
)
profiler
.
create_stats
()
with
tf
.
gfile
.
Open
(
prof_file
,
'w'
)
as
f
:
f
.
write
(
marshal
.
dumps
(
profiler
.
stats
))
return
trainer
def
run_training
(
config
=
None
,
tuner
=
None
,
logdir
=
None
,
trial_name
=
None
,
is_chief
=
True
):
"""Do all training runs.
This is the top level training function for policy gradient based models.
Run this from the main function.
Args:
config: config_lib.Config instance containing global config (agent and
environment hparams). If None, config will be parsed from FLAGS.config.
tuner: A tuner instance. Leave as None if not tuning.
logdir: Parent directory where all data from all runs will be written. If
None, FLAGS.logdir will be used.
trial_name: If tuning, set this to a unique string that identifies this
trial. If `tuner` is not None, this also must be set.
is_chief: True if this worker is the chief.
Returns:
List of results dicts which were written to disk. Each training run gets a
results dict. Results dict contains metrics, i.e. (name, value) pairs which
give information about the training run.
Raises:
ValueError: If results dicts read from disk contain invalid data.
"""
if
not
config
:
# If custom config is not given, get it from flags.
config
=
defaults
.
default_config_with_updates
(
FLAGS
.
config
)
if
not
logdir
:
logdir
=
FLAGS
.
logdir
if
not
tf
.
gfile
.
Exists
(
logdir
):
tf
.
gfile
.
MakeDirs
(
logdir
)
assert
FLAGS
.
num_repetitions
>
0
results
=
results_lib
.
Results
(
logdir
)
results_list
,
_
=
results
.
read_all
()
logging
.
info
(
'Starting experiment. Directory: "%s"'
,
logdir
)
if
results_list
:
if
results_list
[
0
][
'max_npe'
]
!=
FLAGS
.
max_npe
:
raise
ValueError
(
'Cannot resume training. Max-NPE changed. Was %s, now %s'
,
results_list
[
0
][
'max_npe'
],
FLAGS
.
max_npe
)
if
results_list
[
0
][
'max_global_repetitions'
]
!=
FLAGS
.
num_repetitions
:
raise
ValueError
(
'Cannot resume training. Number of repetitions changed. Was %s, '
'now %s'
,
results_list
[
0
][
'max_global_repetitions'
],
FLAGS
.
num_repetitions
)
while
len
(
results_list
)
<
FLAGS
.
num_repetitions
:
run_number
=
len
(
results_list
)
rep_container_name
=
trial_name
if
trial_name
else
'container'
if
FLAGS
.
num_repetitions
>
1
:
rep_dir
=
os
.
path
.
join
(
logdir
,
'run_%d'
%
run_number
)
rep_container_name
=
rep_container_name
+
'_run_'
+
str
(
run_number
)
else
:
rep_dir
=
logdir
logging
.
info
(
'Starting repetition %d (%d out of %d)'
,
run_number
,
run_number
+
1
,
FLAGS
.
num_repetitions
)
# Train will write result to disk.
with
tf
.
container
(
rep_container_name
):
trainer
=
train
(
config
,
is_chief
,
tuner
,
rep_dir
,
run_number
,
results
)
logging
.
info
(
'Done training.'
)
if
is_chief
:
# Destroy current container immediately (clears current graph).
logging
.
info
(
'Clearing shared variables.'
)
tf
.
Session
.
reset
(
FLAGS
.
master
,
containers
=
[
rep_container_name
])
logging
.
info
(
'Shared variables cleared.'
)
# Delete replay buffer on disk.
assert
trainer
trainer
.
delete_replay_buffer
()
else
:
# Give chief worker time to clean up.
sleep_sec
=
30.0
logging
.
info
(
'Sleeping for %s sec.'
,
sleep_sec
)
time
.
sleep
(
sleep_sec
)
tf
.
reset_default_graph
()
logging
.
info
(
'Default graph reset.'
)
# Expecting that train wrote new result to disk before returning.
results_list
,
_
=
results
.
read_all
()
return
results_list
research/brain_coder/single_task/pg_train_test.py
deleted
100644 → 0
View file @
d31aba8a
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
"""Tests for pg_train.
These tests excersize code paths available through configuration options.
Training will be run for just a few steps with the goal being to check that
nothing crashes.
"""
from
absl
import
flags
import
tensorflow
as
tf
from
single_task
import
defaults
# brain coder
from
single_task
import
run
# brain coder
FLAGS
=
flags
.
FLAGS
class
TrainTest
(
tf
.
test
.
TestCase
):
def
RunTrainingSteps
(
self
,
config_string
,
num_steps
=
10
):
"""Run a few training steps with the given config.
Just check that nothing crashes.
Args:
config_string: Config encoded in a string. See
$REPO_PATH/common/config_lib.py
num_steps: Number of training steps to run. Defaults to 10.
"""
config
=
defaults
.
default_config_with_updates
(
config_string
)
FLAGS
.
master
=
''
FLAGS
.
max_npe
=
num_steps
*
config
.
batch_size
FLAGS
.
summary_interval
=
1
FLAGS
.
logdir
=
tf
.
test
.
get_temp_dir
()
FLAGS
.
config
=
config_string
tf
.
reset_default_graph
()
run
.
main
(
None
)
def
testVanillaPolicyGradient
(
self
):
self
.
RunTrainingSteps
(
'env=c(task="reverse"),'
'agent=c(algorithm="pg"),'
'timestep_limit=90,batch_size=64'
)
def
testVanillaPolicyGradient_VariableLengthSequences
(
self
):
self
.
RunTrainingSteps
(
'env=c(task="reverse"),'
'agent=c(algorithm="pg",eos_token=False),'
'timestep_limit=90,batch_size=64'
)
def
testVanillaActorCritic
(
self
):
self
.
RunTrainingSteps
(
'env=c(task="reverse"),'
'agent=c(algorithm="pg",ema_baseline_decay=0.0),'
'timestep_limit=90,batch_size=64'
)
def
testPolicyGradientWithTopK
(
self
):
self
.
RunTrainingSteps
(
'env=c(task="reverse"),'
'agent=c(algorithm="pg",topk_loss_hparam=1.0,topk=10),'
'timestep_limit=90,batch_size=64'
)
def
testVanillaActorCriticWithTopK
(
self
):
self
.
RunTrainingSteps
(
'env=c(task="reverse"),'
'agent=c(algorithm="pg",ema_baseline_decay=0.0,topk_loss_hparam=1.0,'
'topk=10),'
'timestep_limit=90,batch_size=64'
)
def
testPolicyGradientWithTopK_VariableLengthSequences
(
self
):
self
.
RunTrainingSteps
(
'env=c(task="reverse"),'
'agent=c(algorithm="pg",topk_loss_hparam=1.0,topk=10,eos_token=False),'
'timestep_limit=90,batch_size=64'
)
def
testPolicyGradientWithImportanceSampling
(
self
):
self
.
RunTrainingSteps
(
'env=c(task="reverse"),'
'agent=c(algorithm="pg",alpha=0.5),'
'timestep_limit=90,batch_size=64'
)
if
__name__
==
'__main__'
:
tf
.
test
.
main
()
research/brain_coder/single_task/results_lib.py
deleted
100644 → 0
View file @
d31aba8a
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
"""Results object manages distributed reading and writing of results to disk."""
import
ast
from
collections
import
namedtuple
import
os
import
re
from
six.moves
import
xrange
import
tensorflow
as
tf
ShardStats
=
namedtuple
(
'ShardStats'
,
[
'num_local_reps_completed'
,
'max_local_reps'
,
'finished'
])
def
ge_non_zero
(
a
,
b
):
return
a
>=
b
and
b
>
0
def
get_shard_id
(
file_name
):
assert
file_name
[
-
4
:].
lower
()
==
'.txt'
return
int
(
file_name
[
file_name
.
rfind
(
'_'
)
+
1
:
-
4
])
class
Results
(
object
):
"""Manages reading and writing training results to disk asynchronously.
Each worker writes to its own file, so that there are no race conditions when
writing happens. However any worker may read any file, as is the case for
`read_all`. Writes are expected to be atomic so that workers will never
read incomplete data, and this is likely to be the case on Unix systems.
Reading out of date data is fine, as workers calling `read_all` will wait
until data from every worker has been written before proceeding.
"""
file_template
=
'experiment_results_{0}.txt'
search_regex
=
r
'^experiment_results_([0-9])+\.txt$'
def
__init__
(
self
,
log_dir
,
shard_id
=
0
):
"""Construct `Results` instance.
Args:
log_dir: Where to write results files.
shard_id: Unique id for this file (i.e. shard). Each worker that will
be writing results should use a different shard id. If there are
N shards, each shard should be numbered 0 through N-1.
"""
# Use different files for workers so that they can write to disk async.
assert
0
<=
shard_id
self
.
file_name
=
self
.
file_template
.
format
(
shard_id
)
self
.
log_dir
=
log_dir
self
.
results_file
=
os
.
path
.
join
(
self
.
log_dir
,
self
.
file_name
)
def
append
(
self
,
metrics
):
"""Append results to results list on disk."""
with
tf
.
gfile
.
FastGFile
(
self
.
results_file
,
'a'
)
as
writer
:
writer
.
write
(
str
(
metrics
)
+
'
\n
'
)
def
read_this_shard
(
self
):
"""Read only from this shard."""
return
self
.
_read_shard
(
self
.
results_file
)
def
_read_shard
(
self
,
results_file
):
"""Read only from the given shard file."""
try
:
with
tf
.
gfile
.
FastGFile
(
results_file
,
'r'
)
as
reader
:
results
=
[
ast
.
literal_eval
(
entry
)
for
entry
in
reader
]
except
tf
.
errors
.
NotFoundError
:
# No results written to disk yet. Return empty list.
return
[]
return
results
def
_get_max_local_reps
(
self
,
shard_results
):
"""Get maximum number of repetitions the given shard needs to complete.
Worker working on each shard needs to complete a certain number of runs
before it finishes. This method will return that number so that we can
determine which shards are still not done.
We assume that workers are including a 'max_local_repetitions' value in
their results, which should be the total number of repetitions it needs to
run.
Args:
shard_results: Dict mapping metric names to values. This should be read
from a shard on disk.
Returns:
Maximum number of repetitions the given shard needs to complete.
"""
mlrs
=
[
r
[
'max_local_repetitions'
]
for
r
in
shard_results
]
if
not
mlrs
:
return
0
for
n
in
mlrs
[
1
:]:
assert
n
==
mlrs
[
0
],
'Some reps have different max rep.'
return
mlrs
[
0
]
def
read_all
(
self
,
num_shards
=
None
):
"""Read results across all shards, i.e. get global results list.
Args:
num_shards: (optional) specifies total number of shards. If the caller
wants information about which shards are incomplete, provide this
argument (so that shards which have yet to be created are still
counted as incomplete shards). Otherwise, no information about
incomplete shards will be returned.
Returns:
aggregate: Global list of results (across all shards).
shard_stats: List of ShardStats instances, one for each shard. Or None if
`num_shards` is None.
"""
try
:
all_children
=
tf
.
gfile
.
ListDirectory
(
self
.
log_dir
)
except
tf
.
errors
.
NotFoundError
:
if
num_shards
is
None
:
return
[],
None
return
[],
[[]
for
_
in
xrange
(
num_shards
)]
shard_ids
=
{
get_shard_id
(
fname
):
fname
for
fname
in
all_children
if
re
.
search
(
self
.
search_regex
,
fname
)}
if
num_shards
is
None
:
aggregate
=
[]
shard_stats
=
None
for
results_file
in
shard_ids
.
values
():
aggregate
.
extend
(
self
.
_read_shard
(
os
.
path
.
join
(
self
.
log_dir
,
results_file
)))
else
:
results_per_shard
=
[
None
]
*
num_shards
for
shard_id
in
xrange
(
num_shards
):
if
shard_id
in
shard_ids
:
results_file
=
shard_ids
[
shard_id
]
results_per_shard
[
shard_id
]
=
self
.
_read_shard
(
os
.
path
.
join
(
self
.
log_dir
,
results_file
))
else
:
results_per_shard
[
shard_id
]
=
[]
# Compute shard stats.
shard_stats
=
[]
for
shard_results
in
results_per_shard
:
max_local_reps
=
self
.
_get_max_local_reps
(
shard_results
)
shard_stats
.
append
(
ShardStats
(
num_local_reps_completed
=
len
(
shard_results
),
max_local_reps
=
max_local_reps
,
finished
=
ge_non_zero
(
len
(
shard_results
),
max_local_reps
)))
# Compute aggregate.
aggregate
=
[
r
for
shard_results
in
results_per_shard
for
r
in
shard_results
]
return
aggregate
,
shard_stats
research/brain_coder/single_task/results_lib_test.py
deleted
100644 → 0
View file @
d31aba8a
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
"""Tests for results_lib."""
import
contextlib
import
os
import
shutil
import
tempfile
from
six.moves
import
xrange
import
tensorflow
as
tf
from
single_task
import
results_lib
# brain coder
@
contextlib
.
contextmanager
def
temporary_directory
(
suffix
=
''
,
prefix
=
'tmp'
,
base_path
=
None
):
"""A context manager to create a temporary directory and clean up on exit.
The parameters are the same ones expected by tempfile.mkdtemp.
The directory will be securely and atomically created.
Everything under it will be removed when exiting the context.
Args:
suffix: optional suffix.
prefix: options prefix.
base_path: the base path under which to create the temporary directory.
Yields:
The absolute path of the new temporary directory.
"""
temp_dir_path
=
tempfile
.
mkdtemp
(
suffix
,
prefix
,
base_path
)
try
:
yield
temp_dir_path
finally
:
try
:
shutil
.
rmtree
(
temp_dir_path
)
except
OSError
as
e
:
if
e
.
message
==
'Cannot call rmtree on a symbolic link'
:
# Interesting synthetic exception made up by shutil.rmtree.
# Means we received a symlink from mkdtemp.
# Also means must clean up the symlink instead.
os
.
unlink
(
temp_dir_path
)
else
:
raise
def
freeze
(
dictionary
):
"""Convert dict to hashable frozenset."""
return
frozenset
(
dictionary
.
iteritems
())
class
ResultsLibTest
(
tf
.
test
.
TestCase
):
def
testResults
(
self
):
with
temporary_directory
()
as
logdir
:
results_obj
=
results_lib
.
Results
(
logdir
)
self
.
assertEqual
(
results_obj
.
read_this_shard
(),
[])
results_obj
.
append
(
{
'foo'
:
1.5
,
'bar'
:
2.5
,
'baz'
:
0
})
results_obj
.
append
(
{
'foo'
:
5.5
,
'bar'
:
-
1
,
'baz'
:
2
})
self
.
assertEqual
(
results_obj
.
read_this_shard
(),
[{
'foo'
:
1.5
,
'bar'
:
2.5
,
'baz'
:
0
},
{
'foo'
:
5.5
,
'bar'
:
-
1
,
'baz'
:
2
}])
def
testShardedResults
(
self
):
with
temporary_directory
()
as
logdir
:
n
=
4
# Number of shards.
results_objs
=
[
results_lib
.
Results
(
logdir
,
shard_id
=
i
)
for
i
in
xrange
(
n
)]
for
i
,
robj
in
enumerate
(
results_objs
):
robj
.
append
({
'foo'
:
i
,
'bar'
:
1
+
i
*
2
})
results_list
,
_
=
results_objs
[
0
].
read_all
()
# Check results. Order does not matter here.
self
.
assertEqual
(
set
(
freeze
(
r
)
for
r
in
results_list
),
set
(
freeze
({
'foo'
:
i
,
'bar'
:
1
+
i
*
2
})
for
i
in
xrange
(
n
)))
if
__name__
==
'__main__'
:
tf
.
test
.
main
()
research/brain_coder/single_task/run.py
deleted
100644 → 0
View file @
d31aba8a
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
r
"""Run training.
Choose training algorithm and task(s) and follow these examples.
Run synchronous policy gradient training locally:
CONFIG="agent=c(algorithm='pg'),env=c(task='reverse')"
OUT_DIR="/tmp/bf_pg_local"
rm -rf $OUT_DIR
bazel run -c opt single_task:run -- \
--alsologtostderr \
--config="$CONFIG" \
--max_npe=0 \
--logdir="$OUT_DIR" \
--summary_interval=1 \
--model_v=0
learning/brain/tensorboard/tensorboard.sh --port 12345 --logdir "$OUT_DIR"
Run genetic algorithm locally:
CONFIG="agent=c(algorithm='ga'),env=c(task='reverse')"
OUT_DIR="/tmp/bf_ga_local"
rm -rf $OUT_DIR
bazel run -c opt single_task:run -- \
--alsologtostderr \
--config="$CONFIG" \
--max_npe=0 \
--logdir="$OUT_DIR"
Run uniform random search locally:
CONFIG="agent=c(algorithm='rand'),env=c(task='reverse')"
OUT_DIR="/tmp/bf_rand_local"
rm -rf $OUT_DIR
bazel run -c opt single_task:run -- \
--alsologtostderr \
--config="$CONFIG" \
--max_npe=0 \
--logdir="$OUT_DIR"
"""
from
absl
import
app
from
absl
import
flags
from
absl
import
logging
from
single_task
import
defaults
# brain coder
from
single_task
import
ga_train
# brain coder
from
single_task
import
pg_train
# brain coder
FLAGS
=
flags
.
FLAGS
flags
.
DEFINE_string
(
'config'
,
''
,
'Configuration.'
)
flags
.
DEFINE_string
(
'logdir'
,
None
,
'Absolute path where to write results.'
)
flags
.
DEFINE_integer
(
'task_id'
,
0
,
'ID for this worker.'
)
flags
.
DEFINE_integer
(
'num_workers'
,
1
,
'How many workers there are.'
)
flags
.
DEFINE_integer
(
'max_npe'
,
0
,
'NPE = number of programs executed. Maximum number of programs to execute '
'in each run. Training will complete when this threshold is reached. Set '
'to 0 for unlimited training.'
)
flags
.
DEFINE_integer
(
'num_repetitions'
,
1
,
'Number of times the same experiment will be run (globally across all '
'workers). Each run is independent.'
)
flags
.
DEFINE_string
(
'log_level'
,
'INFO'
,
'The threshold for what messages will be logged. One of DEBUG, INFO, WARN, '
'ERROR, or FATAL.'
)
# To register an algorithm:
# 1) Add dependency in the BUILD file to this build rule.
# 2) Import the algorithm's module at the top of this file.
# 3) Add a new entry in the following dict. The key is the algorithm name
# (used to select the algorithm in the config). The value is the module
# defining the expected functions for training and tuning. See the docstring
# for `get_namespace` for further details.
ALGORITHM_REGISTRATION
=
{
'pg'
:
pg_train
,
'ga'
:
ga_train
,
'rand'
:
ga_train
,
}
def
get_namespace
(
config_string
):
"""Get namespace for the selected algorithm.
Users who want to add additional algorithm types should modify this function.
The algorithm's namespace should contain the following functions:
run_training: Run the main training loop.
define_tuner_hparam_space: Return the hparam tuning space for the algo.
write_hparams_to_config: Helper for tuning. Write hparams chosen for tuning
to the Config object.
Look at pg_train.py and ga_train.py for function signatures and
implementations.
Args:
config_string: String representation of a Config object. This will get
parsed into a Config in order to determine what algorithm to use.
Returns:
algorithm_namespace: The module corresponding to the algorithm given in the
config.
config: The Config object resulting from parsing `config_string`.
Raises:
ValueError: If config.agent.algorithm is not one of the registered
algorithms.
"""
config
=
defaults
.
default_config_with_updates
(
config_string
)
if
config
.
agent
.
algorithm
not
in
ALGORITHM_REGISTRATION
:
raise
ValueError
(
'Unknown algorithm type "%s"'
%
(
config
.
agent
.
algorithm
,))
else
:
return
ALGORITHM_REGISTRATION
[
config
.
agent
.
algorithm
],
config
def
main
(
argv
):
del
argv
# Unused.
logging
.
set_verbosity
(
FLAGS
.
log_level
)
flags
.
mark_flag_as_required
(
'logdir'
)
if
FLAGS
.
num_workers
<=
0
:
raise
ValueError
(
'num_workers flag must be greater than 0.'
)
if
FLAGS
.
task_id
<
0
:
raise
ValueError
(
'task_id flag must be greater than or equal to 0.'
)
if
FLAGS
.
task_id
>=
FLAGS
.
num_workers
:
raise
ValueError
(
'task_id flag must be strictly less than num_workers flag.'
)
ns
,
_
=
get_namespace
(
FLAGS
.
config
)
ns
.
run_training
(
is_chief
=
FLAGS
.
task_id
==
0
)
if
__name__
==
'__main__'
:
app
.
run
(
main
)
research/brain_coder/single_task/run_eval_tasks.py
deleted
100755 → 0
View file @
d31aba8a
#!/usr/bin/env python
from
__future__
import
print_function
r
"""This script can launch any eval experiments from the paper.
This is a script. Run with python, not bazel.
Usage:
./single_task/run_eval_tasks.py \
--exp EXP --desc DESC [--tuning_tasks] [--iclr_tasks] [--task TASK] \
[--tasks TASK1 TASK2 ...]
where EXP is one of the keys in `experiments`,
and DESC is a string description of the set of experiments (such as "v0")
Set only one of these flags:
--tuning_tasks flag only runs tuning tasks.
--iclr_tasks flag only runs the tasks included in the paper.
--regression_tests flag runs tasks which function as regression tests.
--task flag manually selects a single task to run.
--tasks flag takes a custom list of tasks.
Other flags:
--reps N specifies N repetitions per experiment, Default is 25.
--training_replicas R specifies that R workers will be launched to train one
task (for neural network algorithms). These workers will update a global
model stored on a parameter server. Defaults to 1. If R > 1, a parameter
server will also be launched.
Run everything:
exps=( pg-20M pg-topk-20M topk-20M ga-20M rand-20M )
BIN_DIR="single_task"
for exp in "${exps[@]}"
do
./$BIN_DIR/run_eval_tasks.py \
--exp "$exp" --iclr_tasks
done
"""
import
argparse
from
collections
import
namedtuple
import
subprocess
S
=
namedtuple
(
'S'
,
[
'length'
])
default_length
=
100
iclr_tasks
=
[
'reverse'
,
'remove-char'
,
'count-char'
,
'add'
,
'bool-logic'
,
'print-hello'
,
'echo-twice'
,
'echo-thrice'
,
'copy-reverse'
,
'zero-cascade'
,
'cascade'
,
'shift-left'
,
'shift-right'
,
'riffle'
,
'unriffle'
,
'middle-char'
,
'remove-last'
,
'remove-last-two'
,
'echo-alternating'
,
'echo-half'
,
'length'
,
'echo-second-seq'
,
'echo-nth-seq'
,
'substring'
,
'divide-2'
,
'dedup'
]
regression_test_tasks
=
[
'reverse'
,
'test-hill-climb'
]
E
=
namedtuple
(
'E'
,
[
'name'
,
'method_type'
,
'config'
,
'simplify'
,
'batch_size'
,
'max_npe'
])
def
make_experiment_settings
(
name
,
**
kwargs
):
# Unpack experiment info from name.
def
split_last
(
string
,
char
):
i
=
string
.
rindex
(
char
)
return
string
[:
i
],
string
[
i
+
1
:]
def
si_to_int
(
si_string
):
return
int
(
si_string
.
upper
().
replace
(
'K'
,
'0'
*
3
).
replace
(
'M'
,
'0'
*
6
)
.
replace
(
'G'
,
'0'
*
9
))
method_type
,
max_npe
=
split_last
(
name
,
'-'
)
assert
method_type
assert
max_npe
return
E
(
name
=
name
,
method_type
=
method_type
,
max_npe
=
si_to_int
(
max_npe
),
**
kwargs
)
experiments_set
=
{
make_experiment_settings
(
'pg-20M'
,
config
=
'entropy_beta=0.05,lr=0.0001,topk_loss_hparam=0.0,topk=0,'
'pi_loss_hparam=1.0,alpha=0.0'
,
simplify
=
False
,
batch_size
=
64
),
make_experiment_settings
(
'pg-topk-20M'
,
config
=
'entropy_beta=0.01,lr=0.0001,topk_loss_hparam=50.0,topk=10,'
'pi_loss_hparam=1.0,alpha=0.0'
,
simplify
=
False
,
batch_size
=
64
),
make_experiment_settings
(
'topk-20M'
,
config
=
'entropy_beta=0.01,lr=0.0001,topk_loss_hparam=200.0,topk=10,'
'pi_loss_hparam=0.0,alpha=0.0'
,
simplify
=
False
,
batch_size
=
64
),
make_experiment_settings
(
'topk-0ent-20M'
,
config
=
'entropy_beta=0.000,lr=0.0001,topk_loss_hparam=200.0,topk=10,'
'pi_loss_hparam=0.0,alpha=0.0'
,
simplify
=
False
,
batch_size
=
64
),
make_experiment_settings
(
'ga-20M'
,
config
=
'crossover_rate=0.95,mutation_rate=0.15'
,
simplify
=
False
,
batch_size
=
100
),
# Population size.
make_experiment_settings
(
'rand-20M'
,
config
=
''
,
simplify
=
False
,
batch_size
=
1
),
make_experiment_settings
(
'simpl-500M'
,
config
=
'entropy_beta=0.05,lr=0.0001,topk_loss_hparam=0.5,topk=10,'
'pi_loss_hparam=1.0,alpha=0.0'
,
simplify
=
True
,
batch_size
=
64
),
}
experiments
=
{
e
.
name
:
e
for
e
in
experiments_set
}
# pylint: disable=redefined-outer-name
def
parse_args
(
extra_args
=
()):
"""Parse arguments and extract task and experiment info."""
parser
=
argparse
.
ArgumentParser
(
description
=
'Run all eval tasks.'
)
parser
.
add_argument
(
'--exp'
,
required
=
True
)
parser
.
add_argument
(
'--tuning_tasks'
,
action
=
'store_true'
)
parser
.
add_argument
(
'--iclr_tasks'
,
action
=
'store_true'
)
parser
.
add_argument
(
'--regression_tests'
,
action
=
'store_true'
)
parser
.
add_argument
(
'--desc'
,
default
=
'v0'
)
parser
.
add_argument
(
'--reps'
,
default
=
25
)
parser
.
add_argument
(
'--task'
)
parser
.
add_argument
(
'--tasks'
,
nargs
=
'+'
)
for
arg_string
,
default
in
extra_args
:
parser
.
add_argument
(
arg_string
,
default
=
default
)
args
=
parser
.
parse_args
()
print
(
'Running experiment: %s'
%
(
args
.
exp
,))
if
args
.
desc
:
print
(
'Extra description: "%s"'
%
(
args
.
desc
,))
if
args
.
exp
not
in
experiments
:
raise
ValueError
(
'Experiment name is not valid'
)
experiment_name
=
args
.
exp
experiment_settings
=
experiments
[
experiment_name
]
assert
experiment_settings
.
name
==
experiment_name
if
args
.
tasks
:
print
(
'Launching tasks from args: %s'
%
(
args
.
tasks
,))
tasks
=
{
t
:
S
(
length
=
default_length
)
for
t
in
args
.
tasks
}
elif
args
.
task
:
print
(
'Launching single task "%s"'
%
args
.
task
)
tasks
=
{
args
.
task
:
S
(
length
=
default_length
)}
elif
args
.
tuning_tasks
:
print
(
'Only running tuning tasks'
)
tasks
=
{
name
:
S
(
length
=
default_length
)
for
name
in
[
'reverse-tune'
,
'remove-char-tune'
]}
elif
args
.
iclr_tasks
:
print
(
'Running eval tasks from ICLR paper.'
)
tasks
=
{
name
:
S
(
length
=
default_length
)
for
name
in
iclr_tasks
}
elif
args
.
regression_tests
:
tasks
=
{
name
:
S
(
length
=
default_length
)
for
name
in
regression_test_tasks
}
print
(
'Tasks: %s'
%
tasks
.
keys
())
print
(
'reps = %d'
%
(
int
(
args
.
reps
),))
return
args
,
tasks
,
experiment_settings
def
run
(
command_string
):
subprocess
.
call
(
command_string
,
shell
=
True
)
if
__name__
==
'__main__'
:
LAUNCH_TRAINING_COMMAND
=
'single_task/launch_training.sh'
COMPILE_COMMAND
=
'bazel build -c opt single_task:run.par'
args
,
tasks
,
experiment_settings
=
parse_args
(
extra_args
=
((
'--training_replicas'
,
1
),))
if
experiment_settings
.
method_type
in
(
'pg'
,
'pg-topk'
,
'topk'
,
'topk-0ent'
,
'simpl'
):
# Runs PG and TopK.
def
make_run_cmd
(
job_name
,
task
,
max_npe
,
num_reps
,
code_length
,
batch_size
,
do_simplify
,
custom_config_str
):
"""Constructs terminal command for launching NN based algorithms.
The arguments to this function will be used to create config for the
experiment.
Args:
job_name: Name of the job to launch. Should uniquely identify this
experiment run.
task: Name of the coding task to solve.
max_npe: Maximum number of programs executed. An integer.
num_reps: Number of times to run the experiment. An integer.
code_length: Maximum allowed length of synthesized code.
batch_size: Minibatch size for gradient descent.
do_simplify: Whether to run the experiment in code simplification mode.
A bool.
custom_config_str: Additional config for the model config string.
Returns:
The terminal command that launches the specified experiment.
"""
config
=
"""
env=c(task='{0}',correct_syntax=False),
agent=c(
algorithm='pg',
policy_lstm_sizes=[35,35],value_lstm_sizes=[35,35],
grad_clip_threshold=50.0,param_init_factor=0.5,regularizer=0.0,
softmax_tr=1.0,optimizer='rmsprop',ema_baseline_decay=0.99,
eos_token={3},{4}),
timestep_limit={1},batch_size={2}
"""
.
replace
(
' '
,
''
).
replace
(
'
\n
'
,
''
).
format
(
task
,
code_length
,
batch_size
,
do_simplify
,
custom_config_str
)
num_ps
=
0
if
args
.
training_replicas
==
1
else
1
return
(
r
'{0} --job_name={1} --config="{2}" --max_npe={3} '
'--num_repetitions={4} --num_workers={5} --num_ps={6} '
'--stop_on_success={7}'
.
format
(
LAUNCH_TRAINING_COMMAND
,
job_name
,
config
,
max_npe
,
num_reps
,
args
.
training_replicas
,
num_ps
,
str
(
not
do_simplify
).
lower
()))
else
:
# Runs GA and Rand.
assert
experiment_settings
.
method_type
in
(
'ga'
,
'rand'
)
def
make_run_cmd
(
job_name
,
task
,
max_npe
,
num_reps
,
code_length
,
batch_size
,
do_simplify
,
custom_config_str
):
"""Constructs terminal command for launching GA or uniform random search.
The arguments to this function will be used to create config for the
experiment.
Args:
job_name: Name of the job to launch. Should uniquely identify this
experiment run.
task: Name of the coding task to solve.
max_npe: Maximum number of programs executed. An integer.
num_reps: Number of times to run the experiment. An integer.
code_length: Maximum allowed length of synthesized code.
batch_size: Minibatch size for gradient descent.
do_simplify: Whether to run the experiment in code simplification mode.
A bool.
custom_config_str: Additional config for the model config string.
Returns:
The terminal command that launches the specified experiment.
"""
assert
not
do_simplify
if
custom_config_str
:
custom_config_str
=
','
+
custom_config_str
config
=
"""
env=c(task='{0}',correct_syntax=False),
agent=c(
algorithm='{4}'
{3}),
timestep_limit={1},batch_size={2}
"""
.
replace
(
' '
,
''
).
replace
(
'
\n
'
,
''
).
format
(
task
,
code_length
,
batch_size
,
custom_config_str
,
experiment_settings
.
method_type
)
num_workers
=
num_reps
# Do each rep in parallel.
return
(
r
'{0} --job_name={1} --config="{2}" --max_npe={3} '
'--num_repetitions={4} --num_workers={5} --num_ps={6} '
'--stop_on_success={7}'
.
format
(
LAUNCH_TRAINING_COMMAND
,
job_name
,
config
,
max_npe
,
num_reps
,
num_workers
,
0
,
str
(
not
do_simplify
).
lower
()))
print
(
'Compiling...'
)
run
(
COMPILE_COMMAND
)
print
(
'Launching %d coding tasks...'
%
len
(
tasks
))
for
task
,
task_settings
in
tasks
.
iteritems
():
name
=
'bf_rl_iclr'
desc
=
'{0}.{1}_{2}'
.
format
(
args
.
desc
,
experiment_settings
.
name
,
task
)
job_name
=
'{}.{}'
.
format
(
name
,
desc
)
print
(
'Job name: %s'
%
job_name
)
reps
=
int
(
args
.
reps
)
if
not
experiment_settings
.
simplify
else
1
run_cmd
=
make_run_cmd
(
job_name
,
task
,
experiment_settings
.
max_npe
,
reps
,
task_settings
.
length
,
experiment_settings
.
batch_size
,
experiment_settings
.
simplify
,
experiment_settings
.
config
)
print
(
'Running command:
\n
'
+
run_cmd
)
run
(
run_cmd
)
print
(
'Done.'
)
# pylint: enable=redefined-outer-name
research/brain_coder/single_task/test_tasks.py
deleted
100644 → 0
View file @
d31aba8a
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
"""Tasks that test correctness of algorithms."""
from
six.moves
import
xrange
from
common
import
reward
as
reward_lib
# brain coder
from
single_task
import
misc
# brain coder
class
BasicTaskManager
(
object
):
"""Wraps a generic reward function."""
def
__init__
(
self
,
reward_fn
):
self
.
reward_fn
=
reward_fn
self
.
good_reward
=
1.0
def
_score_string
(
self
,
string
):
actions
=
misc
.
bf_string_to_tokens
(
string
)
reward
,
correct
=
self
.
reward_fn
(
actions
)
return
misc
.
RewardInfo
(
episode_rewards
=
[
0.0
]
*
(
len
(
string
)
-
1
)
+
[
reward
],
input_case
=
None
,
correct_output
=
None
,
code_output
=
actions
,
input_type
=
None
,
output_type
=
misc
.
IOType
.
integer
,
reason
=
'correct'
if
correct
else
'wrong'
)
def
rl_batch
(
self
,
batch_size
):
reward_fns
=
[
self
.
_score_string
]
*
batch_size
return
reward_fns
class
Trie
(
object
):
"""Trie for sequences."""
EOS
=
()
def
__init__
(
self
):
self
.
trie
=
{}
def
insert
(
self
,
sequence
):
d
=
self
.
trie
for
e
in
sequence
:
if
e
not
in
d
:
d
[
e
]
=
{}
d
=
d
[
e
]
d
[
self
.
EOS
]
=
True
# Terminate sequence.
def
prefix_match
(
self
,
sequence
):
"""Return prefix of `sequence` which exists in the trie."""
d
=
self
.
trie
index
=
0
for
i
,
e
in
enumerate
(
sequence
+
[
self
.
EOS
]):
index
=
i
if
e
in
d
:
d
=
d
[
e
]
if
e
==
self
.
EOS
:
return
sequence
,
True
else
:
break
return
sequence
[:
index
],
False
def
next_choices
(
self
,
sequence
):
d
=
self
.
trie
for
e
in
sequence
:
if
e
in
d
:
d
=
d
[
e
]
else
:
raise
ValueError
(
'Sequence not a prefix: %s'
%
(
sequence
,))
return
d
.
keys
()
class
HillClimbingTask
(
object
):
"""Simple task that tests reward hill climbing ability.
There are a set of paths (sequences of tokens) which are rewarded. The total
reward for a path is proportional to its length, so the longest path is the
target. Shorter paths can be dead ends.
"""
def
__init__
(
self
):
# Paths are sequences of sub-sequences. Here we form unique sub-sequences
# out of 3 arbitrary ints. We use sub-sequences instead of single entities
# to make the task harder by making the episodes last longer, i.e. more
# for the agent to remember.
a
=
(
1
,
2
,
3
)
b
=
(
4
,
5
,
6
)
c
=
(
7
,
8
,
7
)
d
=
(
6
,
5
,
4
)
e
=
(
3
,
2
,
1
)
f
=
(
8
,
5
,
1
)
g
=
(
6
,
4
,
2
)
h
=
(
1
,
8
,
3
)
self
.
paths
=
Trie
()
self
.
paths
.
insert
([
a
,
b
,
h
])
self
.
paths
.
insert
([
a
,
b
,
c
,
d
,
e
,
f
,
g
,
h
])
self
.
paths
.
insert
([
a
,
b
,
c
,
d
,
e
,
b
,
a
])
self
.
paths
.
insert
([
a
,
b
,
g
,
h
])
self
.
paths
.
insert
([
a
,
e
,
f
,
g
])
self
.
correct_sequence
=
misc
.
flatten
([
a
,
b
,
c
,
d
,
e
,
f
,
g
,
h
])
def
distance_fn
(
a
,
b
):
len_diff
=
abs
(
len
(
a
)
-
len
(
b
))
return
sum
(
reward_lib
.
mod_abs_diff
(
ai
-
1
,
bi
-
1
,
8
)
for
ai
,
bi
in
zip
(
a
,
b
))
+
len_diff
*
4
# 8 / 2 = 4
self
.
distance_fn
=
distance_fn
def
__call__
(
self
,
actions
):
# Compute reward for action sequence.
actions
=
[
a
for
a
in
actions
if
a
>
0
]
sequence
=
[
tuple
(
actions
[
i
:
i
+
3
])
for
i
in
xrange
(
0
,
len
(
actions
),
3
)]
prefix
,
complete
=
self
.
paths
.
prefix_match
(
sequence
)
if
complete
:
return
float
(
len
(
prefix
)),
actions
==
self
.
correct_sequence
if
len
(
prefix
)
==
len
(
sequence
):
return
float
(
len
(
prefix
)),
False
next_pred
=
sequence
[
len
(
prefix
)]
choices
=
self
.
paths
.
next_choices
(
prefix
)
if
choices
==
[()]:
return
(
len
(
prefix
)
-
len
(
next_pred
)
/
3.0
),
False
min_dist
=
min
(
self
.
distance_fn
(
c
,
next_pred
)
for
c
in
choices
)
# +1 reward for each element in the sequence correct, plus fraction torwards
# closest next element.
# Maximum distance possible is num_actions * base / 2 = 3 * 8 / 2 = 12
return
(
len
(
prefix
)
+
(
1
-
min_dist
/
12.0
)),
False
research/brain_coder/single_task/test_tasks_test.py
deleted
100644 → 0
View file @
d31aba8a
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
"""Tests for test_tasks."""
import
numpy
as
np
import
tensorflow
as
tf
from
single_task
import
misc
# brain coder
from
single_task
import
test_tasks
# brain coder
def
get_reward
(
reward_fn
,
candidate
):
return
sum
(
reward_fn
(
misc
.
bf_tokens_to_string
(
candidate
)).
episode_rewards
)
class
TestTasksTest
(
tf
.
test
.
TestCase
):
def
testHillClimbingTask
(
self
):
task
=
test_tasks
.
BasicTaskManager
(
test_tasks
.
HillClimbingTask
())
reward_fns
=
task
.
rl_batch
(
1
)
reward_fn
=
reward_fns
[
0
]
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
0
]),
8
/
12.
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
2
,
0
]),
11
/
12.
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
0
]),
1.0
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
4
,
5
,
2
,
0
]),
1.
+
8
/
12.
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
4
,
5
,
6
,
0
]),
2.0
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
4
,
5
,
6
,
1
,
8
,
3
,
0
]),
3.0
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
4
,
5
,
6
,
7
,
8
,
7
,
0
]),
3.0
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
4
,
5
,
6
,
1
,
8
,
3
,
1
,
0
]),
3.0
-
4
/
12.
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
4
,
5
,
6
,
1
,
8
,
3
,
1
,
1
,
1
,
1
,
0
]),
2.0
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
4
,
5
,
6
,
7
,
8
,
7
,
3
,
0
]),
3.0
+
1
/
12.
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
4
,
5
,
6
,
7
,
8
,
7
,
6
,
5
,
4
,
3
,
2
,
1
,
8
,
5
,
1
,
6
,
4
,
2
,
1
,
8
,
3
,
0
]),
8.0
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
4
,
5
,
6
,
7
,
8
,
7
,
6
,
5
,
4
,
3
,
2
,
1
,
8
,
5
,
1
,
6
,
4
,
2
,
1
,
8
,
3
,
1
,
1
,
0
]),
8.0
-
8
/
12.
))
self
.
assertTrue
(
np
.
isclose
(
get_reward
(
reward_fn
,
[
1
,
2
,
3
,
4
,
5
,
6
,
7
,
8
,
7
,
6
,
5
,
4
,
3
,
2
,
1
,
8
,
5
,
1
,
6
,
4
,
2
,
1
,
8
,
3
,
1
,
1
,
1
,
1
,
1
,
1
,
1
,
0
]),
7.0
))
if
__name__
==
'__main__'
:
tf
.
test
.
main
()
research/brain_coder/single_task/tune.py
deleted
100644 → 0
View file @
d31aba8a
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
r
"""Run grid search.
Look at launch_tuning.sh for details on how to tune at scale.
Usage example:
Tune with one worker on the local machine.
CONFIG="agent=c(algorithm='pg'),"
CONFIG+="env=c(task_cycle=['reverse-tune', 'remove-tune'])"
HPARAM_SPACE_TYPE="pg"
OUT_DIR="/tmp/bf_pg_tune"
MAX_NPE=5000000
NUM_REPETITIONS=50
rm -rf $OUT_DIR
mkdir $OUT_DIR
bazel run -c opt single_task:tune -- \
--alsologtostderr \
--config="$CONFIG" \
--max_npe="$MAX_NPE" \
--num_repetitions="$NUM_REPETITIONS" \
--logdir="$OUT_DIR" \
--summary_interval=1 \
--model_v=0 \
--hparam_space="$HPARAM_SPACE_TYPE" \
--tuner_id=0 \
--num_tuners=1 \
2>&1 >"$OUT_DIR/tuner_0.log"
learning/brain/tensorboard/tensorboard.sh --port 12345 --logdir "$OUT_DIR"
"""
import
ast
import
os
from
absl
import
app
from
absl
import
flags
from
absl
import
logging
import
numpy
as
np
from
six.moves
import
xrange
import
tensorflow
as
tf
from
single_task
import
defaults
# brain coder
from
single_task
import
run
as
run_lib
# brain coder
FLAGS
=
flags
.
FLAGS
flags
.
DEFINE_integer
(
'tuner_id'
,
0
,
'The unique ID for this tuning worker.'
)
flags
.
DEFINE_integer
(
'num_tuners'
,
1
,
'How many tuners are there.'
)
flags
.
DEFINE_string
(
'hparam_space'
,
'default'
,
'String name which denotes the hparam space to tune over. This is '
'algorithm dependent.'
)
flags
.
DEFINE_string
(
'fixed_hparams'
,
''
,
'HParams string. Used to fix hparams during tuning.'
)
flags
.
DEFINE_float
(
'success_rate_objective_weight'
,
1.0
,
'How much to weight success rate vs num programs seen. By default, only '
'success rate is optimized (this is the setting used in the paper).'
)
def
parse_hparams_string
(
hparams_str
):
hparams
=
{}
for
term
in
hparams_str
.
split
(
','
):
if
not
term
:
continue
name
,
value
=
term
.
split
(
'='
)
hparams
[
name
.
strip
()]
=
ast
.
literal_eval
(
value
)
return
hparams
def
int_to_multibase
(
n
,
bases
):
digits
=
[
0
]
*
len
(
bases
)
for
i
,
b
in
enumerate
(
bases
):
n
,
d
=
divmod
(
n
,
b
)
digits
[
i
]
=
d
return
digits
def
hparams_for_index
(
index
,
tuning_space
):
keys
=
sorted
(
tuning_space
.
keys
())
indices
=
int_to_multibase
(
index
,
[
len
(
tuning_space
[
k
])
for
k
in
keys
])
return
tf
.
contrib
.
training
.
HParams
(
**
{
k
:
tuning_space
[
k
][
i
]
for
k
,
i
in
zip
(
keys
,
indices
)})
def
run_tuner_loop
(
ns
):
"""Run tuning loop for this worker."""
is_chief
=
FLAGS
.
task_id
==
0
tuning_space
=
ns
.
define_tuner_hparam_space
(
hparam_space_type
=
FLAGS
.
hparam_space
)
fixed_hparams
=
parse_hparams_string
(
FLAGS
.
fixed_hparams
)
for
name
,
value
in
fixed_hparams
.
iteritems
():
tuning_space
[
name
]
=
[
value
]
tuning_space_size
=
np
.
prod
([
len
(
values
)
for
values
in
tuning_space
.
values
()])
num_local_trials
,
remainder
=
divmod
(
tuning_space_size
,
FLAGS
.
num_tuners
)
if
FLAGS
.
tuner_id
<
remainder
:
num_local_trials
+=
1
starting_trial_id
=
(
num_local_trials
*
FLAGS
.
tuner_id
+
min
(
remainder
,
FLAGS
.
tuner_id
))
logging
.
info
(
'tuning_space_size: %d'
,
tuning_space_size
)
logging
.
info
(
'num_local_trials: %d'
,
num_local_trials
)
logging
.
info
(
'starting_trial_id: %d'
,
starting_trial_id
)
for
local_trial_index
in
xrange
(
num_local_trials
):
trial_config
=
defaults
.
default_config_with_updates
(
FLAGS
.
config
)
global_trial_index
=
local_trial_index
+
starting_trial_id
trial_name
=
'trial_'
+
str
(
global_trial_index
)
trial_dir
=
os
.
path
.
join
(
FLAGS
.
logdir
,
trial_name
)
hparams
=
hparams_for_index
(
global_trial_index
,
tuning_space
)
ns
.
write_hparams_to_config
(
trial_config
,
hparams
,
hparam_space_type
=
FLAGS
.
hparam_space
)
results_list
=
ns
.
run_training
(
config
=
trial_config
,
tuner
=
None
,
logdir
=
trial_dir
,
is_chief
=
is_chief
,
trial_name
=
trial_name
)
if
not
is_chief
:
# Only chief worker needs to write tuning results to disk.
continue
objective
,
metrics
=
compute_tuning_objective
(
results_list
,
hparams
,
trial_name
,
num_trials
=
tuning_space_size
)
logging
.
info
(
'metrics:
\n
%s'
,
metrics
)
logging
.
info
(
'objective: %s'
,
objective
)
logging
.
info
(
'programs_seen_fraction: %s'
,
metrics
[
'programs_seen_fraction'
])
logging
.
info
(
'success_rate: %s'
,
metrics
[
'success_rate'
])
logging
.
info
(
'success_rate_objective_weight: %s'
,
FLAGS
.
success_rate_objective_weight
)
tuning_results_file
=
os
.
path
.
join
(
trial_dir
,
'tuning_results.txt'
)
with
tf
.
gfile
.
FastGFile
(
tuning_results_file
,
'a'
)
as
writer
:
writer
.
write
(
str
(
metrics
)
+
'
\n
'
)
logging
.
info
(
'Trial %s complete.'
,
trial_name
)
def
compute_tuning_objective
(
results_list
,
hparams
,
trial_name
,
num_trials
):
"""Compute tuning objective and metrics given results and trial information.
Args:
results_list: List of results dicts read from disk. These are written by
workers.
hparams: tf.contrib.training.HParams instance containing the hparams used
in this trial (only the hparams which are being tuned).
trial_name: Name of this trial. Used to create a trial directory.
num_trials: Total number of trials that need to be run. This is saved in the
metrics dict for future reference.
Returns:
objective: The objective computed for this trial. Choose the hparams for the
trial with the largest objective value.
metrics: Information about this trial. A dict.
"""
found_solution
=
[
r
[
'found_solution'
]
for
r
in
results_list
]
successful_program_counts
=
[
r
[
'npe'
]
for
r
in
results_list
if
r
[
'found_solution'
]]
success_rate
=
sum
(
found_solution
)
/
float
(
len
(
results_list
))
max_programs
=
FLAGS
.
max_npe
# Per run.
all_program_counts
=
[
r
[
'npe'
]
if
r
[
'found_solution'
]
else
max_programs
for
r
in
results_list
]
programs_seen_fraction
=
(
float
(
sum
(
all_program_counts
))
/
(
max_programs
*
len
(
all_program_counts
)))
# min/max/avg stats are over successful runs.
metrics
=
{
'num_runs'
:
len
(
results_list
),
'num_succeeded'
:
sum
(
found_solution
),
'success_rate'
:
success_rate
,
'programs_seen_fraction'
:
programs_seen_fraction
,
'avg_programs'
:
np
.
mean
(
successful_program_counts
),
'max_possible_programs_per_run'
:
max_programs
,
'global_step'
:
sum
([
r
[
'num_batches'
]
for
r
in
results_list
]),
'hparams'
:
hparams
.
values
(),
'trial_name'
:
trial_name
,
'num_trials'
:
num_trials
}
# Report stats per tasks.
tasks
=
[
r
[
'task'
]
for
r
in
results_list
]
for
task
in
set
(
tasks
):
task_list
=
[
r
for
r
in
results_list
if
r
[
'task'
]
==
task
]
found_solution
=
[
r
[
'found_solution'
]
for
r
in
task_list
]
successful_rewards
=
[
r
[
'best_reward'
]
for
r
in
task_list
if
r
[
'found_solution'
]]
successful_num_batches
=
[
r
[
'num_batches'
]
for
r
in
task_list
if
r
[
'found_solution'
]]
successful_program_counts
=
[
r
[
'npe'
]
for
r
in
task_list
if
r
[
'found_solution'
]]
metrics_append
=
{
task
+
'__num_runs'
:
len
(
task_list
),
task
+
'__num_succeeded'
:
sum
(
found_solution
),
task
+
'__success_rate'
:
(
sum
(
found_solution
)
/
float
(
len
(
task_list
)))}
metrics
.
update
(
metrics_append
)
if
any
(
found_solution
):
metrics_append
=
{
task
+
'__min_reward'
:
min
(
successful_rewards
),
task
+
'__max_reward'
:
max
(
successful_rewards
),
task
+
'__avg_reward'
:
np
.
median
(
successful_rewards
),
task
+
'__min_programs'
:
min
(
successful_program_counts
),
task
+
'__max_programs'
:
max
(
successful_program_counts
),
task
+
'__avg_programs'
:
np
.
mean
(
successful_program_counts
),
task
+
'__min_batches'
:
min
(
successful_num_batches
),
task
+
'__max_batches'
:
max
(
successful_num_batches
),
task
+
'__avg_batches'
:
np
.
mean
(
successful_num_batches
)}
metrics
.
update
(
metrics_append
)
# Objective will be maximized.
# Maximize success rate, minimize num programs seen.
# Max objective is always 1.
weight
=
FLAGS
.
success_rate_objective_weight
objective
=
(
weight
*
success_rate
+
(
1
-
weight
)
*
(
1
-
programs_seen_fraction
))
metrics
[
'objective'
]
=
objective
return
objective
,
metrics
def
main
(
argv
):
del
argv
logging
.
set_verbosity
(
FLAGS
.
log_level
)
if
not
FLAGS
.
logdir
:
raise
ValueError
(
'logdir flag must be provided.'
)
if
FLAGS
.
num_workers
<=
0
:
raise
ValueError
(
'num_workers flag must be greater than 0.'
)
if
FLAGS
.
task_id
<
0
:
raise
ValueError
(
'task_id flag must be greater than or equal to 0.'
)
if
FLAGS
.
task_id
>=
FLAGS
.
num_workers
:
raise
ValueError
(
'task_id flag must be strictly less than num_workers flag.'
)
if
FLAGS
.
num_tuners
<=
0
:
raise
ValueError
(
'num_tuners flag must be greater than 0.'
)
if
FLAGS
.
tuner_id
<
0
:
raise
ValueError
(
'tuner_id flag must be greater than or equal to 0.'
)
if
FLAGS
.
tuner_id
>=
FLAGS
.
num_tuners
:
raise
ValueError
(
'tuner_id flag must be strictly less than num_tuners flag.'
)
ns
,
_
=
run_lib
.
get_namespace
(
FLAGS
.
config
)
run_tuner_loop
(
ns
)
if
__name__
==
'__main__'
:
app
.
run
(
main
)
research/cognitive_mapping_and_planning/.gitignore
deleted
100644 → 0
View file @
d31aba8a
deps
*.pyc
lib*.so
lib*.so*
research/cognitive_mapping_and_planning/README.md
deleted
100644 → 0
View file @
d31aba8a



# Cognitive Mapping and Planning for Visual Navigation
**Saurabh Gupta, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik**
**Computer Vision and Pattern Recognition (CVPR) 2017.**
**
[
ArXiv
](
https://arxiv.org/abs/1702.03920
)
,
[
Project Website
](
https://sites.google.com/corp/view/cognitive-mapping-and-planning/
)
**
### Citing
If you find this code base and models useful in your research, please consider
citing the following paper:
```
@inproceedings{gupta2017cognitive,
title={Cognitive Mapping and Planning for Visual Navigation},
author={Gupta, Saurabh and Davidson, James and Levine, Sergey and
Sukthankar, Rahul and Malik, Jitendra},
booktitle={CVPR},
year={2017}
}
```
### Contents
1.
[
Requirements: software
](
#requirements-software
)
2.
[
Requirements: data
](
#requirements-data
)
3.
[
Test Pre-trained Models
](
#test-pre-trained-models
)
4.
[
Train your Own Models
](
#train-your-own-models
)
### Requirements: software
1.
Python Virtual Env Setup: All code is implemented in Python but depends on a
small number of python packages and a couple of C libraries. We recommend
using virtual environment for installing these python packages and python
bindings for these C libraries.
```
Shell
VENV_DIR=venv
pip install virtualenv
virtualenv $VENV_DIR
source $VENV_DIR/bin/activate
# You may need to upgrade pip for installing openv-python.
pip install --upgrade pip
# Install simple dependencies.
pip install -r requirements.txt
# Patch bugs in dependencies.
sh patches/apply_patches.sh
```
2.
Install
[
Tensorflow
](
https://www.tensorflow.org/
)
inside this virtual
environment. You will need to use one of the latest nightly builds
(see instructions
[
here
](
https://github.com/tensorflow/tensorflow#installation
)
).
3.
Swiftshader: We use
[
Swiftshader
](
https://github.com/google/swiftshader.git
)
, a CPU based
renderer to render the meshes. It is possible to use other renderers,
replace
`SwiftshaderRenderer`
in
`render/swiftshader_renderer.py`
with
bindings to your renderer.
```
Shell
mkdir -p deps
git clone --recursive https://github.com/google/swiftshader.git deps/swiftshader-src
cd deps/swiftshader-src && git checkout 91da6b00584afd7dcaed66da88e2b617429b3950
git submodule update
mkdir build && cd build && cmake .. && make -j 16 libEGL libGLESv2
cd ../../../
cp deps/swiftshader-src/build/libEGL* libEGL.so.1
cp deps/swiftshader-src/build/libGLESv2* libGLESv2.so.2
```
4.
PyAssimp: We use
[
PyAssimp
](
https://github.com/assimp/assimp.git
)
to load
meshes. It is possible to use other libraries to load meshes, replace
`Shape`
`render/swiftshader_renderer.py`
with bindings to your library for
loading meshes.
```
Shell
mkdir -p deps
git clone https://github.com/assimp/assimp.git deps/assimp-src
cd deps/assimp-src
git checkout 2afeddd5cb63d14bc77b53740b38a54a97d94ee8
cmake CMakeLists.txt -G 'Unix Makefiles' && make -j 16
cd port/PyAssimp && python setup.py install
cd ../../../..
cp deps/assimp-src/lib/libassimp* .
```
5.
graph-tool: We use
[
graph-tool
](
https://git.skewed.de/count0/graph-tool
)
library for graph processing.
```
Shell
mkdir -p deps
# If the following git clone command fails, you can also download the source
# from https://downloads.skewed.de/graph-tool/graph-tool-2.2.44.tar.bz2
git clone https://git.skewed.de/count0/graph-tool deps/graph-tool-src
cd deps/graph-tool-src && git checkout 178add3a571feb6666f4f119027705d95d2951ab
bash autogen.sh
./configure --disable-cairo --disable-sparsehash --prefix=$HOME/.local
make -j 16
make install
cd ../../
```
### Requirements: data
1.
Download the Stanford 3D Indoor Spaces Dataset (S3DIS Dataset) and ImageNet
Pre-trained models for initializing different models. Follow instructions in
`data/README.md`
### Test Pre-trained Models
1.
Download pre-trained models. See
`output/README.md`
.
2.
Test models using
`scripts/script_test_pretrained_models.sh`
.
### Train Your Own Models
All models were trained asynchronously with 16 workers each worker using data
from a single floor. The default hyper-parameters correspond to this setting.
See
[
distributed training with
Tensorflow
](
https://www.tensorflow.org/deploy/distributed
)
for setting up
distributed training. Training with a single worker is possible with the current
code base but will require some minor changes to allow each worker to load all
training environments.
### Contact
For questions or issues open an issue on the tensorflow/models
[
issues
tracker
](
https://github.com/tensorflow/models/issues
)
. Please assign issues to
@s-gupta.
### Credits
This code was written by Saurabh Gupta (@s-gupta).
research/cognitive_mapping_and_planning/__init__.py
deleted
100644 → 0
View file @
d31aba8a
research/cognitive_mapping_and_planning/cfgs/__init__.py
deleted
100644 → 0
View file @
d31aba8a
research/cognitive_mapping_and_planning/cfgs/config_cmp.py
deleted
100644 → 0
View file @
d31aba8a
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
os
,
sys
import
numpy
as
np
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
logging
import
src.utils
as
utils
import
cfgs.config_common
as
cc
import
tensorflow
as
tf
rgb_resnet_v2_50_path
=
'data/init_models/resnet_v2_50/model.ckpt-5136169'
d_resnet_v2_50_path
=
'data/init_models/distill_rgb_to_d_resnet_v2_50/model.ckpt-120002'
def
get_default_args
():
summary_args
=
utils
.
Foo
(
display_interval
=
1
,
test_iters
=
26
,
arop_full_summary_iters
=
14
)
control_args
=
utils
.
Foo
(
train
=
False
,
test
=
False
,
force_batchnorm_is_training_at_test
=
False
,
reset_rng_seed
=
False
,
only_eval_when_done
=
False
,
test_mode
=
None
)
return
summary_args
,
control_args
def
get_default_cmp_args
():
batch_norm_param
=
{
'center'
:
True
,
'scale'
:
True
,
'activation_fn'
:
tf
.
nn
.
relu
}
mapper_arch_args
=
utils
.
Foo
(
dim_reduce_neurons
=
64
,
fc_neurons
=
[
1024
,
1024
],
fc_out_size
=
8
,
fc_out_neurons
=
64
,
encoder
=
'resnet_v2_50'
,
deconv_neurons
=
[
64
,
32
,
16
,
8
,
4
,
2
],
deconv_strides
=
[
2
,
2
,
2
,
2
,
2
,
2
],
deconv_layers_per_block
=
2
,
deconv_kernel_size
=
4
,
fc_dropout
=
0.5
,
combine_type
=
'wt_avg_logits'
,
batch_norm_param
=
batch_norm_param
)
readout_maps_arch_args
=
utils
.
Foo
(
num_neurons
=
[],
strides
=
[],
kernel_size
=
None
,
layers_per_block
=
None
)
arch_args
=
utils
.
Foo
(
vin_val_neurons
=
8
,
vin_action_neurons
=
8
,
vin_ks
=
3
,
vin_share_wts
=
False
,
pred_neurons
=
[
64
,
64
],
pred_batch_norm_param
=
batch_norm_param
,
conv_on_value_map
=
0
,
fr_neurons
=
16
,
fr_ver
=
'v2'
,
fr_inside_neurons
=
64
,
fr_stride
=
1
,
crop_remove_each
=
30
,
value_crop_size
=
4
,
action_sample_type
=
'sample'
,
action_sample_combine_type
=
'one_or_other'
,
sample_gt_prob_type
=
'inverse_sigmoid_decay'
,
dagger_sample_bn_false
=
True
,
vin_num_iters
=
36
,
isd_k
=
750.
,
use_agent_loc
=
False
,
multi_scale
=
True
,
readout_maps
=
False
,
rom_arch
=
readout_maps_arch_args
)
return
arch_args
,
mapper_arch_args
def
get_arch_vars
(
arch_str
):
if
arch_str
==
''
:
vals
=
[]
else
:
vals
=
arch_str
.
split
(
'_'
)
ks
=
[
'var1'
,
'var2'
,
'var3'
]
ks
=
ks
[:
len
(
vals
)]
# Exp Ver.
if
len
(
vals
)
==
0
:
ks
.
append
(
'var1'
);
vals
.
append
(
'v0'
)
# custom arch.
if
len
(
vals
)
==
1
:
ks
.
append
(
'var2'
);
vals
.
append
(
''
)
# map scape for projection baseline.
if
len
(
vals
)
==
2
:
ks
.
append
(
'var3'
);
vals
.
append
(
'fr2'
)
assert
(
len
(
vals
)
==
3
)
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'arch_vars: %s'
,
vars
)
return
vars
def
process_arch_str
(
args
,
arch_str
):
# This function modifies args.
args
.
arch
,
args
.
mapper_arch
=
get_default_cmp_args
()
arch_vars
=
get_arch_vars
(
arch_str
)
args
.
navtask
.
task_params
.
outputs
.
ego_maps
=
True
args
.
navtask
.
task_params
.
outputs
.
ego_goal_imgs
=
True
args
.
navtask
.
task_params
.
outputs
.
egomotion
=
True
args
.
navtask
.
task_params
.
toy_problem
=
False
if
arch_vars
.
var1
==
'lmap'
:
args
=
process_arch_learned_map
(
args
,
arch_vars
)
elif
arch_vars
.
var1
==
'pmap'
:
args
=
process_arch_projected_map
(
args
,
arch_vars
)
else
:
logging
.
fatal
(
'arch_vars.var1 should be lmap or pmap, but is %s'
,
arch_vars
.
var1
)
assert
(
False
)
return
args
def
process_arch_learned_map
(
args
,
arch_vars
):
# Multiscale vision based system.
args
.
navtask
.
task_params
.
input_type
=
'vision'
args
.
navtask
.
task_params
.
outputs
.
images
=
True
if
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'rgb'
:
args
.
solver
.
pretrained_path
=
rgb_resnet_v2_50_path
elif
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'depth'
:
args
.
solver
.
pretrained_path
=
d_resnet_v2_50_path
if
arch_vars
.
var2
==
'Ssc'
:
sc
=
1.
/
args
.
navtask
.
task_params
.
step_size
args
.
arch
.
vin_num_iters
=
40
args
.
navtask
.
task_params
.
map_scales
=
[
sc
]
max_dist
=
args
.
navtask
.
task_params
.
max_dist
*
\
args
.
navtask
.
task_params
.
num_goals
args
.
navtask
.
task_params
.
map_crop_sizes
=
[
2
*
max_dist
]
args
.
arch
.
fr_stride
=
1
args
.
arch
.
vin_action_neurons
=
8
args
.
arch
.
vin_val_neurons
=
3
args
.
arch
.
fr_inside_neurons
=
32
args
.
mapper_arch
.
pad_map_with_zeros_each
=
[
24
]
args
.
mapper_arch
.
deconv_neurons
=
[
64
,
32
,
16
]
args
.
mapper_arch
.
deconv_strides
=
[
1
,
2
,
1
]
elif
(
arch_vars
.
var2
==
'Msc'
or
arch_vars
.
var2
==
'MscROMms'
or
arch_vars
.
var2
==
'MscROMss'
or
arch_vars
.
var2
==
'MscNoVin'
):
# Code for multi-scale planner.
args
.
arch
.
vin_num_iters
=
8
args
.
arch
.
crop_remove_each
=
4
args
.
arch
.
value_crop_size
=
8
sc
=
1.
/
args
.
navtask
.
task_params
.
step_size
max_dist
=
args
.
navtask
.
task_params
.
max_dist
*
\
args
.
navtask
.
task_params
.
num_goals
n_scales
=
np
.
log2
(
float
(
max_dist
)
/
float
(
args
.
arch
.
vin_num_iters
))
n_scales
=
int
(
np
.
ceil
(
n_scales
)
+
1
)
args
.
navtask
.
task_params
.
map_scales
=
\
list
(
sc
*
(
0.5
**
(
np
.
arange
(
n_scales
))[::
-
1
]))
args
.
navtask
.
task_params
.
map_crop_sizes
=
[
16
for
x
in
range
(
n_scales
)]
args
.
arch
.
fr_stride
=
1
args
.
arch
.
vin_action_neurons
=
8
args
.
arch
.
vin_val_neurons
=
3
args
.
arch
.
fr_inside_neurons
=
32
args
.
mapper_arch
.
pad_map_with_zeros_each
=
[
0
for
_
in
range
(
n_scales
)]
args
.
mapper_arch
.
deconv_neurons
=
[
64
*
n_scales
,
32
*
n_scales
,
16
*
n_scales
]
args
.
mapper_arch
.
deconv_strides
=
[
1
,
2
,
1
]
if
arch_vars
.
var2
==
'MscNoVin'
:
# No planning version.
args
.
arch
.
fr_stride
=
[
1
,
2
,
1
,
2
]
args
.
arch
.
vin_action_neurons
=
None
args
.
arch
.
vin_val_neurons
=
16
args
.
arch
.
fr_inside_neurons
=
32
args
.
arch
.
crop_remove_each
=
0
args
.
arch
.
value_crop_size
=
4
args
.
arch
.
vin_num_iters
=
0
elif
arch_vars
.
var2
==
'MscROMms'
or
arch_vars
.
var2
==
'MscROMss'
:
# Code with read outs, MscROMms flattens and reads out,
# MscROMss does not flatten and produces output at multiple scales.
args
.
navtask
.
task_params
.
outputs
.
readout_maps
=
True
args
.
navtask
.
task_params
.
map_resize_method
=
'antialiasing'
args
.
arch
.
readout_maps
=
True
if
arch_vars
.
var2
==
'MscROMms'
:
args
.
arch
.
rom_arch
.
num_neurons
=
[
64
,
1
]
args
.
arch
.
rom_arch
.
kernel_size
=
4
args
.
arch
.
rom_arch
.
strides
=
[
2
,
2
]
args
.
arch
.
rom_arch
.
layers_per_block
=
2
args
.
navtask
.
task_params
.
readout_maps_crop_sizes
=
[
64
]
args
.
navtask
.
task_params
.
readout_maps_scales
=
[
sc
]
elif
arch_vars
.
var2
==
'MscROMss'
:
args
.
arch
.
rom_arch
.
num_neurons
=
\
[
64
,
len
(
args
.
navtask
.
task_params
.
map_scales
)]
args
.
arch
.
rom_arch
.
kernel_size
=
4
args
.
arch
.
rom_arch
.
strides
=
[
1
,
1
]
args
.
arch
.
rom_arch
.
layers_per_block
=
1
args
.
navtask
.
task_params
.
readout_maps_crop_sizes
=
\
args
.
navtask
.
task_params
.
map_crop_sizes
args
.
navtask
.
task_params
.
readout_maps_scales
=
\
args
.
navtask
.
task_params
.
map_scales
else
:
logging
.
fatal
(
'arch_vars.var2 not one of Msc, MscROMms, MscROMss, MscNoVin.'
)
assert
(
False
)
map_channels
=
args
.
mapper_arch
.
deconv_neurons
[
-
1
]
/
\
(
2
*
len
(
args
.
navtask
.
task_params
.
map_scales
))
args
.
navtask
.
task_params
.
map_channels
=
map_channels
return
args
def
process_arch_projected_map
(
args
,
arch_vars
):
# Single scale vision based system which does not use a mapper but instead
# uses an analytically estimated map.
ds
=
int
(
arch_vars
.
var3
[
2
])
args
.
navtask
.
task_params
.
input_type
=
'analytical_counts'
args
.
navtask
.
task_params
.
outputs
.
analytical_counts
=
True
assert
(
args
.
navtask
.
task_params
.
modalities
[
0
]
==
'depth'
)
args
.
navtask
.
camera_param
.
img_channels
=
None
analytical_counts
=
utils
.
Foo
(
map_sizes
=
[
512
/
ds
],
xy_resolution
=
[
5.
*
ds
],
z_bins
=
[[
-
10
,
10
,
150
,
200
]],
non_linearity
=
[
arch_vars
.
var2
])
args
.
navtask
.
task_params
.
analytical_counts
=
analytical_counts
sc
=
1.
/
ds
args
.
arch
.
vin_num_iters
=
36
args
.
navtask
.
task_params
.
map_scales
=
[
sc
]
args
.
navtask
.
task_params
.
map_crop_sizes
=
[
512
/
ds
]
args
.
arch
.
fr_stride
=
[
1
,
2
]
args
.
arch
.
vin_action_neurons
=
8
args
.
arch
.
vin_val_neurons
=
3
args
.
arch
.
fr_inside_neurons
=
32
map_channels
=
len
(
analytical_counts
.
z_bins
[
0
])
+
1
args
.
navtask
.
task_params
.
map_channels
=
map_channels
args
.
solver
.
freeze_conv
=
False
return
args
def
get_args_for_config
(
config_name
):
args
=
utils
.
Foo
()
args
.
summary
,
args
.
control
=
get_default_args
()
exp_name
,
mode_str
=
config_name
.
split
(
'+'
)
arch_str
,
solver_str
,
navtask_str
=
exp_name
.
split
(
'.'
)
logging
.
error
(
'config_name: %s'
,
config_name
)
logging
.
error
(
'arch_str: %s'
,
arch_str
)
logging
.
error
(
'navtask_str: %s'
,
navtask_str
)
logging
.
error
(
'solver_str: %s'
,
solver_str
)
logging
.
error
(
'mode_str: %s'
,
mode_str
)
args
.
solver
=
cc
.
process_solver_str
(
solver_str
)
args
.
navtask
=
cc
.
process_navtask_str
(
navtask_str
)
args
=
process_arch_str
(
args
,
arch_str
)
args
.
arch
.
isd_k
=
args
.
solver
.
isd_k
# Train, test, etc.
mode
,
imset
=
mode_str
.
split
(
'_'
)
args
=
cc
.
adjust_args_for_mode
(
args
,
mode
)
args
.
navtask
.
building_names
=
args
.
navtask
.
dataset
.
get_split
(
imset
)
args
.
control
.
test_name
=
'{:s}_on_{:s}'
.
format
(
mode
,
imset
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
research/cognitive_mapping_and_planning/cfgs/config_common.py
deleted
100644 → 0
View file @
d31aba8a
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
os
import
numpy
as
np
import
logging
import
src.utils
as
utils
import
datasets.nav_env_config
as
nec
from
datasets
import
factory
def
adjust_args_for_mode
(
args
,
mode
):
if
mode
==
'train'
:
args
.
control
.
train
=
True
elif
mode
==
'val1'
:
# Same settings as for training, to make sure nothing wonky is happening
# there.
args
.
control
.
test
=
True
args
.
control
.
test_mode
=
'val'
args
.
navtask
.
task_params
.
batch_size
=
32
elif
mode
==
'val2'
:
# No data augmentation, not sampling but taking the argmax action, not
# sampling from the ground truth at all.
args
.
control
.
test
=
True
args
.
arch
.
action_sample_type
=
'argmax'
args
.
arch
.
sample_gt_prob_type
=
'zero'
args
.
navtask
.
task_params
.
data_augment
=
\
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
0
,
delta_xy
=
0
,
relight
=
False
,
relight_fast
=
False
,
structured
=
False
)
args
.
control
.
test_mode
=
'val'
args
.
navtask
.
task_params
.
batch_size
=
32
elif
mode
==
'bench'
:
# Actually testing the agent in settings that are kept same between
# different runs.
args
.
navtask
.
task_params
.
batch_size
=
16
args
.
control
.
test
=
True
args
.
arch
.
action_sample_type
=
'argmax'
args
.
arch
.
sample_gt_prob_type
=
'zero'
args
.
navtask
.
task_params
.
data_augment
=
\
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
0
,
delta_xy
=
0
,
relight
=
False
,
relight_fast
=
False
,
structured
=
False
)
args
.
summary
.
test_iters
=
250
args
.
control
.
only_eval_when_done
=
True
args
.
control
.
reset_rng_seed
=
True
args
.
control
.
test_mode
=
'test'
else
:
logging
.
fatal
(
'Unknown mode: %s.'
,
mode
)
assert
(
False
)
return
args
def
get_solver_vars
(
solver_str
):
if
solver_str
==
''
:
vals
=
[];
else
:
vals
=
solver_str
.
split
(
'_'
)
ks
=
[
'clip'
,
'dlw'
,
'long'
,
'typ'
,
'isdk'
,
'adam_eps'
,
'init_lr'
];
ks
=
ks
[:
len
(
vals
)]
# Gradient clipping or not.
if
len
(
vals
)
==
0
:
ks
.
append
(
'clip'
);
vals
.
append
(
'noclip'
);
# data loss weight.
if
len
(
vals
)
==
1
:
ks
.
append
(
'dlw'
);
vals
.
append
(
'dlw20'
)
# how long to train for.
if
len
(
vals
)
==
2
:
ks
.
append
(
'long'
);
vals
.
append
(
'nolong'
)
# Adam
if
len
(
vals
)
==
3
:
ks
.
append
(
'typ'
);
vals
.
append
(
'adam2'
)
# reg loss wt
if
len
(
vals
)
==
4
:
ks
.
append
(
'rlw'
);
vals
.
append
(
'rlw1'
)
# isd_k
if
len
(
vals
)
==
5
:
ks
.
append
(
'isdk'
);
vals
.
append
(
'isdk415'
)
# 415, inflexion at 2.5k.
# adam eps
if
len
(
vals
)
==
6
:
ks
.
append
(
'adam_eps'
);
vals
.
append
(
'aeps1en8'
)
# init lr
if
len
(
vals
)
==
7
:
ks
.
append
(
'init_lr'
);
vals
.
append
(
'lr1en3'
)
assert
(
len
(
vals
)
==
8
)
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'solver_vars: %s'
,
vars
)
return
vars
def
process_solver_str
(
solver_str
):
solver
=
utils
.
Foo
(
seed
=
0
,
learning_rate_decay
=
None
,
clip_gradient_norm
=
None
,
max_steps
=
None
,
initial_learning_rate
=
None
,
momentum
=
None
,
steps_per_decay
=
None
,
logdir
=
None
,
sync
=
False
,
adjust_lr_sync
=
True
,
wt_decay
=
0.0001
,
data_loss_wt
=
None
,
reg_loss_wt
=
None
,
freeze_conv
=
True
,
num_workers
=
1
,
task
=
0
,
ps_tasks
=
0
,
master
=
'local'
,
typ
=
None
,
momentum2
=
None
,
adam_eps
=
None
)
# Clobber with overrides from solver str.
solver_vars
=
get_solver_vars
(
solver_str
)
solver
.
data_loss_wt
=
float
(
solver_vars
.
dlw
[
3
:].
replace
(
'x'
,
'.'
))
solver
.
adam_eps
=
float
(
solver_vars
.
adam_eps
[
4
:].
replace
(
'x'
,
'.'
).
replace
(
'n'
,
'-'
))
solver
.
initial_learning_rate
=
float
(
solver_vars
.
init_lr
[
2
:].
replace
(
'x'
,
'.'
).
replace
(
'n'
,
'-'
))
solver
.
reg_loss_wt
=
float
(
solver_vars
.
rlw
[
3
:].
replace
(
'x'
,
'.'
))
solver
.
isd_k
=
float
(
solver_vars
.
isdk
[
4
:].
replace
(
'x'
,
'.'
))
long
=
solver_vars
.
long
if
long
==
'long'
:
solver
.
steps_per_decay
=
40000
solver
.
max_steps
=
120000
elif
long
==
'long2'
:
solver
.
steps_per_decay
=
80000
solver
.
max_steps
=
120000
elif
long
==
'nolong'
or
long
==
'nol'
:
solver
.
steps_per_decay
=
20000
solver
.
max_steps
=
60000
else
:
logging
.
fatal
(
'solver_vars.long should be long, long2, nolong or nol.'
)
assert
(
False
)
clip
=
solver_vars
.
clip
if
clip
==
'noclip'
or
clip
==
'nocl'
:
solver
.
clip_gradient_norm
=
0
elif
clip
[:
4
]
==
'clip'
:
solver
.
clip_gradient_norm
=
float
(
clip
[
4
:].
replace
(
'x'
,
'.'
))
else
:
logging
.
fatal
(
'Unknown solver_vars.clip: %s'
,
clip
)
assert
(
False
)
typ
=
solver_vars
.
typ
if
typ
==
'adam'
:
solver
.
typ
=
'adam'
solver
.
momentum
=
0.9
solver
.
momentum2
=
0.999
solver
.
learning_rate_decay
=
1.0
elif
typ
==
'adam2'
:
solver
.
typ
=
'adam'
solver
.
momentum
=
0.9
solver
.
momentum2
=
0.999
solver
.
learning_rate_decay
=
0.1
elif
typ
==
'sgd'
:
solver
.
typ
=
'sgd'
solver
.
momentum
=
0.99
solver
.
momentum2
=
None
solver
.
learning_rate_decay
=
0.1
else
:
logging
.
fatal
(
'Unknown solver_vars.typ: %s'
,
typ
)
assert
(
False
)
logging
.
error
(
'solver: %s'
,
solver
)
return
solver
def
get_navtask_vars
(
navtask_str
):
if
navtask_str
==
''
:
vals
=
[]
else
:
vals
=
navtask_str
.
split
(
'_'
)
ks_all
=
[
'dataset_name'
,
'modality'
,
'task'
,
'history'
,
'max_dist'
,
'num_steps'
,
'step_size'
,
'n_ori'
,
'aux_views'
,
'data_aug'
]
ks
=
ks_all
[:
len
(
vals
)]
# All data or not.
if
len
(
vals
)
==
0
:
ks
.
append
(
'dataset_name'
);
vals
.
append
(
'sbpd'
)
# modality
if
len
(
vals
)
==
1
:
ks
.
append
(
'modality'
);
vals
.
append
(
'rgb'
)
# semantic task?
if
len
(
vals
)
==
2
:
ks
.
append
(
'task'
);
vals
.
append
(
'r2r'
)
# number of history frames.
if
len
(
vals
)
==
3
:
ks
.
append
(
'history'
);
vals
.
append
(
'h0'
)
# max steps
if
len
(
vals
)
==
4
:
ks
.
append
(
'max_dist'
);
vals
.
append
(
'32'
)
# num steps
if
len
(
vals
)
==
5
:
ks
.
append
(
'num_steps'
);
vals
.
append
(
'40'
)
# step size
if
len
(
vals
)
==
6
:
ks
.
append
(
'step_size'
);
vals
.
append
(
'8'
)
# n_ori
if
len
(
vals
)
==
7
:
ks
.
append
(
'n_ori'
);
vals
.
append
(
'4'
)
# Auxiliary views.
if
len
(
vals
)
==
8
:
ks
.
append
(
'aux_views'
);
vals
.
append
(
'nv0'
)
# Normal data augmentation as opposed to structured data augmentation (if set
# to straug.
if
len
(
vals
)
==
9
:
ks
.
append
(
'data_aug'
);
vals
.
append
(
'straug'
)
assert
(
len
(
vals
)
==
10
)
for
i
in
range
(
len
(
ks
)):
assert
(
ks
[
i
]
==
ks_all
[
i
])
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'navtask_vars: %s'
,
vals
)
return
vars
def
process_navtask_str
(
navtask_str
):
navtask
=
nec
.
nav_env_base_config
()
# Clobber with overrides from strings.
navtask_vars
=
get_navtask_vars
(
navtask_str
)
navtask
.
task_params
.
n_ori
=
int
(
navtask_vars
.
n_ori
)
navtask
.
task_params
.
max_dist
=
int
(
navtask_vars
.
max_dist
)
navtask
.
task_params
.
num_steps
=
int
(
navtask_vars
.
num_steps
)
navtask
.
task_params
.
step_size
=
int
(
navtask_vars
.
step_size
)
navtask
.
task_params
.
data_augment
.
delta_xy
=
int
(
navtask_vars
.
step_size
)
/
2.
n_aux_views_each
=
int
(
navtask_vars
.
aux_views
[
2
])
aux_delta_thetas
=
np
.
concatenate
((
np
.
arange
(
n_aux_views_each
)
+
1
,
-
1
-
np
.
arange
(
n_aux_views_each
)))
aux_delta_thetas
=
aux_delta_thetas
*
np
.
deg2rad
(
navtask
.
camera_param
.
fov
)
navtask
.
task_params
.
aux_delta_thetas
=
aux_delta_thetas
if
navtask_vars
.
data_aug
==
'aug'
:
navtask
.
task_params
.
data_augment
.
structured
=
False
elif
navtask_vars
.
data_aug
==
'straug'
:
navtask
.
task_params
.
data_augment
.
structured
=
True
else
:
logging
.
fatal
(
'Unknown navtask_vars.data_aug %s.'
,
navtask_vars
.
data_aug
)
assert
(
False
)
navtask
.
task_params
.
num_history_frames
=
int
(
navtask_vars
.
history
[
1
:])
navtask
.
task_params
.
n_views
=
1
+
navtask
.
task_params
.
num_history_frames
navtask
.
task_params
.
goal_channels
=
int
(
navtask_vars
.
n_ori
)
if
navtask_vars
.
task
==
'hard'
:
navtask
.
task_params
.
type
=
'rng_rejection_sampling_many'
navtask
.
task_params
.
rejection_sampling_M
=
2000
navtask
.
task_params
.
min_dist
=
10
elif
navtask_vars
.
task
==
'r2r'
:
navtask
.
task_params
.
type
=
'room_to_room_many'
elif
navtask_vars
.
task
==
'ST'
:
# Semantic task at hand.
navtask
.
task_params
.
goal_channels
=
\
len
(
navtask
.
task_params
.
semantic_task
.
class_map_names
)
navtask
.
task_params
.
rel_goal_loc_dim
=
\
len
(
navtask
.
task_params
.
semantic_task
.
class_map_names
)
navtask
.
task_params
.
type
=
'to_nearest_obj_acc'
else
:
logging
.
fatal
(
'navtask_vars.task: should be hard or r2r, ST'
)
assert
(
False
)
if
navtask_vars
.
modality
==
'rgb'
:
navtask
.
camera_param
.
modalities
=
[
'rgb'
]
navtask
.
camera_param
.
img_channels
=
3
elif
navtask_vars
.
modality
==
'd'
:
navtask
.
camera_param
.
modalities
=
[
'depth'
]
navtask
.
camera_param
.
img_channels
=
2
navtask
.
task_params
.
img_height
=
navtask
.
camera_param
.
height
navtask
.
task_params
.
img_width
=
navtask
.
camera_param
.
width
navtask
.
task_params
.
modalities
=
navtask
.
camera_param
.
modalities
navtask
.
task_params
.
img_channels
=
navtask
.
camera_param
.
img_channels
navtask
.
task_params
.
img_fov
=
navtask
.
camera_param
.
fov
navtask
.
dataset
=
factory
.
get_dataset
(
navtask_vars
.
dataset_name
)
return
navtask
research/cognitive_mapping_and_planning/cfgs/config_distill.py
deleted
100644 → 0
View file @
d31aba8a
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
pprint
import
copy
import
os
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
logging
import
src.utils
as
utils
import
cfgs.config_common
as
cc
import
tensorflow
as
tf
rgb_resnet_v2_50_path
=
'cache/resnet_v2_50_inception_preprocessed/model.ckpt-5136169'
def
get_default_args
():
robot
=
utils
.
Foo
(
radius
=
15
,
base
=
10
,
height
=
140
,
sensor_height
=
120
,
camera_elevation_degree
=-
15
)
camera_param
=
utils
.
Foo
(
width
=
225
,
height
=
225
,
z_near
=
0.05
,
z_far
=
20.0
,
fov
=
60.
,
modalities
=
[
'rgb'
,
'depth'
])
env
=
utils
.
Foo
(
padding
=
10
,
resolution
=
5
,
num_point_threshold
=
2
,
valid_min
=-
10
,
valid_max
=
200
,
n_samples_per_face
=
200
)
data_augment
=
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
1
,
delta_xy
=
4
,
relight
=
False
,
relight_fast
=
False
,
structured
=
False
)
task_params
=
utils
.
Foo
(
num_actions
=
4
,
step_size
=
4
,
num_steps
=
0
,
batch_size
=
32
,
room_seed
=
0
,
base_class
=
'Building'
,
task
=
'mapping'
,
n_ori
=
6
,
data_augment
=
data_augment
,
output_transform_to_global_map
=
False
,
output_canonical_map
=
False
,
output_incremental_transform
=
False
,
output_free_space
=
False
,
move_type
=
'shortest_path'
,
toy_problem
=
0
)
buildinger_args
=
utils
.
Foo
(
building_names
=
[
'area1_gates_wingA_floor1_westpart'
],
env_class
=
None
,
robot
=
robot
,
task_params
=
task_params
,
env
=
env
,
camera_param
=
camera_param
)
solver_args
=
utils
.
Foo
(
seed
=
0
,
learning_rate_decay
=
0.1
,
clip_gradient_norm
=
0
,
max_steps
=
120000
,
initial_learning_rate
=
0.001
,
momentum
=
0.99
,
steps_per_decay
=
40000
,
logdir
=
None
,
sync
=
False
,
adjust_lr_sync
=
True
,
wt_decay
=
0.0001
,
data_loss_wt
=
1.0
,
reg_loss_wt
=
1.0
,
num_workers
=
1
,
task
=
0
,
ps_tasks
=
0
,
master
=
'local'
)
summary_args
=
utils
.
Foo
(
display_interval
=
1
,
test_iters
=
100
)
control_args
=
utils
.
Foo
(
train
=
False
,
test
=
False
,
force_batchnorm_is_training_at_test
=
False
)
arch_args
=
utils
.
Foo
(
rgb_encoder
=
'resnet_v2_50'
,
d_encoder
=
'resnet_v2_50'
)
return
utils
.
Foo
(
solver
=
solver_args
,
summary
=
summary_args
,
control
=
control_args
,
arch
=
arch_args
,
buildinger
=
buildinger_args
)
def
get_vars
(
config_name
):
vars
=
config_name
.
split
(
'_'
)
if
len
(
vars
)
==
1
:
# All data or not.
vars
.
append
(
'noall'
)
if
len
(
vars
)
==
2
:
# n_ori
vars
.
append
(
'4'
)
logging
.
error
(
'vars: %s'
,
vars
)
return
vars
def
get_args_for_config
(
config_name
):
args
=
get_default_args
()
config_name
,
mode
=
config_name
.
split
(
'+'
)
vars
=
get_vars
(
config_name
)
logging
.
info
(
'config_name: %s, mode: %s'
,
config_name
,
mode
)
args
.
buildinger
.
task_params
.
n_ori
=
int
(
vars
[
2
])
args
.
solver
.
freeze_conv
=
True
args
.
solver
.
pretrained_path
=
rgb_resnet_v2_50_path
args
.
buildinger
.
task_params
.
img_channels
=
5
args
.
solver
.
data_loss_wt
=
0.00001
if
vars
[
0
]
==
'v0'
:
None
else
:
logging
.
error
(
'config_name: %s undefined'
,
config_name
)
args
.
buildinger
.
task_params
.
height
=
args
.
buildinger
.
camera_param
.
height
args
.
buildinger
.
task_params
.
width
=
args
.
buildinger
.
camera_param
.
width
args
.
buildinger
.
task_params
.
modalities
=
args
.
buildinger
.
camera_param
.
modalities
if
vars
[
1
]
==
'all'
:
args
=
cc
.
get_args_for_mode_building_all
(
args
,
mode
)
elif
vars
[
1
]
==
'noall'
:
args
=
cc
.
get_args_for_mode_building
(
args
,
mode
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
research/cognitive_mapping_and_planning/cfgs/config_vision_baseline.py
deleted
100644 → 0
View file @
d31aba8a
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
pprint
import
os
import
numpy
as
np
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
logging
import
src.utils
as
utils
import
cfgs.config_common
as
cc
import
datasets.nav_env_config
as
nec
import
tensorflow
as
tf
FLAGS
=
flags
.
FLAGS
get_solver_vars
=
cc
.
get_solver_vars
get_navtask_vars
=
cc
.
get_navtask_vars
rgb_resnet_v2_50_path
=
'data/init_models/resnet_v2_50/model.ckpt-5136169'
d_resnet_v2_50_path
=
'data/init_models/distill_rgb_to_d_resnet_v2_50/model.ckpt-120002'
def
get_default_args
():
summary_args
=
utils
.
Foo
(
display_interval
=
1
,
test_iters
=
26
,
arop_full_summary_iters
=
14
)
control_args
=
utils
.
Foo
(
train
=
False
,
test
=
False
,
force_batchnorm_is_training_at_test
=
False
,
reset_rng_seed
=
False
,
only_eval_when_done
=
False
,
test_mode
=
None
)
return
summary_args
,
control_args
def
get_default_baseline_args
():
batch_norm_param
=
{
'center'
:
True
,
'scale'
:
True
,
'activation_fn'
:
tf
.
nn
.
relu
}
arch_args
=
utils
.
Foo
(
pred_neurons
=
[],
goal_embed_neurons
=
[],
img_embed_neurons
=
[],
batch_norm_param
=
batch_norm_param
,
dim_reduce_neurons
=
64
,
combine_type
=
''
,
encoder
=
'resnet_v2_50'
,
action_sample_type
=
'sample'
,
action_sample_combine_type
=
'one_or_other'
,
sample_gt_prob_type
=
'inverse_sigmoid_decay'
,
dagger_sample_bn_false
=
True
,
isd_k
=
750.
,
use_visit_count
=
False
,
lstm_output
=
False
,
lstm_ego
=
False
,
lstm_img
=
False
,
fc_dropout
=
0.0
,
embed_goal_for_state
=
False
,
lstm_output_init_state_from_goal
=
False
)
return
arch_args
def
get_arch_vars
(
arch_str
):
if
arch_str
==
''
:
vals
=
[]
else
:
vals
=
arch_str
.
split
(
'_'
)
ks
=
[
'ver'
,
'lstm_dim'
,
'dropout'
]
# Exp Ver
if
len
(
vals
)
==
0
:
vals
.
append
(
'v0'
)
# LSTM dimentsions
if
len
(
vals
)
==
1
:
vals
.
append
(
'lstm2048'
)
# Dropout
if
len
(
vals
)
==
2
:
vals
.
append
(
'noDO'
)
assert
(
len
(
vals
)
==
3
)
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'arch_vars: %s'
,
vars
)
return
vars
def
process_arch_str
(
args
,
arch_str
):
# This function modifies args.
args
.
arch
=
get_default_baseline_args
()
arch_vars
=
get_arch_vars
(
arch_str
)
args
.
navtask
.
task_params
.
outputs
.
rel_goal_loc
=
True
args
.
navtask
.
task_params
.
input_type
=
'vision'
args
.
navtask
.
task_params
.
outputs
.
images
=
True
if
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'rgb'
:
args
.
solver
.
pretrained_path
=
rgb_resnet_v2_50_path
elif
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'depth'
:
args
.
solver
.
pretrained_path
=
d_resnet_v2_50_path
else
:
logging
.
fatal
(
'Neither of rgb or d'
)
if
arch_vars
.
dropout
==
'DO'
:
args
.
arch
.
fc_dropout
=
0.5
args
.
tfcode
=
'B'
exp_ver
=
arch_vars
.
ver
if
exp_ver
==
'v0'
:
# Multiplicative interaction between goal loc and image features.
args
.
arch
.
combine_type
=
'multiply'
args
.
arch
.
pred_neurons
=
[
256
,
256
]
args
.
arch
.
goal_embed_neurons
=
[
64
,
8
]
args
.
arch
.
img_embed_neurons
=
[
1024
,
512
,
256
*
8
]
elif
exp_ver
==
'v1'
:
# Additive interaction between goal and image features.
args
.
arch
.
combine_type
=
'add'
args
.
arch
.
pred_neurons
=
[
256
,
256
]
args
.
arch
.
goal_embed_neurons
=
[
64
,
256
]
args
.
arch
.
img_embed_neurons
=
[
1024
,
512
,
256
]
elif
exp_ver
==
'v2'
:
# LSTM at the output on top of multiple interactions.
args
.
arch
.
combine_type
=
'multiply'
args
.
arch
.
goal_embed_neurons
=
[
64
,
8
]
args
.
arch
.
img_embed_neurons
=
[
1024
,
512
,
256
*
8
]
args
.
arch
.
lstm_output
=
True
args
.
arch
.
lstm_output_dim
=
int
(
arch_vars
.
lstm_dim
[
4
:])
args
.
arch
.
pred_neurons
=
[
256
]
# The other is inside the LSTM.
elif
exp_ver
==
'v0blind'
:
# LSTM only on the goal location.
args
.
arch
.
combine_type
=
'goalonly'
args
.
arch
.
goal_embed_neurons
=
[
64
,
256
]
args
.
arch
.
img_embed_neurons
=
[
2
]
# I dont know what it will do otherwise.
args
.
arch
.
lstm_output
=
True
args
.
arch
.
lstm_output_dim
=
256
args
.
arch
.
pred_neurons
=
[
256
]
# The other is inside the LSTM.
else
:
logging
.
fatal
(
'exp_ver: %s undefined'
,
exp_ver
)
assert
(
False
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
def
get_args_for_config
(
config_name
):
args
=
utils
.
Foo
()
args
.
summary
,
args
.
control
=
get_default_args
()
exp_name
,
mode_str
=
config_name
.
split
(
'+'
)
arch_str
,
solver_str
,
navtask_str
=
exp_name
.
split
(
'.'
)
logging
.
error
(
'config_name: %s'
,
config_name
)
logging
.
error
(
'arch_str: %s'
,
arch_str
)
logging
.
error
(
'navtask_str: %s'
,
navtask_str
)
logging
.
error
(
'solver_str: %s'
,
solver_str
)
logging
.
error
(
'mode_str: %s'
,
mode_str
)
args
.
solver
=
cc
.
process_solver_str
(
solver_str
)
args
.
navtask
=
cc
.
process_navtask_str
(
navtask_str
)
args
=
process_arch_str
(
args
,
arch_str
)
args
.
arch
.
isd_k
=
args
.
solver
.
isd_k
# Train, test, etc.
mode
,
imset
=
mode_str
.
split
(
'_'
)
args
=
cc
.
adjust_args_for_mode
(
args
,
mode
)
args
.
navtask
.
building_names
=
args
.
navtask
.
dataset
.
get_split
(
imset
)
args
.
control
.
test_name
=
'{:s}_on_{:s}'
.
format
(
mode
,
imset
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
research/cognitive_mapping_and_planning/data/.gitignore
deleted
100644 → 0
View file @
d31aba8a
stanford_building_parser_dataset_raw
stanford_building_parser_dataset
init_models
research/cognitive_mapping_and_planning/data/README.md
deleted
100644 → 0
View file @
d31aba8a
This directory contains the data needed for training and benchmarking various
navigation models.
1.
Download the data from the [dataset website]
(http://buildingparser.stanford.edu/dataset.html).
1.
[
Raw meshes
](
https://goo.gl/forms/2YSPaO2UKmn5Td5m2
)
. We need the meshes
which are in the noXYZ folder. Download the tar files and place them in
the
`stanford_building_parser_dataset_raw`
folder. You need to download
`area_1_noXYZ.tar`
,
`area_3_noXYZ.tar`
,
`area_5a_noXYZ.tar`
,
`area_5b_noXYZ.tar`
,
`area_6_noXYZ.tar`
for training and
`area_4_noXYZ.tar`
for evaluation.
2.
[
Annotations
](
https://goo.gl/forms/4SoGp4KtH1jfRqEj2
)
for setting up
tasks. We will need the file called
`Stanford3dDataset_v1.2.zip`
. Place
the file in the directory
`stanford_building_parser_dataset_raw`
.
2.
Preprocess the data.
1.
Extract meshes using
`scripts/script_preprocess_meshes_S3DIS.sh`
. After
this
`ls data/stanford_building_parser_dataset/mesh`
should have 6
folders
`area1`
,
`area3`
,
`area4`
,
`area5a`
,
`area5b`
,
`area6`
, with
textures and obj files within each directory.
2.
Extract out room information and semantics from zip file using
`scripts/script_preprocess_annoations_S3DIS.sh`
. After this there should
be
`room-dimension`
and
`class-maps`
folder in
`data/stanford_building_parser_dataset`
. (If you find this script to
crash because of an exception in np.loadtxt while processing
`Area_5/office_19/Annotations/ceiling_1.txt`
, there is a special
character on line 323474, that should be removed manually.)
3.
Download ImageNet Pre-trained models. We used ResNet-v2-50 for representing
images. For RGB images this is pre-trained on ImageNet. For Depth images we
[
distill
](
https://arxiv.org/abs/1507.00448
)
the RGB model to depth images
using paired RGB-D images. Both there models are available through
`scripts/script_download_init_models.sh`
research/cognitive_mapping_and_planning/datasets/__init__.py
deleted
100644 → 0
View file @
d31aba8a
Prev
1
…
5
6
7
8
9
10
11
12
13
…
18
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment