Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
5b9d9097
Commit
5b9d9097
authored
Apr 19, 2017
by
Saurabh Gupta
Browse files
Implementation for Cognitive Mapping and Planning paper.
parent
c136af63
Changes
51
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
2763 additions
and
0 deletions
+2763
-0
cognitive_mapping_and_planning/.gitignore
cognitive_mapping_and_planning/.gitignore
+4
-0
cognitive_mapping_and_planning/README.md
cognitive_mapping_and_planning/README.md
+122
-0
cognitive_mapping_and_planning/__init__.py
cognitive_mapping_and_planning/__init__.py
+0
-0
cognitive_mapping_and_planning/cfgs/__init__.py
cognitive_mapping_and_planning/cfgs/__init__.py
+0
-0
cognitive_mapping_and_planning/cfgs/config_cmp.py
cognitive_mapping_and_planning/cfgs/config_cmp.py
+283
-0
cognitive_mapping_and_planning/cfgs/config_common.py
cognitive_mapping_and_planning/cfgs/config_common.py
+261
-0
cognitive_mapping_and_planning/cfgs/config_distill.py
cognitive_mapping_and_planning/cfgs/config_distill.py
+114
-0
cognitive_mapping_and_planning/cfgs/config_vision_baseline.py
...itive_mapping_and_planning/cfgs/config_vision_baseline.py
+173
-0
cognitive_mapping_and_planning/data/.gitignore
cognitive_mapping_and_planning/data/.gitignore
+3
-0
cognitive_mapping_and_planning/data/README.md
cognitive_mapping_and_planning/data/README.md
+33
-0
cognitive_mapping_and_planning/datasets/__init__.py
cognitive_mapping_and_planning/datasets/__init__.py
+0
-0
cognitive_mapping_and_planning/datasets/factory.py
cognitive_mapping_and_planning/datasets/factory.py
+113
-0
cognitive_mapping_and_planning/datasets/nav_env.py
cognitive_mapping_and_planning/datasets/nav_env.py
+1465
-0
cognitive_mapping_and_planning/datasets/nav_env_config.py
cognitive_mapping_and_planning/datasets/nav_env_config.py
+127
-0
cognitive_mapping_and_planning/matplotlibrc
cognitive_mapping_and_planning/matplotlibrc
+1
-0
cognitive_mapping_and_planning/output/.gitignore
cognitive_mapping_and_planning/output/.gitignore
+1
-0
cognitive_mapping_and_planning/output/README.md
cognitive_mapping_and_planning/output/README.md
+16
-0
cognitive_mapping_and_planning/patches/GLES2_2_0.py.patch
cognitive_mapping_and_planning/patches/GLES2_2_0.py.patch
+14
-0
cognitive_mapping_and_planning/patches/apply_patches.sh
cognitive_mapping_and_planning/patches/apply_patches.sh
+18
-0
cognitive_mapping_and_planning/patches/ctypesloader.py.patch
cognitive_mapping_and_planning/patches/ctypesloader.py.patch
+15
-0
No files found.
cognitive_mapping_and_planning/.gitignore
0 → 100644
View file @
5b9d9097
deps
*.pyc
lib*.so
lib*.so*
cognitive_mapping_and_planning/README.md
0 → 100644
View file @
5b9d9097
# Cognitive Mapping and Planning for Visual Navigation
**Saurabh Gupta, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik**
**Computer Vision and Pattern Recognition (CVPR) 2017.**
**
[
ArXiv
](
https://arxiv.org/abs/1702.03920
)
,
[
Project Website
](
https://sites.google.com/corp/view/cognitive-mapping-and-planning/
)
**
### Citing
If you find this code base and models useful in your research, please consider
citing the following paper:
```
@inproceedings{gupta2017cognitive,
title={Cognitive Mapping and Planning for Visual Navigation},
author={Gupta, Saurabh and Davidson, James and Levine, Sergey and
Sukthankar, Rahul and Malik, Jitendra},
booktitle={CVPR},
year={2017}
}
```
### Contents
1.
[
Requirements: software
](
#requirements-software
)
2.
[
Requirements: data
](
#requirements-data
)
3.
[
Test Pre-trained Models
](
#test-pre_trained-models
)
4.
[
Train your Own Models
](
#train-your-own-models
)
### Requirements: software
1.
Python Virtual Env Setup: All code is implemented in Python but depends on a
small number of python packages and a couple of C libraries. We recommend
using virtual environment for installing these python packages and python
bindings for these C libraries.
```
Shell
VENV_DIR=venv
pip install virtualenv
virtualenv $VENV_DIR
source $VENV_DIR/bin/activate
# You may need to upgrade pip for installing openv-python.
pip install --upgrade pip
# Install simple dependencies.
pip install -r requirements.txt
# Patch bugs in dependencies.
sh patches/apply_patches.sh
```
2.
Install
[
Tensorflow
](
https://www.tensorflow.org/
)
inside this virtual
environment. Typically done with
`pip install --upgrade tensorflow-gpu`
.
3.
Swiftshader: We use
[
Swiftshader
](
https://github.com/google/swiftshader.git
)
, a CPU based
renderer to render the meshes. It is possible to use other renderers,
replace
`SwiftshaderRenderer`
in
`render/swiftshader_renderer.py`
with
bindings to your renderer.
```
Shell
mkdir -p deps
git clone --recursive https://github.com/google/swiftshader.git deps/swiftshader-src
cd deps/swiftshader-src && git checkout 91da6b00584afd7dcaed66da88e2b617429b3950
mkdir build && cd build && cmake .. && make -j 16 libEGL libGLESv2
cd ../../../
cp deps/swiftshader-src/build/libEGL* libEGL.so.1
cp deps/swiftshader-src/build/libGLESv2* libGLESv2.so.2
```
4.
PyAssimp: We use
[
PyAssimp
](
https://github.com/assimp/assimp.git
)
to load
meshes. It is possible to use other libraries to load meshes, replace
`Shape`
`render/swiftshader_renderer.py`
with bindings to your library for
loading meshes.
```
Shell
mkdir -p deps
git clone https://github.com/assimp/assimp.git deps/assimp-src
cd deps/assimp-src
git checkout 2afeddd5cb63d14bc77b53740b38a54a97d94ee8
cmake CMakeLists.txt -G 'Unix Makefiles' && make -j 16
cd port/PyAssimp && python setup.py install
cd ../../../..
cp deps/assimp-src/lib/libassimp* .
```
5.
graph-tool: We use
[
graph-tool
](
https://git.skewed.de/count0/graph-tool
)
library for graph processing.
```
Shell
mkdir -p deps
# If the following git clone command fails, you can also download the source
# from https://downloads.skewed.de/graph-tool/graph-tool-2.2.44.tar.bz2
git clone https://git.skewed.de/count0/graph-tool deps/graph-tool-src
cd deps/graph-tool-src && git checkout 178add3a571feb6666f4f119027705d95d2951ab
bash autogen.sh
./configure --disable-cairo --disable-sparsehash --prefix=$HOME/.local
make -j 16
make install
cd ../../
```
### Requirements: data
1.
Download the Stanford 3D Inddor Spaces Dataset (S3DIS Dataset) and ImageNet
Pre-trained models for initializing different models. Follow instructions in
`data/README.md`
### Test Pre-trained Models
1.
Download pre-trained models using
`scripts/scripts_download_pretrained_models.sh`
2.
Test models using
`scripts/script_test_pretrained_models.sh`
.
### Train Your Own Models
All models were trained asynchronously with 16 workers each worker using data
from a single floor. The default hyper-parameters coorespond to this setting.
See
[
distributed training with
Tensorflow
](
https://www.tensorflow.org/deploy/distributed
)
for setting up
distributed training. Training with a single worker is possible with the current
code base but will require some minor changes to allow each worker to load all
training environments.
### Contact
For questions or issues open an issue on the tensorflow/models
[
issues
tracker
](
https://github.com/tensorflow/models/issues
)
. Please assign issues to
@s-gupta.
### Credits
This code was written by Saurabh Gupta (@s-gupta).
cognitive_mapping_and_planning/__init__.py
0 → 100644
View file @
5b9d9097
cognitive_mapping_and_planning/cfgs/__init__.py
0 → 100644
View file @
5b9d9097
cognitive_mapping_and_planning/cfgs/config_cmp.py
0 → 100644
View file @
5b9d9097
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
os
,
sys
import
numpy
as
np
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
logging
import
src.utils
as
utils
import
cfgs.config_common
as
cc
import
tensorflow
as
tf
rgb_resnet_v2_50_path
=
'data/init_models/resnet_v2_50/model.ckpt-5136169'
d_resnet_v2_50_path
=
'data/init_models/distill_rgb_to_d_resnet_v2_50/model.ckpt-120002'
def
get_default_args
():
summary_args
=
utils
.
Foo
(
display_interval
=
1
,
test_iters
=
26
,
arop_full_summary_iters
=
14
)
control_args
=
utils
.
Foo
(
train
=
False
,
test
=
False
,
force_batchnorm_is_training_at_test
=
False
,
reset_rng_seed
=
False
,
only_eval_when_done
=
False
,
test_mode
=
None
)
return
summary_args
,
control_args
def
get_default_cmp_args
():
batch_norm_param
=
{
'center'
:
True
,
'scale'
:
True
,
'activation_fn'
:
tf
.
nn
.
relu
}
mapper_arch_args
=
utils
.
Foo
(
dim_reduce_neurons
=
64
,
fc_neurons
=
[
1024
,
1024
],
fc_out_size
=
8
,
fc_out_neurons
=
64
,
encoder
=
'resnet_v2_50'
,
deconv_neurons
=
[
64
,
32
,
16
,
8
,
4
,
2
],
deconv_strides
=
[
2
,
2
,
2
,
2
,
2
,
2
],
deconv_layers_per_block
=
2
,
deconv_kernel_size
=
4
,
fc_dropout
=
0.5
,
combine_type
=
'wt_avg_logits'
,
batch_norm_param
=
batch_norm_param
)
readout_maps_arch_args
=
utils
.
Foo
(
num_neurons
=
[],
strides
=
[],
kernel_size
=
None
,
layers_per_block
=
None
)
arch_args
=
utils
.
Foo
(
vin_val_neurons
=
8
,
vin_action_neurons
=
8
,
vin_ks
=
3
,
vin_share_wts
=
False
,
pred_neurons
=
[
64
,
64
],
pred_batch_norm_param
=
batch_norm_param
,
conv_on_value_map
=
0
,
fr_neurons
=
16
,
fr_ver
=
'v2'
,
fr_inside_neurons
=
64
,
fr_stride
=
1
,
crop_remove_each
=
30
,
value_crop_size
=
4
,
action_sample_type
=
'sample'
,
action_sample_combine_type
=
'one_or_other'
,
sample_gt_prob_type
=
'inverse_sigmoid_decay'
,
dagger_sample_bn_false
=
True
,
vin_num_iters
=
36
,
isd_k
=
750.
,
use_agent_loc
=
False
,
multi_scale
=
True
,
readout_maps
=
False
,
rom_arch
=
readout_maps_arch_args
)
return
arch_args
,
mapper_arch_args
def
get_arch_vars
(
arch_str
):
if
arch_str
==
''
:
vals
=
[]
else
:
vals
=
arch_str
.
split
(
'_'
)
ks
=
[
'var1'
,
'var2'
,
'var3'
]
ks
=
ks
[:
len
(
vals
)]
# Exp Ver.
if
len
(
vals
)
==
0
:
ks
.
append
(
'var1'
);
vals
.
append
(
'v0'
)
# custom arch.
if
len
(
vals
)
==
1
:
ks
.
append
(
'var2'
);
vals
.
append
(
''
)
# map scape for projection baseline.
if
len
(
vals
)
==
2
:
ks
.
append
(
'var3'
);
vals
.
append
(
'fr2'
)
assert
(
len
(
vals
)
==
3
)
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'arch_vars: %s'
,
vars
)
return
vars
def
process_arch_str
(
args
,
arch_str
):
# This function modifies args.
args
.
arch
,
args
.
mapper_arch
=
get_default_cmp_args
()
arch_vars
=
get_arch_vars
(
arch_str
)
args
.
navtask
.
task_params
.
outputs
.
ego_maps
=
True
args
.
navtask
.
task_params
.
outputs
.
ego_goal_imgs
=
True
args
.
navtask
.
task_params
.
outputs
.
egomotion
=
True
args
.
navtask
.
task_params
.
toy_problem
=
False
if
arch_vars
.
var1
==
'lmap'
:
args
=
process_arch_learned_map
(
args
,
arch_vars
)
elif
arch_vars
.
var1
==
'pmap'
:
args
=
process_arch_projected_map
(
args
,
arch_vars
)
else
:
logging
.
fatal
(
'arch_vars.var1 should be lmap or pmap, but is %s'
,
arch_vars
.
var1
)
assert
(
False
)
return
args
def
process_arch_learned_map
(
args
,
arch_vars
):
# Multiscale vision based system.
args
.
navtask
.
task_params
.
input_type
=
'vision'
args
.
navtask
.
task_params
.
outputs
.
images
=
True
if
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'rgb'
:
args
.
solver
.
pretrained_path
=
rgb_resnet_v2_50_path
elif
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'depth'
:
args
.
solver
.
pretrained_path
=
d_resnet_v2_50_path
if
arch_vars
.
var2
==
'Ssc'
:
sc
=
1.
/
args
.
navtask
.
task_params
.
step_size
args
.
arch
.
vin_num_iters
=
40
args
.
navtask
.
task_params
.
map_scales
=
[
sc
]
max_dist
=
args
.
navtask
.
task_params
.
max_dist
*
\
args
.
navtask
.
task_params
.
num_goals
args
.
navtask
.
task_params
.
map_crop_sizes
=
[
2
*
max_dist
]
args
.
arch
.
fr_stride
=
1
args
.
arch
.
vin_action_neurons
=
8
args
.
arch
.
vin_val_neurons
=
3
args
.
arch
.
fr_inside_neurons
=
32
args
.
mapper_arch
.
pad_map_with_zeros_each
=
[
24
]
args
.
mapper_arch
.
deconv_neurons
=
[
64
,
32
,
16
]
args
.
mapper_arch
.
deconv_strides
=
[
1
,
2
,
1
]
elif
(
arch_vars
.
var2
==
'Msc'
or
arch_vars
.
var2
==
'MscROMms'
or
arch_vars
.
var2
==
'MscROMss'
or
arch_vars
.
var2
==
'MscNoVin'
):
# Code for multi-scale planner.
args
.
arch
.
vin_num_iters
=
8
args
.
arch
.
crop_remove_each
=
4
args
.
arch
.
value_crop_size
=
8
sc
=
1.
/
args
.
navtask
.
task_params
.
step_size
max_dist
=
args
.
navtask
.
task_params
.
max_dist
*
\
args
.
navtask
.
task_params
.
num_goals
n_scales
=
np
.
log2
(
float
(
max_dist
)
/
float
(
args
.
arch
.
vin_num_iters
))
n_scales
=
int
(
np
.
ceil
(
n_scales
)
+
1
)
args
.
navtask
.
task_params
.
map_scales
=
\
list
(
sc
*
(
0.5
**
(
np
.
arange
(
n_scales
))[::
-
1
]))
args
.
navtask
.
task_params
.
map_crop_sizes
=
[
16
for
x
in
range
(
n_scales
)]
args
.
arch
.
fr_stride
=
1
args
.
arch
.
vin_action_neurons
=
8
args
.
arch
.
vin_val_neurons
=
3
args
.
arch
.
fr_inside_neurons
=
32
args
.
mapper_arch
.
pad_map_with_zeros_each
=
[
0
for
_
in
range
(
n_scales
)]
args
.
mapper_arch
.
deconv_neurons
=
[
64
*
n_scales
,
32
*
n_scales
,
16
*
n_scales
]
args
.
mapper_arch
.
deconv_strides
=
[
1
,
2
,
1
]
if
arch_vars
.
var2
==
'MscNoVin'
:
# No planning version.
args
.
arch
.
fr_stride
=
[
1
,
2
,
1
,
2
]
args
.
arch
.
vin_action_neurons
=
None
args
.
arch
.
vin_val_neurons
=
16
args
.
arch
.
fr_inside_neurons
=
32
args
.
arch
.
crop_remove_each
=
0
args
.
arch
.
value_crop_size
=
4
args
.
arch
.
vin_num_iters
=
0
elif
arch_vars
.
var2
==
'MscROMms'
or
arch_vars
.
var2
==
'MscROMss'
:
# Code with read outs, MscROMms flattens and reads out,
# MscROMss does not flatten and produces output at multiple scales.
args
.
navtask
.
task_params
.
outputs
.
readout_maps
=
True
args
.
navtask
.
task_params
.
map_resize_method
=
'antialiasing'
args
.
arch
.
readout_maps
=
True
if
arch_vars
.
var2
==
'MscROMms'
:
args
.
arch
.
rom_arch
.
num_neurons
=
[
64
,
1
]
args
.
arch
.
rom_arch
.
kernel_size
=
4
args
.
arch
.
rom_arch
.
strides
=
[
2
,
2
]
args
.
arch
.
rom_arch
.
layers_per_block
=
2
args
.
navtask
.
task_params
.
readout_maps_crop_sizes
=
[
64
]
args
.
navtask
.
task_params
.
readout_maps_scales
=
[
sc
]
elif
arch_vars
.
var2
==
'MscROMss'
:
args
.
arch
.
rom_arch
.
num_neurons
=
\
[
64
,
len
(
args
.
navtask
.
task_params
.
map_scales
)]
args
.
arch
.
rom_arch
.
kernel_size
=
4
args
.
arch
.
rom_arch
.
strides
=
[
1
,
1
]
args
.
arch
.
rom_arch
.
layers_per_block
=
1
args
.
navtask
.
task_params
.
readout_maps_crop_sizes
=
\
args
.
navtask
.
task_params
.
map_crop_sizes
args
.
navtask
.
task_params
.
readout_maps_scales
=
\
args
.
navtask
.
task_params
.
map_scales
else
:
logging
.
fatal
(
'arch_vars.var2 not one of Msc, MscROMms, MscROMss, MscNoVin.'
)
assert
(
False
)
map_channels
=
args
.
mapper_arch
.
deconv_neurons
[
-
1
]
/
\
(
2
*
len
(
args
.
navtask
.
task_params
.
map_scales
))
args
.
navtask
.
task_params
.
map_channels
=
map_channels
return
args
def
process_arch_projected_map
(
args
,
arch_vars
):
# Single scale vision based system which does not use a mapper but instead
# uses an analytically estimated map.
ds
=
int
(
arch_vars
.
var3
[
2
])
args
.
navtask
.
task_params
.
input_type
=
'analytical_counts'
args
.
navtask
.
task_params
.
outputs
.
analytical_counts
=
True
assert
(
args
.
navtask
.
task_params
.
modalities
[
0
]
==
'depth'
)
args
.
navtask
.
camera_param
.
img_channels
=
None
analytical_counts
=
utils
.
Foo
(
map_sizes
=
[
512
/
ds
],
xy_resolution
=
[
5.
*
ds
],
z_bins
=
[[
-
10
,
10
,
150
,
200
]],
non_linearity
=
[
arch_vars
.
var2
])
args
.
navtask
.
task_params
.
analytical_counts
=
analytical_counts
sc
=
1.
/
ds
args
.
arch
.
vin_num_iters
=
36
args
.
navtask
.
task_params
.
map_scales
=
[
sc
]
args
.
navtask
.
task_params
.
map_crop_sizes
=
[
512
/
ds
]
args
.
arch
.
fr_stride
=
[
1
,
2
]
args
.
arch
.
vin_action_neurons
=
8
args
.
arch
.
vin_val_neurons
=
3
args
.
arch
.
fr_inside_neurons
=
32
map_channels
=
len
(
analytical_counts
.
z_bins
[
0
])
+
1
args
.
navtask
.
task_params
.
map_channels
=
map_channels
args
.
solver
.
freeze_conv
=
False
return
args
def
get_args_for_config
(
config_name
):
args
=
utils
.
Foo
()
args
.
summary
,
args
.
control
=
get_default_args
()
exp_name
,
mode_str
=
config_name
.
split
(
'+'
)
arch_str
,
solver_str
,
navtask_str
=
exp_name
.
split
(
'.'
)
logging
.
error
(
'config_name: %s'
,
config_name
)
logging
.
error
(
'arch_str: %s'
,
arch_str
)
logging
.
error
(
'navtask_str: %s'
,
navtask_str
)
logging
.
error
(
'solver_str: %s'
,
solver_str
)
logging
.
error
(
'mode_str: %s'
,
mode_str
)
args
.
solver
=
cc
.
process_solver_str
(
solver_str
)
args
.
navtask
=
cc
.
process_navtask_str
(
navtask_str
)
args
=
process_arch_str
(
args
,
arch_str
)
args
.
arch
.
isd_k
=
args
.
solver
.
isd_k
# Train, test, etc.
mode
,
imset
=
mode_str
.
split
(
'_'
)
args
=
cc
.
adjust_args_for_mode
(
args
,
mode
)
args
.
navtask
.
building_names
=
args
.
navtask
.
dataset
.
get_split
(
imset
)
args
.
control
.
test_name
=
'{:s}_on_{:s}'
.
format
(
mode
,
imset
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
cognitive_mapping_and_planning/cfgs/config_common.py
0 → 100644
View file @
5b9d9097
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
os
import
numpy
as
np
import
logging
import
src.utils
as
utils
import
datasets.nav_env_config
as
nec
from
datasets
import
factory
def
adjust_args_for_mode
(
args
,
mode
):
if
mode
==
'train'
:
args
.
control
.
train
=
True
elif
mode
==
'val1'
:
# Same settings as for training, to make sure nothing wonky is happening
# there.
args
.
control
.
test
=
True
args
.
control
.
test_mode
=
'val'
args
.
navtask
.
task_params
.
batch_size
=
32
elif
mode
==
'val2'
:
# No data augmentation, not sampling but taking the argmax action, not
# sampling from the ground truth at all.
args
.
control
.
test
=
True
args
.
arch
.
action_sample_type
=
'argmax'
args
.
arch
.
sample_gt_prob_type
=
'zero'
args
.
navtask
.
task_params
.
data_augment
=
\
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
0
,
delta_xy
=
0
,
relight
=
False
,
relight_fast
=
False
,
structured
=
False
)
args
.
control
.
test_mode
=
'val'
args
.
navtask
.
task_params
.
batch_size
=
32
elif
mode
==
'bench'
:
# Actually testing the agent in settings that are kept same between
# different runs.
args
.
navtask
.
task_params
.
batch_size
=
16
args
.
control
.
test
=
True
args
.
arch
.
action_sample_type
=
'argmax'
args
.
arch
.
sample_gt_prob_type
=
'zero'
args
.
navtask
.
task_params
.
data_augment
=
\
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
0
,
delta_xy
=
0
,
relight
=
False
,
relight_fast
=
False
,
structured
=
False
)
args
.
summary
.
test_iters
=
250
args
.
control
.
only_eval_when_done
=
True
args
.
control
.
reset_rng_seed
=
True
args
.
control
.
test_mode
=
'test'
else
:
logging
.
fatal
(
'Unknown mode: %s.'
,
mode
)
assert
(
False
)
return
args
def
get_solver_vars
(
solver_str
):
if
solver_str
==
''
:
vals
=
[];
else
:
vals
=
solver_str
.
split
(
'_'
)
ks
=
[
'clip'
,
'dlw'
,
'long'
,
'typ'
,
'isdk'
,
'adam_eps'
,
'init_lr'
];
ks
=
ks
[:
len
(
vals
)]
# Gradient clipping or not.
if
len
(
vals
)
==
0
:
ks
.
append
(
'clip'
);
vals
.
append
(
'noclip'
);
# data loss weight.
if
len
(
vals
)
==
1
:
ks
.
append
(
'dlw'
);
vals
.
append
(
'dlw20'
)
# how long to train for.
if
len
(
vals
)
==
2
:
ks
.
append
(
'long'
);
vals
.
append
(
'nolong'
)
# Adam
if
len
(
vals
)
==
3
:
ks
.
append
(
'typ'
);
vals
.
append
(
'adam2'
)
# reg loss wt
if
len
(
vals
)
==
4
:
ks
.
append
(
'rlw'
);
vals
.
append
(
'rlw1'
)
# isd_k
if
len
(
vals
)
==
5
:
ks
.
append
(
'isdk'
);
vals
.
append
(
'isdk415'
)
# 415, inflexion at 2.5k.
# adam eps
if
len
(
vals
)
==
6
:
ks
.
append
(
'adam_eps'
);
vals
.
append
(
'aeps1en8'
)
# init lr
if
len
(
vals
)
==
7
:
ks
.
append
(
'init_lr'
);
vals
.
append
(
'lr1en3'
)
assert
(
len
(
vals
)
==
8
)
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'solver_vars: %s'
,
vars
)
return
vars
def
process_solver_str
(
solver_str
):
solver
=
utils
.
Foo
(
seed
=
0
,
learning_rate_decay
=
None
,
clip_gradient_norm
=
None
,
max_steps
=
None
,
initial_learning_rate
=
None
,
momentum
=
None
,
steps_per_decay
=
None
,
logdir
=
None
,
sync
=
False
,
adjust_lr_sync
=
True
,
wt_decay
=
0.0001
,
data_loss_wt
=
None
,
reg_loss_wt
=
None
,
freeze_conv
=
True
,
num_workers
=
1
,
task
=
0
,
ps_tasks
=
0
,
master
=
'local'
,
typ
=
None
,
momentum2
=
None
,
adam_eps
=
None
)
# Clobber with overrides from solver str.
solver_vars
=
get_solver_vars
(
solver_str
)
solver
.
data_loss_wt
=
float
(
solver_vars
.
dlw
[
3
:].
replace
(
'x'
,
'.'
))
solver
.
adam_eps
=
float
(
solver_vars
.
adam_eps
[
4
:].
replace
(
'x'
,
'.'
).
replace
(
'n'
,
'-'
))
solver
.
initial_learning_rate
=
float
(
solver_vars
.
init_lr
[
2
:].
replace
(
'x'
,
'.'
).
replace
(
'n'
,
'-'
))
solver
.
reg_loss_wt
=
float
(
solver_vars
.
rlw
[
3
:].
replace
(
'x'
,
'.'
))
solver
.
isd_k
=
float
(
solver_vars
.
isdk
[
4
:].
replace
(
'x'
,
'.'
))
long
=
solver_vars
.
long
if
long
==
'long'
:
solver
.
steps_per_decay
=
40000
solver
.
max_steps
=
120000
elif
long
==
'long2'
:
solver
.
steps_per_decay
=
80000
solver
.
max_steps
=
120000
elif
long
==
'nolong'
or
long
==
'nol'
:
solver
.
steps_per_decay
=
20000
solver
.
max_steps
=
60000
else
:
logging
.
fatal
(
'solver_vars.long should be long, long2, nolong or nol.'
)
assert
(
False
)
clip
=
solver_vars
.
clip
if
clip
==
'noclip'
or
clip
==
'nocl'
:
solver
.
clip_gradient_norm
=
0
elif
clip
[:
4
]
==
'clip'
:
solver
.
clip_gradient_norm
=
float
(
clip
[
4
:].
replace
(
'x'
,
'.'
))
else
:
logging
.
fatal
(
'Unknown solver_vars.clip: %s'
,
clip
)
assert
(
False
)
typ
=
solver_vars
.
typ
if
typ
==
'adam'
:
solver
.
typ
=
'adam'
solver
.
momentum
=
0.9
solver
.
momentum2
=
0.999
solver
.
learning_rate_decay
=
1.0
elif
typ
==
'adam2'
:
solver
.
typ
=
'adam'
solver
.
momentum
=
0.9
solver
.
momentum2
=
0.999
solver
.
learning_rate_decay
=
0.1
elif
typ
==
'sgd'
:
solver
.
typ
=
'sgd'
solver
.
momentum
=
0.99
solver
.
momentum2
=
None
solver
.
learning_rate_decay
=
0.1
else
:
logging
.
fatal
(
'Unknown solver_vars.typ: %s'
,
typ
)
assert
(
False
)
logging
.
error
(
'solver: %s'
,
solver
)
return
solver
def
get_navtask_vars
(
navtask_str
):
if
navtask_str
==
''
:
vals
=
[]
else
:
vals
=
navtask_str
.
split
(
'_'
)
ks_all
=
[
'dataset_name'
,
'modality'
,
'task'
,
'history'
,
'max_dist'
,
'num_steps'
,
'step_size'
,
'n_ori'
,
'aux_views'
,
'data_aug'
]
ks
=
ks_all
[:
len
(
vals
)]
# All data or not.
if
len
(
vals
)
==
0
:
ks
.
append
(
'dataset_name'
);
vals
.
append
(
'sbpd'
)
# modality
if
len
(
vals
)
==
1
:
ks
.
append
(
'modality'
);
vals
.
append
(
'rgb'
)
# semantic task?
if
len
(
vals
)
==
2
:
ks
.
append
(
'task'
);
vals
.
append
(
'r2r'
)
# number of history frames.
if
len
(
vals
)
==
3
:
ks
.
append
(
'history'
);
vals
.
append
(
'h0'
)
# max steps
if
len
(
vals
)
==
4
:
ks
.
append
(
'max_dist'
);
vals
.
append
(
'32'
)
# num steps
if
len
(
vals
)
==
5
:
ks
.
append
(
'num_steps'
);
vals
.
append
(
'40'
)
# step size
if
len
(
vals
)
==
6
:
ks
.
append
(
'step_size'
);
vals
.
append
(
'8'
)
# n_ori
if
len
(
vals
)
==
7
:
ks
.
append
(
'n_ori'
);
vals
.
append
(
'4'
)
# Auxiliary views.
if
len
(
vals
)
==
8
:
ks
.
append
(
'aux_views'
);
vals
.
append
(
'nv0'
)
# Normal data augmentation as opposed to structured data augmentation (if set
# to straug.
if
len
(
vals
)
==
9
:
ks
.
append
(
'data_aug'
);
vals
.
append
(
'straug'
)
assert
(
len
(
vals
)
==
10
)
for
i
in
range
(
len
(
ks
)):
assert
(
ks
[
i
]
==
ks_all
[
i
])
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'navtask_vars: %s'
,
vals
)
return
vars
def
process_navtask_str
(
navtask_str
):
navtask
=
nec
.
nav_env_base_config
()
# Clobber with overrides from strings.
navtask_vars
=
get_navtask_vars
(
navtask_str
)
navtask
.
task_params
.
n_ori
=
int
(
navtask_vars
.
n_ori
)
navtask
.
task_params
.
max_dist
=
int
(
navtask_vars
.
max_dist
)
navtask
.
task_params
.
num_steps
=
int
(
navtask_vars
.
num_steps
)
navtask
.
task_params
.
step_size
=
int
(
navtask_vars
.
step_size
)
navtask
.
task_params
.
data_augment
.
delta_xy
=
int
(
navtask_vars
.
step_size
)
/
2.
n_aux_views_each
=
int
(
navtask_vars
.
aux_views
[
2
])
aux_delta_thetas
=
np
.
concatenate
((
np
.
arange
(
n_aux_views_each
)
+
1
,
-
1
-
np
.
arange
(
n_aux_views_each
)))
aux_delta_thetas
=
aux_delta_thetas
*
np
.
deg2rad
(
navtask
.
camera_param
.
fov
)
navtask
.
task_params
.
aux_delta_thetas
=
aux_delta_thetas
if
navtask_vars
.
data_aug
==
'aug'
:
navtask
.
task_params
.
data_augment
.
structured
=
False
elif
navtask_vars
.
data_aug
==
'straug'
:
navtask
.
task_params
.
data_augment
.
structured
=
True
else
:
logging
.
fatal
(
'Unknown navtask_vars.data_aug %s.'
,
navtask_vars
.
data_aug
)
assert
(
False
)
navtask
.
task_params
.
num_history_frames
=
int
(
navtask_vars
.
history
[
1
:])
navtask
.
task_params
.
n_views
=
1
+
navtask
.
task_params
.
num_history_frames
navtask
.
task_params
.
goal_channels
=
int
(
navtask_vars
.
n_ori
)
if
navtask_vars
.
task
==
'hard'
:
navtask
.
task_params
.
type
=
'rng_rejection_sampling_many'
navtask
.
task_params
.
rejection_sampling_M
=
2000
navtask
.
task_params
.
min_dist
=
10
elif
navtask_vars
.
task
==
'r2r'
:
navtask
.
task_params
.
type
=
'room_to_room_many'
elif
navtask_vars
.
task
==
'ST'
:
# Semantic task at hand.
navtask
.
task_params
.
goal_channels
=
\
len
(
navtask
.
task_params
.
semantic_task
.
class_map_names
)
navtask
.
task_params
.
rel_goal_loc_dim
=
\
len
(
navtask
.
task_params
.
semantic_task
.
class_map_names
)
navtask
.
task_params
.
type
=
'to_nearest_obj_acc'
else
:
logging
.
fatal
(
'navtask_vars.task: should be hard or r2r, ST'
)
assert
(
False
)
if
navtask_vars
.
modality
==
'rgb'
:
navtask
.
camera_param
.
modalities
=
[
'rgb'
]
navtask
.
camera_param
.
img_channels
=
3
elif
navtask_vars
.
modality
==
'd'
:
navtask
.
camera_param
.
modalities
=
[
'depth'
]
navtask
.
camera_param
.
img_channels
=
2
navtask
.
task_params
.
img_height
=
navtask
.
camera_param
.
height
navtask
.
task_params
.
img_width
=
navtask
.
camera_param
.
width
navtask
.
task_params
.
modalities
=
navtask
.
camera_param
.
modalities
navtask
.
task_params
.
img_channels
=
navtask
.
camera_param
.
img_channels
navtask
.
task_params
.
img_fov
=
navtask
.
camera_param
.
fov
navtask
.
dataset
=
factory
.
get_dataset
(
navtask_vars
.
dataset_name
)
return
navtask
cognitive_mapping_and_planning/cfgs/config_distill.py
0 → 100644
View file @
5b9d9097
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
pprint
import
copy
import
os
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
logging
import
src.utils
as
utils
import
cfgs.config_common
as
cc
import
tensorflow
as
tf
rgb_resnet_v2_50_path
=
'cache/resnet_v2_50_inception_preprocessed/model.ckpt-5136169'
def
get_default_args
():
robot
=
utils
.
Foo
(
radius
=
15
,
base
=
10
,
height
=
140
,
sensor_height
=
120
,
camera_elevation_degree
=-
15
)
camera_param
=
utils
.
Foo
(
width
=
225
,
height
=
225
,
z_near
=
0.05
,
z_far
=
20.0
,
fov
=
60.
,
modalities
=
[
'rgb'
,
'depth'
])
env
=
utils
.
Foo
(
padding
=
10
,
resolution
=
5
,
num_point_threshold
=
2
,
valid_min
=-
10
,
valid_max
=
200
,
n_samples_per_face
=
200
)
data_augment
=
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
1
,
delta_xy
=
4
,
relight
=
False
,
relight_fast
=
False
,
structured
=
False
)
task_params
=
utils
.
Foo
(
num_actions
=
4
,
step_size
=
4
,
num_steps
=
0
,
batch_size
=
32
,
room_seed
=
0
,
base_class
=
'Building'
,
task
=
'mapping'
,
n_ori
=
6
,
data_augment
=
data_augment
,
output_transform_to_global_map
=
False
,
output_canonical_map
=
False
,
output_incremental_transform
=
False
,
output_free_space
=
False
,
move_type
=
'shortest_path'
,
toy_problem
=
0
)
buildinger_args
=
utils
.
Foo
(
building_names
=
[
'area1_gates_wingA_floor1_westpart'
],
env_class
=
None
,
robot
=
robot
,
task_params
=
task_params
,
env
=
env
,
camera_param
=
camera_param
)
solver_args
=
utils
.
Foo
(
seed
=
0
,
learning_rate_decay
=
0.1
,
clip_gradient_norm
=
0
,
max_steps
=
120000
,
initial_learning_rate
=
0.001
,
momentum
=
0.99
,
steps_per_decay
=
40000
,
logdir
=
None
,
sync
=
False
,
adjust_lr_sync
=
True
,
wt_decay
=
0.0001
,
data_loss_wt
=
1.0
,
reg_loss_wt
=
1.0
,
num_workers
=
1
,
task
=
0
,
ps_tasks
=
0
,
master
=
'local'
)
summary_args
=
utils
.
Foo
(
display_interval
=
1
,
test_iters
=
100
)
control_args
=
utils
.
Foo
(
train
=
False
,
test
=
False
,
force_batchnorm_is_training_at_test
=
False
)
arch_args
=
utils
.
Foo
(
rgb_encoder
=
'resnet_v2_50'
,
d_encoder
=
'resnet_v2_50'
)
return
utils
.
Foo
(
solver
=
solver_args
,
summary
=
summary_args
,
control
=
control_args
,
arch
=
arch_args
,
buildinger
=
buildinger_args
)
def
get_vars
(
config_name
):
vars
=
config_name
.
split
(
'_'
)
if
len
(
vars
)
==
1
:
# All data or not.
vars
.
append
(
'noall'
)
if
len
(
vars
)
==
2
:
# n_ori
vars
.
append
(
'4'
)
logging
.
error
(
'vars: %s'
,
vars
)
return
vars
def
get_args_for_config
(
config_name
):
args
=
get_default_args
()
config_name
,
mode
=
config_name
.
split
(
'+'
)
vars
=
get_vars
(
config_name
)
logging
.
info
(
'config_name: %s, mode: %s'
,
config_name
,
mode
)
args
.
buildinger
.
task_params
.
n_ori
=
int
(
vars
[
2
])
args
.
solver
.
freeze_conv
=
True
args
.
solver
.
pretrained_path
=
resnet_v2_50_path
args
.
buildinger
.
task_params
.
img_channels
=
5
args
.
solver
.
data_loss_wt
=
0.00001
if
vars
[
0
]
==
'v0'
:
None
else
:
logging
.
error
(
'config_name: %s undefined'
,
config_name
)
args
.
buildinger
.
task_params
.
height
=
args
.
buildinger
.
camera_param
.
height
args
.
buildinger
.
task_params
.
width
=
args
.
buildinger
.
camera_param
.
width
args
.
buildinger
.
task_params
.
modalities
=
args
.
buildinger
.
camera_param
.
modalities
if
vars
[
1
]
==
'all'
:
args
=
cc
.
get_args_for_mode_building_all
(
args
,
mode
)
elif
vars
[
1
]
==
'noall'
:
args
=
cc
.
get_args_for_mode_building
(
args
,
mode
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
cognitive_mapping_and_planning/cfgs/config_vision_baseline.py
0 → 100644
View file @
5b9d9097
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
pprint
import
os
import
numpy
as
np
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
logging
import
src.utils
as
utils
import
cfgs.config_common
as
cc
import
datasets.nav_env_config
as
nec
import
tensorflow
as
tf
FLAGS
=
flags
.
FLAGS
get_solver_vars
=
cc
.
get_solver_vars
get_navtask_vars
=
cc
.
get_navtask_vars
rgb_resnet_v2_50_path
=
'data/init_models/resnet_v2_50/model.ckpt-5136169'
d_resnet_v2_50_path
=
'data/init_models/distill_rgb_to_d_resnet_v2_50/model.ckpt-120002'
def
get_default_args
():
summary_args
=
utils
.
Foo
(
display_interval
=
1
,
test_iters
=
26
,
arop_full_summary_iters
=
14
)
control_args
=
utils
.
Foo
(
train
=
False
,
test
=
False
,
force_batchnorm_is_training_at_test
=
False
,
reset_rng_seed
=
False
,
only_eval_when_done
=
False
,
test_mode
=
None
)
return
summary_args
,
control_args
def
get_default_baseline_args
():
batch_norm_param
=
{
'center'
:
True
,
'scale'
:
True
,
'activation_fn'
:
tf
.
nn
.
relu
}
arch_args
=
utils
.
Foo
(
pred_neurons
=
[],
goal_embed_neurons
=
[],
img_embed_neurons
=
[],
batch_norm_param
=
batch_norm_param
,
dim_reduce_neurons
=
64
,
combine_type
=
''
,
encoder
=
'resnet_v2_50'
,
action_sample_type
=
'sample'
,
action_sample_combine_type
=
'one_or_other'
,
sample_gt_prob_type
=
'inverse_sigmoid_decay'
,
dagger_sample_bn_false
=
True
,
isd_k
=
750.
,
use_visit_count
=
False
,
lstm_output
=
False
,
lstm_ego
=
False
,
lstm_img
=
False
,
fc_dropout
=
0.0
,
embed_goal_for_state
=
False
,
lstm_output_init_state_from_goal
=
False
)
return
arch_args
def
get_arch_vars
(
arch_str
):
if
arch_str
==
''
:
vals
=
[]
else
:
vals
=
arch_str
.
split
(
'_'
)
ks
=
[
'ver'
,
'lstm_dim'
,
'dropout'
]
# Exp Ver
if
len
(
vals
)
==
0
:
vals
.
append
(
'v0'
)
# LSTM dimentsions
if
len
(
vals
)
==
1
:
vals
.
append
(
'lstm2048'
)
# Dropout
if
len
(
vals
)
==
2
:
vals
.
append
(
'noDO'
)
assert
(
len
(
vals
)
==
3
)
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'arch_vars: %s'
,
vars
)
return
vars
def
process_arch_str
(
args
,
arch_str
):
# This function modifies args.
args
.
arch
=
get_default_baseline_args
()
arch_vars
=
get_arch_vars
(
arch_str
)
args
.
navtask
.
task_params
.
outputs
.
rel_goal_loc
=
True
args
.
navtask
.
task_params
.
input_type
=
'vision'
args
.
navtask
.
task_params
.
outputs
.
images
=
True
if
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'rgb'
:
args
.
solver
.
pretrained_path
=
rgb_resnet_v2_50_path
elif
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'depth'
:
args
.
solver
.
pretrained_path
=
d_resnet_v2_50_path
else
:
logging
.
fatal
(
'Neither of rgb or d'
)
if
arch_vars
.
dropout
==
'DO'
:
args
.
arch
.
fc_dropout
=
0.5
args
.
tfcode
=
'B'
exp_ver
=
arch_vars
.
ver
if
exp_ver
==
'v0'
:
# Multiplicative interaction between goal loc and image features.
args
.
arch
.
combine_type
=
'multiply'
args
.
arch
.
pred_neurons
=
[
256
,
256
]
args
.
arch
.
goal_embed_neurons
=
[
64
,
8
]
args
.
arch
.
img_embed_neurons
=
[
1024
,
512
,
256
*
8
]
elif
exp_ver
==
'v1'
:
# Additive interaction between goal and image features.
args
.
arch
.
combine_type
=
'add'
args
.
arch
.
pred_neurons
=
[
256
,
256
]
args
.
arch
.
goal_embed_neurons
=
[
64
,
256
]
args
.
arch
.
img_embed_neurons
=
[
1024
,
512
,
256
]
elif
exp_ver
==
'v2'
:
# LSTM at the output on top of multiple interactions.
args
.
arch
.
combine_type
=
'multiply'
args
.
arch
.
goal_embed_neurons
=
[
64
,
8
]
args
.
arch
.
img_embed_neurons
=
[
1024
,
512
,
256
*
8
]
args
.
arch
.
lstm_output
=
True
args
.
arch
.
lstm_output_dim
=
int
(
arch_vars
.
lstm_dim
[
4
:])
args
.
arch
.
pred_neurons
=
[
256
]
# The other is inside the LSTM.
elif
exp_ver
==
'v0blind'
:
# LSTM only on the goal location.
args
.
arch
.
combine_type
=
'goalonly'
args
.
arch
.
goal_embed_neurons
=
[
64
,
256
]
args
.
arch
.
img_embed_neurons
=
[
2
]
# I dont know what it will do otherwise.
args
.
arch
.
lstm_output
=
True
args
.
arch
.
lstm_output_dim
=
256
args
.
arch
.
pred_neurons
=
[
256
]
# The other is inside the LSTM.
else
:
logging
.
fatal
(
'exp_ver: %s undefined'
,
exp_ver
)
assert
(
False
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
def
get_args_for_config
(
config_name
):
args
=
utils
.
Foo
()
args
.
summary
,
args
.
control
=
get_default_args
()
exp_name
,
mode_str
=
config_name
.
split
(
'+'
)
arch_str
,
solver_str
,
navtask_str
=
exp_name
.
split
(
'.'
)
logging
.
error
(
'config_name: %s'
,
config_name
)
logging
.
error
(
'arch_str: %s'
,
arch_str
)
logging
.
error
(
'navtask_str: %s'
,
navtask_str
)
logging
.
error
(
'solver_str: %s'
,
solver_str
)
logging
.
error
(
'mode_str: %s'
,
mode_str
)
args
.
solver
=
cc
.
process_solver_str
(
solver_str
)
args
.
navtask
=
cc
.
process_navtask_str
(
navtask_str
)
args
=
process_arch_str
(
args
,
arch_str
)
args
.
arch
.
isd_k
=
args
.
solver
.
isd_k
# Train, test, etc.
mode
,
imset
=
mode_str
.
split
(
'_'
)
args
=
cc
.
adjust_args_for_mode
(
args
,
mode
)
args
.
navtask
.
building_names
=
args
.
navtask
.
dataset
.
get_split
(
imset
)
args
.
control
.
test_name
=
'{:s}_on_{:s}'
.
format
(
mode
,
imset
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
cognitive_mapping_and_planning/data/.gitignore
0 → 100644
View file @
5b9d9097
stanford_building_parser_dataset_raw
stanford_building_parser_dataset
init_models
cognitive_mapping_and_planning/data/README.md
0 → 100644
View file @
5b9d9097
This directory contains the data needed for training and benchmarking various
navigation models.
1.
Download the data from the [dataset website]
(http://buildingparser.stanford.edu/dataset.html).
1.
[
Raw meshes
](
https://goo.gl/forms/2YSPaO2UKmn5Td5m2
)
. We need the meshes
which are in the noXYZ folder. Download the tar files and place them in
the
`stanford_building_parser_dataset_raw`
folder. You need to download
`area_1_noXYZ.tar`
,
`area_3_noXYZ.tar`
,
`area_5a_noXYZ.tar`
,
`area_5b_noXYZ.tar`
,
`area_6_noXYZ.tar`
for training and
`area_4_noXYZ.tar`
for evaluation.
2.
[
Annotations
](
https://goo.gl/forms/4SoGp4KtH1jfRqEj2
)
for setting up
tasks. We will need the file called
`Stanford3dDataset_v1.2.zip`
. Place
the file in the directory
`stanford_building_parser_dataset_raw`
.
2.
Preprocess the data.
1.
Extract meshes using
`scripts/script_preprocess_meshes_S3DIS.sh`
. After
this
`ls data/stanford_building_parser_dataset/mesh`
should have 6
folders
`area1`
,
`area3`
,
`area4`
,
`area5a`
,
`area5b`
,
`area6`
, with
textures and obj files within each directory.
2.
Extract out room information and semantics from zip file using
`scripts/script_preprocess_annoations_S3DIS.sh`
. After this there should
be
`room-dimension`
and
`class-maps`
folder in
`data/stanford_building_parser_dataset`
. (If you find this script to
crash because of an exception in np.loadtxt while processing
`Area_5/office_19/Annotations/ceiling_1.txt`
, there is a special
character on line 323474, that should be removed manually.)
3.
Download ImageNet Pre-trained models. We used ResNet-v2-50 for representing
images. For RGB images this is pre-trained on ImageNet. For Depth images we
[
distill
](
https://arxiv.org/abs/1507.00448
)
the RGB model to depth images
using paired RGB-D images. Both there models are available through
`scripts/script_download_init_models.sh`
cognitive_mapping_and_planning/datasets/__init__.py
0 → 100644
View file @
5b9d9097
cognitive_mapping_and_planning/datasets/factory.py
0 → 100644
View file @
5b9d9097
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
r
"""Wrapper for selecting the navigation environment that we want to train and
test on.
"""
import
numpy
as
np
import
os
,
glob
import
platform
import
logging
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
render.swiftshader_renderer
as
renderer
import
src.file_utils
as
fu
import
src.utils
as
utils
def
get_dataset
(
dataset_name
):
if
dataset_name
==
'sbpd'
:
dataset
=
StanfordBuildingParserDataset
(
dataset_name
)
else
:
logging
.
fatal
(
'Not one of sbpd'
)
return
dataset
class
Loader
():
def
get_data_dir
():
pass
def
get_meta_data
(
self
,
file_name
,
data_dir
=
None
):
if
data_dir
is
None
:
data_dir
=
self
.
get_data_dir
()
full_file_name
=
os
.
path
.
join
(
data_dir
,
'meta'
,
file_name
)
assert
(
fu
.
exists
(
full_file_name
)),
\
'{:s} does not exist'
.
format
(
full_file_name
)
ext
=
os
.
path
.
splitext
(
full_file_name
)[
1
]
if
ext
==
'.txt'
:
ls
=
[]
with
fu
.
fopen
(
full_file_name
,
'r'
)
as
f
:
for
l
in
f
:
ls
.
append
(
l
.
rstrip
())
elif
ext
==
'.pkl'
:
ls
=
utils
.
load_variables
(
full_file_name
)
return
ls
def
load_building
(
self
,
name
,
data_dir
=
None
):
if
data_dir
is
None
:
data_dir
=
self
.
get_data_dir
()
out
=
{}
out
[
'name'
]
=
name
out
[
'data_dir'
]
=
data_dir
out
[
'room_dimension_file'
]
=
os
.
path
.
join
(
data_dir
,
'room-dimension'
,
name
+
'.pkl'
)
out
[
'class_map_folder'
]
=
os
.
path
.
join
(
data_dir
,
'class-maps'
)
return
out
def
load_building_meshes
(
self
,
building
):
dir_name
=
os
.
path
.
join
(
building
[
'data_dir'
],
'mesh'
,
building
[
'name'
])
mesh_file_name
=
glob
.
glob1
(
dir_name
,
'*.obj'
)[
0
]
mesh_file_name_full
=
os
.
path
.
join
(
dir_name
,
mesh_file_name
)
logging
.
error
(
'Loading building from obj file: %s'
,
mesh_file_name_full
)
shape
=
renderer
.
Shape
(
mesh_file_name_full
,
load_materials
=
True
,
name_prefix
=
building
[
'name'
]
+
'_'
)
return
[
shape
]
class
StanfordBuildingParserDataset
(
Loader
):
def
__init__
(
self
,
ver
):
self
.
ver
=
ver
self
.
data_dir
=
None
def
get_data_dir
(
self
):
if
self
.
data_dir
is
None
:
self
.
data_dir
=
'data/stanford_building_parser_dataset/'
return
self
.
data_dir
def
get_benchmark_sets
(
self
):
return
self
.
_get_benchmark_sets
()
def
get_split
(
self
,
split_name
):
if
self
.
ver
==
'sbpd'
:
return
self
.
_get_split
(
split_name
)
else
:
logging
.
fatal
(
'Unknown version.'
)
def
_get_benchmark_sets
(
self
):
sets
=
[
'train1'
,
'val'
,
'test'
]
return
sets
def
_get_split
(
self
,
split_name
):
train
=
[
'area1'
,
'area5a'
,
'area5b'
,
'area6'
]
train1
=
[
'area1'
]
val
=
[
'area3'
]
test
=
[
'area4'
]
sets
=
{}
sets
[
'train'
]
=
train
sets
[
'train1'
]
=
train1
sets
[
'val'
]
=
val
sets
[
'test'
]
=
test
sets
[
'all'
]
=
sorted
(
list
(
set
(
train
+
val
+
test
)))
return
sets
[
split_name
]
cognitive_mapping_and_planning/datasets/nav_env.py
0 → 100644
View file @
5b9d9097
This diff is collapsed.
Click to expand it.
cognitive_mapping_and_planning/datasets/nav_env_config.py
0 → 100644
View file @
5b9d9097
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Configs for stanford navigation environment.
Base config for stanford navigation enviornment.
"""
import
numpy
as
np
import
src.utils
as
utils
import
datasets.nav_env
as
nav_env
def
nav_env_base_config
():
"""Returns the base config for stanford navigation environment.
Returns:
Base config for stanford navigation environment.
"""
robot
=
utils
.
Foo
(
radius
=
15
,
base
=
10
,
height
=
140
,
sensor_height
=
120
,
camera_elevation_degree
=-
15
)
env
=
utils
.
Foo
(
padding
=
10
,
resolution
=
5
,
num_point_threshold
=
2
,
valid_min
=-
10
,
valid_max
=
200
,
n_samples_per_face
=
200
)
camera_param
=
utils
.
Foo
(
width
=
225
,
height
=
225
,
z_near
=
0.05
,
z_far
=
20.0
,
fov
=
60.
,
modalities
=
[
'rgb'
],
img_channels
=
3
)
data_augment
=
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
0.5
,
delta_xy
=
4
,
relight
=
True
,
relight_fast
=
False
,
structured
=
False
)
# if True, uses the same perturb for the whole episode.
outputs
=
utils
.
Foo
(
images
=
True
,
rel_goal_loc
=
False
,
loc_on_map
=
True
,
gt_dist_to_goal
=
True
,
ego_maps
=
False
,
ego_goal_imgs
=
False
,
egomotion
=
False
,
visit_count
=
False
,
analytical_counts
=
False
,
node_ids
=
True
,
readout_maps
=
False
)
# class_map_names=['board', 'chair', 'door', 'sofa', 'table']
class_map_names
=
[
'chair'
,
'door'
,
'table'
]
semantic_task
=
utils
.
Foo
(
class_map_names
=
class_map_names
,
pix_distance
=
16
,
sampling
=
'uniform'
)
# time per iteration for cmp is 0.82 seconds per episode with 3.4s overhead per batch.
task_params
=
utils
.
Foo
(
max_dist
=
32
,
step_size
=
8
,
num_steps
=
40
,
num_actions
=
4
,
batch_size
=
4
,
building_seed
=
0
,
num_goals
=
1
,
img_height
=
None
,
img_width
=
None
,
img_channels
=
None
,
modalities
=
None
,
outputs
=
outputs
,
map_scales
=
[
1.
],
map_crop_sizes
=
[
64
],
rel_goal_loc_dim
=
4
,
base_class
=
'Building'
,
task
=
'map+plan'
,
n_ori
=
4
,
type
=
'room_to_room_many'
,
data_augment
=
data_augment
,
room_regex
=
'^((?!hallway).)*$'
,
toy_problem
=
False
,
map_channels
=
1
,
gt_coverage
=
False
,
input_type
=
'maps'
,
full_information
=
False
,
aux_delta_thetas
=
[],
semantic_task
=
semantic_task
,
num_history_frames
=
0
,
node_ids_dim
=
1
,
perturbs_dim
=
4
,
map_resize_method
=
'linear_noantialiasing'
,
readout_maps_channels
=
1
,
readout_maps_scales
=
[],
readout_maps_crop_sizes
=
[],
n_views
=
1
,
reward_time_penalty
=
0.1
,
reward_at_goal
=
1.
,
discount_factor
=
0.99
,
rejection_sampling_M
=
100
,
min_dist
=
None
)
navtask_args
=
utils
.
Foo
(
building_names
=
[
'area1_gates_wingA_floor1_westpart'
],
env_class
=
nav_env
.
VisualNavigationEnv
,
robot
=
robot
,
task_params
=
task_params
,
env
=
env
,
camera_param
=
camera_param
,
cache_rooms
=
True
)
return
navtask_args
cognitive_mapping_and_planning/matplotlibrc
0 → 100644
View file @
5b9d9097
backend : agg
cognitive_mapping_and_planning/output/.gitignore
0 → 100644
View file @
5b9d9097
*
cognitive_mapping_and_planning/output/README.md
0 → 100644
View file @
5b9d9097
### Pre-Trained Models
We provide the following pre-trained models:
Config Name | Checkpoint | Mean Dist. | 50%ile Dist. | 75%ile Dist. | Success %age |
:-: | :-: | :-: | :-: | :-: | :-: |
cmp.lmap_Msc.clip5.sbpd_d_r2r |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/cmp.lmap_Msc.clip5.sbpd_d_r2r.tar
)
| 4.79 | 0 | 1 | 78.9 |
cmp.lmap_Msc.clip5.sbpd_rgb_r2r |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/cmp.lmap_Msc.clip5.sbpd_rgb_r2r.tar
)
| 7.74 | 0 | 14 | 62.4 |
cmp.lmap_Msc.clip5.sbpd_d_ST |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/cmp.lmap_Msc.clip5.sbpd_d_ST.tar
)
| 10.67 | 9 | 19 | 39.7 |
cmp.lmap_Msc.clip5.sbpd_rgb_ST |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/cmp.lmap_Msc.clip5.sbpd_rgb_ST.tar
)
| 11.27 | 10 | 19 | 35.6 |
cmp.lmap_Msc.clip5.sbpd_d_r2r_h0_64_80 |
[
ckpt
](
http:////download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/cmp.lmap_Msc.clip5.sbpd_d_r2r_h0_64_80.tar
)
| 11.6 | 0 | 19 | 66.9 |
bl.v2.noclip.sbpd_d_r2r |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/bl.v2.noclip.sbpd_d_r2r.tar
)
| 5.90 | 0 | 6 | 71.2 |
bl.v2.noclip.sbpd_rgb_r2r |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/bl.v2.noclip.sbpd_rgb_r2r.tar
)
| 10.21 | 1 | 21 | 53.4 |
bl.v2.noclip.sbpd_d_ST |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/bl.v2.noclip.sbpd_d_ST.tar
)
| 13.29 | 14 | 23 | 28.0 |
bl.v2.noclip.sbpd_rgb_ST |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/bl.v2.noclip.sbpd_rgb_ST.tar
)
| 13.37 | 13 | 20 | 24.2 |
bl.v2.noclip.sbpd_d_r2r_h0_64_80 |
[
ckpt
](
http:////download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/bl.v2.noclip.sbpd_d_r2r_h0_64_80.tar
)
| 15.30 | 0 | 29 | 57.9 |
cognitive_mapping_and_planning/patches/GLES2_2_0.py.patch
0 → 100644
View file @
5b9d9097
10c10
< from OpenGL import platform, constant, arrays
---
> from OpenGL import platform, constant, arrays, contextdata
249a250
> from OpenGL._bytes import _NULL_8_BYTE
399c400
< array = ArrayDatatype.asArray( pointer, type )
---
> array = arrays.ArrayDatatype.asArray( pointer, type )
405c406
< ArrayDatatype.voidDataPointer( array )
---
> arrays.ArrayDatatype.voidDataPointer( array )
cognitive_mapping_and_planning/patches/apply_patches.sh
0 → 100644
View file @
5b9d9097
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
echo
$VIRTUAL_ENV
patch
$VIRTUAL_ENV
/local/lib/python2.7/site-packages/OpenGL/GLES2/VERSION/GLES2_2_0.py patches/GLES2_2_0.py.patch
patch
$VIRTUAL_ENV
/local/lib/python2.7/site-packages/OpenGL/platform/ctypesloader.py patches/ctypesloader.py.patch
cognitive_mapping_and_planning/patches/ctypesloader.py.patch
0 → 100644
View file @
5b9d9097
45c45,46
< return dllType( name, mode )
---
> print './' + name
> return dllType( './' + name, mode )
47,48c48,53
< err.args += (name,fullName)
< raise
---
> try:
> print name
> return dllType( name, mode )
> except:
> err.args += (name,fullName)
> raise
Prev
1
2
3
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment