Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
3e6b2f20
Commit
3e6b2f20
authored
Apr 21, 2017
by
Lukasz Kaiser
Committed by
GitHub
Apr 21, 2017
Browse files
Merge pull request #1375 from s-gupta/cmp
Implementation for Cognitive Mapping and Planning paper.
parents
c5a5d558
5b9d9097
Changes
51
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
2763 additions
and
0 deletions
+2763
-0
cognitive_mapping_and_planning/.gitignore
cognitive_mapping_and_planning/.gitignore
+4
-0
cognitive_mapping_and_planning/README.md
cognitive_mapping_and_planning/README.md
+122
-0
cognitive_mapping_and_planning/__init__.py
cognitive_mapping_and_planning/__init__.py
+0
-0
cognitive_mapping_and_planning/cfgs/__init__.py
cognitive_mapping_and_planning/cfgs/__init__.py
+0
-0
cognitive_mapping_and_planning/cfgs/config_cmp.py
cognitive_mapping_and_planning/cfgs/config_cmp.py
+283
-0
cognitive_mapping_and_planning/cfgs/config_common.py
cognitive_mapping_and_planning/cfgs/config_common.py
+261
-0
cognitive_mapping_and_planning/cfgs/config_distill.py
cognitive_mapping_and_planning/cfgs/config_distill.py
+114
-0
cognitive_mapping_and_planning/cfgs/config_vision_baseline.py
...itive_mapping_and_planning/cfgs/config_vision_baseline.py
+173
-0
cognitive_mapping_and_planning/data/.gitignore
cognitive_mapping_and_planning/data/.gitignore
+3
-0
cognitive_mapping_and_planning/data/README.md
cognitive_mapping_and_planning/data/README.md
+33
-0
cognitive_mapping_and_planning/datasets/__init__.py
cognitive_mapping_and_planning/datasets/__init__.py
+0
-0
cognitive_mapping_and_planning/datasets/factory.py
cognitive_mapping_and_planning/datasets/factory.py
+113
-0
cognitive_mapping_and_planning/datasets/nav_env.py
cognitive_mapping_and_planning/datasets/nav_env.py
+1465
-0
cognitive_mapping_and_planning/datasets/nav_env_config.py
cognitive_mapping_and_planning/datasets/nav_env_config.py
+127
-0
cognitive_mapping_and_planning/matplotlibrc
cognitive_mapping_and_planning/matplotlibrc
+1
-0
cognitive_mapping_and_planning/output/.gitignore
cognitive_mapping_and_planning/output/.gitignore
+1
-0
cognitive_mapping_and_planning/output/README.md
cognitive_mapping_and_planning/output/README.md
+16
-0
cognitive_mapping_and_planning/patches/GLES2_2_0.py.patch
cognitive_mapping_and_planning/patches/GLES2_2_0.py.patch
+14
-0
cognitive_mapping_and_planning/patches/apply_patches.sh
cognitive_mapping_and_planning/patches/apply_patches.sh
+18
-0
cognitive_mapping_and_planning/patches/ctypesloader.py.patch
cognitive_mapping_and_planning/patches/ctypesloader.py.patch
+15
-0
No files found.
cognitive_mapping_and_planning/.gitignore
0 → 100644
View file @
3e6b2f20
deps
*.pyc
lib*.so
lib*.so*
cognitive_mapping_and_planning/README.md
0 → 100644
View file @
3e6b2f20
# Cognitive Mapping and Planning for Visual Navigation
**Saurabh Gupta, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik**
**Computer Vision and Pattern Recognition (CVPR) 2017.**
**
[
ArXiv
](
https://arxiv.org/abs/1702.03920
)
,
[
Project Website
](
https://sites.google.com/corp/view/cognitive-mapping-and-planning/
)
**
### Citing
If you find this code base and models useful in your research, please consider
citing the following paper:
```
@inproceedings{gupta2017cognitive,
title={Cognitive Mapping and Planning for Visual Navigation},
author={Gupta, Saurabh and Davidson, James and Levine, Sergey and
Sukthankar, Rahul and Malik, Jitendra},
booktitle={CVPR},
year={2017}
}
```
### Contents
1.
[
Requirements: software
](
#requirements-software
)
2.
[
Requirements: data
](
#requirements-data
)
3.
[
Test Pre-trained Models
](
#test-pre_trained-models
)
4.
[
Train your Own Models
](
#train-your-own-models
)
### Requirements: software
1.
Python Virtual Env Setup: All code is implemented in Python but depends on a
small number of python packages and a couple of C libraries. We recommend
using virtual environment for installing these python packages and python
bindings for these C libraries.
```
Shell
VENV_DIR=venv
pip install virtualenv
virtualenv $VENV_DIR
source $VENV_DIR/bin/activate
# You may need to upgrade pip for installing openv-python.
pip install --upgrade pip
# Install simple dependencies.
pip install -r requirements.txt
# Patch bugs in dependencies.
sh patches/apply_patches.sh
```
2.
Install
[
Tensorflow
](
https://www.tensorflow.org/
)
inside this virtual
environment. Typically done with
`pip install --upgrade tensorflow-gpu`
.
3.
Swiftshader: We use
[
Swiftshader
](
https://github.com/google/swiftshader.git
)
, a CPU based
renderer to render the meshes. It is possible to use other renderers,
replace
`SwiftshaderRenderer`
in
`render/swiftshader_renderer.py`
with
bindings to your renderer.
```
Shell
mkdir -p deps
git clone --recursive https://github.com/google/swiftshader.git deps/swiftshader-src
cd deps/swiftshader-src && git checkout 91da6b00584afd7dcaed66da88e2b617429b3950
mkdir build && cd build && cmake .. && make -j 16 libEGL libGLESv2
cd ../../../
cp deps/swiftshader-src/build/libEGL* libEGL.so.1
cp deps/swiftshader-src/build/libGLESv2* libGLESv2.so.2
```
4.
PyAssimp: We use
[
PyAssimp
](
https://github.com/assimp/assimp.git
)
to load
meshes. It is possible to use other libraries to load meshes, replace
`Shape`
`render/swiftshader_renderer.py`
with bindings to your library for
loading meshes.
```
Shell
mkdir -p deps
git clone https://github.com/assimp/assimp.git deps/assimp-src
cd deps/assimp-src
git checkout 2afeddd5cb63d14bc77b53740b38a54a97d94ee8
cmake CMakeLists.txt -G 'Unix Makefiles' && make -j 16
cd port/PyAssimp && python setup.py install
cd ../../../..
cp deps/assimp-src/lib/libassimp* .
```
5.
graph-tool: We use
[
graph-tool
](
https://git.skewed.de/count0/graph-tool
)
library for graph processing.
```
Shell
mkdir -p deps
# If the following git clone command fails, you can also download the source
# from https://downloads.skewed.de/graph-tool/graph-tool-2.2.44.tar.bz2
git clone https://git.skewed.de/count0/graph-tool deps/graph-tool-src
cd deps/graph-tool-src && git checkout 178add3a571feb6666f4f119027705d95d2951ab
bash autogen.sh
./configure --disable-cairo --disable-sparsehash --prefix=$HOME/.local
make -j 16
make install
cd ../../
```
### Requirements: data
1.
Download the Stanford 3D Inddor Spaces Dataset (S3DIS Dataset) and ImageNet
Pre-trained models for initializing different models. Follow instructions in
`data/README.md`
### Test Pre-trained Models
1.
Download pre-trained models using
`scripts/scripts_download_pretrained_models.sh`
2.
Test models using
`scripts/script_test_pretrained_models.sh`
.
### Train Your Own Models
All models were trained asynchronously with 16 workers each worker using data
from a single floor. The default hyper-parameters coorespond to this setting.
See
[
distributed training with
Tensorflow
](
https://www.tensorflow.org/deploy/distributed
)
for setting up
distributed training. Training with a single worker is possible with the current
code base but will require some minor changes to allow each worker to load all
training environments.
### Contact
For questions or issues open an issue on the tensorflow/models
[
issues
tracker
](
https://github.com/tensorflow/models/issues
)
. Please assign issues to
@s-gupta.
### Credits
This code was written by Saurabh Gupta (@s-gupta).
cognitive_mapping_and_planning/__init__.py
0 → 100644
View file @
3e6b2f20
cognitive_mapping_and_planning/cfgs/__init__.py
0 → 100644
View file @
3e6b2f20
cognitive_mapping_and_planning/cfgs/config_cmp.py
0 → 100644
View file @
3e6b2f20
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
os
,
sys
import
numpy
as
np
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
logging
import
src.utils
as
utils
import
cfgs.config_common
as
cc
import
tensorflow
as
tf
rgb_resnet_v2_50_path
=
'data/init_models/resnet_v2_50/model.ckpt-5136169'
d_resnet_v2_50_path
=
'data/init_models/distill_rgb_to_d_resnet_v2_50/model.ckpt-120002'
def
get_default_args
():
summary_args
=
utils
.
Foo
(
display_interval
=
1
,
test_iters
=
26
,
arop_full_summary_iters
=
14
)
control_args
=
utils
.
Foo
(
train
=
False
,
test
=
False
,
force_batchnorm_is_training_at_test
=
False
,
reset_rng_seed
=
False
,
only_eval_when_done
=
False
,
test_mode
=
None
)
return
summary_args
,
control_args
def
get_default_cmp_args
():
batch_norm_param
=
{
'center'
:
True
,
'scale'
:
True
,
'activation_fn'
:
tf
.
nn
.
relu
}
mapper_arch_args
=
utils
.
Foo
(
dim_reduce_neurons
=
64
,
fc_neurons
=
[
1024
,
1024
],
fc_out_size
=
8
,
fc_out_neurons
=
64
,
encoder
=
'resnet_v2_50'
,
deconv_neurons
=
[
64
,
32
,
16
,
8
,
4
,
2
],
deconv_strides
=
[
2
,
2
,
2
,
2
,
2
,
2
],
deconv_layers_per_block
=
2
,
deconv_kernel_size
=
4
,
fc_dropout
=
0.5
,
combine_type
=
'wt_avg_logits'
,
batch_norm_param
=
batch_norm_param
)
readout_maps_arch_args
=
utils
.
Foo
(
num_neurons
=
[],
strides
=
[],
kernel_size
=
None
,
layers_per_block
=
None
)
arch_args
=
utils
.
Foo
(
vin_val_neurons
=
8
,
vin_action_neurons
=
8
,
vin_ks
=
3
,
vin_share_wts
=
False
,
pred_neurons
=
[
64
,
64
],
pred_batch_norm_param
=
batch_norm_param
,
conv_on_value_map
=
0
,
fr_neurons
=
16
,
fr_ver
=
'v2'
,
fr_inside_neurons
=
64
,
fr_stride
=
1
,
crop_remove_each
=
30
,
value_crop_size
=
4
,
action_sample_type
=
'sample'
,
action_sample_combine_type
=
'one_or_other'
,
sample_gt_prob_type
=
'inverse_sigmoid_decay'
,
dagger_sample_bn_false
=
True
,
vin_num_iters
=
36
,
isd_k
=
750.
,
use_agent_loc
=
False
,
multi_scale
=
True
,
readout_maps
=
False
,
rom_arch
=
readout_maps_arch_args
)
return
arch_args
,
mapper_arch_args
def
get_arch_vars
(
arch_str
):
if
arch_str
==
''
:
vals
=
[]
else
:
vals
=
arch_str
.
split
(
'_'
)
ks
=
[
'var1'
,
'var2'
,
'var3'
]
ks
=
ks
[:
len
(
vals
)]
# Exp Ver.
if
len
(
vals
)
==
0
:
ks
.
append
(
'var1'
);
vals
.
append
(
'v0'
)
# custom arch.
if
len
(
vals
)
==
1
:
ks
.
append
(
'var2'
);
vals
.
append
(
''
)
# map scape for projection baseline.
if
len
(
vals
)
==
2
:
ks
.
append
(
'var3'
);
vals
.
append
(
'fr2'
)
assert
(
len
(
vals
)
==
3
)
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'arch_vars: %s'
,
vars
)
return
vars
def
process_arch_str
(
args
,
arch_str
):
# This function modifies args.
args
.
arch
,
args
.
mapper_arch
=
get_default_cmp_args
()
arch_vars
=
get_arch_vars
(
arch_str
)
args
.
navtask
.
task_params
.
outputs
.
ego_maps
=
True
args
.
navtask
.
task_params
.
outputs
.
ego_goal_imgs
=
True
args
.
navtask
.
task_params
.
outputs
.
egomotion
=
True
args
.
navtask
.
task_params
.
toy_problem
=
False
if
arch_vars
.
var1
==
'lmap'
:
args
=
process_arch_learned_map
(
args
,
arch_vars
)
elif
arch_vars
.
var1
==
'pmap'
:
args
=
process_arch_projected_map
(
args
,
arch_vars
)
else
:
logging
.
fatal
(
'arch_vars.var1 should be lmap or pmap, but is %s'
,
arch_vars
.
var1
)
assert
(
False
)
return
args
def
process_arch_learned_map
(
args
,
arch_vars
):
# Multiscale vision based system.
args
.
navtask
.
task_params
.
input_type
=
'vision'
args
.
navtask
.
task_params
.
outputs
.
images
=
True
if
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'rgb'
:
args
.
solver
.
pretrained_path
=
rgb_resnet_v2_50_path
elif
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'depth'
:
args
.
solver
.
pretrained_path
=
d_resnet_v2_50_path
if
arch_vars
.
var2
==
'Ssc'
:
sc
=
1.
/
args
.
navtask
.
task_params
.
step_size
args
.
arch
.
vin_num_iters
=
40
args
.
navtask
.
task_params
.
map_scales
=
[
sc
]
max_dist
=
args
.
navtask
.
task_params
.
max_dist
*
\
args
.
navtask
.
task_params
.
num_goals
args
.
navtask
.
task_params
.
map_crop_sizes
=
[
2
*
max_dist
]
args
.
arch
.
fr_stride
=
1
args
.
arch
.
vin_action_neurons
=
8
args
.
arch
.
vin_val_neurons
=
3
args
.
arch
.
fr_inside_neurons
=
32
args
.
mapper_arch
.
pad_map_with_zeros_each
=
[
24
]
args
.
mapper_arch
.
deconv_neurons
=
[
64
,
32
,
16
]
args
.
mapper_arch
.
deconv_strides
=
[
1
,
2
,
1
]
elif
(
arch_vars
.
var2
==
'Msc'
or
arch_vars
.
var2
==
'MscROMms'
or
arch_vars
.
var2
==
'MscROMss'
or
arch_vars
.
var2
==
'MscNoVin'
):
# Code for multi-scale planner.
args
.
arch
.
vin_num_iters
=
8
args
.
arch
.
crop_remove_each
=
4
args
.
arch
.
value_crop_size
=
8
sc
=
1.
/
args
.
navtask
.
task_params
.
step_size
max_dist
=
args
.
navtask
.
task_params
.
max_dist
*
\
args
.
navtask
.
task_params
.
num_goals
n_scales
=
np
.
log2
(
float
(
max_dist
)
/
float
(
args
.
arch
.
vin_num_iters
))
n_scales
=
int
(
np
.
ceil
(
n_scales
)
+
1
)
args
.
navtask
.
task_params
.
map_scales
=
\
list
(
sc
*
(
0.5
**
(
np
.
arange
(
n_scales
))[::
-
1
]))
args
.
navtask
.
task_params
.
map_crop_sizes
=
[
16
for
x
in
range
(
n_scales
)]
args
.
arch
.
fr_stride
=
1
args
.
arch
.
vin_action_neurons
=
8
args
.
arch
.
vin_val_neurons
=
3
args
.
arch
.
fr_inside_neurons
=
32
args
.
mapper_arch
.
pad_map_with_zeros_each
=
[
0
for
_
in
range
(
n_scales
)]
args
.
mapper_arch
.
deconv_neurons
=
[
64
*
n_scales
,
32
*
n_scales
,
16
*
n_scales
]
args
.
mapper_arch
.
deconv_strides
=
[
1
,
2
,
1
]
if
arch_vars
.
var2
==
'MscNoVin'
:
# No planning version.
args
.
arch
.
fr_stride
=
[
1
,
2
,
1
,
2
]
args
.
arch
.
vin_action_neurons
=
None
args
.
arch
.
vin_val_neurons
=
16
args
.
arch
.
fr_inside_neurons
=
32
args
.
arch
.
crop_remove_each
=
0
args
.
arch
.
value_crop_size
=
4
args
.
arch
.
vin_num_iters
=
0
elif
arch_vars
.
var2
==
'MscROMms'
or
arch_vars
.
var2
==
'MscROMss'
:
# Code with read outs, MscROMms flattens and reads out,
# MscROMss does not flatten and produces output at multiple scales.
args
.
navtask
.
task_params
.
outputs
.
readout_maps
=
True
args
.
navtask
.
task_params
.
map_resize_method
=
'antialiasing'
args
.
arch
.
readout_maps
=
True
if
arch_vars
.
var2
==
'MscROMms'
:
args
.
arch
.
rom_arch
.
num_neurons
=
[
64
,
1
]
args
.
arch
.
rom_arch
.
kernel_size
=
4
args
.
arch
.
rom_arch
.
strides
=
[
2
,
2
]
args
.
arch
.
rom_arch
.
layers_per_block
=
2
args
.
navtask
.
task_params
.
readout_maps_crop_sizes
=
[
64
]
args
.
navtask
.
task_params
.
readout_maps_scales
=
[
sc
]
elif
arch_vars
.
var2
==
'MscROMss'
:
args
.
arch
.
rom_arch
.
num_neurons
=
\
[
64
,
len
(
args
.
navtask
.
task_params
.
map_scales
)]
args
.
arch
.
rom_arch
.
kernel_size
=
4
args
.
arch
.
rom_arch
.
strides
=
[
1
,
1
]
args
.
arch
.
rom_arch
.
layers_per_block
=
1
args
.
navtask
.
task_params
.
readout_maps_crop_sizes
=
\
args
.
navtask
.
task_params
.
map_crop_sizes
args
.
navtask
.
task_params
.
readout_maps_scales
=
\
args
.
navtask
.
task_params
.
map_scales
else
:
logging
.
fatal
(
'arch_vars.var2 not one of Msc, MscROMms, MscROMss, MscNoVin.'
)
assert
(
False
)
map_channels
=
args
.
mapper_arch
.
deconv_neurons
[
-
1
]
/
\
(
2
*
len
(
args
.
navtask
.
task_params
.
map_scales
))
args
.
navtask
.
task_params
.
map_channels
=
map_channels
return
args
def
process_arch_projected_map
(
args
,
arch_vars
):
# Single scale vision based system which does not use a mapper but instead
# uses an analytically estimated map.
ds
=
int
(
arch_vars
.
var3
[
2
])
args
.
navtask
.
task_params
.
input_type
=
'analytical_counts'
args
.
navtask
.
task_params
.
outputs
.
analytical_counts
=
True
assert
(
args
.
navtask
.
task_params
.
modalities
[
0
]
==
'depth'
)
args
.
navtask
.
camera_param
.
img_channels
=
None
analytical_counts
=
utils
.
Foo
(
map_sizes
=
[
512
/
ds
],
xy_resolution
=
[
5.
*
ds
],
z_bins
=
[[
-
10
,
10
,
150
,
200
]],
non_linearity
=
[
arch_vars
.
var2
])
args
.
navtask
.
task_params
.
analytical_counts
=
analytical_counts
sc
=
1.
/
ds
args
.
arch
.
vin_num_iters
=
36
args
.
navtask
.
task_params
.
map_scales
=
[
sc
]
args
.
navtask
.
task_params
.
map_crop_sizes
=
[
512
/
ds
]
args
.
arch
.
fr_stride
=
[
1
,
2
]
args
.
arch
.
vin_action_neurons
=
8
args
.
arch
.
vin_val_neurons
=
3
args
.
arch
.
fr_inside_neurons
=
32
map_channels
=
len
(
analytical_counts
.
z_bins
[
0
])
+
1
args
.
navtask
.
task_params
.
map_channels
=
map_channels
args
.
solver
.
freeze_conv
=
False
return
args
def
get_args_for_config
(
config_name
):
args
=
utils
.
Foo
()
args
.
summary
,
args
.
control
=
get_default_args
()
exp_name
,
mode_str
=
config_name
.
split
(
'+'
)
arch_str
,
solver_str
,
navtask_str
=
exp_name
.
split
(
'.'
)
logging
.
error
(
'config_name: %s'
,
config_name
)
logging
.
error
(
'arch_str: %s'
,
arch_str
)
logging
.
error
(
'navtask_str: %s'
,
navtask_str
)
logging
.
error
(
'solver_str: %s'
,
solver_str
)
logging
.
error
(
'mode_str: %s'
,
mode_str
)
args
.
solver
=
cc
.
process_solver_str
(
solver_str
)
args
.
navtask
=
cc
.
process_navtask_str
(
navtask_str
)
args
=
process_arch_str
(
args
,
arch_str
)
args
.
arch
.
isd_k
=
args
.
solver
.
isd_k
# Train, test, etc.
mode
,
imset
=
mode_str
.
split
(
'_'
)
args
=
cc
.
adjust_args_for_mode
(
args
,
mode
)
args
.
navtask
.
building_names
=
args
.
navtask
.
dataset
.
get_split
(
imset
)
args
.
control
.
test_name
=
'{:s}_on_{:s}'
.
format
(
mode
,
imset
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
cognitive_mapping_and_planning/cfgs/config_common.py
0 → 100644
View file @
3e6b2f20
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
os
import
numpy
as
np
import
logging
import
src.utils
as
utils
import
datasets.nav_env_config
as
nec
from
datasets
import
factory
def
adjust_args_for_mode
(
args
,
mode
):
if
mode
==
'train'
:
args
.
control
.
train
=
True
elif
mode
==
'val1'
:
# Same settings as for training, to make sure nothing wonky is happening
# there.
args
.
control
.
test
=
True
args
.
control
.
test_mode
=
'val'
args
.
navtask
.
task_params
.
batch_size
=
32
elif
mode
==
'val2'
:
# No data augmentation, not sampling but taking the argmax action, not
# sampling from the ground truth at all.
args
.
control
.
test
=
True
args
.
arch
.
action_sample_type
=
'argmax'
args
.
arch
.
sample_gt_prob_type
=
'zero'
args
.
navtask
.
task_params
.
data_augment
=
\
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
0
,
delta_xy
=
0
,
relight
=
False
,
relight_fast
=
False
,
structured
=
False
)
args
.
control
.
test_mode
=
'val'
args
.
navtask
.
task_params
.
batch_size
=
32
elif
mode
==
'bench'
:
# Actually testing the agent in settings that are kept same between
# different runs.
args
.
navtask
.
task_params
.
batch_size
=
16
args
.
control
.
test
=
True
args
.
arch
.
action_sample_type
=
'argmax'
args
.
arch
.
sample_gt_prob_type
=
'zero'
args
.
navtask
.
task_params
.
data_augment
=
\
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
0
,
delta_xy
=
0
,
relight
=
False
,
relight_fast
=
False
,
structured
=
False
)
args
.
summary
.
test_iters
=
250
args
.
control
.
only_eval_when_done
=
True
args
.
control
.
reset_rng_seed
=
True
args
.
control
.
test_mode
=
'test'
else
:
logging
.
fatal
(
'Unknown mode: %s.'
,
mode
)
assert
(
False
)
return
args
def
get_solver_vars
(
solver_str
):
if
solver_str
==
''
:
vals
=
[];
else
:
vals
=
solver_str
.
split
(
'_'
)
ks
=
[
'clip'
,
'dlw'
,
'long'
,
'typ'
,
'isdk'
,
'adam_eps'
,
'init_lr'
];
ks
=
ks
[:
len
(
vals
)]
# Gradient clipping or not.
if
len
(
vals
)
==
0
:
ks
.
append
(
'clip'
);
vals
.
append
(
'noclip'
);
# data loss weight.
if
len
(
vals
)
==
1
:
ks
.
append
(
'dlw'
);
vals
.
append
(
'dlw20'
)
# how long to train for.
if
len
(
vals
)
==
2
:
ks
.
append
(
'long'
);
vals
.
append
(
'nolong'
)
# Adam
if
len
(
vals
)
==
3
:
ks
.
append
(
'typ'
);
vals
.
append
(
'adam2'
)
# reg loss wt
if
len
(
vals
)
==
4
:
ks
.
append
(
'rlw'
);
vals
.
append
(
'rlw1'
)
# isd_k
if
len
(
vals
)
==
5
:
ks
.
append
(
'isdk'
);
vals
.
append
(
'isdk415'
)
# 415, inflexion at 2.5k.
# adam eps
if
len
(
vals
)
==
6
:
ks
.
append
(
'adam_eps'
);
vals
.
append
(
'aeps1en8'
)
# init lr
if
len
(
vals
)
==
7
:
ks
.
append
(
'init_lr'
);
vals
.
append
(
'lr1en3'
)
assert
(
len
(
vals
)
==
8
)
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'solver_vars: %s'
,
vars
)
return
vars
def
process_solver_str
(
solver_str
):
solver
=
utils
.
Foo
(
seed
=
0
,
learning_rate_decay
=
None
,
clip_gradient_norm
=
None
,
max_steps
=
None
,
initial_learning_rate
=
None
,
momentum
=
None
,
steps_per_decay
=
None
,
logdir
=
None
,
sync
=
False
,
adjust_lr_sync
=
True
,
wt_decay
=
0.0001
,
data_loss_wt
=
None
,
reg_loss_wt
=
None
,
freeze_conv
=
True
,
num_workers
=
1
,
task
=
0
,
ps_tasks
=
0
,
master
=
'local'
,
typ
=
None
,
momentum2
=
None
,
adam_eps
=
None
)
# Clobber with overrides from solver str.
solver_vars
=
get_solver_vars
(
solver_str
)
solver
.
data_loss_wt
=
float
(
solver_vars
.
dlw
[
3
:].
replace
(
'x'
,
'.'
))
solver
.
adam_eps
=
float
(
solver_vars
.
adam_eps
[
4
:].
replace
(
'x'
,
'.'
).
replace
(
'n'
,
'-'
))
solver
.
initial_learning_rate
=
float
(
solver_vars
.
init_lr
[
2
:].
replace
(
'x'
,
'.'
).
replace
(
'n'
,
'-'
))
solver
.
reg_loss_wt
=
float
(
solver_vars
.
rlw
[
3
:].
replace
(
'x'
,
'.'
))
solver
.
isd_k
=
float
(
solver_vars
.
isdk
[
4
:].
replace
(
'x'
,
'.'
))
long
=
solver_vars
.
long
if
long
==
'long'
:
solver
.
steps_per_decay
=
40000
solver
.
max_steps
=
120000
elif
long
==
'long2'
:
solver
.
steps_per_decay
=
80000
solver
.
max_steps
=
120000
elif
long
==
'nolong'
or
long
==
'nol'
:
solver
.
steps_per_decay
=
20000
solver
.
max_steps
=
60000
else
:
logging
.
fatal
(
'solver_vars.long should be long, long2, nolong or nol.'
)
assert
(
False
)
clip
=
solver_vars
.
clip
if
clip
==
'noclip'
or
clip
==
'nocl'
:
solver
.
clip_gradient_norm
=
0
elif
clip
[:
4
]
==
'clip'
:
solver
.
clip_gradient_norm
=
float
(
clip
[
4
:].
replace
(
'x'
,
'.'
))
else
:
logging
.
fatal
(
'Unknown solver_vars.clip: %s'
,
clip
)
assert
(
False
)
typ
=
solver_vars
.
typ
if
typ
==
'adam'
:
solver
.
typ
=
'adam'
solver
.
momentum
=
0.9
solver
.
momentum2
=
0.999
solver
.
learning_rate_decay
=
1.0
elif
typ
==
'adam2'
:
solver
.
typ
=
'adam'
solver
.
momentum
=
0.9
solver
.
momentum2
=
0.999
solver
.
learning_rate_decay
=
0.1
elif
typ
==
'sgd'
:
solver
.
typ
=
'sgd'
solver
.
momentum
=
0.99
solver
.
momentum2
=
None
solver
.
learning_rate_decay
=
0.1
else
:
logging
.
fatal
(
'Unknown solver_vars.typ: %s'
,
typ
)
assert
(
False
)
logging
.
error
(
'solver: %s'
,
solver
)
return
solver
def
get_navtask_vars
(
navtask_str
):
if
navtask_str
==
''
:
vals
=
[]
else
:
vals
=
navtask_str
.
split
(
'_'
)
ks_all
=
[
'dataset_name'
,
'modality'
,
'task'
,
'history'
,
'max_dist'
,
'num_steps'
,
'step_size'
,
'n_ori'
,
'aux_views'
,
'data_aug'
]
ks
=
ks_all
[:
len
(
vals
)]
# All data or not.
if
len
(
vals
)
==
0
:
ks
.
append
(
'dataset_name'
);
vals
.
append
(
'sbpd'
)
# modality
if
len
(
vals
)
==
1
:
ks
.
append
(
'modality'
);
vals
.
append
(
'rgb'
)
# semantic task?
if
len
(
vals
)
==
2
:
ks
.
append
(
'task'
);
vals
.
append
(
'r2r'
)
# number of history frames.
if
len
(
vals
)
==
3
:
ks
.
append
(
'history'
);
vals
.
append
(
'h0'
)
# max steps
if
len
(
vals
)
==
4
:
ks
.
append
(
'max_dist'
);
vals
.
append
(
'32'
)
# num steps
if
len
(
vals
)
==
5
:
ks
.
append
(
'num_steps'
);
vals
.
append
(
'40'
)
# step size
if
len
(
vals
)
==
6
:
ks
.
append
(
'step_size'
);
vals
.
append
(
'8'
)
# n_ori
if
len
(
vals
)
==
7
:
ks
.
append
(
'n_ori'
);
vals
.
append
(
'4'
)
# Auxiliary views.
if
len
(
vals
)
==
8
:
ks
.
append
(
'aux_views'
);
vals
.
append
(
'nv0'
)
# Normal data augmentation as opposed to structured data augmentation (if set
# to straug.
if
len
(
vals
)
==
9
:
ks
.
append
(
'data_aug'
);
vals
.
append
(
'straug'
)
assert
(
len
(
vals
)
==
10
)
for
i
in
range
(
len
(
ks
)):
assert
(
ks
[
i
]
==
ks_all
[
i
])
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'navtask_vars: %s'
,
vals
)
return
vars
def
process_navtask_str
(
navtask_str
):
navtask
=
nec
.
nav_env_base_config
()
# Clobber with overrides from strings.
navtask_vars
=
get_navtask_vars
(
navtask_str
)
navtask
.
task_params
.
n_ori
=
int
(
navtask_vars
.
n_ori
)
navtask
.
task_params
.
max_dist
=
int
(
navtask_vars
.
max_dist
)
navtask
.
task_params
.
num_steps
=
int
(
navtask_vars
.
num_steps
)
navtask
.
task_params
.
step_size
=
int
(
navtask_vars
.
step_size
)
navtask
.
task_params
.
data_augment
.
delta_xy
=
int
(
navtask_vars
.
step_size
)
/
2.
n_aux_views_each
=
int
(
navtask_vars
.
aux_views
[
2
])
aux_delta_thetas
=
np
.
concatenate
((
np
.
arange
(
n_aux_views_each
)
+
1
,
-
1
-
np
.
arange
(
n_aux_views_each
)))
aux_delta_thetas
=
aux_delta_thetas
*
np
.
deg2rad
(
navtask
.
camera_param
.
fov
)
navtask
.
task_params
.
aux_delta_thetas
=
aux_delta_thetas
if
navtask_vars
.
data_aug
==
'aug'
:
navtask
.
task_params
.
data_augment
.
structured
=
False
elif
navtask_vars
.
data_aug
==
'straug'
:
navtask
.
task_params
.
data_augment
.
structured
=
True
else
:
logging
.
fatal
(
'Unknown navtask_vars.data_aug %s.'
,
navtask_vars
.
data_aug
)
assert
(
False
)
navtask
.
task_params
.
num_history_frames
=
int
(
navtask_vars
.
history
[
1
:])
navtask
.
task_params
.
n_views
=
1
+
navtask
.
task_params
.
num_history_frames
navtask
.
task_params
.
goal_channels
=
int
(
navtask_vars
.
n_ori
)
if
navtask_vars
.
task
==
'hard'
:
navtask
.
task_params
.
type
=
'rng_rejection_sampling_many'
navtask
.
task_params
.
rejection_sampling_M
=
2000
navtask
.
task_params
.
min_dist
=
10
elif
navtask_vars
.
task
==
'r2r'
:
navtask
.
task_params
.
type
=
'room_to_room_many'
elif
navtask_vars
.
task
==
'ST'
:
# Semantic task at hand.
navtask
.
task_params
.
goal_channels
=
\
len
(
navtask
.
task_params
.
semantic_task
.
class_map_names
)
navtask
.
task_params
.
rel_goal_loc_dim
=
\
len
(
navtask
.
task_params
.
semantic_task
.
class_map_names
)
navtask
.
task_params
.
type
=
'to_nearest_obj_acc'
else
:
logging
.
fatal
(
'navtask_vars.task: should be hard or r2r, ST'
)
assert
(
False
)
if
navtask_vars
.
modality
==
'rgb'
:
navtask
.
camera_param
.
modalities
=
[
'rgb'
]
navtask
.
camera_param
.
img_channels
=
3
elif
navtask_vars
.
modality
==
'd'
:
navtask
.
camera_param
.
modalities
=
[
'depth'
]
navtask
.
camera_param
.
img_channels
=
2
navtask
.
task_params
.
img_height
=
navtask
.
camera_param
.
height
navtask
.
task_params
.
img_width
=
navtask
.
camera_param
.
width
navtask
.
task_params
.
modalities
=
navtask
.
camera_param
.
modalities
navtask
.
task_params
.
img_channels
=
navtask
.
camera_param
.
img_channels
navtask
.
task_params
.
img_fov
=
navtask
.
camera_param
.
fov
navtask
.
dataset
=
factory
.
get_dataset
(
navtask_vars
.
dataset_name
)
return
navtask
cognitive_mapping_and_planning/cfgs/config_distill.py
0 → 100644
View file @
3e6b2f20
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
pprint
import
copy
import
os
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
logging
import
src.utils
as
utils
import
cfgs.config_common
as
cc
import
tensorflow
as
tf
rgb_resnet_v2_50_path
=
'cache/resnet_v2_50_inception_preprocessed/model.ckpt-5136169'
def
get_default_args
():
robot
=
utils
.
Foo
(
radius
=
15
,
base
=
10
,
height
=
140
,
sensor_height
=
120
,
camera_elevation_degree
=-
15
)
camera_param
=
utils
.
Foo
(
width
=
225
,
height
=
225
,
z_near
=
0.05
,
z_far
=
20.0
,
fov
=
60.
,
modalities
=
[
'rgb'
,
'depth'
])
env
=
utils
.
Foo
(
padding
=
10
,
resolution
=
5
,
num_point_threshold
=
2
,
valid_min
=-
10
,
valid_max
=
200
,
n_samples_per_face
=
200
)
data_augment
=
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
1
,
delta_xy
=
4
,
relight
=
False
,
relight_fast
=
False
,
structured
=
False
)
task_params
=
utils
.
Foo
(
num_actions
=
4
,
step_size
=
4
,
num_steps
=
0
,
batch_size
=
32
,
room_seed
=
0
,
base_class
=
'Building'
,
task
=
'mapping'
,
n_ori
=
6
,
data_augment
=
data_augment
,
output_transform_to_global_map
=
False
,
output_canonical_map
=
False
,
output_incremental_transform
=
False
,
output_free_space
=
False
,
move_type
=
'shortest_path'
,
toy_problem
=
0
)
buildinger_args
=
utils
.
Foo
(
building_names
=
[
'area1_gates_wingA_floor1_westpart'
],
env_class
=
None
,
robot
=
robot
,
task_params
=
task_params
,
env
=
env
,
camera_param
=
camera_param
)
solver_args
=
utils
.
Foo
(
seed
=
0
,
learning_rate_decay
=
0.1
,
clip_gradient_norm
=
0
,
max_steps
=
120000
,
initial_learning_rate
=
0.001
,
momentum
=
0.99
,
steps_per_decay
=
40000
,
logdir
=
None
,
sync
=
False
,
adjust_lr_sync
=
True
,
wt_decay
=
0.0001
,
data_loss_wt
=
1.0
,
reg_loss_wt
=
1.0
,
num_workers
=
1
,
task
=
0
,
ps_tasks
=
0
,
master
=
'local'
)
summary_args
=
utils
.
Foo
(
display_interval
=
1
,
test_iters
=
100
)
control_args
=
utils
.
Foo
(
train
=
False
,
test
=
False
,
force_batchnorm_is_training_at_test
=
False
)
arch_args
=
utils
.
Foo
(
rgb_encoder
=
'resnet_v2_50'
,
d_encoder
=
'resnet_v2_50'
)
return
utils
.
Foo
(
solver
=
solver_args
,
summary
=
summary_args
,
control
=
control_args
,
arch
=
arch_args
,
buildinger
=
buildinger_args
)
def
get_vars
(
config_name
):
vars
=
config_name
.
split
(
'_'
)
if
len
(
vars
)
==
1
:
# All data or not.
vars
.
append
(
'noall'
)
if
len
(
vars
)
==
2
:
# n_ori
vars
.
append
(
'4'
)
logging
.
error
(
'vars: %s'
,
vars
)
return
vars
def
get_args_for_config
(
config_name
):
args
=
get_default_args
()
config_name
,
mode
=
config_name
.
split
(
'+'
)
vars
=
get_vars
(
config_name
)
logging
.
info
(
'config_name: %s, mode: %s'
,
config_name
,
mode
)
args
.
buildinger
.
task_params
.
n_ori
=
int
(
vars
[
2
])
args
.
solver
.
freeze_conv
=
True
args
.
solver
.
pretrained_path
=
resnet_v2_50_path
args
.
buildinger
.
task_params
.
img_channels
=
5
args
.
solver
.
data_loss_wt
=
0.00001
if
vars
[
0
]
==
'v0'
:
None
else
:
logging
.
error
(
'config_name: %s undefined'
,
config_name
)
args
.
buildinger
.
task_params
.
height
=
args
.
buildinger
.
camera_param
.
height
args
.
buildinger
.
task_params
.
width
=
args
.
buildinger
.
camera_param
.
width
args
.
buildinger
.
task_params
.
modalities
=
args
.
buildinger
.
camera_param
.
modalities
if
vars
[
1
]
==
'all'
:
args
=
cc
.
get_args_for_mode_building_all
(
args
,
mode
)
elif
vars
[
1
]
==
'noall'
:
args
=
cc
.
get_args_for_mode_building
(
args
,
mode
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
cognitive_mapping_and_planning/cfgs/config_vision_baseline.py
0 → 100644
View file @
3e6b2f20
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
import
pprint
import
os
import
numpy
as
np
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
logging
import
src.utils
as
utils
import
cfgs.config_common
as
cc
import
datasets.nav_env_config
as
nec
import
tensorflow
as
tf
FLAGS
=
flags
.
FLAGS
get_solver_vars
=
cc
.
get_solver_vars
get_navtask_vars
=
cc
.
get_navtask_vars
rgb_resnet_v2_50_path
=
'data/init_models/resnet_v2_50/model.ckpt-5136169'
d_resnet_v2_50_path
=
'data/init_models/distill_rgb_to_d_resnet_v2_50/model.ckpt-120002'
def
get_default_args
():
summary_args
=
utils
.
Foo
(
display_interval
=
1
,
test_iters
=
26
,
arop_full_summary_iters
=
14
)
control_args
=
utils
.
Foo
(
train
=
False
,
test
=
False
,
force_batchnorm_is_training_at_test
=
False
,
reset_rng_seed
=
False
,
only_eval_when_done
=
False
,
test_mode
=
None
)
return
summary_args
,
control_args
def
get_default_baseline_args
():
batch_norm_param
=
{
'center'
:
True
,
'scale'
:
True
,
'activation_fn'
:
tf
.
nn
.
relu
}
arch_args
=
utils
.
Foo
(
pred_neurons
=
[],
goal_embed_neurons
=
[],
img_embed_neurons
=
[],
batch_norm_param
=
batch_norm_param
,
dim_reduce_neurons
=
64
,
combine_type
=
''
,
encoder
=
'resnet_v2_50'
,
action_sample_type
=
'sample'
,
action_sample_combine_type
=
'one_or_other'
,
sample_gt_prob_type
=
'inverse_sigmoid_decay'
,
dagger_sample_bn_false
=
True
,
isd_k
=
750.
,
use_visit_count
=
False
,
lstm_output
=
False
,
lstm_ego
=
False
,
lstm_img
=
False
,
fc_dropout
=
0.0
,
embed_goal_for_state
=
False
,
lstm_output_init_state_from_goal
=
False
)
return
arch_args
def
get_arch_vars
(
arch_str
):
if
arch_str
==
''
:
vals
=
[]
else
:
vals
=
arch_str
.
split
(
'_'
)
ks
=
[
'ver'
,
'lstm_dim'
,
'dropout'
]
# Exp Ver
if
len
(
vals
)
==
0
:
vals
.
append
(
'v0'
)
# LSTM dimentsions
if
len
(
vals
)
==
1
:
vals
.
append
(
'lstm2048'
)
# Dropout
if
len
(
vals
)
==
2
:
vals
.
append
(
'noDO'
)
assert
(
len
(
vals
)
==
3
)
vars
=
utils
.
Foo
()
for
k
,
v
in
zip
(
ks
,
vals
):
setattr
(
vars
,
k
,
v
)
logging
.
error
(
'arch_vars: %s'
,
vars
)
return
vars
def
process_arch_str
(
args
,
arch_str
):
# This function modifies args.
args
.
arch
=
get_default_baseline_args
()
arch_vars
=
get_arch_vars
(
arch_str
)
args
.
navtask
.
task_params
.
outputs
.
rel_goal_loc
=
True
args
.
navtask
.
task_params
.
input_type
=
'vision'
args
.
navtask
.
task_params
.
outputs
.
images
=
True
if
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'rgb'
:
args
.
solver
.
pretrained_path
=
rgb_resnet_v2_50_path
elif
args
.
navtask
.
camera_param
.
modalities
[
0
]
==
'depth'
:
args
.
solver
.
pretrained_path
=
d_resnet_v2_50_path
else
:
logging
.
fatal
(
'Neither of rgb or d'
)
if
arch_vars
.
dropout
==
'DO'
:
args
.
arch
.
fc_dropout
=
0.5
args
.
tfcode
=
'B'
exp_ver
=
arch_vars
.
ver
if
exp_ver
==
'v0'
:
# Multiplicative interaction between goal loc and image features.
args
.
arch
.
combine_type
=
'multiply'
args
.
arch
.
pred_neurons
=
[
256
,
256
]
args
.
arch
.
goal_embed_neurons
=
[
64
,
8
]
args
.
arch
.
img_embed_neurons
=
[
1024
,
512
,
256
*
8
]
elif
exp_ver
==
'v1'
:
# Additive interaction between goal and image features.
args
.
arch
.
combine_type
=
'add'
args
.
arch
.
pred_neurons
=
[
256
,
256
]
args
.
arch
.
goal_embed_neurons
=
[
64
,
256
]
args
.
arch
.
img_embed_neurons
=
[
1024
,
512
,
256
]
elif
exp_ver
==
'v2'
:
# LSTM at the output on top of multiple interactions.
args
.
arch
.
combine_type
=
'multiply'
args
.
arch
.
goal_embed_neurons
=
[
64
,
8
]
args
.
arch
.
img_embed_neurons
=
[
1024
,
512
,
256
*
8
]
args
.
arch
.
lstm_output
=
True
args
.
arch
.
lstm_output_dim
=
int
(
arch_vars
.
lstm_dim
[
4
:])
args
.
arch
.
pred_neurons
=
[
256
]
# The other is inside the LSTM.
elif
exp_ver
==
'v0blind'
:
# LSTM only on the goal location.
args
.
arch
.
combine_type
=
'goalonly'
args
.
arch
.
goal_embed_neurons
=
[
64
,
256
]
args
.
arch
.
img_embed_neurons
=
[
2
]
# I dont know what it will do otherwise.
args
.
arch
.
lstm_output
=
True
args
.
arch
.
lstm_output_dim
=
256
args
.
arch
.
pred_neurons
=
[
256
]
# The other is inside the LSTM.
else
:
logging
.
fatal
(
'exp_ver: %s undefined'
,
exp_ver
)
assert
(
False
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
def
get_args_for_config
(
config_name
):
args
=
utils
.
Foo
()
args
.
summary
,
args
.
control
=
get_default_args
()
exp_name
,
mode_str
=
config_name
.
split
(
'+'
)
arch_str
,
solver_str
,
navtask_str
=
exp_name
.
split
(
'.'
)
logging
.
error
(
'config_name: %s'
,
config_name
)
logging
.
error
(
'arch_str: %s'
,
arch_str
)
logging
.
error
(
'navtask_str: %s'
,
navtask_str
)
logging
.
error
(
'solver_str: %s'
,
solver_str
)
logging
.
error
(
'mode_str: %s'
,
mode_str
)
args
.
solver
=
cc
.
process_solver_str
(
solver_str
)
args
.
navtask
=
cc
.
process_navtask_str
(
navtask_str
)
args
=
process_arch_str
(
args
,
arch_str
)
args
.
arch
.
isd_k
=
args
.
solver
.
isd_k
# Train, test, etc.
mode
,
imset
=
mode_str
.
split
(
'_'
)
args
=
cc
.
adjust_args_for_mode
(
args
,
mode
)
args
.
navtask
.
building_names
=
args
.
navtask
.
dataset
.
get_split
(
imset
)
args
.
control
.
test_name
=
'{:s}_on_{:s}'
.
format
(
mode
,
imset
)
# Log the arguments
logging
.
error
(
'%s'
,
args
)
return
args
cognitive_mapping_and_planning/data/.gitignore
0 → 100644
View file @
3e6b2f20
stanford_building_parser_dataset_raw
stanford_building_parser_dataset
init_models
cognitive_mapping_and_planning/data/README.md
0 → 100644
View file @
3e6b2f20
This directory contains the data needed for training and benchmarking various
navigation models.
1.
Download the data from the [dataset website]
(http://buildingparser.stanford.edu/dataset.html).
1.
[
Raw meshes
](
https://goo.gl/forms/2YSPaO2UKmn5Td5m2
)
. We need the meshes
which are in the noXYZ folder. Download the tar files and place them in
the
`stanford_building_parser_dataset_raw`
folder. You need to download
`area_1_noXYZ.tar`
,
`area_3_noXYZ.tar`
,
`area_5a_noXYZ.tar`
,
`area_5b_noXYZ.tar`
,
`area_6_noXYZ.tar`
for training and
`area_4_noXYZ.tar`
for evaluation.
2.
[
Annotations
](
https://goo.gl/forms/4SoGp4KtH1jfRqEj2
)
for setting up
tasks. We will need the file called
`Stanford3dDataset_v1.2.zip`
. Place
the file in the directory
`stanford_building_parser_dataset_raw`
.
2.
Preprocess the data.
1.
Extract meshes using
`scripts/script_preprocess_meshes_S3DIS.sh`
. After
this
`ls data/stanford_building_parser_dataset/mesh`
should have 6
folders
`area1`
,
`area3`
,
`area4`
,
`area5a`
,
`area5b`
,
`area6`
, with
textures and obj files within each directory.
2.
Extract out room information and semantics from zip file using
`scripts/script_preprocess_annoations_S3DIS.sh`
. After this there should
be
`room-dimension`
and
`class-maps`
folder in
`data/stanford_building_parser_dataset`
. (If you find this script to
crash because of an exception in np.loadtxt while processing
`Area_5/office_19/Annotations/ceiling_1.txt`
, there is a special
character on line 323474, that should be removed manually.)
3.
Download ImageNet Pre-trained models. We used ResNet-v2-50 for representing
images. For RGB images this is pre-trained on ImageNet. For Depth images we
[
distill
](
https://arxiv.org/abs/1507.00448
)
the RGB model to depth images
using paired RGB-D images. Both there models are available through
`scripts/script_download_init_models.sh`
cognitive_mapping_and_planning/datasets/__init__.py
0 → 100644
View file @
3e6b2f20
cognitive_mapping_and_planning/datasets/factory.py
0 → 100644
View file @
3e6b2f20
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
r
"""Wrapper for selecting the navigation environment that we want to train and
test on.
"""
import
numpy
as
np
import
os
,
glob
import
platform
import
logging
from
tensorflow.python.platform
import
app
from
tensorflow.python.platform
import
flags
import
render.swiftshader_renderer
as
renderer
import
src.file_utils
as
fu
import
src.utils
as
utils
def
get_dataset
(
dataset_name
):
if
dataset_name
==
'sbpd'
:
dataset
=
StanfordBuildingParserDataset
(
dataset_name
)
else
:
logging
.
fatal
(
'Not one of sbpd'
)
return
dataset
class
Loader
():
def
get_data_dir
():
pass
def
get_meta_data
(
self
,
file_name
,
data_dir
=
None
):
if
data_dir
is
None
:
data_dir
=
self
.
get_data_dir
()
full_file_name
=
os
.
path
.
join
(
data_dir
,
'meta'
,
file_name
)
assert
(
fu
.
exists
(
full_file_name
)),
\
'{:s} does not exist'
.
format
(
full_file_name
)
ext
=
os
.
path
.
splitext
(
full_file_name
)[
1
]
if
ext
==
'.txt'
:
ls
=
[]
with
fu
.
fopen
(
full_file_name
,
'r'
)
as
f
:
for
l
in
f
:
ls
.
append
(
l
.
rstrip
())
elif
ext
==
'.pkl'
:
ls
=
utils
.
load_variables
(
full_file_name
)
return
ls
def
load_building
(
self
,
name
,
data_dir
=
None
):
if
data_dir
is
None
:
data_dir
=
self
.
get_data_dir
()
out
=
{}
out
[
'name'
]
=
name
out
[
'data_dir'
]
=
data_dir
out
[
'room_dimension_file'
]
=
os
.
path
.
join
(
data_dir
,
'room-dimension'
,
name
+
'.pkl'
)
out
[
'class_map_folder'
]
=
os
.
path
.
join
(
data_dir
,
'class-maps'
)
return
out
def
load_building_meshes
(
self
,
building
):
dir_name
=
os
.
path
.
join
(
building
[
'data_dir'
],
'mesh'
,
building
[
'name'
])
mesh_file_name
=
glob
.
glob1
(
dir_name
,
'*.obj'
)[
0
]
mesh_file_name_full
=
os
.
path
.
join
(
dir_name
,
mesh_file_name
)
logging
.
error
(
'Loading building from obj file: %s'
,
mesh_file_name_full
)
shape
=
renderer
.
Shape
(
mesh_file_name_full
,
load_materials
=
True
,
name_prefix
=
building
[
'name'
]
+
'_'
)
return
[
shape
]
class
StanfordBuildingParserDataset
(
Loader
):
def
__init__
(
self
,
ver
):
self
.
ver
=
ver
self
.
data_dir
=
None
def
get_data_dir
(
self
):
if
self
.
data_dir
is
None
:
self
.
data_dir
=
'data/stanford_building_parser_dataset/'
return
self
.
data_dir
def
get_benchmark_sets
(
self
):
return
self
.
_get_benchmark_sets
()
def
get_split
(
self
,
split_name
):
if
self
.
ver
==
'sbpd'
:
return
self
.
_get_split
(
split_name
)
else
:
logging
.
fatal
(
'Unknown version.'
)
def
_get_benchmark_sets
(
self
):
sets
=
[
'train1'
,
'val'
,
'test'
]
return
sets
def
_get_split
(
self
,
split_name
):
train
=
[
'area1'
,
'area5a'
,
'area5b'
,
'area6'
]
train1
=
[
'area1'
]
val
=
[
'area3'
]
test
=
[
'area4'
]
sets
=
{}
sets
[
'train'
]
=
train
sets
[
'train1'
]
=
train1
sets
[
'val'
]
=
val
sets
[
'test'
]
=
test
sets
[
'all'
]
=
sorted
(
list
(
set
(
train
+
val
+
test
)))
return
sets
[
split_name
]
cognitive_mapping_and_planning/datasets/nav_env.py
0 → 100644
View file @
3e6b2f20
This diff is collapsed.
Click to expand it.
cognitive_mapping_and_planning/datasets/nav_env_config.py
0 → 100644
View file @
3e6b2f20
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Configs for stanford navigation environment.
Base config for stanford navigation enviornment.
"""
import
numpy
as
np
import
src.utils
as
utils
import
datasets.nav_env
as
nav_env
def
nav_env_base_config
():
"""Returns the base config for stanford navigation environment.
Returns:
Base config for stanford navigation environment.
"""
robot
=
utils
.
Foo
(
radius
=
15
,
base
=
10
,
height
=
140
,
sensor_height
=
120
,
camera_elevation_degree
=-
15
)
env
=
utils
.
Foo
(
padding
=
10
,
resolution
=
5
,
num_point_threshold
=
2
,
valid_min
=-
10
,
valid_max
=
200
,
n_samples_per_face
=
200
)
camera_param
=
utils
.
Foo
(
width
=
225
,
height
=
225
,
z_near
=
0.05
,
z_far
=
20.0
,
fov
=
60.
,
modalities
=
[
'rgb'
],
img_channels
=
3
)
data_augment
=
utils
.
Foo
(
lr_flip
=
0
,
delta_angle
=
0.5
,
delta_xy
=
4
,
relight
=
True
,
relight_fast
=
False
,
structured
=
False
)
# if True, uses the same perturb for the whole episode.
outputs
=
utils
.
Foo
(
images
=
True
,
rel_goal_loc
=
False
,
loc_on_map
=
True
,
gt_dist_to_goal
=
True
,
ego_maps
=
False
,
ego_goal_imgs
=
False
,
egomotion
=
False
,
visit_count
=
False
,
analytical_counts
=
False
,
node_ids
=
True
,
readout_maps
=
False
)
# class_map_names=['board', 'chair', 'door', 'sofa', 'table']
class_map_names
=
[
'chair'
,
'door'
,
'table'
]
semantic_task
=
utils
.
Foo
(
class_map_names
=
class_map_names
,
pix_distance
=
16
,
sampling
=
'uniform'
)
# time per iteration for cmp is 0.82 seconds per episode with 3.4s overhead per batch.
task_params
=
utils
.
Foo
(
max_dist
=
32
,
step_size
=
8
,
num_steps
=
40
,
num_actions
=
4
,
batch_size
=
4
,
building_seed
=
0
,
num_goals
=
1
,
img_height
=
None
,
img_width
=
None
,
img_channels
=
None
,
modalities
=
None
,
outputs
=
outputs
,
map_scales
=
[
1.
],
map_crop_sizes
=
[
64
],
rel_goal_loc_dim
=
4
,
base_class
=
'Building'
,
task
=
'map+plan'
,
n_ori
=
4
,
type
=
'room_to_room_many'
,
data_augment
=
data_augment
,
room_regex
=
'^((?!hallway).)*$'
,
toy_problem
=
False
,
map_channels
=
1
,
gt_coverage
=
False
,
input_type
=
'maps'
,
full_information
=
False
,
aux_delta_thetas
=
[],
semantic_task
=
semantic_task
,
num_history_frames
=
0
,
node_ids_dim
=
1
,
perturbs_dim
=
4
,
map_resize_method
=
'linear_noantialiasing'
,
readout_maps_channels
=
1
,
readout_maps_scales
=
[],
readout_maps_crop_sizes
=
[],
n_views
=
1
,
reward_time_penalty
=
0.1
,
reward_at_goal
=
1.
,
discount_factor
=
0.99
,
rejection_sampling_M
=
100
,
min_dist
=
None
)
navtask_args
=
utils
.
Foo
(
building_names
=
[
'area1_gates_wingA_floor1_westpart'
],
env_class
=
nav_env
.
VisualNavigationEnv
,
robot
=
robot
,
task_params
=
task_params
,
env
=
env
,
camera_param
=
camera_param
,
cache_rooms
=
True
)
return
navtask_args
cognitive_mapping_and_planning/matplotlibrc
0 → 100644
View file @
3e6b2f20
backend : agg
cognitive_mapping_and_planning/output/.gitignore
0 → 100644
View file @
3e6b2f20
*
cognitive_mapping_and_planning/output/README.md
0 → 100644
View file @
3e6b2f20
### Pre-Trained Models
We provide the following pre-trained models:
Config Name | Checkpoint | Mean Dist. | 50%ile Dist. | 75%ile Dist. | Success %age |
:-: | :-: | :-: | :-: | :-: | :-: |
cmp.lmap_Msc.clip5.sbpd_d_r2r |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/cmp.lmap_Msc.clip5.sbpd_d_r2r.tar
)
| 4.79 | 0 | 1 | 78.9 |
cmp.lmap_Msc.clip5.sbpd_rgb_r2r |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/cmp.lmap_Msc.clip5.sbpd_rgb_r2r.tar
)
| 7.74 | 0 | 14 | 62.4 |
cmp.lmap_Msc.clip5.sbpd_d_ST |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/cmp.lmap_Msc.clip5.sbpd_d_ST.tar
)
| 10.67 | 9 | 19 | 39.7 |
cmp.lmap_Msc.clip5.sbpd_rgb_ST |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/cmp.lmap_Msc.clip5.sbpd_rgb_ST.tar
)
| 11.27 | 10 | 19 | 35.6 |
cmp.lmap_Msc.clip5.sbpd_d_r2r_h0_64_80 |
[
ckpt
](
http:////download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/cmp.lmap_Msc.clip5.sbpd_d_r2r_h0_64_80.tar
)
| 11.6 | 0 | 19 | 66.9 |
bl.v2.noclip.sbpd_d_r2r |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/bl.v2.noclip.sbpd_d_r2r.tar
)
| 5.90 | 0 | 6 | 71.2 |
bl.v2.noclip.sbpd_rgb_r2r |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/bl.v2.noclip.sbpd_rgb_r2r.tar
)
| 10.21 | 1 | 21 | 53.4 |
bl.v2.noclip.sbpd_d_ST |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/bl.v2.noclip.sbpd_d_ST.tar
)
| 13.29 | 14 | 23 | 28.0 |
bl.v2.noclip.sbpd_rgb_ST |
[
ckpt
](
http://download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/bl.v2.noclip.sbpd_rgb_ST.tar
)
| 13.37 | 13 | 20 | 24.2 |
bl.v2.noclip.sbpd_d_r2r_h0_64_80 |
[
ckpt
](
http:////download.tensorflow.org/models/cognitive_mapping_and_planning/2017_04_16/bl.v2.noclip.sbpd_d_r2r_h0_64_80.tar
)
| 15.30 | 0 | 29 | 57.9 |
cognitive_mapping_and_planning/patches/GLES2_2_0.py.patch
0 → 100644
View file @
3e6b2f20
10c10
< from OpenGL import platform, constant, arrays
---
> from OpenGL import platform, constant, arrays, contextdata
249a250
> from OpenGL._bytes import _NULL_8_BYTE
399c400
< array = ArrayDatatype.asArray( pointer, type )
---
> array = arrays.ArrayDatatype.asArray( pointer, type )
405c406
< ArrayDatatype.voidDataPointer( array )
---
> arrays.ArrayDatatype.voidDataPointer( array )
cognitive_mapping_and_planning/patches/apply_patches.sh
0 → 100644
View file @
3e6b2f20
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
echo
$VIRTUAL_ENV
patch
$VIRTUAL_ENV
/local/lib/python2.7/site-packages/OpenGL/GLES2/VERSION/GLES2_2_0.py patches/GLES2_2_0.py.patch
patch
$VIRTUAL_ENV
/local/lib/python2.7/site-packages/OpenGL/platform/ctypesloader.py patches/ctypesloader.py.patch
cognitive_mapping_and_planning/patches/ctypesloader.py.patch
0 → 100644
View file @
3e6b2f20
45c45,46
< return dllType( name, mode )
---
> print './' + name
> return dllType( './' + name, mode )
47,48c48,53
< err.args += (name,fullName)
< raise
---
> try:
> print name
> return dllType( name, mode )
> except:
> err.args += (name,fullName)
> raise
Prev
1
2
3
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment