Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_tensorflow
Commits
9beaea41
Commit
9beaea41
authored
May 01, 2017
by
Alexander Gorban
Browse files
Merge remote-tracking branch 'upstream/master'
parents
6159b593
3a3c5b9d
Changes
13
Show whitespace changes
Inline
Side-by-side
Showing
13 changed files
with
439 additions
and
91 deletions
+439
-91
adversarial_crypto/README.md
adversarial_crypto/README.md
+56
-0
adversarial_crypto/train_eval.py
adversarial_crypto/train_eval.py
+274
-0
differential_privacy/dp_sgd/README.md
differential_privacy/dp_sgd/README.md
+13
-13
inception/README.md
inception/README.md
+8
-1
inception/inception/imagenet_distributed_train.py
inception/inception/imagenet_distributed_train.py
+2
-1
inception/inception/inception_distributed_train.py
inception/inception/inception_distributed_train.py
+3
-0
lm_1b/README.md
lm_1b/README.md
+27
-27
next_frame_prediction/README.md
next_frame_prediction/README.md
+14
-14
resnet/README.md
resnet/README.md
+25
-25
resnet/resnet_model.py
resnet/resnet_model.py
+1
-1
slim/README.md
slim/README.md
+14
-7
textsum/README.md
textsum/README.md
+1
-1
tutorials/rnn/ptb/ptb_word_lm.py
tutorials/rnn/ptb/ptb_word_lm.py
+1
-1
No files found.
adversarial_crypto/README.md
0 → 100644
View file @
9beaea41
# Learning to Protect Communications with Adversarial Neural Cryptography
This is a slightly-updated model used for the paper
[
"Learning to Protect Communications with Adversarial Neural
Cryptography"
](
https://arxiv.org/abs/1610.06918
)
.
> We ask whether neural networks can learn to use secret keys to protect
> information from other neural networks. Specifically, we focus on ensuring
> confidentiality properties in a multiagent system, and we specify those
> properties in terms of an adversary. Thus, a system may consist of neural
> networks named Alice and Bob, and we aim to limit what a third neural
> network named Eve learns from eavesdropping on the communication between
> Alice and Bob. We do not prescribe specific cryptographic algorithms to
> these neural networks; instead, we train end-to-end, adversarially.
> We demonstrate that the neural networks can learn how to perform forms of
> encryption and decryption, and also how to apply these operations
> selectively in order to meet confidentiality goals.
This code allows you to train an encoder/decoder/adversary triplet
and evaluate their effectiveness on randomly generated input and key
pairs.
## Prerequisites
The only software requirements for running the encoder and decoder is having
Tensorflow installed.
Requires Tensorflow r0.12 or later.
## Training and evaluating
After installing TensorFlow and ensuring that your paths are configured
appropriately:
python train_eval.py
This will begin training a fresh model. If and when the model becomes
sufficiently well-trained, it will reset the Eve model multiple times
and retrain it from scratch, outputting the accuracy thus obtained
in each run.
## Model differences from the paper
The model has been simplified slightly from the one described in
the paper - the convolutional layer width was reduced by a factor
of two. In the version in the paper, there was a nonlinear unit
after the fully-connected layer; that nonlinear has been removed
here. These changes improve the robustness of training. The
initializer for the convolution layers has switched to the
tf.contrib.layers default of xavier_initializer instead of
a simpler truncated_normal.
## Contact information
This model repository is maintained by David G. Andersen
(
[
dave-andersen
](
https://github.com/dave-andersen
)
).
adversarial_crypto/train_eval.py
0 → 100644
View file @
9beaea41
# Copyright 2016 The TensorFlow Authors All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Adversarial training to learn trivial encryption functions,
from the paper "Learning to Protect Communications with
Adversarial Neural Cryptography", Abadi & Andersen, 2016.
https://arxiv.org/abs/1610.06918
This program creates and trains three neural networks,
termed Alice, Bob, and Eve. Alice takes inputs
in_m (message), in_k (key) and outputs 'ciphertext'.
Bob takes inputs in_k, ciphertext and tries to reconstruct
the message.
Eve is an adversarial network that takes input ciphertext
and also tries to reconstruct the message.
The main function attempts to train these networks and then
evaluates them, all on random plaintext and key values.
"""
# TensorFlow Python 3 compatibility
from
__future__
import
absolute_import
from
__future__
import
division
from
__future__
import
print_function
import
signal
import
sys
from
six.moves
import
xrange
# pylint: disable=redefined-builtin
import
tensorflow
as
tf
flags
=
tf
.
app
.
flags
flags
.
DEFINE_float
(
'learning_rate'
,
0.0008
,
'Constant learning rate'
)
flags
.
DEFINE_integer
(
'batch_size'
,
4096
,
'Batch size'
)
FLAGS
=
flags
.
FLAGS
# Input and output configuration.
TEXT_SIZE
=
16
KEY_SIZE
=
16
# Training parameters.
ITERS_PER_ACTOR
=
1
EVE_MULTIPLIER
=
2
# Train Eve 2x for every step of Alice/Bob
# Train until either max loops or Alice/Bob "good enough":
MAX_TRAINING_LOOPS
=
850000
BOB_LOSS_THRESH
=
0.02
# Exit when Bob loss < 0.02 and Eve > 7.7 bits
EVE_LOSS_THRESH
=
7.7
# Logging and evaluation.
PRINT_EVERY
=
200
# In training, log every 200 steps.
EVE_EXTRA_ROUNDS
=
2000
# At end, train eve a bit more.
RETRAIN_EVE_ITERS
=
10000
# Retrain eve up to ITERS*LOOPS times.
RETRAIN_EVE_LOOPS
=
25
# With an evaluation each loop
NUMBER_OF_EVE_RESETS
=
5
# And do this up to 5 times with a fresh eve.
# Use EVAL_BATCHES samples each time we check accuracy.
EVAL_BATCHES
=
1
def
batch_of_random_bools
(
batch_size
,
n
):
"""Return a batch of random "boolean" numbers.
Args:
batch_size: Batch size dimension of returned tensor.
n: number of entries per batch.
Returns:
A [batch_size, n] tensor of "boolean" numbers, where each number is
preresented as -1 or 1.
"""
as_int
=
tf
.
random_uniform
(
[
batch_size
,
n
],
minval
=
0
,
maxval
=
2
,
dtype
=
tf
.
int32
)
expanded_range
=
(
as_int
*
2
)
-
1
return
tf
.
cast
(
expanded_range
,
tf
.
float32
)
class
AdversarialCrypto
(
object
):
"""Primary model implementation class for Adversarial Neural Crypto.
This class contains the code for the model itself,
and when created, plumbs the pathways from Alice to Bob and
Eve, creates the optimizers and loss functions, etc.
Attributes:
eve_loss: Eve's loss function.
bob_loss: Bob's loss function. Different units from eve_loss.
eve_optimizer: A tf op that runs Eve's optimizer.
bob_optimizer: A tf op that runs Bob's optimizer.
bob_reconstruction_loss: Bob's message reconstruction loss,
which is comparable to eve_loss.
reset_eve_vars: Execute this op to completely reset Eve.
"""
def
get_message_and_key
(
self
):
"""Generate random pseudo-boolean key and message values."""
batch_size
=
tf
.
placeholder_with_default
(
FLAGS
.
batch_size
,
shape
=
[])
in_m
=
batch_of_random_bools
(
batch_size
,
TEXT_SIZE
)
in_k
=
batch_of_random_bools
(
batch_size
,
KEY_SIZE
)
return
in_m
,
in_k
def
model
(
self
,
collection
,
message
,
key
=
None
):
"""The model for Alice, Bob, and Eve. If key=None, the first FC layer
takes only the Key as inputs. Otherwise, it uses both the key
and the message.
Args:
collection: The graph keys collection to add new vars to.
message: The input message to process.
key: The input key (if any) to use.
"""
if
key
is
not
None
:
combined_message
=
tf
.
concat
(
1
,
[
message
,
key
])
else
:
combined_message
=
message
# Ensure that all variables created are in the specified collection.
with
tf
.
contrib
.
framework
.
arg_scope
(
[
tf
.
contrib
.
layers
.
fully_connected
,
tf
.
contrib
.
layers
.
convolution
],
variables_collections
=
[
collection
]):
fc
=
tf
.
contrib
.
layers
.
fully_connected
(
combined_message
,
TEXT_SIZE
+
KEY_SIZE
,
biases_initializer
=
tf
.
constant_initializer
(
0.0
),
activation_fn
=
None
)
# Perform a sequence of 1D convolutions (by expanding the message out to 2D
# and then squeezing it back down).
fc
=
tf
.
expand_dims
(
fc
,
2
)
# 2,1 -> 1,2
conv
=
tf
.
contrib
.
layers
.
convolution
(
fc
,
2
,
2
,
2
,
'SAME'
,
activation_fn
=
tf
.
nn
.
sigmoid
)
# 1,2 -> 1, 2
conv
=
tf
.
contrib
.
layers
.
convolution
(
conv
,
2
,
1
,
1
,
'SAME'
,
activation_fn
=
tf
.
nn
.
sigmoid
)
# 1,2 -> 1, 1
conv
=
tf
.
contrib
.
layers
.
convolution
(
conv
,
1
,
1
,
1
,
'SAME'
,
activation_fn
=
tf
.
nn
.
tanh
)
conv
=
tf
.
squeeze
(
conv
,
2
)
return
conv
def
__init__
(
self
):
in_m
,
in_k
=
self
.
get_message_and_key
()
encrypted
=
self
.
model
(
'alice'
,
in_m
,
in_k
)
decrypted
=
self
.
model
(
'bob'
,
encrypted
,
in_k
)
eve_out
=
self
.
model
(
'eve'
,
encrypted
,
None
)
self
.
reset_eve_vars
=
tf
.
group
(
*
[
w
.
initializer
for
w
in
tf
.
get_collection
(
'eve'
)])
optimizer
=
tf
.
train
.
AdamOptimizer
(
learning_rate
=
FLAGS
.
learning_rate
)
# Eve's goal is to decrypt the entire message:
eve_bits_wrong
=
tf
.
reduce_sum
(
tf
.
abs
((
eve_out
+
1.0
)
/
2.0
-
(
in_m
+
1.0
)
/
2.0
),
[
1
])
self
.
eve_loss
=
tf
.
reduce_sum
(
eve_bits_wrong
)
self
.
eve_optimizer
=
optimizer
.
minimize
(
self
.
eve_loss
,
var_list
=
tf
.
get_collection
(
'eve'
))
# Alice and Bob want to be accurate...
self
.
bob_bits_wrong
=
tf
.
reduce_sum
(
tf
.
abs
((
decrypted
+
1.0
)
/
2.0
-
(
in_m
+
1.0
)
/
2.0
),
[
1
])
# ... and to not let Eve do better than guessing.
self
.
bob_reconstruction_loss
=
tf
.
reduce_sum
(
self
.
bob_bits_wrong
)
bob_eve_error_deviation
=
tf
.
abs
(
float
(
TEXT_SIZE
)
/
2.0
-
eve_bits_wrong
)
# 7-9 bits wrong is OK too, so we squish the error function a bit.
# Without doing this, we often tend to hang out at 0.25 / 7.5 error,
# and it seems bad to have continued, high communication error.
bob_eve_loss
=
tf
.
reduce_sum
(
tf
.
square
(
bob_eve_error_deviation
)
/
(
TEXT_SIZE
/
2
)
**
2
)
# Rescale the losses to [0, 1] per example and combine.
self
.
bob_loss
=
(
self
.
bob_reconstruction_loss
/
TEXT_SIZE
+
bob_eve_loss
)
self
.
bob_optimizer
=
optimizer
.
minimize
(
self
.
bob_loss
,
var_list
=
(
tf
.
get_collection
(
'alice'
)
+
tf
.
get_collection
(
'bob'
)))
def
doeval
(
s
,
ac
,
n
,
itercount
):
"""Evaluate the current network on n batches of random examples.
Args:
s: The current TensorFlow session
ac: an instance of the AdversarialCrypto class
n: The number of iterations to run.
itercount: Iteration count label for logging.
Returns:
Bob and eve's loss, as a percent of bits incorrect.
"""
bob_loss_accum
=
0
eve_loss_accum
=
0
for
_
in
xrange
(
n
):
bl
,
el
=
s
.
run
([
ac
.
bob_reconstruction_loss
,
ac
.
eve_loss
])
bob_loss_accum
+=
bl
eve_loss_accum
+=
el
bob_loss_percent
=
bob_loss_accum
/
(
n
*
FLAGS
.
batch_size
)
eve_loss_percent
=
eve_loss_accum
/
(
n
*
FLAGS
.
batch_size
)
print
(
'%d %.2f %.2f'
%
(
itercount
,
bob_loss_percent
,
eve_loss_percent
))
sys
.
stdout
.
flush
()
return
bob_loss_percent
,
eve_loss_percent
def
train_until_thresh
(
s
,
ac
):
for
j
in
xrange
(
MAX_TRAINING_LOOPS
):
for
_
in
xrange
(
ITERS_PER_ACTOR
):
s
.
run
(
ac
.
bob_optimizer
)
for
_
in
xrange
(
ITERS_PER_ACTOR
*
EVE_MULTIPLIER
):
s
.
run
(
ac
.
eve_optimizer
)
if
j
%
PRINT_EVERY
==
0
:
bob_avg_loss
,
eve_avg_loss
=
doeval
(
s
,
ac
,
EVAL_BATCHES
,
j
)
if
(
bob_avg_loss
<
BOB_LOSS_THRESH
and
eve_avg_loss
>
EVE_LOSS_THRESH
):
print
(
'Target losses achieved.'
)
return
True
return
False
def
train_and_evaluate
():
"""Run the full training and evaluation loop."""
ac
=
AdversarialCrypto
()
init
=
tf
.
global_variables_initializer
()
with
tf
.
Session
()
as
s
:
s
.
run
(
init
)
print
(
'# Batch size: '
,
FLAGS
.
batch_size
)
print
(
'# Iter Bob_Recon_Error Eve_Recon_Error'
)
if
train_until_thresh
(
s
,
ac
):
for
_
in
xrange
(
EVE_EXTRA_ROUNDS
):
s
.
run
(
eve_optimizer
)
print
(
'Loss after eve extra training:'
)
doeval
(
s
,
ac
,
EVAL_BATCHES
*
2
,
0
)
for
_
in
xrange
(
NUMBER_OF_EVE_RESETS
):
print
(
'Resetting Eve'
)
s
.
run
(
reset_eve_vars
)
eve_counter
=
0
for
_
in
xrange
(
RETRAIN_EVE_LOOPS
):
for
_
in
xrange
(
RETRAIN_EVE_ITERS
):
eve_counter
+=
1
s
.
run
(
eve_optimizer
)
doeval
(
s
,
ac
,
EVAL_BATCHES
,
eve_counter
)
doeval
(
s
,
ac
,
EVAL_BATCHES
,
eve_counter
)
def
main
(
unused_argv
):
# Exit more quietly with Ctrl-C.
signal
.
signal
(
signal
.
SIGINT
,
signal
.
SIG_DFL
)
train_and_evaluate
()
if
__name__
==
'__main__'
:
tf
.
app
.
run
()
differential_privacy/dp_sgd/README.md
View file @
9beaea41
...
...
@@ -46,7 +46,7 @@ https://github.com/panyx0718/models/tree/master/slim
# Download the data to the data/ directory.
# List the codes.
ls
-R
differential_privacy/
$
ls
-R
differential_privacy/
differential_privacy/:
dp_sgd __init__.py privacy_accountant README.md
...
...
@@ -72,16 +72,16 @@ differential_privacy/privacy_accountant/tf:
accountant.py accountant_test.py BUILD
# List the data.
ls
-R
data/
$
ls
-R
data/
./data:
mnist_test.tfrecord mnist_train.tfrecord
# Build the codes.
bazel build
-c
opt differential_privacy/...
$
bazel build
-c
opt differential_privacy/...
# Run the mnist differntial privacy training codes.
bazel-bin/differential_privacy/dp_sgd/dp_mnist/dp_mnist
\
$
bazel-bin/differential_privacy/dp_sgd/dp_mnist/dp_mnist
\
--training_data_path
=
data/mnist_train.tfrecord
\
--eval_data_path
=
data/mnist_test.tfrecord
\
--save_path
=
/tmp/mnist_dir
...
...
@@ -102,6 +102,6 @@ train_accuracy: 0.53
eval_accuracy: 0.53
...
ls
/tmp/mnist_dir/
$
ls
/tmp/mnist_dir/
checkpoint ckpt ckpt.meta results-0.json
```
inception/README.md
View file @
9beaea41
...
...
@@ -367,6 +367,13 @@ I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:206] Initialize HostPo
I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:202] Started server with target: grpc://localhost:2222
```
If you compiled TensorFlow (from v1.1-rc3) with VERBS support and you have the
required device and IB verbs SW stack, you can specify --protocol='grpc+verbs'
In order to use Verbs RDMA for Tensor passing between workers and ps.
Need to add the the --protocol flag in all tasks (ps and workers).
The default protocol is the TensorFlow default protocol of grpc.
[
Congratulations!
](
https://www.youtube.com/watch?v=9bZkp7q19f0
)
You are now
training Inception in a distributed manner.
...
...
@@ -749,7 +756,7 @@ batch-splitting the model across multiple GPUs.
permit training the model with higher learning rates.
*
Often the GPU memory is a bottleneck that prevents employing larger batch
sizes. Employing more GPUs allows one to use
r
larger batch sizes because
sizes. Employing more GPUs allows one to use larger batch sizes because
this model splits the batch across the GPUs.
**NOTE**
If one wishes to train this model with
*asynchronous*
gradient updates,
...
...
inception/inception/imagenet_distributed_train.py
View file @
9beaea41
...
...
@@ -45,7 +45,8 @@ def main(unused_args):
{
'ps'
:
ps_hosts
,
'worker'
:
worker_hosts
},
job_name
=
FLAGS
.
job_name
,
task_index
=
FLAGS
.
task_id
)
task_index
=
FLAGS
.
task_id
,
protocol
=
FLAGS
.
protocol
)
if
FLAGS
.
job_name
==
'ps'
:
# `ps` jobs wait for incoming connections from the workers.
...
...
inception/inception/inception_distributed_train.py
View file @
9beaea41
...
...
@@ -42,6 +42,9 @@ tf.app.flags.DEFINE_string('worker_hosts', '',
"""Comma-separated list of hostname:port for the """
"""worker jobs. e.g. """
"""'machine1:2222,machine2:1111,machine2:2222'"""
)
tf
.
app
.
flags
.
DEFINE_string
(
'protocol'
,
'grpc'
,
"""Communication protocol to use in distributed """
"""execution (default grpc) """
)
tf
.
app
.
flags
.
DEFINE_string
(
'train_dir'
,
'/tmp/imagenet_train'
,
"""Directory where to write event logs """
...
...
lm_1b/README.md
View file @
9beaea41
...
...
@@ -73,7 +73,7 @@ LSTM-8192-2048 (50\% Dropout) | 32.2 | 3.3
<b>
How To Run
</b>
Pre
-
requ
e
site:
Prerequ
i
site
s
:
*
Install TensorFlow.
*
Install Bazel.
...
...
@@ -97,7 +97,7 @@ Pre-requesite:
[
link
](
http://download.tensorflow.org/models/LM_LSTM_CNN/vocab-2016-09-10.txt
)
*
test dataset: link
[
link
](
http://download.tensorflow.org/models/LM_LSTM_CNN/test/news.en.heldout-00000-of-00050
)
*
It is recommended to run on modern desktop instead of laptop.
*
It is recommended to run on
a
modern desktop instead of
a
laptop.
```
shell
# 1. Clone the code to your workspace.
...
...
@@ -105,7 +105,7 @@ Pre-requesite:
# 3. Create an empty WORKSPACE file in your workspace.
# 4. Create an empty output directory in your workspace.
# Example directory structure below:
ls
-R
$
ls
-R
.:
data lm_1b output WORKSPACE
...
...
@@ -121,9 +121,9 @@ BUILD data_utils.py lm_1b_eval.py README.md
./output:
# Build the codes.
bazel build
-c
opt lm_1b/...
$
bazel build
-c
opt lm_1b/...
# Run sample mode:
bazel-bin/lm_1b/lm_1b_eval
--mode
sample
\
$
bazel-bin/lm_1b/lm_1b_eval
--mode
sample
\
--prefix
"I love that I"
\
--pbtxt
data/graph-2016-09-10.pbtxt
\
--vocab_file
data/vocab-2016-09-10.txt
\
...
...
@@ -138,7 +138,7 @@ I love that I find that amazing
...
(
omitted
)
# Run eval mode:
bazel-bin/lm_1b/lm_1b_eval
--mode
eval
\
$
bazel-bin/lm_1b/lm_1b_eval
--mode
eval
\
--pbtxt
data/graph-2016-09-10.pbtxt
\
--vocab_file
data/vocab-2016-09-10.txt
\
--input_data
data/news.en.heldout-00000-of-00050
\
...
...
@@ -166,7 +166,7 @@ Eval Step: 4531, Average Perplexity: 29.285674.
...
(
omitted. At convergence, it should be around 30.
)
# Run dump_emb mode:
bazel-bin/lm_1b/lm_1b_eval
--mode
dump_emb
\
$
bazel-bin/lm_1b/lm_1b_eval
--mode
dump_emb
\
--pbtxt
data/graph-2016-09-10.pbtxt
\
--vocab_file
data/vocab-2016-09-10.txt
\
--ckpt
'data/ckpt-*'
\
...
...
@@ -177,17 +177,17 @@ Finished word embedding 0/793471
Finished word embedding 1/793471
Finished word embedding 2/793471
...
(
omitted
)
ls
output/
$
ls
output/
embeddings_softmax.npy ...
# Run dump_lstm_emb mode:
bazel-bin/lm_1b/lm_1b_eval
--mode
dump_lstm_emb
\
$
bazel-bin/lm_1b/lm_1b_eval
--mode
dump_lstm_emb
\
--pbtxt
data/graph-2016-09-10.pbtxt
\
--vocab_file
data/vocab-2016-09-10.txt
\
--ckpt
'data/ckpt-*'
\
--sentence
"I love who I am ."
\
--save_dir
output
ls
output/
$
ls
output/
lstm_emb_step_0.npy lstm_emb_step_2.npy lstm_emb_step_4.npy
lstm_emb_step_6.npy lstm_emb_step_1.npy lstm_emb_step_3.npy
lstm_emb_step_5.npy
...
...
next_frame_prediction/README.md
View file @
9beaea41
...
...
@@ -34,7 +34,7 @@ to tf.SequenceExample.
<b>
How to run:
</b>
```
shell
ls
-R
$
ls
-R
.:
data next_frame_prediction WORKSPACE
...
...
@@ -52,14 +52,14 @@ cross_conv2.png cross_conv3.png cross_conv.png
# Build everything.
bazel build
-c
opt next_frame_prediction/...
$
bazel build
-c
opt next_frame_prediction/...
# The following example runs the generated 2d objects.
# For Sprites dataset, image_size should be 60, norm_scale should be 255.0.
# Batch size is normally 16~64, depending on your memory size.
#
# Run training.
bazel-bin/next_frame_prediction/cross_conv/train
\
$
bazel-bin/next_frame_prediction/cross_conv/train
\
--batch_size
=
1
\
--data_filepattern
=
data/tfrecords
\
--image_size
=
64
\
...
...
@@ -75,9 +75,9 @@ step: 7, loss: 1.747665
step: 8, loss: 1.572436
step: 9, loss: 1.586816
step: 10, loss: 1.434191
#
# Run eval.
bazel-bin/next_frame_prediction/cross_conv/eval
\
$
bazel-bin/next_frame_prediction/cross_conv/eval
\
--batch_size
=
1
\
--data_filepattern
=
data/tfrecords_test
\
--image_size
=
64
\
...
...
resnet/README.md
View file @
9beaea41
...
...
@@ -23,7 +23,7 @@ https://arxiv.org/pdf/1605.07146v1.pdf
<b>
Settings:
</b>
*
Random split 50k training set into 45k/5k train/eval split.
*
Pad to 36x36 and random crop. Horizontal flip. Per-image whiten
t
ing.
*
Pad to 36x36 and random crop. Horizontal flip. Per-image whitening.
*
Momentum optimizer 0.9.
*
Learning rate schedule: 0.1 (40k), 0.01 (60k), 0.001 (>60k).
*
L2 weight decay: 0.002.
...
...
@@ -65,37 +65,37 @@ curl -o cifar-100-binary.tar.gz https://www.cs.toronto.edu/~kriz/cifar-100-binar
<b>
How to run:
</b>
```
shell
# cd to the
your workspace
.
#
It
contain
s
an empty WORKSPACE file, resnet code
s
and cifar10 dataset.
# Note:
U
ser can split 5k from train set for eval set.
ls
-R
.:
cifar10 resnet WORKSPACE
# cd to the
models repository and run with bash. Expected command output shown
.
#
The directory should
contain an empty WORKSPACE file,
the
resnet code
,
and
the
cifar10 dataset.
# Note:
The u
ser can split 5k from train set for eval set.
$
ls
-R
.:
cifar10 resnet WORKSPACE
./cifar10:
data_batch_1.bin data_batch_2.bin data_batch_3.bin data_batch_4.bin
data_batch_5.bin test_batch.bin
./cifar10:
data_batch_1.bin data_batch_2.bin data_batch_3.bin data_batch_4.bin
data_batch_5.bin test_batch.bin
./resnet:
BUILD cifar_input.py g3doc README.md resnet_main.py resnet_model.py
./resnet:
BUILD cifar_input.py g3doc README.md resnet_main.py resnet_model.py
# Build everything for GPU.
bazel build
-c
opt
--config
=
cuda resnet/...
$
bazel build
-c
opt
--config
=
cuda resnet/...
# Train the model.
bazel-bin/resnet/resnet_main
--train_data_path
=
cifar10/data_batch
*
\
$
bazel-bin/resnet/resnet_main
--train_data_path
=
cifar10/data_batch
*
\
--log_root
=
/tmp/resnet_model
\
--train_dir
=
/tmp/resnet_model/train
\
--dataset
=
'cifar10'
\
--num_gpus
=
1
# While the model is training, you can also check on its progress using tensorboard:
tensorboard
--logdir
=
/tmp/resnet_model
$
tensorboard
--logdir
=
/tmp/resnet_model
# Evaluate the model.
# Avoid running on the same GPU as the training job at the same time,
# otherwise, you might run out of memory.
bazel-bin/resnet/resnet_main
--eval_data_path
=
cifar10/test_batch.bin
\
$
bazel-bin/resnet/resnet_main
--eval_data_path
=
cifar10/test_batch.bin
\
--log_root
=
/tmp/resnet_model
\
--eval_dir
=
/tmp/resnet_model/test
\
--mode
=
eval
\
...
...
resnet/resnet_model.py
View file @
9beaea41
...
...
@@ -85,7 +85,7 @@ class ResNet(object):
# comparably good performance.
# https://arxiv.org/pdf/1605.07146v1.pdf
# filters = [16, 160, 320, 640]
# Update hps.num_residual_units to
9
# Update hps.num_residual_units to
4
with
tf
.
variable_scope
(
'unit_1_0'
):
x
=
res_func
(
x
,
filters
[
0
],
filters
[
1
],
self
.
_stride_arr
(
strides
[
0
]),
...
...
slim/README.md
View file @
9beaea41
...
...
@@ -178,12 +178,12 @@ image classification dataset.
In the table below, we list each model, the corresponding
TensorFlow model file, the link to the model checkpoint, and the top 1 and top 5
accuracy (on the imagenet test set).
Note that the VGG and ResNet parameters have been converted from their original
Note that the VGG and ResNet
V1
parameters have been converted from their original
caffe formats
(
[
here
](
https://github.com/BVLC/caffe/wiki/Model-Zoo#models-used-by-the-vgg-team-in-ilsvrc-2014
)
and
[
here
](
https://github.com/KaimingHe/deep-residual-networks
)
),
whereas the Inception parameters have been trained internally at
whereas the Inception
and ResNet V2
parameters have been trained internally at
Google. Also be aware that these accuracies were computed by evaluating using a
single image crop. Some academic papers report higher accuracy by using multiple
crops at multiple scales.
...
...
@@ -195,12 +195,19 @@ Model | TF-Slim File | Checkpoint | Top-1 Accuracy| Top-5 Accuracy |
[
Inception V3
](
http://arxiv.org/abs/1512.00567
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/inception_v3.py
)
|
[
inception_v3_2016_08_28.tar.gz
](
http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz
)
|78.0|93.9|
[
Inception V4
](
http://arxiv.org/abs/1602.07261
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/inception_v4.py
)
|
[
inception_v4_2016_09_09.tar.gz
](
http://download.tensorflow.org/models/inception_v4_2016_09_09.tar.gz
)
|80.2|95.2|
[
Inception-ResNet-v2
](
http://arxiv.org/abs/1602.07261
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py
)
|
[
inception_resnet_v2.tar.gz
](
http://download.tensorflow.org/models/inception_resnet_v2_2016_08_30.tar.gz
)
|80.4|95.3|
[
ResNet 50
](
https://arxiv.org/abs/1512.03385
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py
)
|
[
resnet_v1_50.tar.gz
](
http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz
)
|75.2|92.2|
[
ResNet 101
](
https://arxiv.org/abs/1512.03385
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py
)
|
[
resnet_v1_101.tar.gz
](
http://download.tensorflow.org/models/resnet_v1_101_2016_08_28.tar.gz
)
|76.4|92.9|
[
ResNet 152
](
https://arxiv.org/abs/1512.03385
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py
)
|
[
resnet_v1_152.tar.gz
](
http://download.tensorflow.org/models/resnet_v1_152_2016_08_28.tar.gz
)
|76.8|93.2|
[
ResNet V1 50
](
https://arxiv.org/abs/1512.03385
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py
)
|
[
resnet_v1_50.tar.gz
](
http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz
)
|75.2|92.2|
[
ResNet V1 101
](
https://arxiv.org/abs/1512.03385
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py
)
|
[
resnet_v1_101.tar.gz
](
http://download.tensorflow.org/models/resnet_v1_101_2016_08_28.tar.gz
)
|76.4|92.9|
[
ResNet V1 152
](
https://arxiv.org/abs/1512.03385
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v1.py
)
|
[
resnet_v1_152.tar.gz
](
http://download.tensorflow.org/models/resnet_v1_152_2016_08_28.tar.gz
)
|76.8|93.2|
[
ResNet V2 50
](
https://arxiv.org/abs/1603.05027
)
^|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v2.py
)
|
[
resnet_v2_50.tar.gz
](
http://download.tensorflow.org/models/resnet_v2_50_2017_04_14.tar.gz
)
|75.6|92.8|
[
ResNet V2 101
](
https://arxiv.org/abs/1603.05027
)
^|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v2.py
)
|
[
resnet_v2_101.tar.gz
](
http://download.tensorflow.org/models/resnet_v2_101_2017_04_14.tar.gz
)
|77.0|93.7|
[
ResNet V2 152
](
https://arxiv.org/abs/1603.05027
)
^|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/resnet_v2.py
)
|
[
resnet_v2_152.tar.gz
](
http://download.tensorflow.org/models/resnet_v2_152_2017_04_14.tar.gz
)
|77.8|94.1|
[
VGG 16
](
http://arxiv.org/abs/1409.1556.pdf
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/vgg.py
)
|
[
vgg_16.tar.gz
](
http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz
)
|71.5|89.8|
[
VGG 19
](
http://arxiv.org/abs/1409.1556.pdf
)
|
[
Code
](
https://github.com/tensorflow/models/blob/master/slim/nets/vgg.py
)
|
[
vgg_19.tar.gz
](
http://download.tensorflow.org/models/vgg_19_2016_08_28.tar.gz
)
|71.1|89.8|
^ ResNet V2 models use Inception pre-processing and input image size of 299 (use
`--preprocessing_name inception --eval_image_size 299`
when using
`eval_image_classifier.py`
). Performance numbers for ResNet V2 models are
reported on ImageNet valdiation set.
Here is an example of how to download the Inception V3 checkpoint:
...
...
@@ -344,10 +351,10 @@ following error:
```
bash
InvalidArgumentError: Assign requires shapes of both tensors to match. lhs
shape
=
[
1001] rhs
shape
=
[
1000]
```
This is due to the fact that the VGG and ResNet final layers have only 1000
This is due to the fact that the VGG and ResNet
V1
final layers have only 1000
outputs rather than 1001.
To fix this issue, you can set the
`--labels_offset
s
=1`
flag. This results in
To fix this issue, you can set the
`--labels_offset=1`
flag. This results in
the ImageNet labels being shifted down by one:
...
...
textsum/README.md
View file @
9beaea41
...
...
@@ -16,7 +16,7 @@ The results described below are based on model trained on multi-gpu and
multi-machine settings. It has been simplified to run on only one machine
for open source purpose.
<b>
Data
S
et
</b>
<b>
Data
s
et
</b>
We used the Gigaword dataset described in
[
Rush et al. A Neural Attention Model
for Sentence Summarization
](
https://arxiv.org/abs/1509.00685
)
.
...
...
tutorials/rnn/ptb/ptb_word_lm.py
View file @
9beaea41
...
...
@@ -157,7 +157,7 @@ class PTBModel(object):
(
cell_output
,
state
)
=
cell
(
inputs
[:,
time_step
,
:],
state
)
outputs
.
append
(
cell_output
)
output
=
tf
.
reshape
(
tf
.
concat
(
axis
=
1
,
values
=
outputs
),
[
-
1
,
size
])
output
=
tf
.
reshape
(
tf
.
stack
(
axis
=
1
,
values
=
outputs
),
[
-
1
,
size
])
softmax_w
=
tf
.
get_variable
(
"softmax_w"
,
[
size
,
vocab_size
],
dtype
=
data_type
())
softmax_b
=
tf
.
get_variable
(
"softmax_b"
,
[
vocab_size
],
dtype
=
data_type
())
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment