Commit e46cbff3 authored by Matt Rickard's avatar Matt Rickard Committed by Neal Wu
Browse files

Revert merge #1292

This broke the bazel build of inception/download_and_preprocess_flowers
The way that this script is written doesn't actually allow it to be ran
outside bazel, some refactoring would be needed if you want to run it
standalone.

It should be ran using

```
bazel build inception/download_and_preprocess_flowers

bazel-bin/inception/download_and_preprocess_flowers
"${FLOWERS_DATA_DIR}"
```
parent 405bb623
......@@ -198,7 +198,7 @@ def _process_image(filename, coder):
width: integer, image width in pixels.
"""
# Read the image file.
with tf.gfile.FastGFile(filename, 'rb') as f:
with tf.gfile.FastGFile(filename, 'r') as f:
image_data = f.read()
# Convert any PNG to JPEG's for consistency.
......
......@@ -41,32 +41,31 @@ fi
# Create the output and temporary directories.
DATA_DIR="${1%/}"
SCRATCH_DIR="${DATA_DIR}/raw-data"
SCRATCH_DIR="${DATA_DIR}/raw-data/"
mkdir -p "${DATA_DIR}"
mkdir -p "${SCRATCH_DIR}"
# http://stackoverflow.com/questions/59895/getting-the-source-directory-of-a-bash-script-from-within
WORK_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
WORK_DIR="$0.runfiles/inception/inception"
# Download the flowers data.
DATA_URL="http://download.tensorflow.org/example_images/flower_photos.tgz"
CURRENT_DIR=$(pwd)
cd "${DATA_DIR}"
TARBALL="flower_photos.tgz"
if [ ! -f ${TARBALL} ]; then
echo "Downloading flower data set."
curl -o ${DATA_DIR}/${TARBALL} "${DATA_URL}"
curl -o ${TARBALL} "${DATA_URL}"
else
echo "Skipping download of flower data."
fi
# Note the locations of the train and validation data.
TRAIN_DIRECTORY="${SCRATCH_DIR}/train"
VALIDATION_DIRECTORY="${SCRATCH_DIR}/validation"
TRAIN_DIRECTORY="${SCRATCH_DIR}train/"
VALIDATION_DIRECTORY="${SCRATCH_DIR}validation/"
# Expands the data into the flower_photos/ directory and rename it as the
# train directory.
tar xf ${DATA_DIR}/flower_photos.tgz
tar xf flower_photos.tgz
rm -rf "${TRAIN_DIRECTORY}" "${VALIDATION_DIRECTORY}"
mkdir -p "${TRAIN_DIRECTORY}"
mv flower_photos "${TRAIN_DIRECTORY}"
# Generate a list of 5 labels: daisy, dandelion, roses, sunflowers, tulips
......@@ -75,22 +74,22 @@ ls -1 "${TRAIN_DIRECTORY}" | grep -v 'LICENSE' | sed 's/\///' | sort > "${LABELS
# Generate the validation data set.
while read LABEL; do
VALIDATION_DIR_FOR_LABEL="${VALIDATION_DIRECTORY}/${LABEL}"
TRAIN_DIR_FOR_LABEL="${TRAIN_DIRECTORY}/${LABEL}"
VALIDATION_DIR_FOR_LABEL="${VALIDATION_DIRECTORY}${LABEL}"
TRAIN_DIR_FOR_LABEL="${TRAIN_DIRECTORY}${LABEL}"
# Move the first randomly selected 100 images to the validation set.
mkdir -p "${VALIDATION_DIR_FOR_LABEL}"
VALIDATION_IMAGES=$(ls -1 "${TRAIN_DIR_FOR_LABEL}" | shuf | head -100)
for IMAGE in ${VALIDATION_IMAGES}; do
mv -f "${TRAIN_DIRECTORY}/${LABEL}/${IMAGE}" "${VALIDATION_DIR_FOR_LABEL}"
mv -f "${TRAIN_DIRECTORY}${LABEL}/${IMAGE}" "${VALIDATION_DIR_FOR_LABEL}"
done
done < "${LABELS_FILE}"
# Build the TFRecords version of the image data.
cd "${CURRENT_DIR}"
BUILD_SCRIPT="${WORK_DIR}/build_image_data.py"
BUILD_SCRIPT="${WORK_DIR}/build_image_data"
OUTPUT_DIRECTORY="${DATA_DIR}"
python "${BUILD_SCRIPT}" \
"${BUILD_SCRIPT}" \
--train_directory="${TRAIN_DIRECTORY}" \
--validation_directory="${VALIDATION_DIRECTORY}" \
--output_directory="${OUTPUT_DIRECTORY}" \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment