Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
ModelZoo
ResNet50_pytorch
Commits
754be2d7
Commit
754be2d7
authored
Apr 03, 2023
by
panning
Browse files
添加数据处理脚本
parent
e397f630
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
80 additions
and
0 deletions
+80
-0
scrips/extract_ILSVRC.sh
scrips/extract_ILSVRC.sh
+80
-0
No files found.
scrips/extract_ILSVRC.sh
0 → 100644
View file @
754be2d7
#!/bin/bash
#
# script to extract ImageNet dataset
# ILSVRC2012_img_train.tar (about 138 GB)
# ILSVRC2012_img_val.tar (about 6.3 GB)
# make sure ILSVRC2012_img_train.tar & ILSVRC2012_img_val.tar in your current directory
#
# Adapted from:
# https://github.com/facebook/fb.resnet.torch/blob/master/INSTALL.md
# https://gist.github.com/BIGBALLON/8a71d225eff18d88e469e6ea9b39cef4
#
# imagenet/train/
# ├── n01440764
# │ ├── n01440764_10026.JPEG
# │ ├── n01440764_10027.JPEG
# │ ├── ......
# ├── ......
# imagenet/val/
# ├── n01440764
# │ ├── ILSVRC2012_val_00000293.JPEG
# │ ├── ILSVRC2012_val_00002138.JPEG
# │ ├── ......
# ├── ......
#
#
# Make imagnet directory
#
mkdir
imagenet
#
# Extract the training data:
#
# Create train directory; move .tar file; change directory
mkdir
imagenet/train
&&
mv
ILSVRC2012_img_train.tar imagenet/train/
&&
cd
imagenet/train
# Extract training set; remove compressed file
tar
-xvf
ILSVRC2012_img_train.tar
&&
rm
-f
ILSVRC2012_img_train.tar
#
# At this stage imagenet/train will contain 1000 compressed .tar files, one for each category
#
# For each .tar file:
# 1. create directory with same name as .tar file
# 2. extract and copy contents of .tar file into directory
# 3. remove .tar file
find
.
-name
"*.tar"
|
while
read
NAME
;
do
mkdir
-p
"
${
NAME
%.tar
}
"
;
tar
-xvf
"
${
NAME
}
"
-C
"
${
NAME
%.tar
}
"
;
rm
-f
"
${
NAME
}
"
;
done
#
# This results in a training directory like so:
#
# imagenet/train/
# ├── n01440764
# │ ├── n01440764_10026.JPEG
# │ ├── n01440764_10027.JPEG
# │ ├── ......
# ├── ......
#
# Change back to original directory
cd
../..
#
# Extract the validation data and move images to subfolders:
#
# Create validation directory; move .tar file; change directory; extract validation .tar; remove compressed file
mkdir
imagenet/val
&&
mv
ILSVRC2012_img_val.tar imagenet/val/
&&
cd
imagenet/val
&&
tar
-xvf
ILSVRC2012_img_val.tar
&&
rm
-f
ILSVRC2012_img_val.tar
# get script from soumith and run; this script creates all class directories and moves images into corresponding directories
wget
-qO-
https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh | bash
#
# This results in a validation directory like so:
#
# imagenet/val/
# ├── n01440764
# │ ├── ILSVRC2012_val_00000293.JPEG
# │ ├── ILSVRC2012_val_00002138.JPEG
# │ ├── ......
# ├── ......
#
#
# Check total files after extract
#
# $ find train/ -name "*.JPEG" | wc -l
# 1281167
# $ find val/ -name "*.JPEG" | wc -l
# 50000
#
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment