Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
OpenFold
Commits
8881b210
Commit
8881b210
authored
Aug 09, 2022
by
Gustaf Ahdritz
Browse files
Add RODA flattening script
parent
7e3cd77f
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
46 additions
and
1 deletion
+46
-1
README.md
README.md
+2
-1
scripts/flatten_roda.sh
scripts/flatten_roda.sh
+44
-0
No files found.
README.md
View file @
8881b210
...
@@ -326,7 +326,8 @@ script.
...
@@ -326,7 +326,8 @@ script.
If you're using your own MSAs or MSAs from the RODA repository, make sure that
If you're using your own MSAs or MSAs from the RODA repository, make sure that
the
`alignment_dir`
contains one directory per chain and that each of these
the
`alignment_dir`
contains one directory per chain and that each of these
contains alignments (.sto, .a3m, and .hhr) corresponding to that chain.
contains alignments (.sto, .a3m, and .hhr) corresponding to that chain. You
can use
`scripts/flatten_roda.sh`
to reformat RODA downloads in this way.
Note that, despite its variable name,
`mmcif_dir`
can also contain PDB files
Note that, despite its variable name,
`mmcif_dir`
can also contain PDB files
or even ProteinNet .core files. To emulate the AlphaFold training procedure,
or even ProteinNet .core files. To emulate the AlphaFold training procedure,
...
...
scripts/flatten_roda.sh
0 → 100755
View file @
8881b210
#!/usr/bin/env sh
#
# Flattens a downloaded RODA database into the format expected by OpenFold
# Args:
# roda_dir:
# The path to the database you want to flatten. E.g. "roda/pdb"
# or "roda/uniclust30". Note that, to save space, this script
# will empty this directory.
# output_dir:
# The directory in which to construct the reformatted data
if
[[
$#
!=
2
]]
;
then
echo
"usage: ./flatten_roda.sh <roda_dir> <output_dir>"
exit
1
fi
RODA_DIR
=
$1
OUTPUT_DIR
=
$2
DATA_DIR
=
"
${
OUTPUT_DIR
}
/data"
ALIGNMENT_DIR
=
"
${
OUTPUT_DIR
}
/alignments"
mkdir
-p
"
${
DATA_DIR
}
"
mkdir
-p
"
${
ALIGNMENT_DIR
}
"
for
chain_dir
in
$(
ls
"
${
RODA_DIR
}
"
)
;
do
CHAIN_DIR_PATH
=
"
${
RODA_DIR
}
/
${
chain_dir
}
"
for
subdir
in
$(
ls
"
${
CHAIN_DIR_PATH
}
"
)
;
do
if
[[
$subdir
=
"pdb"
]]
||
[[
$subdir
=
"cif"
]]
;
then
CHAIN_DATA_DIR
=
"
${
DATA_DIR
}
/
${
chain_dir
}
"
mkdir
-p
"
${
CHAIN_DATA_DIR
}
"
mv
"
${
CHAIN_DIR_PATH
}
/
${
subdir
}
"
/
*
"
${
CHAIN_DATA_DIR
}
"
else
CHAIN_ALIGNMENT_DIR
=
"
${
ALIGNMENT_DIR
}
/
${
chain_dir
}
"
mkdir
-p
"
${
CHAIN_ALIGNMENT_DIR
}
"
mv
"
${
CHAIN_DIR_PATH
}
/
${
subdir
}
"
/
*
"
${
CHAIN_ALIGNMENT_DIR
}
"
fi
done
done
NO_DATA_FILES
=
$(
find
"
${
DATA_DIR
}
"
-type
f |
wc
-l
)
if
[[
$NO_DATA_FILES
=
0
]]
;
then
rm
-rf
${
DATA_DIR
}
fi
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment