Commit 04fa3548 authored by sft-managed's avatar sft-managed
Browse files

minor fixes

parent 6dc8aa7f
...@@ -39,13 +39,13 @@ scripts/install_third_party_dependencies.sh ...@@ -39,13 +39,13 @@ scripts/install_third_party_dependencies.sh
To activate the environment, run: To activate the environment, run:
```bash ```bash
source scripts/activate_conda_venv.sh source scripts/activate_conda_env.sh
``` ```
To deactivate it, run: To deactivate it, run:
```bash ```bash
source scripts/deactivate_conda_venv.sh source scripts/deactivate_conda_env.sh
``` ```
## Usage ## Usage
...@@ -60,8 +60,7 @@ This script depends on `aria2c`. ...@@ -60,8 +60,7 @@ This script depends on `aria2c`.
### Inference ### Inference
To run inference on a sequence using a set of DeepMind's pretrained parameters, To run inference on a sequence `target.fasta` (e.g., `wget https://www.rcsb.org/fasta/entry/4DSN`) using a set of DeepMind's pretrained parameters, run e.g.
run e.g.
```bash ```bash
python3 run_pretrained_openfold.py \ python3 run_pretrained_openfold.py \
...@@ -73,15 +72,22 @@ python3 run_pretrained_openfold.py \ ...@@ -73,15 +72,22 @@ python3 run_pretrained_openfold.py \
data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \ data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
--output_dir ./ \ --output_dir ./ \
--bfd_database_path data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \ --bfd_database_path data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--device cuda:1 --device cuda:1 \
--jackhmmer_binary_path lib/conda/envs/openfold_venv/bin/jackhmmer \
--hhblits_binary_path lib/conda/envs/openfold_venv/bin/hhblits \
--hhsearch_binary_path lib/conda/envs/openfold_venv/bin/hhsearch \
--kalign_binary_path lib/conda/envs/openfold_venv/bin/kalign
``` ```
where `data` is the same directory as in the previous step. where `data` is the same directory as in the previous step. If `jackhmmer`, `hhblits`, `hhsearch` and `kalign` are available at the default path of `/usr/bin`, their `binary_path` command-line arguments can be dropped.
### Training ### Training
To train the model, you will first need to precompute protein alignments. After After activating the OpenFold environment with `source scripts/activate_conda_env.sh`, install OpenFold by running
installing OpenFold using `setup.py`, do so with: ```bash
python setup.py install
```
To train the model, you will first need to precompute protein alignments. Create `mmcif_dir/` and download `.cif` files from the PDB (e.g., `wget https://files.rcsb.org/download/4DSN.cif`). Then run:
```bash ```bash
python3 scripts/precompute_alignments.py mmcif_dir/ alignment_dir/ \ python3 scripts/precompute_alignments.py mmcif_dir/ alignment_dir/ \
...@@ -91,9 +97,13 @@ python3 scripts/precompute_alignments.py mmcif_dir/ alignment_dir/ \ ...@@ -91,9 +97,13 @@ python3 scripts/precompute_alignments.py mmcif_dir/ alignment_dir/ \
data/pdb_mmcif/mmcif_files/ \ data/pdb_mmcif/mmcif_files/ \
data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \ data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
--bfd_database_path data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \ --bfd_database_path data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--cpus 16 --cpus 16 \
--jackhmmer_binary_path lib/conda/envs/openfold_venv/bin/jackhmmer \
--hhblits_binary_path lib/conda/envs/openfold_venv/bin/hhblits \
--hhsearch_binary_path lib/conda/envs/openfold_venv/bin/hhsearch \
--kalign_binary_path lib/conda/envs/openfold_venv/bin/kalign
``` ```
As noted before, you can skip the `binary_path` arguments if these binaries are at `/usr/bin`.
Expect this step to take a very long time, even for small numbers of proteins. Expect this step to take a very long time, even for small numbers of proteins.
Next, generate a cache of certain datapoints in the mmCIF files: Next, generate a cache of certain datapoints in the mmCIF files:
......
...@@ -4,6 +4,7 @@ import logging ...@@ -4,6 +4,7 @@ import logging
from multiprocessing import Pool from multiprocessing import Pool
import os import os
import sys import sys
import json
sys.path.append(".") # an innocent hack to get this to run from the top level sys.path.append(".") # an innocent hack to get this to run from the top level
from tqdm import tqdm from tqdm import tqdm
......
...@@ -21,6 +21,10 @@ conda update -qy conda \ ...@@ -21,6 +21,10 @@ conda update -qy conda \
openmm=7.5.1 \ openmm=7.5.1 \
pdbfixer pdbfixer
# Comment out if you have these already installed on your system, for example in /usr/bin/
conda install -c bioconda aria2
conda install -y -c bioconda hmmer==3.3.2 hhsuite==3.3.0 kalign2==2.04
# Install DeepMind's OpenMM patch # Install DeepMind's OpenMM patch
OPENFOLD_DIR=$PWD OPENFOLD_DIR=$PWD
pushd lib/conda/envs/$ENV_NAME/lib/python3.7/site-packages/ \ pushd lib/conda/envs/$ENV_NAME/lib/python3.7/site-packages/ \
......
...@@ -3,9 +3,14 @@ import logging ...@@ -3,9 +3,14 @@ import logging
import os import os
import tempfile import tempfile
import openfold.features.mmcif_parsing as mmcif_parsing import openfold.data.mmcif_parsing as mmcif_parsing
from openfold.features.data_pipeline import AlignmentRunner from openfold.data.data_pipeline import AlignmentRunner
from scripts.utils import add_data_args
from utils import add_data_args
#python3 scripts/precompute_alignments.py mmcif_dir/ alignment_dir/ data/uniref90/uniref90.fasta data/mgnify/mgy_clusters_2018_12.fa data/pdb70/pdb70 data/pdb_mmcif/mmcif_files/ data/uniclust30/uniclust30_2018_08/uniclust30_2018_08 --bfd_database_path data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --cpus 16 --jackhmmer_binary_path /home/u00u98too4mkqFBu8M357/openfold/lib/conda/envs/openfold_venv/bin/jackhmmer --hhblits_binary_path /home/u00u98too4mkqFBu8M357/openfold/lib/conda/envs/openfold_venv/bin/hhblits --hhsearch_binary_path /home/u00u98too4mkqFBu8M357/openfold/lib/conda/envs/openfold_venv/bin/hhsearch --kalign_binary_path /home/u00u98too4mkqFBu8M357/openfold/lib/conda/envs/openfold_venv/bin/kalign
logging.basicConfig(level=logging.DEBUG)
def main(args): def main(args):
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment