Commit 998155e9 authored by Gustaf Ahdritz's avatar Gustaf Ahdritz
Browse files

Merge branch 'main' of github.com:aqlaboratory/openfold into chunking_experiment

parents 676b6668 c9e0f894
name: Docker Image CI
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Build the Docker image
run: docker build . --file Dockerfile --tag openfold:$(date +%s)
\ No newline at end of file
FROM nvidia/cuda:11.0-cudnn8-runtime-ubuntu18.04
# I'm not sure why i needed both opencl and cuda here, but the relax phase of the script needed opencl
RUN apt-get update && apt-get install -y wget cuda-minimal-build-11-0 nvidia-opencl-dev git
RUN wget -P /tmp \
"https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" \
&& bash /tmp/Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda \
&& rm /tmp/Miniconda3-latest-Linux-x86_64.sh
ENV PATH /opt/conda/bin:$PATH
COPY environment.yml /opt/openfold/environment.yml
# installing into the base environment since the docker container wont do anything other than run openfold
RUN conda env update -n base --file /opt/openfold/environment.yml && conda clean --all
COPY openfold /opt/openfold/openfold
COPY scripts /opt/openfold/scripts
COPY run_pretrained_openfold.py /opt/openfold/run_pretrained_openfold.py
COPY train_openfold.py /opt/openfold/train_openfold.py
COPY setup.py /opt/openfold/setup.py
COPY lib/openmm.patch /opt/openfold/lib/openmm.patch
RUN wget -q -P /opt/openfold/openfold/resources \
https://git.scicore.unibas.ch/schwede/openstructure/-/raw/7102c63615b64735c4941278d92b554ec94415f8/modules/mol/alg/src/stereo_chemical_props.txt
RUN patch -p0 -d /opt/conda/lib/python3.7/site-packages/ < /opt/openfold/lib/openmm.patch
RUN python3 /opt/openfold/setup.py install
...@@ -23,12 +23,10 @@ or `bfloat16` half-precision. ...@@ -23,12 +23,10 @@ or `bfloat16` half-precision.
## Installation (Linux) ## Installation (Linux)
Python dependencies available through `pip` are provided in `requirements.txt`. All Python dependencies are specified in `environment.yml`. For producing sequence
OpenFold depends on `openmm==7.5.1` and `pdbfixer`, which are only available alignments, you'll also need `kalign`, the [HH-suite](https://github.com/soedinglab/hh-suite),
via `conda`. For producing sequence alignments, you'll also need and one of {`jackhmmer`, [MMseqs2](https://github.com/soedinglab/mmseqs2) (nightly build)}
`kalign`, the [HH-suite](https://github.com/soedinglab/hh-suite), and one of installed on on your system. Finally, some download scripts require `aria2c`.
{`jackhmmer`, [MMseqs2](https://github.com/soedinglab/mmseqs2) (nightly build)} installed on
on your system. Finally, some download scripts require `aria2c`.
For convenience, we provide a script that installs Miniconda locally, creates a For convenience, we provide a script that installs Miniconda locally, creates a
`conda` virtual environment, installs all Python dependencies, and downloads `conda` virtual environment, installs all Python dependencies, and downloads
...@@ -235,13 +233,63 @@ environment. These run components of AlphaFold and OpenFold side by side and ...@@ -235,13 +233,63 @@ environment. These run components of AlphaFold and OpenFold side by side and
ensure that output activations are adequately similar. For most modules, we ensure that output activations are adequately similar. For most modules, we
target a maximum pointwise difference of `1e-4`. target a maximum pointwise difference of `1e-4`.
## Building and using the docker container
### Building the docker image
Openfold can be built as a docker container using the included dockerfile. To build it, run the following command from the root of this repository:
```bash
docker build -t openfold .
```
### Running the docker container
The built container contains both `run_pretrained_openfold.py` and `train_openfold.py` as well as all necessary software dependencies. It does not contain the model parameters, sequence, or structural databases. These should be downloaded to the host machine following the instructions in the Usage section above.
The docker container installs all conda components to the base conda environment in `/opt/conda`, and installs openfold itself in `/opt/openfold`,
Before running the docker container, you can verify that your docker installation is able to properly communicate with your GPU by running the following command:
```bash
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
```
Note the `--gpus all` option passed to `docker run`. This option is necessary in order for the container to use the GPUs on the host machine.
To run the inference code under docker, you can use a command like the one below. In this example, parameters and sequences from the alphafold dataset are being used and are located at `/mnt/alphafold_database` on the host machine, and the input files are located in the current working directory. You can adjust the volume mount locations as needed to reflect the locations of your data.
```bash
docker run \
--gpus all \
-v $PWD/:/data \
-v /mnt/alphafold_database/:/database \
-ti openfold:latest \
python3 /opt/openfold/run_pretrained_openfold.py \
/data/input.fasta \
/database/uniref90/uniref90.fasta \
/database/mgnify/mgy_clusters_2018_12.fa \
/database/pdb70/pdb70 \
/database/pdb_mmcif/mmcif_files/ \
/database/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
--output_dir /data \
--bfd_database_path /database/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt \
--model_device cuda:0 \
--jackhmmer_binary_path /opt/conda/bin/jackhmmer \
--hhblits_binary_path /opt/conda/bin/hhblits \
--hhsearch_binary_path /opt/conda/bin/hhsearch \
--kalign_binary_path /opt/conda/bin/kalign \
--param_path /database/params/params_model_1.npz
```
## Copyright notice ## Copyright notice
While AlphaFold's and, by extension, OpenFold's source code is licensed under While AlphaFold's and, by extension, OpenFold's source code is licensed under
the permissive Apache Licence, Version 2.0, DeepMind's pretrained parameters the permissive Apache Licence, Version 2.0, DeepMind's pretrained parameters
remain under the more restrictive CC BY-NC 4.0 license, a copy of which is fall under the CC BY 4.0 license, a copy of which is downloaded to
downloaded to `openfold/resources/params` by the installation script. They are `openfold/resources/params` by the installation script. Note that the latter
thereby made unavailable for commercial use. replaced the original, more restrictive CC BY-NC 4.0 license as of January 2022.
## Contributing ## Contributing
......
...@@ -16,15 +16,13 @@ dependencies: ...@@ -16,15 +16,13 @@ dependencies:
- tqdm==4.62.2 - tqdm==4.62.2
- typing-extensions==3.10.0.2 - typing-extensions==3.10.0.2
- pytorch_lightning==1.5.0 - pytorch_lightning==1.5.0
- nvidia-pyindex - git+https://github.com/NVIDIA/dllogger.git
- nvidia-dllogger
- pytorch::pytorch=1.10.* - pytorch::pytorch=1.10.*
- conda-forge::python=3.7 - conda-forge::python=3.7
- conda-forge::setuptools=59.5.0 - conda-forge::setuptools=59.5.0
- conda-forge::pip - conda-forge::pip
- conda-forge::openmm=7.5.1 - conda-forge::openmm=7.5.1
- conda-forge::pdbfixer - conda-forge::pdbfixer
- bioconda::aria2
- bioconda::hmmer==3.3.2 - bioconda::hmmer==3.3.2
- bioconda::hhsuite==3.3.0 - bioconda::hhsuite==3.3.0
- bioconda::kalign2==2.04 - bioconda::kalign2==2.04
...@@ -101,11 +101,6 @@ ...@@ -101,11 +101,6 @@
"\n", "\n",
" PATH=%env PATH\n", " PATH=%env PATH\n",
" %env PATH=/opt/conda/bin:{PATH}\n", " %env PATH=/opt/conda/bin:{PATH}\n",
" %shell conda update -qy conda \\\n",
" && conda install -qy -c conda-forge \\\n",
" python=3.7 \\\n",
" openmm=7.5.1 \\\n",
" pdbfixer\n",
" pbar.update(80)\n", " pbar.update(80)\n",
"\n", "\n",
" # Create a ramdisk to store a database chunk to make Jackhmmer run fast.\n", " # Create a ramdisk to store a database chunk to make Jackhmmer run fast.\n",
...@@ -148,7 +143,7 @@ ...@@ -148,7 +143,7 @@
" %shell git clone {GIT_REPO} openfold\n", " %shell git clone {GIT_REPO} openfold\n",
" pbar.update(8)\n", " pbar.update(8)\n",
" # Install the required versions of all dependencies.\n", " # Install the required versions of all dependencies.\n",
" %shell pip3 install -r ./openfold/requirements.txt\n", " %shell conda env update -n base --file openfold/environment.yml\n",
" # Run setup.py to install only Openfold.\n", " # Run setup.py to install only Openfold.\n",
" %shell pip3 install --no-dependencies ./openfold\n", " %shell pip3 install --no-dependencies ./openfold\n",
" pbar.update(10)\n", " pbar.update(10)\n",
...@@ -171,13 +166,7 @@ ...@@ -171,13 +166,7 @@
" pbar.update(55)\n", " pbar.update(55)\n",
"except subprocess.CalledProcessError:\n", "except subprocess.CalledProcessError:\n",
" print(captured)\n", " print(captured)\n",
" raise\n", " raise"
"\n",
"import jax\n",
"if jax.local_devices()[0].platform == 'tpu':\n",
" raise RuntimeError('Colab TPU runtime not supported. Change it to GPU via Runtime -> Change Runtime Type -> Hardware accelerator -> GPU.')\n",
"elif jax.local_devices()[0].platform == 'cpu':\n",
" raise RuntimeError('Colab CPU runtime not supported. Change it to GPU via Runtime -> Change Runtime Type -> Hardware accelerator -> GPU.')"
], ],
"execution_count": null, "execution_count": null,
"outputs": [] "outputs": []
......
...@@ -18,6 +18,7 @@ ...@@ -18,6 +18,7 @@
import collections import collections
import functools import functools
from typing import Mapping, List, Tuple from typing import Mapping, List, Tuple
from importlib import resources
import numpy as np import numpy as np
import tree import tree
...@@ -452,9 +453,8 @@ def load_stereo_chemical_props() -> Tuple[ ...@@ -452,9 +453,8 @@ def load_stereo_chemical_props() -> Tuple[
residue_bond_angles: dict that maps resname --> list of BondAngle tuples residue_bond_angles: dict that maps resname --> list of BondAngle tuples
""" """
# TODO: this file should be downloaded in a setup script # TODO: this file should be downloaded in a setup script
stereo_chemical_props_path = "openfold/resources/stereo_chemical_props.txt" stereo_chemical_props = resources.read_text("openfold.resources", "stereo_chemical_props.txt")
with open(stereo_chemical_props_path, "rt") as f:
stereo_chemical_props = f.read()
lines_iter = iter(stereo_chemical_props.splitlines()) lines_iter = iter(stereo_chemical_props.splitlines())
# Load bond lengths. # Load bond lengths.
residue_bonds = {} residue_bonds = {}
......
...@@ -31,7 +31,7 @@ fi ...@@ -31,7 +31,7 @@ fi
DOWNLOAD_DIR="$1" DOWNLOAD_DIR="$1"
ROOT_DIR="${DOWNLOAD_DIR}/params" ROOT_DIR="${DOWNLOAD_DIR}/params"
SOURCE_URL="https://storage.googleapis.com/alphafold/alphafold_params_2021-07-14.tar" SOURCE_URL="https://storage.googleapis.com/alphafold/alphafold_params_2022-01-19.tar"
BASENAME=$(basename "${SOURCE_URL}") BASENAME=$(basename "${SOURCE_URL}")
mkdir --parents "${ROOT_DIR}" mkdir --parents "${ROOT_DIR}"
......
...@@ -13,6 +13,7 @@ wget -P /tmp \ ...@@ -13,6 +13,7 @@ wget -P /tmp \
# Grab conda-only packages # Grab conda-only packages
export PATH=lib/conda/bin:$PATH export PATH=lib/conda/bin:$PATH
lib/conda/bin/python3 -m pip install nvidia-pyindex
conda env create --name=${ENV_NAME} -f environment.yml conda env create --name=${ENV_NAME} -f environment.yml
source activate ${ENV_NAME} source activate ${ENV_NAME}
......
...@@ -24,6 +24,8 @@ setup( ...@@ -24,6 +24,8 @@ setup(
license='Apache License, Version 2.0', license='Apache License, Version 2.0',
url='https://github.com/aqlaboratory/openfold', url='https://github.com/aqlaboratory/openfold',
packages=find_packages(exclude=["tests", "scripts"]), packages=find_packages(exclude=["tests", "scripts"]),
include_package_data=True,
package_data={"": ["resources/stereo_chemical_props.txt"]},
install_requires=[ install_requires=[
'torch', 'torch',
'deepspeed', 'deepspeed',
......
../../../../openfold/resources/stereo_chemical_props.txt
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment