Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
OpenFold
Commits
bd0c4d48
Commit
bd0c4d48
authored
Oct 21, 2021
by
Gustaf Ahdritz
Browse files
Update documentation
parent
2d6ef1fb
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
64 additions
and
5 deletions
+64
-5
README.md
README.md
+64
-5
No files found.
README.md
View file @
bd0c4d48
...
@@ -12,12 +12,15 @@ DeepMind experiments. It is omitted here for the sake of reducing clutter. In
...
@@ -12,12 +12,15 @@ DeepMind experiments. It is omitted here for the sake of reducing clutter. In
cases where the Nature paper differs from the source, we always defer to the
cases where the Nature paper differs from the source, we always defer to the
latter.
latter.
OpenFold is built to support inference with AlphaFold's original JAX weights.
Try it out with our
[
Colab notebook
](
https://colab.research.google.com/github/aqlaboratory/openfold/blob/main/notebooks/OpenFold.ipynb
)
.
Unlike DeepMind's public code, OpenFold is also trainable. It can be trained
Unlike DeepMind's public code, OpenFold is also trainable. It can be trained
with or without
[
DeepSpeed
](
https://github.com/microsoft/deepspeed
)
and with
with or without
[
DeepSpeed
](
https://github.com/microsoft/deepspeed
)
and with
mixed precision. bfloat16 training is not currently supported, but will be
mixed precision. bfloat16 training is not currently supported, but will be
soon.
in the future.
## Installation
## Installation
(Linux)
Python dependencies available through
`pip`
are provided in
`requirements.txt`
.
Python dependencies available through
`pip`
are provided in
`requirements.txt`
.
OpenFold also depends on
`openmm==7.5.1`
and
`pdbfixer`
, which are only
OpenFold also depends on
`openmm==7.5.1`
and
`pdbfixer`
, which are only
...
@@ -45,6 +48,14 @@ source scripts/deactivate_conda_venv.sh
...
@@ -45,6 +48,14 @@ source scripts/deactivate_conda_venv.sh
## Usage
## Usage
To download the genetic databases used by AlphaFold/OpenFold, run:
```
bash
scripts/download_all_data.sh data/
```
This script depends on
`aria2c`
.
To run inference on a sequence using a set of DeepMind's pretrained parameters,
To run inference on a sequence using a set of DeepMind's pretrained parameters,
run e.g.
run e.g.
...
@@ -61,7 +72,7 @@ python3 run_pretrained_openfold.py \
...
@@ -61,7 +72,7 @@ python3 run_pretrained_openfold.py \
--device
cuda:1
--device
cuda:1
```
```
where
`data`
is
a
directory
populated by
`scripts/download_all_data.sh`
. Run
where
`data`
is
the same
directory
as in the previous step
. Run
```
bash
```
bash
python3 run_pretrained_openfold.py
--help
python3 run_pretrained_openfold.py
--help
...
@@ -69,6 +80,50 @@ python3 run_pretrained_openfold.py --help
...
@@ -69,6 +80,50 @@ python3 run_pretrained_openfold.py --help
for a full list of options.
for a full list of options.
To train the model, you will first need to precompute protein alignments. After
installing OpenFold using
`setup.py`
, do so with:
```
bash
python3 scripts/precompute_alignments.py mmcif_dir/ alignment_dir/
\
data/uniref90/uniref90.fasta
\
data/mgnify/mgy_clusters_2018_12.fa
\
data/pdb70/pdb70
\
data/pdb_mmcif/mmcif_files/
\
data/uniclust30/uniclust30_2018_08/uniclust30_2018_08
\
--bfd_database_path
data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
\
--cpus
16
```
Expect this step to take a very long time, even for small numbers of proteins.
Next, generate a cache of certain datapoints in the mmCIF files as follows:
```
bash
python3 scripts/generate_mmcif_cache.py mmcif_dir/ mmcif_cache.json
--no_workers
16
```
This cache is used to minimize the number of mmCIF parses performed during
training-time data preprocessing. Finally, call the training script:
```
bash
python3 train_openfold.py mmcif_dir/ alignment_dir/ template_mmcif_dir/
\
2021-10-10
\
--template_release_dates_cache_path
mmcif_cache.json
\
--precision
16
\
--gpus
8
--replace_sampler_ddp
=
True
\
--accelerator
ddp
\
--seed
42
\
# in multi-gpu settings, the seed must be specified
--deepspeed_config_path
deepspeed_config.json
```
where
`--template_release_dates_cache_path`
is a path to the
`.json`
file
generated in the previous step. A suitable DeepSpeed configuration file can be
generated with
`scripts/build_deepspeed_config.py`
. The training script is
written with
[
PyTorch Lightning
](
https://github.com/PyTorchLightning/pytorch-lightning
)
and supports the full range of training options that entails, including
multi-node distributed training. For more information, consult PyTorch
Lightning documentation and the
`--help`
flag of the training script.
## Testing
## Testing
To run unit tests, use
To run unit tests, use
...
@@ -87,7 +142,7 @@ scripts/run_unit_tests.sh -v tests.test_model
...
@@ -87,7 +142,7 @@ scripts/run_unit_tests.sh -v tests.test_model
Certain tests require that AlphaFold be installed in the same Python
Certain tests require that AlphaFold be installed in the same Python
environment. These run components of AlphaFold and OpenFold side by side and
environment. These run components of AlphaFold and OpenFold side by side and
ensure that output activations are adequately similar. For most modules, we
ensure that output activations are adequately similar. For most modules, we
target a maximum difference of 1e-4.
target a maximum difference of
`
1e-4
`
.
## Copyright notice
## Copyright notice
...
@@ -96,3 +151,7 @@ the permissive Apache Licence, Version 2.0, DeepMind's pretrained parameters
...
@@ -96,3 +151,7 @@ the permissive Apache Licence, Version 2.0, DeepMind's pretrained parameters
remain under the more restrictive CC BY-NC 4.0 license, a copy of which is
remain under the more restrictive CC BY-NC 4.0 license, a copy of which is
downloaded to
`openfold/resources/params`
by the installation script. They are
downloaded to
`openfold/resources/params`
by the installation script. They are
thereby made unavailable for commercial use.
thereby made unavailable for commercial use.
## Contributing
If you encounter problems using OpenFold, feel free to create an issue!
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment