Unverified Commit 930a58ad authored by oahzxl's avatar oahzxl Committed by GitHub
Browse files

Update readme and docker for v0.2.0 (#100)



* update readme and setup, not finished

* update readme and dockerfile
Co-authored-by: default avatarlc_pro <gyang_lu@foxmail.com>
parent 8a599895
......@@ -18,7 +18,8 @@ FastFold provides a **high-performance implementation of Evoformer** with the fo
3. Ease of use
* Huge performance gains with a few lines changes
* You don't need to care about how the parallel part is implemented
4. Faster data processing, about 3x times faster than the original way
4. Faster data processing, about 3x times faster on monomer, about 3Nx times faster on multimer with N sequence.
5. Great Reduction on GPU memory, able to inference sequence containing more than **10000** residues.
## Installation
......@@ -42,9 +43,24 @@ conda activate fastfold
python setup.py install
```
#### Advanced
To leverage the power of FastFold, we recommend you build [Triton]() from source.
**[NVIDIA CUDA](https://developer.nvidia.com/cuda-downloads) 11.4 or above is needed.**
```bash
git clone https://github.com/openai/triton.git ~/triton
cd ~/triton/python
pip install -e .
```
### Using PyPi
You can download FastFold with pre-built CUDA extensions.
Warning, only stable versions available.
```shell
pip install fastfold -f https://release.colossalai.org/fastfold
```
......@@ -147,7 +163,9 @@ python inference.py target.fasta data/pdb_mmcif/mmcif_files/ \
Alphafold's embedding presentations take up a lot of memory as the sequence length increases. To reduce memory usage,
you should add parameter `--chunk_size [N]` and `--inplace` to cmdline or shell script `./inference.sh`.
The smaller you set N, the less memory will be used, but it will affect the speed. We can inference
a sequence of length 7000 in fp32 on a 80G A100.
a sequence of length 10000 in bf16 with 61GB memory on a Nvidia A100(80GB). For fp32, the max length is 8000.
> You need to set `PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:15000` to inference such an extreme long sequence.
```shell
python inference.py target.fasta data/pdb_mmcif/mmcif_files/ \
--output_dir ./ \
......
FROM hpcaitech/colossalai:0.1.8
FROM hpcaitech/pytorch-cuda:1.12.0-11.3.0
RUN conda install openmm=7.7.0 pdbfixer -c conda-forge -y \
&& conda install hmmer==3.3.2 hhsuite=3.3.0 kalign2=2.04 -c bioconda -y
......@@ -6,7 +6,12 @@ RUN conda install openmm=7.7.0 pdbfixer -c conda-forge -y \
RUN pip install biopython==1.79 dm-tree==0.1.6 ml-collections==0.1.0 \
scipy==1.7.1 ray pyarrow pandas einops
# prepare environment
Run git clone https://github.com/hpcaitech/FastFold.git\
RUN pip install colossalai==0.1.10+torch1.12cu11.3 -f https://release.colossalai.org
RUN git clone https://github.com/openai/triton.git ~/triton \
&& cd ~/triton/python \
&& pip install -e .
Run git clone https://github.com/hpcaitech/FastFold.git \
&& cd ./FastFold \
&& python setup.py install
......@@ -129,7 +129,7 @@ else:
setup(
name='fastfold',
version='0.1.0',
version='0.2.0',
packages=find_packages(exclude=(
'assets',
'benchmark',
......@@ -140,5 +140,5 @@ setup(
ext_modules=ext_modules,
package_data={'fastfold': ['model/fastnn/kernel/cuda_native/csrc/*']},
cmdclass={'build_ext': BuildExtension} if ext_modules else {},
install_requires=['einops'],
install_requires=['einops', 'colossalai'],
)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment