You need to sign in or sign up before continuing.
Commit 67decff0 authored by chenych's avatar chenych
Browse files

Merge branch 'icon' into 'master'

Add icon and SCNET

See merge request !1
parents f9fd6dcb c5587ebf
# Uni-Fold # Uni-Fold
## 论文 ## 论文
`Uni-Fold: An Open-Source Platform for Developing Protein Folding Models beyond AlphaFold`
- https://www.biorxiv.org/content/biorxiv/early/2022/08/06/2022.08.04.502811.full.pdf
Uni-Fold: An Open-Source Platform for Developing Protein Folding Models beyond AlphaFold
https://www.biorxiv.org/content/biorxiv/early/2022/08/06/2022.08.04.502811.full.pdf
## 模型结构 ## 模型结构
模型核心是一个基于Transformer架构的神经网络,包括两个主要组件:Sequence to Sequence Model和Structure Model,这两个组件通过迭代训练进行优化,以提高其预测准确性。 模型核心是一个基于Transformer架构的神经网络,包括两个主要组件:Sequence to Sequence Model和Structure Model,这两个组件通过迭代训练进行优化,以提高其预测准确性。
...@@ -15,7 +15,6 @@ https://www.biorxiv.org/content/biorxiv/early/2022/08/06/2022.08.04.502811.full. ...@@ -15,7 +15,6 @@ https://www.biorxiv.org/content/biorxiv/early/2022/08/06/2022.08.04.502811.full.
![img](./alphafold2_1.png) ![img](./alphafold2_1.png)
## 环境配置 ## 环境配置
提供[光源](https://www.sourcefind.cn/#/service-details)拉取的训练的docker镜像: 提供[光源](https://www.sourcefind.cn/#/service-details)拉取的训练的docker镜像:
``` ```
...@@ -24,7 +23,9 @@ docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --p ...@@ -24,7 +23,9 @@ docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --p
cd /root/Uni-Fold-main cd /root/Uni-Fold-main
``` ```
安装requirement.txt中的工具,镜像中已经安装好,加载方式 安装requirement.txt中的工具,镜像中已经安装好,加载方式
``` ```
export PATH=/root/software/hmmer/bin${PATH:+:${PATH}} export PATH=/root/software/hmmer/bin${PATH:+:${PATH}}
...@@ -34,55 +35,57 @@ export PATH=/root/software/kalign/bin${PATH:+:${PATH}} ...@@ -34,55 +35,57 @@ export PATH=/root/software/kalign/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/root/software/hh-suite-master/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} export LD_LIBRARY_PATH=/root/software/hh-suite-master/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
``` ```
## 数据集 ## 数据集
推荐使用AlphaFold2中的开源数据集,包括BFD、MGnify、PDB70、Uniclust、Uniref90等,数据集大小约2.62TB。数据集格式如下: 推荐使用[AlphaFold2](http://113.200.138.88:18080/aidatasets/project-dependency/alphafold)中的开源数据集,包括BFD、MGnify、PDB70、Uniclust、Uniref90等,数据集大小约2.62TB。数据集格式如下:
``` ```
$DOWNLOAD_DIR/ $DOWNLOAD_DIR/
bfd/ bfd/
bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffindex bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffindex
bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffdata bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_hhm.ffdata
bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffindex bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt_cs219.ffindex
... ...
mgnify/ mgnify/
mgy_clusters_2022_05.fa mgy_clusters_2022_05.fa
params/ params/
params_model_1.npz params_model_1.npz
params_model_2.npz params_model_2.npz
params_model_3.npz params_model_3.npz
... ...
pdb70/ pdb70/
pdb_filter.dat pdb_filter.dat
pdb70_hhm.ffindex pdb70_hhm.ffindex
pdb70_hhm.ffdata pdb70_hhm.ffdata
... ...
pdb_mmcif/ pdb_mmcif/
mmcif_files/ mmcif_files/
100d.cif 100d.cif
101d.cif 101d.cif
101m.cif 101m.cif
... ...
obsolete.dat obsolete.dat
pdb_seqres/ pdb_seqres/
pdb_seqres.txt pdb_seqres.txt
small_bfd/ small_bfd/
bfd-first_non_consensus_sequences.fasta bfd-first_non_consensus_sequences.fasta
uniref30/ uniref30/
UniRef30_2021_03_hhm.ffindex UniRef30_2021_03_hhm.ffindex
UniRef30_2021_03_hhm.ffdata UniRef30_2021_03_hhm.ffdata
UniRef30_2021_03_cs219.ffindex UniRef30_2021_03_cs219.ffindex
... ...
uniprot/ uniprot/
uniprot.fasta uniprot.fasta
uniref90/ uniref90/
uniref90.fasta uniref90.fasta
``` ```
此处提供了一个脚本download_all_data.sh用于下载使用的数据集和模型文件: 此处提供了一个脚本download_all_data.sh用于下载使用的数据集和模型文件:
``` ```
bash scripts/download/download_all_data.sh /path/to/database/directory bash scripts/download/download_all_data.sh /path/to/database/directory
``` ```
## 推理 ## 推理
### 安装 ### 安装
#### 安装Uni-Core-main(如使用镜像,则无需再次安装) #### 安装Uni-Core-main(如使用镜像,则无需再次安装)
``` ```
...@@ -92,35 +95,38 @@ export CUDA_HOME=/opt/dtk-22.04.2 ...@@ -92,35 +95,38 @@ export CUDA_HOME=/opt/dtk-22.04.2
python3 setup.py install python3 setup.py install
``` ```
#### 安装Uni-Fold-main(如使用镜像,则无需再次安装) #### 安装Uni-Fold-main(如使用镜像,则无需再次安装)
``` ```
pip install -e . pip install -e .
``` ```
### 多卡测试 ### 多卡测试
#### 多聚体参考脚本,需要根据实际情况修改路径配置 #### 多聚体参考脚本,需要根据实际情况修改路径配置
``` ```
sh run_multimer.sh sh run_multimer.sh
``` ```
#### 单聚体参考脚本,需要根据实际情况修改路径配置 #### 单聚体参考脚本,需要根据实际情况修改路径配置
``` ```
sh run_monomer.sh sh run_monomer.sh
``` ```
## result ## result
![img](./result_pdb.png) ![img](./result_pdb.png)
### 精度 ### 精度
## 应用场景
## 应用场景
### 算法类别 ### 算法类别
蛋白质结构预测 蛋白质结构预测
### 热点应用行业 ### 热点应用行业
医疗,科研,教育 医疗,科研,教育
## 源码仓库及问题反馈 ## 源码仓库及问题反馈
* https://developer.hpccube.com/codes/modelzoo/uni-fold - https://developer.hpccube.com/codes/modelzoo/uni-fold
## 参考资料 ## 参考资料
* https://github.com/dptech-corp/Uni-Fold - https://github.com/dptech-corp/Uni-Fold
icon.png

47 KB

Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment