Commit 2590be89 authored by zhuwenwen's avatar zhuwenwen
Browse files

use precomputed msas and features.pkl

parent 5ecff046
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
* @Author: zhuww * @Author: zhuww
* @email: zhuww@sugon.com * @email: zhuww@sugon.com
* @Date: 2023-04-06 18:04:07 * @Date: 2023-04-06 18:04:07
* @LastEditTime: 2023-11-15 17:30:01 * @LastEditTime: 2023-11-23 16:01:01
--> -->
# AF2 # AF2
## 论文 ## 论文
...@@ -96,7 +96,7 @@ $DOWNLOAD_DIR/ ...@@ -96,7 +96,7 @@ $DOWNLOAD_DIR/
```bash ```bash
./run_monomer.sh ./run_monomer.sh
``` ```
单体推理参数说明:download_dir为数据集下载目录,monomer.fasta为推理的单体序列;`--output_dir`为输出目录;`model_names`为推理的模型名称,`--model_preset=monomer`为单体模型配置;`--run_relax=true`为进行relax操作;`--use_gpu_relax=true`为使用gpu进行relax操作(速度更快,但可能不太稳定),`--use_gpu_relax=false`为使用CPU进行relax操作(速度慢,但稳定);若添加use_precomputed_msas=true则可以加载已经搜索对齐的序列,否则默认进行搜索对齐 单体推理参数说明:download_dir为数据集下载目录,monomer.fasta为推理的单体序列;`--output_dir`为输出目录;`model_names`为推理的模型名称,`--model_preset=monomer`为单体模型配置;`--run_relax=true`为进行relax操作;`--use_gpu_relax=true`为使用gpu进行relax操作(速度更快,但可能不太稳定),`--use_gpu_relax=false`为使用CPU进行relax操作(速度慢,但稳定)。
### 多体 ### 多体
```bash ```bash
...@@ -129,17 +129,14 @@ $DOWNLOAD_DIR/ ...@@ -129,17 +129,14 @@ $DOWNLOAD_DIR/
测试数据:[casp14](https://www.predictioncenter.org/casp14/targetlist.cgi)[uniprot](https://www.uniprot.org/) 测试数据:[casp14](https://www.predictioncenter.org/casp14/targetlist.cgi)[uniprot](https://www.uniprot.org/)
使用的加速卡:1张 Z100L-32G 使用的加速卡:1张 Z100L-32G
1、lddt plddts:见<target_name>/ranking_debug.json中的`plddts`
<target_name>/ranking_debug.json中的`plddts`
2、其它精度值计算:[https://zhanggroup.org/TM-score/](https://zhanggroup.org/TM-score/)
准确性数据: 准确性数据:
| 数据类型 | 序列类型 | 序列标签 | 序列长度 | GDT-TS | GDT-HA | LDDT | TM score | MaxSub | RMSD | | 数据类型 | 序列类型 | 序列标签 | 序列长度 | LDDT |
| :------: | :------: | :------: | :------: |:------: |:------: | :------: | :------: | :------: |:------: | | :------: | :------: | :------: | :------: |:------: |
| fp32 | 单体 | T1026 | 172 | 0.849 | 0.658 | 75.050 | 0.901 | 0.851 | 1.6 | | fp32 | 单体 | T1026 | 172 | 75.050 |
| fp32 | 单体 | T1053 | 580 | 0.941 | 0.789 | 92.316 | 0.985 | 0.935 | 1.1 | | fp32 | 单体 | T1053 | 580 | 92.316 |
| fp32 | 单体 | T1091 | 863 | 0.492 | 0.332 | 85.083 | 0.740 | 0.388 | 6.7 | | fp32 | 单体 | T1091 | 863 | 85.083 |
## 应用场景 ## 应用场景
......
...@@ -194,13 +194,18 @@ def predict_structure( ...@@ -194,13 +194,18 @@ def predict_structure(
# Get features. # Get features.
t_0 = time.time() t_0 = time.time()
features_output_path = os.path.join(output_dir, 'features.pkl')
if os.path.exists(features_output_path):
feature_dict = pickle.load(open(features_output_path, 'rb'))
else:
feature_dict = data_pipeline.process( feature_dict = data_pipeline.process(
input_fasta_path=fasta_path, input_fasta_path=fasta_path,
msa_output_dir=msa_output_dir) msa_output_dir=msa_output_dir)
timings['features'] = time.time() - t_0 timings['features'] = time.time() - t_0
# Write out features as a pickled dictionary. # Write out features as a pickled dictionary.
features_output_path = os.path.join(output_dir, 'features.pkl') # features_output_path = os.path.join(output_dir, 'features.pkl')
with open(features_output_path, 'wb') as f: with open(features_output_path, 'wb') as f:
pickle.dump(feature_dict, f, protocol=4) pickle.dump(feature_dict, f, protocol=4)
......
...@@ -2,7 +2,6 @@ ...@@ -2,7 +2,6 @@
python3 run_alphafold.py \ python3 run_alphafold.py \
--fasta_paths=monomer.fasta \ --fasta_paths=monomer.fasta \
--output_dir=./ \ --output_dir=./ \
--use_precomputed_msas=false \
--data_dir=$download_dir \ --data_dir=$download_dir \
--model_names="model_1" \ --model_names="model_1" \
--uniref90_database_path=$download_dir/uniref90/uniref90.fasta \ --uniref90_database_path=$download_dir/uniref90/uniref90.fasta \
......
...@@ -3,7 +3,6 @@ python3 run_alphafold.py \ ...@@ -3,7 +3,6 @@ python3 run_alphafold.py \
--fasta_paths=multimer.fasta \ --fasta_paths=multimer.fasta \
--output_dir=./ \ --output_dir=./ \
--num_multimer_predictions_per_model=1 \ --num_multimer_predictions_per_model=1 \
--use_precomputed_msas=false \
--data_dir=$download_dir \ --data_dir=$download_dir \
--model_names="model_1_multimer_v3" \ --model_names="model_1_multimer_v3" \
--uniref90_database_path=$download_dir/uniref90/uniref90.fasta \ --uniref90_database_path=$download_dir/uniref90/uniref90.fasta \
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment