README.md 4.03 KB
Newer Older
wangsen's avatar
wangsen committed
1

dcuai's avatar
dcuai committed
2
3
4
5
# Higashi

## 论文

wangsen's avatar
wangsen committed
6
7
https://doi.org/10.1038/s41587-021-01034-y

dcuai's avatar
dcuai committed
8
## 模型结构
wangsen's avatar
wangsen committed
9
10
11
12
13
14
15


Higashi使用超图神经网络来揭示这个构造的超图中的高阶交互模式。Higashi可以为scHi-C制作嵌入物,用于下游分析。Higashi可以输入单细胞Hi-C接触图谱,从而能够以单细胞分辨率详细表征3D基因组特征,如TAD样结构域边界和A/B区分数。

![Alt text](./image/image.png)


dcuai's avatar
dcuai committed
16
## 算法原理
wangsen's avatar
wangsen committed
17
18
19
20
21
22

Higashi的关键算法设计是将scHi-C数据转换为超图。这种转化保留了scHi-C接触图谱的单细胞分辨率和3D基因组特征。具体来说,嵌入scHi-C数据的过程现在相当于学习超图的节点嵌入,输入scHi-C接触图就变成了预测超图中缺失的超边。在Higashi,我们使用我们最近开发的Hyper-SAGNN架构22,这是一个通用的超图表示学习框架,专门针对scHi-C分析进行了大量的新开发

![Alt text](./image/image-1.png)


dcuai's avatar
dcuai committed
23
24
## 环境配置
### Docker(方式一)
wangsen's avatar
wangsen committed
25
推荐使用docker方式运行,提供拉取的docker镜像:
wangsen's avatar
wangsen committed
26

wangsen's avatar
wangsen committed
27
28
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10
wangsen's avatar
wangsen committed
29
30
docker run -dit --shm-size 80g --network=host --name=higashi --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /opt/hyhal/:/opt/hyhal/:ro image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10 /bin/bash
docker exec -it higashi /bin/bash
wangsen's avatar
wangsen committed
31
32
33
34
35
36
```

安装docker中没有的依赖:

```
pip install -r requirements.txt  -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
wangsen's avatar
wangsen committed
37
python setup.py install
wangsen's avatar
wangsen committed
38
39
```

dcuai's avatar
dcuai committed
40
### Dockerfile(方式二)
wangsen's avatar
wangsen committed
41
42

```
wangsen's avatar
wangsen committed
43
docker build -t higashi:latest .
wangsen's avatar
wangsen committed
44
docker run -dit --shm-size 80g --network=host --name=higashi --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /opt/hyhal/:/opt/hyhal/:ro higashi:latest /bin/bash
wangsen's avatar
wangsen committed
45
docker exec -it higashi  /bin/bash
wangsen's avatar
wangsen committed
46
47

```
wangsen's avatar
wangsen committed
48
安装docker中没有的依赖:
wangsen's avatar
wangsen committed
49

wangsen's avatar
wangsen committed
50
51
52
53
```
pip install -r requirements.txt  -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
python setup.py install
```
wangsen's avatar
wangsen committed
54
55


dcuai's avatar
dcuai committed
56
### Conda(方式三)
wangsen's avatar
wangsen committed
57
58
59
60

1.创建conda虚拟环境:

```
wangsen's avatar
wangsen committed
61
62
conda create -n higashi python=3.10
conda activate higashi 
wangsen's avatar
wangsen committed
63
64
65
66
67
68
69
70
71
72
```

2.关于本项目DCU显卡所需的工具包、深度学习库等均可从光合开发者社区下载安装。
- [DTK 24.04.1](https://cancon.hpccube.com:65024/directlink/1/DTK-24.04.1/Ubuntu20.04.1/DTK-24.04.1-Ubuntu20.04.1-x86_64.tar.gz)
- [Pytorch 2.1](https://cancon.hpccube.com:65024/directlink/4/pytorch/DAS1.2/torch-2.1.0+das.opt1.dtk24042-cp310-cp310-manylinux_2_28_x86_64.whl)


Tips:以上dtk驱动、torch等工具版本需要严格一一对应。


wangsen's avatar
wangsen committed
73
3.其它依赖库参照requirements.txt安装:
wangsen's avatar
wangsen committed
74
```
wangsen's avatar
wangsen committed
75
python setup.py install
wangsen's avatar
wangsen committed
76
77
78
pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
```

dcuai's avatar
dcuai committed
79
## 数据集
wangsen's avatar
wangsen committed
80

wangsen's avatar
wangsen committed
81

wangsen's avatar
wangsen committed
82
```
wangsen's avatar
wangsen committed
83
84
apt-get update 
apt-get install git-lfs
wangsen's avatar
wangsen committed
85
86
87
88
mkdir -p /work/magroup/ruochiz/Data/scHiC_collection/ramani
mkdir -p /work/magroup/ruochiz/Higashi/Temp/ramani
wget -P  /work/magroup/ruochiz/Higashi/ https://mirror.ghproxy.com/https://raw.githubusercontent.com/hanfang/Topsorter/refs/heads/master/data/hg19.chrom.sizes.txt
wget -P  /work/magroup/ruochiz/Higashi/   https://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/cytoBand.txt.gz
wangsen's avatar
wangsen committed
89
90
git lfs  clone http://113.200.138.88:18080/aidatasets/project-dependency/ramani-et-ai.git
cp -r ramani-et-ai/* /work/magroup/ruochiz/Data/scHiC_collection/ramani/ 
wangsen's avatar
wangsen committed
91

wangsen's avatar
wangsen committed
92
93
94
```


dcuai's avatar
dcuai committed
95
## 训练
wangsen's avatar
wangsen committed
96

dcuai's avatar
dcuai committed
97
结合测试数据和Higashi模型生成具备超图分析与接触图嵌入能力的demo
wangsen's avatar
wangsen committed
98
99
100
101

```
python train.py 
```
dcuai's avatar
dcuai committed
102
103
## 推理

wangsen's avatar
wangsen committed
104

dcuai's avatar
dcuai committed
105
106
107
## result

### 精度
wangsen's avatar
wangsen committed
108

wangsen's avatar
wangsen committed
109
bce:  0.5046, mse:  0.7233,  acc: 86.692 %, pearson: 0.590, spearman: 0.514, elapse: 27.894 s 
wangsen's avatar
wangsen committed
110

dcuai's avatar
dcuai committed
111
112
## 应用场景
### 算法类别
wangsen's avatar
wangsen committed
113
ai for science
wangsen's avatar
wangsen committed
114

dcuai's avatar
dcuai committed
115
### 热点应用行业
wangsen's avatar
wangsen committed
116
117
118

科研  单细胞预测    基因预测

dcuai's avatar
dcuai committed
119
## 源码仓库及问题反馈
wangsen's avatar
wangsen committed
120

wangsen's avatar
wangsen committed
121
http://developer.sourcefind.cn/codes/modelzoo/higashi.git
wangsen's avatar
wangsen committed
122

wangsen's avatar
wangsen committed
123
124


dcuai's avatar
dcuai committed
125
## 参考资料
wangsen's avatar
wangsen committed
126
127


wangsen's avatar
wangsen committed
128
https://github.com/ma-compbio/Higashi/
wangsen's avatar
wangsen committed
129

wangsen's avatar
wangsen committed
130
131
132