"...composable_kernel-1.git" did not exist on "d4d1147f0ac473b48c2e3ca4a2a21087f1962ede"
README.md 3.82 KB
Newer Older
mashun1's avatar
mashun1 committed
1
2
3
4
# CLIP

## 论文

mashun1's avatar
mashun1 committed
5
`Learning Transferable Visual Models From Natural Language Supervision`
mashun1's avatar
mashun1 committed
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

* https://arxiv.org/abs/2103.00020

## 模型结构

CLIP 模型有两个主要组件,一个文本编码器和一个图像编码器。对于文本编码器,使用了`Transformer`;对于图像编码器采用了`ResNet``Vision Transformer(ViT)`

![alt text](readme_imgs/model.png)

## 算法原理

CLIP通过最大化`文本-图像`匹配得分同时训练一个图像编码器和文本编码器。

![alt text](readme_imgs/alg.png)

## 环境配置

### Docker(方法一)
    
    docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-ubuntu20.04-dtk24.04.1-py3.10

    docker run --shm-size 50g --network=host --name=clip --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash

    python setup.py install

### Dockerfile(方法二)

    docker build -t <IMAGE_NAME>:<TAG> .

    docker run --shm-size 50g --network=host --name=clip --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it <your IMAGE ID> bash

    python setup.py install

### Anaconda (方法三)

1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装:
chenzk's avatar
chenzk committed
42
https://developer.sourcefind.cn/tool/
mashun1's avatar
mashun1 committed
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57

    DTK驱动:dtk24.04.1
    python:python3.10
    torch: 2.1.0
    torchvision: 0.16.0

Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应

2、其它非特殊库参照requirements.txt安装

    python setup.py install


## 数据集

mashun1's avatar
mashun1 committed
58

mashun1's avatar
mashun1 committed
59
60
61

## 训练

mashun1's avatar
mashun1 committed
62

mashun1's avatar
mashun1 committed
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81

## 推理

    python tests/simple_test.py --pt <model_name or model_name.pt>

    python tests/zero_shot_prediction.py --pt <model_name or model_name.pt>

    python tests/linear_probe.py --pt <model_name or model_name.pt>

注意:使用`model_name.pt`会读取`pretrained_models`文件夹下的已下载模型,使用模型名称`model_name`则会自动下载模型。

## result

`python tests/zero_shot_prediction.py --pt ViT-B-32.pt`

![alt text](readme_imgs/r.png)

### 精度

mashun1's avatar
mashun1 committed
82

mashun1's avatar
mashun1 committed
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112

## 应用场景

### 算法类别

`图像分类`

### 热点应用行业

`电商,绘画,交通`

## 预训练权重

下载模型后放入`pretrained_models`文件夹中(需要自行创建)。

原始链接 

[RN50](https://openaipublic.azureedge.net/clip/models/afeb0e10f9e5a86da6080e35cf09123aca3b358a0c3e3b6c78a7b63bc04b6762/RN50.pt)
/ [RN101](https://openaipublic.azureedge.net/clip/models/afeb0e10f9e5a86da6080e35cf09123aca3b358a0c3e3b6c78a7b63bc04b6762/RN101.pt)
/ [RN50x4](https://openaipublic.azureedge.net/clip/models/7e526bd135e493cef0776de27d5f42653e6b4c8bf9e0f653bb11773263205fdd/RN50x4.pt)
/ [RN50x16](https://openaipublic.azureedge.net/clip/models/52378b407f34354e150460fe41077663dd5b39c54cd0bfd2b27167a4a06ec9aa/RN50x16.pt)
/ [RN50x64](https://openaipublic.azureedge.net/clip/models/be1cfb55d75a9666199fb2206c106743da0f6468c9d327f3e0d0a543a9919d9c/RN50x64.pt)
/ [ViT-B/32](https://openaipublic.azureedge.net/clip/models/40d365715913c9da98579312b702a82c18be219cc2a73407c4526f58eba950af/ViT-B-32.pt)
/ [ViT-B/16](https://openaipublic.azureedge.net/clip/models/5806e77cd80f8b59890b7e101eabd078d9fb84e6937f9e85e4ecb61988df416f/ViT-B-16.pt)
/ [ViT-L/14](https://openaipublic.azureedge.net/clip/models/b8cca3fd41ae0c99ba7e8951adf17d267cdb84cd88be6f7c2e0eca1737a03836/ViT-L-14.pt)
/ [ViT-L/14@336px](https://openaipublic.azureedge.net/clip/models/3035c92b350959924f9f00213499208652fc7ea050643e8b385c2dac08641f02/ViT-L-14-336px.pt)


## 源码仓库及问题反馈

chenzk's avatar
chenzk committed
113
* https://developer.sourcefind.cn/codes/modelzoo/clip_pytorch
mashun1's avatar
mashun1 committed
114
115
116
117
118

## 参考资料

* https://github.com/openai/CLIP