".github/git@developer.sourcefind.cn:orangecat/ollama.git" did not exist on "d8be22e47d460d1483846e2effb9b67fbfce1c0b"
README.md 13.8 KB
Newer Older
chenzk's avatar
v1.0  
chenzk committed
1
2
3
4
5
6
7
8
# Search-o1
动态获取和整合外部知识,无需训练即可赋予开源模型CoT“慢思考”能力,属于推理版o1。

## 论文
`Search-o1: Agentic Search-Enhanced Large Reasoning Models`
- https://arxiv.org/abs/2501.05366

## 模型结构
chenzk's avatar
v1.0.2  
chenzk committed
9
本项目实验效果时LLM模型采用Qwen2.5 作为示例,模型结构类似Llama系列,采用极简Decoder-only结构,Llama源自基本的transformer结构,主体为attention(QKV自点积)+ffn(全连接),最后外加一个softmax进行概率转换输出即可,为了使数据分布归一化方便训练收敛,在attention、ffn、softmax前分别再加一个RMS Norm。
chenzk's avatar
v1.0  
chenzk committed
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
<div align=center>
    <img src="./doc/llama3.png"/>
</div>

## 算法原理
通过集成自主检索增强生成机制和文档内推理模块,实现了在推理过程中动态获取和整合外部知识的能力,同时,确保推理过程的连贯性和逻辑一致性。
<div align=center>
    <img src="./doc/algorithm.png"/>
</div>

## 环境配置
```
mv Search-o1_pytorch Search-o1 # 去框架名后缀
```

### Docker(方法一)
```
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-py3.10-dtk24.04.3-ubuntu20.04
# <your IMAGE ID>为以上拉取的docker的镜像ID替换,本镜像为:b272aae8ec72
docker run -it --shm-size=64G -v $PWD/Search-o1:/home/Search-o1 -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name searcho1 <your IMAGE ID> bash
cd /home/Search-o1
pip install -r requirements.txt
pip install whl/lmslim-0.1.2+das.dtk24043-cp310-cp310-linux_x86_64.whl # 安装lmslim==0.1.2
pip install whl/vllm-0.6.2+das.opt1.cd549d3.dtk24043-cp310-cp310-linux_x86_64.whl # 安装vllm==0.6.2
```
### Dockerfile(方法二)
```
cd cd /home/Search-o1/docker
docker build --no-cache -t searcho1:latest .
docker run --shm-size=64G --name searcho1 -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video -v $PWD/../../Search-o1:/home/Search-o1 -it searcho1 bash
# 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt。
cd /home/Search-o1
pip install whl/lmslim-0.1.2+das.dtk24043-cp310-cp310-linux_x86_64.whl # 安装lmslim==0.1.2
pip install whl/vllm-0.6.2+das.opt1.cd549d3.dtk24043-cp310-cp310-linux_x86_64.whl # 安装vllm==0.6.2
```
### Anaconda(方法三)
1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装:
chenzk's avatar
chenzk committed
47
- https://developer.sourcefind.cn/tool/
chenzk's avatar
v1.0  
chenzk committed
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
```
DTK驱动:dtk24.04.3
python:python3.10
torch:2.3.0
torchvision:0.18.1
torchaudio:2.1.2
triton:2.1.0
vllm:0.6.2
flash-attn:2.6.1
deepspeed:0.14.2
apex:1.3.0
xformers:0.0.25
transformers:4.48.0
```

`Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应。`

2、其它非特殊库参照requirements.txt安装
```
cd /home/Search-o1
pip install -r requirements.txt
pip install whl/lmslim-0.1.2+das.dtk24043-cp310-cp310-linux_x86_64.whl # 安装lmslim==0.1.2
pip install whl/vllm-0.6.2+das.opt1.cd549d3.dtk24043-cp310-cp310-linux_x86_64.whl # 安装vllm==0.6.2
```

## 数据集
项目中提供实验性的迷你数据集`AIME``GPQA`(两个数据集的问题都难度很大)可直接使用,预处理代码参照][`data_pre_precess.ipynb`](./data/data_pre_precess.ipynb),数据集格式处理成`*.json`

数据的完整目录结构如下:
```
/home/Search-o1/data
    ├── AIME
        ├── test.json
        ...
    ├── GPQA
        ├── diamond.json
        ...
```

chenzk's avatar
v1.0.2  
chenzk committed
87
88
89
## 训练


chenzk's avatar
v1.0  
chenzk committed
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
## 推理
### 单机多卡
**Search-o1 (Ours)**
```bash
python scripts/run_search_o1.py \
    --dataset_name aime \
    --split test \
    --max_search_limit 5 \
    --max_turn 10 \
    --top_k 10 \
    --max_doc_len 3000 \
    --use_jina True \
    --model_path "YOUR_MODEL_PATH" \
    --jina_api_key "YOUR_JINA_API_KEY" \
    --bing_subscription_key "YOUR_BING_SUBSCRIPTION_KEY"
```
以上命令为参考命令,无法直接运行,完整使用`Search-o1`的功能需要花钱购买jina和bing搜索引擎的api_key,如有需求可自行购买服务,额外费用与本算法无关,本步骤不含外部搜索的可用api_key,仅供使用方法示例,故无外网搜索功能:
```
sh searcho1_gen.sh # 项目中采用Qwen2.5-72B-Instruct进行示例,建议不要采用小型模型,效果不如大型模型。
# 注:项目中默认的bing_subscription_key为无效api_key,故无法获取网络数据,仅供参考。
```

更多资料可参考源项目的[`README_origin`](./README_origin.md)

## result
由于无有效api_key,无法获取网络数据,此推理效果仅供参考示例,以方便读者了解项目使用方法:

`输入: `
[`test.json`](./data/AIME/test.json)
```
{
    "id": 0,
    "Problem_ID": 60,
    "Question": "Every morning Aya goes for a $9$-kilometer-long walk and stops at a coffee shop afterwards. When she walks at a constant speed of $s$ kilometers per hour, the walk takes her 4 hours, including $t$ minutes spent in the coffee shop. When she walks $s+2$ kilometers per hour, the walk takes her 2 hours and 24 minutes, including $t$ minutes spent in the coffee shop. Suppose Aya walks at $s+\\frac{1}{2}$ kilometers per hour. Find the number of minutes the walk takes her, including the $t$ minutes spent in the coffee shop.",
    "Solution": "$\\frac{9}{s} + t = 4$ in hours and $\\frac{9}{s+2} + t = 2.4$ in hours.\nSubtracting the second equation from the first, we get, \n$\\frac{9}{s} - \\frac{9}{s+2} = 1.6$\nMultiplying by $(s)(s+2)$, we get \n$9s+18-9s=18=1.6s^{2} + 3.2s$\nMultiplying by 5/2 on both sides, we get\n$0 = 4s^{2} + 8s - 45$\nFactoring gives us \n$(2s-5)(2s+9) = 0$, of which the solution we want is $s=2.5$.\nSubstituting this back to the first equation, we can find that $t = 0.4$ hours.\nLastly, $s + \\frac{1}{2} = 3$ kilometers per hour, so\n$\\frac{9}{3} + 0.4 = 3.4$ hours, or $\\framebox{204}$ minutes\n-Failure.net\nThe amount of hours spent while walking on the first travel is $\\frac{240-t}{6}$. Thus, we have the equation $(240-t)(s) = 540$, and by the same logic, the second equation yields $(144-t)(s+2) = 540$. We have $240s-st = 540$, and $288+144s-2t-st = 540$. We subtract the two equations to get $96s+2t-288 = 0$, so we have $48s+t = 144$, so $t = 144-48s$, and now we have $(96+48s)(s) = 540$. The numerator of $s$ must evenly divide 540, however, $s$ must be less than 3. We can guess that $s = 2.5$. Now, $2.5+0.5 = 3$. Taking $\\frac{9}{3} = 3$, we find that it will take three hours for the 9 kilometers to be traveled. The t minutes spent at the coffeeshop can be written as $144-48(2.5)$, so t = 24. $180 + 24 = 204$. -sepehr2010",
    "answer": "204"
},
...
```
参考`outputs/runs.baselines/aime.qwen2.5-72b.search_o1/test.*.json`

`输出:`
```
{
    "id": 0,
    "Problem_ID": 60,
    "Question": "<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n<|im_start|>user\nYou are a reasoning assistant with the ability to perform web searches to help you answer the user's question accurately. You have special tools:\n\n- To perform a search: write <|begin_search_query|> your query here <|end_search_query|>.\nThen, the system will search and analyze relevant web pages, then provide you with helpful information in the format <|begin_search_result|> ...search results... <|end_search_result|>.\n\nYou can repeat the search process multiple times if necessary. The maximum number of search attempts is limited to 5.\n\nOnce you have all the information you need, continue your reasoning.\n\nExample:\nQuestion: \"How do you compute the integral of e^(x^2) dx?\"\nAssistant thinking steps:\n- I might need to look up techniques for integrating e^(x^2).\n\nAssistant:\n<|begin_search_query|>methods to integrate e^(x^2)<|end_search_query|>\n\n(System returns processed information from relevant web pages)\n\nAssistant continues reasoning with the new information...\n\nRemember:\n- Use <|begin_search_query|> to request a web search and end with <|end_search_query|>.\n- When done searching, continue your reasoning.\n\nPlease answer the following math question. You should think step by step to solve it.\n\nProvide your final answer in the format \\boxed{YOUR_ANSWER}.\n\nQuestion:\nEvery morning Aya goes for a $9$-kilometer-long walk and stops at a coffee shop afterwards. When she walks at a constant speed of $s$ kilometers per hour, the walk takes her 4 hours, including $t$ minutes spent in the coffee shop. When she walks $s+2$ kilometers per hour, the walk takes her 2 hours and 24 minutes, including $t$ minutes spent in the coffee shop. Suppose Aya walks at $s+\\frac{1}{2}$ kilometers per hour. Find the number of minutes the walk takes her, including the $t$ minutes spent in the coffee shop.\n\n<|im_end|>\n<|im_start|>assistant\n",
    "Solution": "$\\frac{9}{s} + t = 4$ in hours and $\\frac{9}{s+2} + t = 2.4$ in hours.\nSubtracting the second equation from the first, we get, \n$\\frac{9}{s} - \\frac{9}{s+2} = 1.6$\nMultiplying by $(s)(s+2)$, we get \n$9s+18-9s=18=1.6s^{2} + 3.2s$\nMultiplying by 5/2 on both sides, we get\n$0 = 4s^{2} + 8s - 45$\nFactoring gives us \n$(2s-5)(2s+9) = 0$, of which the solution we want is $s=2.5$.\nSubstituting this back to the first equation, we can find that $t = 0.4$ hours.\nLastly, $s + \\frac{1}{2} = 3$ kilometers per hour, so\n$\\frac{9}{3} + 0.4 = 3.4$ hours, or $\\framebox{204}$ minutes\n-Failure.net\nThe amount of hours spent while walking on the first travel is $\\frac{240-t}{6}$. Thus, we have the equation $(240-t)(s) = 540$, and by the same logic, the second equation yields $(144-t)(s+2) = 540$. We have $240s-st = 540$, and $288+144s-2t-st = 540$. We subtract the two equations to get $96s+2t-288 = 0$, so we have $48s+t = 144$, so $t = 144-48s$, and now we have $(96+48s)(s) = 540$. The numerator of $s$ must evenly divide 540, however, $s$ must be less than 3. We can guess that $s = 2.5$. Now, $2.5+0.5 = 3$. Taking $\\frac{9}{3} = 3$, we find that it will take three hours for the 9 kilometers to be traveled. The t minutes spent at the coffeeshop can be written as $144-48(2.5)$, so t = 24. $180 + 24 = 204$. -sepehr2010",
    "answer": "204",
    "Output": "To solve the problem, we need to determine the value of \\( t \\) and then use it to find the total time Aya spends walking and in the coffee shop when she walks at \\( s + \\frac{1}{2} \\) kilometers per hour.\n\nFirst, let's set up the equations based on the information given:\n\n1. When Aya walks at a speed of \\( s \\) kilometers per hour, the walk takes her 4 hours, including \\( t \\) minutes spent in the coffee shop.\n2. When Aya walks at a speed of \\( s + 2 \\) kilometers per hour, the walk takes her 2 hours and 24 minutes, including \\( t \\) minutes spent in the coffee shop.\n\n### Step 1: Convert all times to hours\n- 4 hours is already in hours.\n- 2 hours and 24 minutes can be converted to hours: \\( 2 + \\frac{24}{60} = 2.4 \\) hours.\n\n### Step 2: Set up the equations\nLet \\( T_1 \\) be the time spent walking at speed \\( s \\) and \\( T_2 \\) be the time spent walking at speed \\( s + 2 \\).\n\nFrom the first condition:\n\\[ T_1 + \\frac{t}{60} = 4 \\]\nSince \\( T_1 = \\frac{9}{s} \\):\n\\[ \\frac{9}{s} + \\frac{t}{60} = 4 \\]\n\nFrom the second condition:\n\\[ T_2 + \\frac{t}{60} = 2.4 \\]\nSince \\( T_2 = \\frac{9}{s + 2} \\):\n\\[ \\frac{9}{s + 2} + \\frac{t}{60} = 2.4 \\]\n\n### Step 3: Solve the system of equations\nWe have:\n\\[ \\frac{9}{s} + \\frac{t}{60} = 4 \\]\n\\[ \\frac{9}{s + 2} + \\frac{t}{60} = 2.4 \\]\n\nSubtract the second equation from the first:\n\\[ \\left( \\frac{9}{s} + \\frac{t}{60} \\right) - \\left( \\frac{9}{s + 2} + \\frac{t}{60} \\right) = 4 - 2.4 \\]\n\\[ \\frac{9}{s} - \\frac{9}{s + 2} = 1.6 \\]\n\n### Step 4: Simplify the equation\n\\[ 9 \\left( \\frac{1}{s} - \\frac{1}{s + 2} \\right) = 1.6 \\]\n\\[ 9 \\left( \\frac{(s + 2) - s}{s(s + 2)} \\right) = 1.6 \\]\n\\[ 9 \\left( \\frac{2}{s(s + 2)} \\right) = 1.6 \\]\n\\[ \\frac{18}{s(s + 2)} = 1.6 \\]\n\\[ 18 = 1.6s(s + 2) \\]\n\\[ 18 = 1.6s^2 + 3.2s \\]\n\\[ 1.6s^2 + 3.2s - 18 = 0 \\]\n\n### Step 5: Solve the quadratic equation\nDivide the entire equation by 1.6:\n\\[ s^2 + 2s - 11.25 = 0 \\]\n\nUse the quadratic formula \\( s = \\frac{-b \\pm \\sqrt{b^2 - 4ac}}{2a} \\):\n\\[ s = \\frac{-2 \\pm \\sqrt{2^2 - 4 \\cdot 1 \\cdot (-11.25)}}{2 \\cdot 1} \\]\n\\[ s = \\frac{-2 \\pm \\sqrt{4 + 45}}{2} \\]\n\\[ s = \\frac{-2 \\pm \\sqrt{49}}{2} \\]\n\\[ s = \\frac{-2 \\pm 7}{2} \\]\n\nThis gives us two solutions:\n\\[ s = \\frac{5}{2} = 2.5 \\]\n\\[ s = \\frac{-9}{2} = -4.5 \\]\n\nSince speed cannot be negative, we have:\n\\[ s = 2.5 \\]\n\n### Step 6: Find \\( t \\)\nSubstitute \\( s = 2.5 \\) back into one of the original equations:\n\\[ \\frac{9}{2.5} + \\frac{t}{60} = 4 \\]\n\\[ 3.6 + \\frac{t}{60} = 4 \\]\n\\[ \\frac{t}{60} = 0.4 \\]\n\\[ t = 0.4 \\times 60 \\]\n\\[ t = 24 \\]\n\n### Step 7: Calculate the total time when \\( s = 2.5 + 0.5 = 3 \\)\n\\[ T = \\frac{9}{3} + \\frac{24}{60} \\]\n\\[ T = 3 + 0.4 \\]\n\\[ T = 3.4 \\]\n\nConvert 3.4 hours to minutes:\n\\[ 3.4 \\times 60 = 204 \\]\n\nThus, the total time Aya spends walking and in the coffee shop when she walks at \\( s + \\frac{1}{2} \\) kilometers per hour is \\(\\boxed{204}\\) minutes.",
    "Pred_Answer": "204",
    "Metrics": {
        "is_valid_answer": true,
        "acc": 1,
        "em": 1,
        "f1": 1.0,
        "math_equal": true
    }
},
...
```

### 精度
DCU与GPU精度一致,推理框架:pytorch。

## 应用场景
### 算法类别
`对话问答`
### 热点应用行业
`制造,广媒,金融,能源,医疗,家居,教育`
## 预训练权重
Hugging Face下载地址为:[Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
## 源码仓库及问题反馈
- http://developer.sourcefind.cn/codes/modelzoo/search-o1_pytorch.git
## 参考资料
- https://github.com/sunnynexus/Search-o1.git