# Search-o1 动态获取和整合外部知识,无需训练即可赋予开源模型CoT“慢思考”能力,属于推理版o1。 ## 论文 `Search-o1: Agentic Search-Enhanced Large Reasoning Models` - https://arxiv.org/abs/2501.05366 ## 模型结构 本项目实验效果时LLM模型采用Qwen2.5 作为示例,模型结构类似Llama系列,采用极简Decoder-only结构,Llama源自基本的transformer结构,主体为attention(QKV自点积)+ffn(全连接),最后外加一个softmax进行概率转换输出即可,为了使数据分布归一化方便训练收敛,在attention、ffn、softmax前分别再加一个RMS Norm。
## 算法原理 通过集成自主检索增强生成机制和文档内推理模块,实现了在推理过程中动态获取和整合外部知识的能力,同时,确保推理过程的连贯性和逻辑一致性。
## 环境配置 ``` mv Search-o1_pytorch Search-o1 # 去框架名后缀 ``` ### Docker(方法一) ``` docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-py3.10-dtk24.04.3-ubuntu20.04 # 为以上拉取的docker的镜像ID替换,本镜像为:b272aae8ec72 docker run -it --shm-size=64G -v $PWD/Search-o1:/home/Search-o1 -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name searcho1 bash cd /home/Search-o1 pip install -r requirements.txt pip install whl/lmslim-0.1.2+das.dtk24043-cp310-cp310-linux_x86_64.whl # 安装lmslim==0.1.2 pip install whl/vllm-0.6.2+das.opt1.cd549d3.dtk24043-cp310-cp310-linux_x86_64.whl # 安装vllm==0.6.2 ``` ### Dockerfile(方法二) ``` cd cd /home/Search-o1/docker docker build --no-cache -t searcho1:latest . docker run --shm-size=64G --name searcho1 -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video -v $PWD/../../Search-o1:/home/Search-o1 -it searcho1 bash # 若遇到Dockerfile启动的方式安装环境需要长时间等待,可注释掉里面的pip安装,启动容器后再安装python库:pip install -r requirements.txt。 cd /home/Search-o1 pip install whl/lmslim-0.1.2+das.dtk24043-cp310-cp310-linux_x86_64.whl # 安装lmslim==0.1.2 pip install whl/vllm-0.6.2+das.opt1.cd549d3.dtk24043-cp310-cp310-linux_x86_64.whl # 安装vllm==0.6.2 ``` ### Anaconda(方法三) 1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装: - https://developer.sourcefind.cn/tool/ ``` DTK驱动:dtk24.04.3 python:python3.10 torch:2.3.0 torchvision:0.18.1 torchaudio:2.1.2 triton:2.1.0 vllm:0.6.2 flash-attn:2.6.1 deepspeed:0.14.2 apex:1.3.0 xformers:0.0.25 transformers:4.48.0 ``` `Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应。` 2、其它非特殊库参照requirements.txt安装 ``` cd /home/Search-o1 pip install -r requirements.txt pip install whl/lmslim-0.1.2+das.dtk24043-cp310-cp310-linux_x86_64.whl # 安装lmslim==0.1.2 pip install whl/vllm-0.6.2+das.opt1.cd549d3.dtk24043-cp310-cp310-linux_x86_64.whl # 安装vllm==0.6.2 ``` ## 数据集 项目中提供实验性的迷你数据集`AIME`与`GPQA`(两个数据集的问题都难度很大)可直接使用,预处理代码参照][`data_pre_precess.ipynb`](./data/data_pre_precess.ipynb),数据集格式处理成`*.json`。 数据的完整目录结构如下: ``` /home/Search-o1/data ├── AIME ├── test.json ... ├── GPQA ├── diamond.json ... ``` ## 训练 无 ## 推理 ### 单机多卡 **Search-o1 (Ours)** ```bash python scripts/run_search_o1.py \ --dataset_name aime \ --split test \ --max_search_limit 5 \ --max_turn 10 \ --top_k 10 \ --max_doc_len 3000 \ --use_jina True \ --model_path "YOUR_MODEL_PATH" \ --jina_api_key "YOUR_JINA_API_KEY" \ --bing_subscription_key "YOUR_BING_SUBSCRIPTION_KEY" ``` 以上命令为参考命令,无法直接运行,完整使用`Search-o1`的功能需要花钱购买jina和bing搜索引擎的api_key,如有需求可自行购买服务,额外费用与本算法无关,本步骤不含外部搜索的可用api_key,仅供使用方法示例,故无外网搜索功能: ``` sh searcho1_gen.sh # 项目中采用Qwen2.5-72B-Instruct进行示例,建议不要采用小型模型,效果不如大型模型。 # 注:项目中默认的bing_subscription_key为无效api_key,故无法获取网络数据,仅供参考。 ``` 更多资料可参考源项目的[`README_origin`](./README_origin.md) ## result 由于无有效api_key,无法获取网络数据,此推理效果仅供参考示例,以方便读者了解项目使用方法: `输入: ` [`test.json`](./data/AIME/test.json) ``` { "id": 0, "Problem_ID": 60, "Question": "Every morning Aya goes for a $9$-kilometer-long walk and stops at a coffee shop afterwards. When she walks at a constant speed of $s$ kilometers per hour, the walk takes her 4 hours, including $t$ minutes spent in the coffee shop. When she walks $s+2$ kilometers per hour, the walk takes her 2 hours and 24 minutes, including $t$ minutes spent in the coffee shop. Suppose Aya walks at $s+\\frac{1}{2}$ kilometers per hour. Find the number of minutes the walk takes her, including the $t$ minutes spent in the coffee shop.", "Solution": "$\\frac{9}{s} + t = 4$ in hours and $\\frac{9}{s+2} + t = 2.4$ in hours.\nSubtracting the second equation from the first, we get, \n$\\frac{9}{s} - \\frac{9}{s+2} = 1.6$\nMultiplying by $(s)(s+2)$, we get \n$9s+18-9s=18=1.6s^{2} + 3.2s$\nMultiplying by 5/2 on both sides, we get\n$0 = 4s^{2} + 8s - 45$\nFactoring gives us \n$(2s-5)(2s+9) = 0$, of which the solution we want is $s=2.5$.\nSubstituting this back to the first equation, we can find that $t = 0.4$ hours.\nLastly, $s + \\frac{1}{2} = 3$ kilometers per hour, so\n$\\frac{9}{3} + 0.4 = 3.4$ hours, or $\\framebox{204}$ minutes\n-Failure.net\nThe amount of hours spent while walking on the first travel is $\\frac{240-t}{6}$. Thus, we have the equation $(240-t)(s) = 540$, and by the same logic, the second equation yields $(144-t)(s+2) = 540$. We have $240s-st = 540$, and $288+144s-2t-st = 540$. We subtract the two equations to get $96s+2t-288 = 0$, so we have $48s+t = 144$, so $t = 144-48s$, and now we have $(96+48s)(s) = 540$. The numerator of $s$ must evenly divide 540, however, $s$ must be less than 3. We can guess that $s = 2.5$. Now, $2.5+0.5 = 3$. Taking $\\frac{9}{3} = 3$, we find that it will take three hours for the 9 kilometers to be traveled. The t minutes spent at the coffeeshop can be written as $144-48(2.5)$, so t = 24. $180 + 24 = 204$. -sepehr2010", "answer": "204" }, ... ``` 参考`outputs/runs.baselines/aime.qwen2.5-72b.search_o1/test.*.json` `输出:` ``` { "id": 0, "Problem_ID": 60, "Question": "<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n<|im_start|>user\nYou are a reasoning assistant with the ability to perform web searches to help you answer the user's question accurately. You have special tools:\n\n- To perform a search: write <|begin_search_query|> your query here <|end_search_query|>.\nThen, the system will search and analyze relevant web pages, then provide you with helpful information in the format <|begin_search_result|> ...search results... <|end_search_result|>.\n\nYou can repeat the search process multiple times if necessary. The maximum number of search attempts is limited to 5.\n\nOnce you have all the information you need, continue your reasoning.\n\nExample:\nQuestion: \"How do you compute the integral of e^(x^2) dx?\"\nAssistant thinking steps:\n- I might need to look up techniques for integrating e^(x^2).\n\nAssistant:\n<|begin_search_query|>methods to integrate e^(x^2)<|end_search_query|>\n\n(System returns processed information from relevant web pages)\n\nAssistant continues reasoning with the new information...\n\nRemember:\n- Use <|begin_search_query|> to request a web search and end with <|end_search_query|>.\n- When done searching, continue your reasoning.\n\nPlease answer the following math question. You should think step by step to solve it.\n\nProvide your final answer in the format \\boxed{YOUR_ANSWER}.\n\nQuestion:\nEvery morning Aya goes for a $9$-kilometer-long walk and stops at a coffee shop afterwards. When she walks at a constant speed of $s$ kilometers per hour, the walk takes her 4 hours, including $t$ minutes spent in the coffee shop. When she walks $s+2$ kilometers per hour, the walk takes her 2 hours and 24 minutes, including $t$ minutes spent in the coffee shop. Suppose Aya walks at $s+\\frac{1}{2}$ kilometers per hour. Find the number of minutes the walk takes her, including the $t$ minutes spent in the coffee shop.\n\n<|im_end|>\n<|im_start|>assistant\n", "Solution": "$\\frac{9}{s} + t = 4$ in hours and $\\frac{9}{s+2} + t = 2.4$ in hours.\nSubtracting the second equation from the first, we get, \n$\\frac{9}{s} - \\frac{9}{s+2} = 1.6$\nMultiplying by $(s)(s+2)$, we get \n$9s+18-9s=18=1.6s^{2} + 3.2s$\nMultiplying by 5/2 on both sides, we get\n$0 = 4s^{2} + 8s - 45$\nFactoring gives us \n$(2s-5)(2s+9) = 0$, of which the solution we want is $s=2.5$.\nSubstituting this back to the first equation, we can find that $t = 0.4$ hours.\nLastly, $s + \\frac{1}{2} = 3$ kilometers per hour, so\n$\\frac{9}{3} + 0.4 = 3.4$ hours, or $\\framebox{204}$ minutes\n-Failure.net\nThe amount of hours spent while walking on the first travel is $\\frac{240-t}{6}$. Thus, we have the equation $(240-t)(s) = 540$, and by the same logic, the second equation yields $(144-t)(s+2) = 540$. We have $240s-st = 540$, and $288+144s-2t-st = 540$. We subtract the two equations to get $96s+2t-288 = 0$, so we have $48s+t = 144$, so $t = 144-48s$, and now we have $(96+48s)(s) = 540$. The numerator of $s$ must evenly divide 540, however, $s$ must be less than 3. We can guess that $s = 2.5$. Now, $2.5+0.5 = 3$. Taking $\\frac{9}{3} = 3$, we find that it will take three hours for the 9 kilometers to be traveled. The t minutes spent at the coffeeshop can be written as $144-48(2.5)$, so t = 24. $180 + 24 = 204$. -sepehr2010", "answer": "204", "Output": "To solve the problem, we need to determine the value of \\( t \\) and then use it to find the total time Aya spends walking and in the coffee shop when she walks at \\( s + \\frac{1}{2} \\) kilometers per hour.\n\nFirst, let's set up the equations based on the information given:\n\n1. When Aya walks at a speed of \\( s \\) kilometers per hour, the walk takes her 4 hours, including \\( t \\) minutes spent in the coffee shop.\n2. When Aya walks at a speed of \\( s + 2 \\) kilometers per hour, the walk takes her 2 hours and 24 minutes, including \\( t \\) minutes spent in the coffee shop.\n\n### Step 1: Convert all times to hours\n- 4 hours is already in hours.\n- 2 hours and 24 minutes can be converted to hours: \\( 2 + \\frac{24}{60} = 2.4 \\) hours.\n\n### Step 2: Set up the equations\nLet \\( T_1 \\) be the time spent walking at speed \\( s \\) and \\( T_2 \\) be the time spent walking at speed \\( s + 2 \\).\n\nFrom the first condition:\n\\[ T_1 + \\frac{t}{60} = 4 \\]\nSince \\( T_1 = \\frac{9}{s} \\):\n\\[ \\frac{9}{s} + \\frac{t}{60} = 4 \\]\n\nFrom the second condition:\n\\[ T_2 + \\frac{t}{60} = 2.4 \\]\nSince \\( T_2 = \\frac{9}{s + 2} \\):\n\\[ \\frac{9}{s + 2} + \\frac{t}{60} = 2.4 \\]\n\n### Step 3: Solve the system of equations\nWe have:\n\\[ \\frac{9}{s} + \\frac{t}{60} = 4 \\]\n\\[ \\frac{9}{s + 2} + \\frac{t}{60} = 2.4 \\]\n\nSubtract the second equation from the first:\n\\[ \\left( \\frac{9}{s} + \\frac{t}{60} \\right) - \\left( \\frac{9}{s + 2} + \\frac{t}{60} \\right) = 4 - 2.4 \\]\n\\[ \\frac{9}{s} - \\frac{9}{s + 2} = 1.6 \\]\n\n### Step 4: Simplify the equation\n\\[ 9 \\left( \\frac{1}{s} - \\frac{1}{s + 2} \\right) = 1.6 \\]\n\\[ 9 \\left( \\frac{(s + 2) - s}{s(s + 2)} \\right) = 1.6 \\]\n\\[ 9 \\left( \\frac{2}{s(s + 2)} \\right) = 1.6 \\]\n\\[ \\frac{18}{s(s + 2)} = 1.6 \\]\n\\[ 18 = 1.6s(s + 2) \\]\n\\[ 18 = 1.6s^2 + 3.2s \\]\n\\[ 1.6s^2 + 3.2s - 18 = 0 \\]\n\n### Step 5: Solve the quadratic equation\nDivide the entire equation by 1.6:\n\\[ s^2 + 2s - 11.25 = 0 \\]\n\nUse the quadratic formula \\( s = \\frac{-b \\pm \\sqrt{b^2 - 4ac}}{2a} \\):\n\\[ s = \\frac{-2 \\pm \\sqrt{2^2 - 4 \\cdot 1 \\cdot (-11.25)}}{2 \\cdot 1} \\]\n\\[ s = \\frac{-2 \\pm \\sqrt{4 + 45}}{2} \\]\n\\[ s = \\frac{-2 \\pm \\sqrt{49}}{2} \\]\n\\[ s = \\frac{-2 \\pm 7}{2} \\]\n\nThis gives us two solutions:\n\\[ s = \\frac{5}{2} = 2.5 \\]\n\\[ s = \\frac{-9}{2} = -4.5 \\]\n\nSince speed cannot be negative, we have:\n\\[ s = 2.5 \\]\n\n### Step 6: Find \\( t \\)\nSubstitute \\( s = 2.5 \\) back into one of the original equations:\n\\[ \\frac{9}{2.5} + \\frac{t}{60} = 4 \\]\n\\[ 3.6 + \\frac{t}{60} = 4 \\]\n\\[ \\frac{t}{60} = 0.4 \\]\n\\[ t = 0.4 \\times 60 \\]\n\\[ t = 24 \\]\n\n### Step 7: Calculate the total time when \\( s = 2.5 + 0.5 = 3 \\)\n\\[ T = \\frac{9}{3} + \\frac{24}{60} \\]\n\\[ T = 3 + 0.4 \\]\n\\[ T = 3.4 \\]\n\nConvert 3.4 hours to minutes:\n\\[ 3.4 \\times 60 = 204 \\]\n\nThus, the total time Aya spends walking and in the coffee shop when she walks at \\( s + \\frac{1}{2} \\) kilometers per hour is \\(\\boxed{204}\\) minutes.", "Pred_Answer": "204", "Metrics": { "is_valid_answer": true, "acc": 1, "em": 1, "f1": 1.0, "math_equal": true } }, ... ``` ### 精度 DCU与GPU精度一致,推理框架:pytorch。 ## 应用场景 ### 算法类别 `对话问答` ### 热点应用行业 `制造,广媒,金融,能源,医疗,家居,教育` ## 预训练权重 Hugging Face下载地址为:[Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct) ## 源码仓库及问题反馈 - http://developer.sourcefind.cn/codes/modelzoo/search-o1_pytorch.git ## 参考资料 - https://github.com/sunnynexus/Search-o1.git