Commit b5958444 authored by luopl's avatar luopl
Browse files

Initial commit

parents
Modified MIT License
Copyright (c) 2026 Moonshot AI
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the “Software”), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Our only modification part is that, if the Software (or any derivative works
thereof) is used for any of your commercial products or services that have
more than 100 million monthly active users, or more than 20 million US dollars
(or equivalent in other currencies) in monthly revenue, you shall prominently
display "Kimi K2.5" on the user interface of such product or service.
# Kimi-K2.5
## 论文
[Kimi-K2.5](https://arxiv.org/abs/2602.02276)
## 模型简介
Kimi K2.5 是一个开源的、原生的多模态智能体模型,它基于 Kimi-K2-Base,通过对大约 15 万亿个混合视觉和文本标记进行持续预训练而构建。它将视觉和语言理解与高级智能体功能、即时模式和思考模式以及对话和智能体范式无缝集成。
**主要特点**
原生多模态:K2.5 经过视觉语言标记的预训练,在视觉知识、跨模态推理和基于视觉输入的智能工具使用方面表现出色。
视觉编码:K2.5 根据视觉规范(UI 设计、视频工作流程)生成代码,并自主协调视觉数据处理工具。
智能体集群:K2.5 从单智能体扩展过渡到自导向、协调的集群式执行方案。它将复杂任务分解为并行子任务,由动态实例化的、特定领域的智能体执行。
<div align=center>
<img src="./doc/kimi-bar-chart.png"/>
</div>
## 环境依赖
| 软件 | 版本 |
| :------: |:-----------------------------------------:|
| DTK | 26.04 |
| python | 3.10.12 |
| transformers | 4.57.6 |
| vllm | 0.15.1+das.opt1.alpha.dtk2604.20260220.g2799735a |
| triton | 3.3.0+das.opt2.dtk2604.torch291.20260210.g1329924c |
| torch | 2.9.0+das.opt1.dtk2604.20260206.g275d08c2 |
当前仅支持以下镜像: harbor.sourcefind.cn:5443/dcu/admin/base/custom:vllm0.15.1-ubuntu22.04-dtk26.04-0130-py3.10-20260220
**注意**:该镜像版本暂不稳定,建议试用,生产环境不推荐部署
- 挂载地址`-v` 根据实际模型情况修改
```bash
docker run -it \
--shm-size 200g \
--network=host \
--name Kimi-K2.5 \
--privileged \
--device=/dev/kfd \
--device=/dev/dri \
--device=/dev/mkfd \
--group-add video \
--cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
-u root \
-v /opt/hyhal/:/opt/hyhal/:ro \
-v /path/your_code_data/:/path/your_code_data/ \
harbor.sourcefind.cn:5443/dcu/admin/base/custom:vllm0.15.1-ubuntu22.04-dtk26.04-0130-py3.10-20260220 bash
```
更多镜像可前往[光源](https://sourcefind.cn/#/service-list)下载使用。
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装,pycountry库需要单独安装:
```
pip install pycountry
```
## 数据集
暂无
## 训练
暂无
## 推理
### vllm
#### 多机推理
1. 加入环境变量
> 请注意:
> 每个节点上的环境变量都写到.sh文件中,保存后各个计算节点分别source`.sh`文件
>
> VLLM_HOST_IP:节点本地通信口ip,尽量选择IB网卡的IP,**避免出现rccl超时问题**
>
> NCCL_SOCKET_IFNAME和 GLOO_SOCKET_IFNAME:节点本地通信网口ip对应的名称
>
> 通信口和ip查询方法:ifconfig
>
> IB口状态查询:ibstat !!!一定要active激活状态才可用,各个节点要保持统一
```bash
export ALLREDUCE_STREAM_WITH_COMPUTE=1
export VLLM_HOST_IP=x.x.x.x # 对应计算节点的IP,选择IB口SOCKET_IFNAME对应IP地址
export NCCL_SOCKET_IFNAME=ibxxxx
export GLOO_SOCKET_IFNAME=ibxxxx
export NCCL_IB_HCA=mlx5_0:1 # 环境中的IB网卡名字
unset NCCL_ALGO
export NCCL_MIN_NCHANNELS=16
export NCCL_MAX_NCHANNELS=16
export NCCL_NET_GDR_READ=1
export HIP_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export VLLM_SPEC_DECODE_EAGER=1
export VLLM_MLA_DISABLE=0
export VLLM_USE_FLASH_MLA=1
export VLLM_RPC_TIMEOUT=1800000
# 海光CPU绑定核
export VLLM_NUMA_BIND=1
export VLLM_RANK0_NUMA=0
export VLLM_RANK1_NUMA=1
export VLLM_RANK2_NUMA=2
export VLLM_RANK3_NUMA=3
export VLLM_RANK4_NUMA=4
export VLLM_RANK5_NUMA=5
export VLLM_RANK6_NUMA=6
export VLLM_RANK7_NUMA=7
```
2. 启动RAY集群
> x.x.x.x 对应第一步 VLLM_HOST_IP
```bash
# head节点执行
ray start --head --node-ip-address=x.x.x.x --port=6379 --num-gpus=8 --num-cpus=32
# worker节点执行
ray start --address='x.x.x.x:6379' --num-gpus=8 --num-cpus=32
```
3. 启动vllm server
```bash
## serve启动
vllm serve moonshotai/Kimi-K2.5 \
-tp 32 \
--distributed-executor-backend ray \
--mm-encoder-tp-mode data \
--trust-remote-code \
--tool-call-parser kimi_k2 \
--reasoning-parser kimi_k2
## client访问
curl http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "moonshotai/Kimi-K2.5",
"messages": [
{"role": "user", "content": "which one is bigger, 9.11 or 9.9? think carefully"}
],
"temperature": 0.6
}'
```
## 效果展示
<div align=center>
<img src="./doc/result-dcu.png"/>
</div>
### 精度
DCU与GPU精度一致,推理框架:vllm。
## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 | 下载地址 |
|:------:|:----:|:----------:|:------:|:---------------------:|
| Kimi-K2.5 | 1.1T | BW1000 | 32 | [Hugging Face](https://huggingface.co/moonshotai/Kimi-K2.5) |
## 源码仓库及问题反馈
- https://developer.sourcefind.cn/codes/modelzoo/kimi-k2.5_vllm
## 参考资料
- https://github.com/moonshotai/Kimi-K2.5
icon.png

53.8 KB

# 模型唯一标识
modelCode=2153
# 模型名称
modelName=kimi-k2.5_vllm
# 模型描述
modelDescription=Kimi K2.5 作为一个原生多模态模型,提供了最先进的编码和视觉功能,以及自主智能体集群范式。
# 运行过程
processType=推理
# 算法类别
appCategory=对话问答
# 框架类型
frameType=vllm
# 加速卡类型
accelerateType=BW1000
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment