Commit bc137ddf authored by chenych's avatar chenych
Browse files

Update

parent 1ff71e49
...@@ -28,7 +28,6 @@ docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:vllm0.8.5-ubuntu22.04 ...@@ -28,7 +28,6 @@ docker pull image.sourcefind.cn:5000/dcu/admin/base/custom:vllm0.8.5-ubuntu22.04
docker run -it --shm-size 200g --network=host --name {docker_name} --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro {imageID} bash docker run -it --shm-size 200g --network=host --name {docker_name} --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro {imageID} bash
cd /your_code_path/mistral_pytorch cd /your_code_path/mistral_pytorch
pip install mistral_inference
``` ```
### Dockerfile(方法二) ### Dockerfile(方法二)
...@@ -38,24 +37,19 @@ docker build --no-cache -t mistral:latest . ...@@ -38,24 +37,19 @@ docker build --no-cache -t mistral:latest .
docker run -it --shm-size 200g --network=host --name {docker_name} --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro {imageID} bash docker run -it --shm-size 200g --network=host --name {docker_name} --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal/:/opt/hyhal/:ro {imageID} bash
cd /your_code_path/mistral_pytorch cd /your_code_path/mistral_pytorch
pip install mistral_inference
``` ```
### Anaconda(方法三) ### Anaconda(方法三)
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.hpccube.com/tool/)开发者社区下载安装。 关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装。
```bash ```bash
DTK: 25.04 DTK: 25.04
python: 3.10 python: 3.10
vllm: 0.8.5
torch: 2.4.1+das.opt2.dtk2504 torch: 2.4.1+das.opt2.dtk2504
deepspeed: 0.14.2+das.opt2.dtk2504 deepspeed: 0.14.2+das.opt2.dtk2504
``` ```
`Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应` `Tips:以上dtk驱动、python、torch等DCU相关工具版本需要严格一一对应`
其它非深度学习库安装方式如下:
```bash
pip install mistral_inference
```
## 数据集 ## 数据集
...@@ -86,23 +80,41 @@ SFT训练脚本示例,参考`llama-factory/train_lora`下对应yaml文件。 ...@@ -86,23 +80,41 @@ SFT训练脚本示例,参考`llama-factory/train_lora`下对应yaml文件。
参数解释同[#全参微调](#全参微调) 参数解释同[#全参微调](#全参微调)
## 推理 ## 推理
### mistral-chat ### vllm
#### offline
```bash ```bash
mistral-chat /path_of/mistral_models/7B-Instruct-v0.3 --instruct --max_tokens 256 python infer_vllm.py --model_name_or_path /path_of/mistralai/Mistral-7B-Instruct-v0.3
``` ```
### offline ### server
1. 启动服务
```bash ```bash
python infer_mistral.py --model_name_or_path /path_of/model vllm serve mistralai/Mistral-7B-Instruct-v0.3 --tokenizer_mode mistral --config_format mistral --load_format mistral --served-model-name Mistral-7B-Instruct --trust-remote-code --enforce-eager
```
2. 测试client
```
curl http://<your-node-url>:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Mistral-7B-Instruct",
"messages": [
{
"role": "user",
"content": "Explain Machine Learning to me in a nutshell."
}
],
"temperature": 0.15
}'
``` ```
## result ## result
<div align=center> <div align=center>
<img src="./doc/results.jpg"/> <img src="./doc/results.pngcd ../"/>
</div> </div>
### 精度 ### 精度
暂无 DCU与GPU精度一致,推理框架:pytorch。
## 应用场景 ## 应用场景
### 算法类别 ### 算法类别
......
import argparse
from mistral_inference.transformer import Transformer
from mistral_inference.generate import generate
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest
parse = argparse.ArgumentParser()
parse.add_argument("--user_prompt", type=str, default="Explain Machine Learning to me in a nutshell.")
parse.add_argument("--model_name_or_path", type=str, default="mistralai/Mistral-7B-Instruct-v0.3")
args = parse.parse_args()
tokenizer = MistralTokenizer.from_file(f"{args.model_name_or_path}/tokenizer.model.v3")
model = Transformer.from_folder(args.model_name_or_path)
completion_request = ChatCompletionRequest(messages=[UserMessage(content=args.user_prompt)])
tokens = tokenizer.encode_chat_completion(completion_request).tokens
out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])
print(result)
import argparse
from vllm import LLM, SamplingParams
parse = argparse.ArgumentParser()
parse.add_argument("--user_prompt", type=str, default="Explain Machine Learning to me in a nutshell.")
parse.add_argument("--model_name_or_path", type=str, default="mistralai/Mistral-7B-Instruct-v0.3")
args = parse.parse_args()
sampling_params = SamplingParams(max_tokens=8192)
# If you want to divide the GPU requirement over multiple devices, please add *e.g.* `tensor_parallel=2`
llm = LLM(model=args.model_name_or_path, tokenizer_mode="mistral", config_format="mistral", load_format="mistral")
messages = [
{
"role": "user",
"content": args.user_prompt
},
]
outputs = llm.chat(messages, sampling_params=sampling_params)
print("output:", outputs[0].outputs[0].text)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment