# DeepSeek-R1-Distill ## 论文 `DeepSeek-R1` * https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf ## 模型结构 该算法共有三种模型,分别是LLama3.1,LLama3.3和Qwen2.5,三者都是decoder-only结构。 ![alt text](readme_imgs/arch.png) ## 算法原理 DeepSeek-R1-Distill-model基于目前性能较好的开源模型,使用`DeepSeek-R1`生成的高质量数据进行监督微调(SFT)获得。 ## 环境配置 ### Docker(方法一) docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-ubuntu22.04-dtk24.04.3-py3.10 docker run --shm-size 500g --network=host --name=dpskv3 --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it bash pip install https://download.sourcefind.cn:65024/directlink/4/lmslim/DAS1.3/lmslim-0.1.2+das.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl pip install https://download.sourcefind.cn:65024/directlink/4/vllm/DAS1.3/vllm-0.6.2+das.opt1.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl ### Dockerfile(方法二) docker build -t : . docker run --shm-size 500g --network=host --name=dpskv3 --privileged --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -v 项目地址(绝对路径):/home/ -v /opt/hyhal:/opt/hyhal:ro -it bash pip install https://download.sourcefind.cn:65024/directlink/4/lmslim/DAS1.3/lmslim-0.1.2+das.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl pip install https://download.sourcefind.cn:65024/directlink/4/vllm/DAS1.3/vllm-0.6.2+das.opt1.dtk24043-cp310-cp310-manylinux_2_28_x86_64.whl 环境变量: export ALLREDUCE_STREAM_WITH_COMPUTE=1 export VLLM_NUMA_BIND=1 export VLLM_RANK0_NUMA=0 export VLLM_RANK1_NUMA=1 export VLLM_RANK2_NUMA=2 export VLLM_RANK3_NUMA=3 export VLLM_RANK4_NUMA=4 export VLLM_RANK5_NUMA=5 export VLLM_RANK6_NUMA=6 export VLLM_RANK7_NUMA=7 ## 数据集 数据需要使用`DeepSeek-R1`获取,本项目提供一个示例数据集用于测试,见`examples/toy.json` ## 训练 可使用[LLaMA-Factory](https://developer.sourcefind.cn/codes/OpenDAS/llama-factory)训练,安装方法如下 ```bash git clone http://developer.sourcefind.cn/codes/OpenDAS/llama-factory.git cd llama-factory && pip install -e ".[torch,metrics]" ``` ### SFT deepseek_r1_distill.yaml ```yaml # 单机N卡训练配置(按需修改) model_name_or_path: /path/to/your/model stage: sft do_train: true finetuning_type: full deepspeed: examples/deepspeed/ds_z3_config.json dataset: deepseek-r1_distill template: qwen cutoff_len: 2048 max_samples: 5000 overwrite_cache: true preprocessing_num_workers: 16 output_dir: /path/to/save/checkpoints logging_steps: 10 save_steps: 500 overwrite_output_dir: true per_device_train_batch_size: 1 gradient_accumulation_steps: 4 learning_rate: 1.0e-4 num_train_epochs: 3.0 lr_scheduler_type: cosine warmup_ratio: 0.1 bf16: true ddp_timeout: 1800 val_size: 0.1 per_device_eval_batch_size: 1 eval_strategy: steps eval_steps: 500 ``` ```bash cd llama-factory llamafactory-cli train /path/to/deepseek_r1_distill.yaml ``` ## 推理 ### vllm服务 ```bash vllm serve /path/to/distill_model --tensor-parallel-size 2 --max-model-len 32768 --enforce-eager ``` ```bash curl http://localhost:8000/v1/completions \ -H "Content-Type: application/json" \ -d '{ "model": "model_id", "prompt": "your prompt", "max_tokens": 512, "temperature": 0 }' ``` ## result ```bash curl http://localhost:8000/v1/completions \ -H "Content-Type: application/json" \ -d '{ "model": "/home/modelzoo/DeepSeek-R1-Distill-Qwen-14B/", "prompt": "甲乙两班共有学生98人,甲班比乙班多6人,求两班各有多少人?", "max_tokens": 300, "temperature": 0 }' ``` ```bash {"id":"cmpl-5473237b46054a98ba27906a4b099e33","object":"text_completion","created":1737515343,"model":"/home/modelzoo/DeepSeek-R1-Distill-Qwen-14B/","choices":[{"index":0,"text":"(用方程解)\n\n首先,设乙班有x人,那么甲班就有x + 6人。\n\n根据总人数,可以列出方程:x + (x + 6) = 98。\n\n解这个方程,得到x = 41。\n\n因此,乙班有41人,甲班有47人。\n\n\n**解答:**\n\n设乙班有 \\( x \\) 人,则甲班有 \\( x + 6 \\) 人。\n\n根据题意,两班共有学生98人,可以列出方程:\n\n\\[\nx + (x + 6) = 98\n\\]\n\n解方程:\n\n\\[\n2x + 6 = 98\n\\]\n\n\\[\n2x = 98 - 6\n\\]\n\n\\[\n2x = 92\n\\]\n\n\\[\nx = 46\n\\]\n\n因此,乙班有46人,甲班有:\n\n\\[\nx + 6 = 46 + 6 = 52\n\\]\n\n**答案:**\n\n甲班有 \\(\\boxed{52}\\) 人,乙班有 \\(\\boxed{46}\\) 人。","logprobs":null,"finish_reason":"stop","stop_reason":null,"prompt_logprobs":null}],"usage":{"prompt_tokens":27,"total_tokens":285,"completion_tokens":258}} ``` ### 精度 与Nvidia GPU保持一致。 ## 应用场景 ### 算法类别 `对话问答` ### 热点应用行业 `电商,教育,广媒` ## 预训练权重 |model|下载地址| |:---:|:---:| |DeepSeek-R1-Distill-Qwen-1.5B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) \| [SCNet]| |DeepSeek-R1-Distill-Qwen-7B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) \| [SCNet] | |DeepSeek-R1-Distill-Llama-8B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) \| [SCNet]| |DeepSeek-R1-Distill-Qwen-14B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) \| [SCNet]| |DeepSeek-R1-Distill-Qwen-32B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) \| [SCNet]| |DeepSeek-R1-Distill-Llama-70B| [huggingface](https://hf-mirror.com/deepseek-ai/DeepSeek-R1-Distill-Llama-70B) \| [SCNet]| ## 源码仓库及问题反馈 * https://developer.sourcefind.cn/codes/modelzoo/deepseek-r1-distill_vllm ## 参考资料 * https://github.com/deepseek-ai/DeepSeek-R1