"third_party/METIS/GKlib/README.md" did not exist on "f2c80b440e80226441dc6c11a95ade10defaaf11"
Commit a52e53db authored by chenzk's avatar chenzk
Browse files

v1.0

parents
Pipeline #2680 canceled with stages
# SOME DESCRIPTIVE TITLE.
# Copyright (C) 2024, Qwen Team
# This file is distributed under the same license as the Qwen package.
# FIRST AUTHOR <EMAIL@ADDRESS>, 2024.
#
msgid ""
msgstr ""
"Project-Id-Version: Qwen \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2025-04-28 19:42+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
"Language-Team: zh_CN <LL@li.org>\n"
"Plural-Forms: nplurals=1; plural=0;\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.17.0\n"
#: ../../Qwen/source/framework/qwen_agent.rst:2
#: aaed24d3edd64e6ab1f20188f3d5ba24
msgid "Qwen-Agent"
msgstr "Qwen-Agent"
#: ../../Qwen/source/framework/qwen_agent.rst:5
#: 1cbbb8d342f243c58e0d66a3e44daac8
msgid "To be updated for Qwen3."
msgstr "仍需为Qwen3更新。"
#: ../../Qwen/source/framework/qwen_agent.rst:7
#: 3e1dbee121bc4a6c91a26618e27c0d86
msgid "`Qwen-Agent <https://github.com/QwenLM/Qwen-Agent>`__ is a framework for developing LLM applications based on the instruction following, tool usage, planning, and memory capabilities of Qwen. It also comes with example applications such as Browser Assistant, Code Interpreter, and Custom Assistant."
msgstr "`Qwen-Agent <https://github.com/QwenLM/Qwen-Agent>`__ 是一个基于 Qwen 的指令跟随、工具使用、计划和记忆能力来开发 LLM 应用程序的框架。它还附带了一些示例应用程序,例如浏览器助手、代码解释器和自定义助手。"
#: ../../Qwen/source/framework/qwen_agent.rst:14
#: f180730da09640169fb93950a2e8cb5f
msgid "Installation"
msgstr "安装"
#: ../../Qwen/source/framework/qwen_agent.rst:23
#: 89f39ac4160d49fba7f9d52dce6527c3
msgid "Developing Your Own Agent"
msgstr "开发您自己的智能体"
#: ../../Qwen/source/framework/qwen_agent.rst:25
#: 307456721ed7469eb7b8f636483188f4
msgid "Qwen-Agent provides atomic components such as LLMs and prompts, as well as high-level components such as Agents. The example below uses the Assistant component as an illustration, demonstrating how to add custom tools and quickly develop an agent that uses tools."
msgstr "Qwen-Agent 提供包括语言模型和提示词等原子级组件,及智能体等高级组件在内的多种组件。以下示例选取助理组件进行展示,阐述了如何整合自定义工具以及如何迅速开发出一个能够应用这些工具的代理程序。"
#: ../../Qwen/source/framework/qwen_agent.rst:94
#: 13034806dd414e19a5f53ece31d0fa16
msgid "The framework also provides more atomic components for developers to combine. For additional showcases, please refer to `examples <https://github.com/QwenLM/Qwen-Agent/tree/main/examples>`__."
msgstr "该框架还为开发者提供了更多的原子组件以供组合使用。欲了解更多示例,请参见 `examples <https://github.com/QwenLM/Qwen-Agent/tree/main/examples>`__。"
This diff is collapsed.
# SOME DESCRIPTIVE TITLE.
# Copyright (C) 2024, Qwen Team
# This file is distributed under the same license as the Qwen package.
# FIRST AUTHOR <EMAIL@ADDRESS>, 2024.
#
msgid ""
msgstr ""
"Project-Id-Version: Qwen \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2025-04-28 19:42+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
"Language-Team: zh_CN <LL@li.org>\n"
"Plural-Forms: nplurals=1; plural=0;\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.17.0\n"
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:2
#: 6d4d3bb3020f4e4d8dba0ca5778cdcae
msgid "Performance of Quantized Models"
msgstr "量化模型效果评估"
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:5
#: 3a541cd8cba74edf9b06b46f59eaaf38
msgid "To be updated for Qwen3."
msgstr "仍需为Qwen3更新。"
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:7
#: 3a95fc299de141dea4fc729ef907ce17
msgid "This section reports the generation performance of quantized models (including GPTQ and AWQ) of the Qwen2 series. Specifically, we report:"
msgstr "本部分介绍Qwen2量化模型(包括GPTQ与AWQ量化方案)的效果评估,有以下数据集"
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:11
#: 9386a3b95eb340568185da78224a1ccd
msgid "MMLU (Accuracy)"
msgstr "MMLU (准确率)"
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:12
#: 3cd93b881c90488895c61298104bc7fb
msgid "C-Eval (Accuracy)"
msgstr "C-Eval (准确率)"
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:13
#: 7ac4bb515b0a49699d4eb95fc433bb51
msgid "IFEval (Strict Prompt-Level Accuracy)"
msgstr "IFEval (提示词级的严格准确率,Strict Prompt-Level Accuracy)"
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:15
#: 08e3f35820344c93877618815650b866
msgid "We use greedy decoding in evaluating all models."
msgstr "所有模型均使用贪心解码。"
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:18
#: 9aec40221219455d8fc4e473e5acf09c
msgid "Quantization"
msgstr "量化模型"
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:18
#: 93f274f4751f445d85f04937b25c7f7d
msgid "Average"
msgstr "平均"
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:18
#: 776612f5dd4a40d98976bdfe4896508c
msgid "MMLU"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:18
#: f6e8014116cf4179a934d601ee61d04d
msgid "C-Eval"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:18
#: 0c40e96c4a3b4cdeaaf1a95ff1aa8f98
msgid "IFEval"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
#: 773ccb0f10bd4cf690e819af51c40e76
msgid "Qwen2-72B-Instruct"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
#: 71e180f75e624b738d56ec2a1fad253c 7ebe73a2e96445c4bb733845c3190240
#: bd5a3b8861d646fa9e8d8bc51bb1b80c cc79a78b34f94c18b7bdaf1bfcc8824d
msgid "BF16"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:22
#: 08517ffc3e6e4ceb812c3d8710307266 2e879d3d1fef4c878b097550d745e7ae
msgid "81.3"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
#: f795aa42cf7d42ccb5a573a5f44be79f
msgid "82.3"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
#: 01c54f3da3454e178a07a9f88ed5302b
msgid "83.8"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
#: 7651df5ccaa14b11a3a89827a5265ae8
msgid "77.6"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:22
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:30
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:38
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:46
#: 04de04c9ff3640f096301e76fdd291de 301aa8e494ff4fe4aefcc8cfb7a4c065
#: d395be41cf144318a1faeccc6f6965c8 ec513d10a75d44b8bd134287a57b5cdd
msgid "GPTQ-Int8"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:22
#: 411166db878d4d8f8515e9f5d78a651c
msgid "80.7"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:22
#: e63ce8a2f1cc4cec9b52521015e2aebe
msgid "83.4"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:22
#: e6be6c30e0d740d39c6c8807e2d4f5f8
msgid "77.5"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:24
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:32
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:40
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:48
#: 21720ff324814b2b865f37a40c3586b5 4644a49bcdfd457b84eb5b2771177d78
#: 560dcb4bfa6e45088faefdb504d629a5 7044a0d2dd6945138ea385287ab5bf33
msgid "GPTQ-Int4"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:24
#: 1cb55cd40b3c484d8213c15375b2ad68
msgid "81.2"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:24
#: 32b889d9ef014f2ab6be6881e20d40ae
msgid "80.8"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:24
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:26
#: ba86de9eb27b40e0ba6a57580aed89c3 eed2e99c0edc426e81ec24e961fe971e
msgid "83.9"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:24
#: ee3a3132082048d5b79721fa84f6f816
msgid "78.9"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:26
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:34
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:42
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:50
#: 632f832fc1f249fa92764538b698550d 8c7ccf4f75f44b27bb1b5aac544836cb
#: b473937c2be94c3490483bb5a820e2fe bc1abd77dd27412992d21bda1831a2a8
msgid "AWQ"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:26
#: 2711a3f907224e51ba30818b2e730a30
msgid "80.4"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:26
#: ca9624c0258b425ba53f024b086c173a
msgid "80.5"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:26
#: 2f4b57d4394c4cb187407145ce8d5f1e
msgid "76.9"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
#: 48cc75ed7bf04778b327c7b03d418e37
msgid "Qwen2-7B-Instruct"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
#: 75182905b74a41099ff859fb86752e99
msgid "66.9"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
#: 80cda712e9dc482fac24952d3bb27b28
msgid "70.5"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
#: 0701d66bc3084aef8937e4b687705f37
msgid "77.2"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
#: 8efb5c133644420c808dfd78f8fcde2f
msgid "53.1"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:30
#: 2076e02516bd4ff1856bc12a8d6bd320
msgid "66.2"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:30
#: 588f4ad13845491d9589ea094265d532
msgid "69.1"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:30
#: 0c79963a231a402eb6db1671e851be38
msgid "76.7"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:30
#: 5d525163672f456289990489459466ae
msgid "52.9"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:32
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:34
#: 9283ca6491194b59a5edf57228f9b5af a4123c0691a442f6850ae25615c108af
msgid "64.1"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:32
#: 9e7ffb49aac34129894b0582c0d8aba1
msgid "67.8"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:32
#: 7c2fc310e5764b7fbf6034ffd3a5d26d
msgid "75.2"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:32
#: 33e6b6e590a64c08adccf0bb161c1046
msgid "49.4"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:34
#: b3cbe7665bdf4f4388f015fb6606540e
msgid "67.4"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:34
#: a47d3b52e80249f986c4339b9d3fff10
msgid "73.6"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:34
#: d76543cff2df434185fbe51712024679
msgid "51.4"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
#: cee2c965036d41c6a93ffbf9a9788e4b
msgid "Qwen2-1.5B-Instruct"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
#: 8c9d1cd8fb5a4d75b85d0edcb9ed69df
msgid "48.4"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
#: f5e05b0942a24e2b9cac753932ad51c4
msgid "52.4"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
#: c6f81ec529004598aa14c55228ff9538
msgid "63.8"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
#: 5b2b4092d04f4d02a56bd0df5807e2c5
msgid "29.0"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:38
#: 08d2bf82e83f4a889d622c72c1e1b3b2
msgid "48.1"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:38
#: 3d8ea738153f467ba55d50e6bf0f84c0
msgid "53.0"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:38
#: 8755d6c4c1e64cd38122f08a92bd90ca
msgid "62.5"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:38
#: 1c403dbb3692472a88706cb4b4a1f0f3
msgid "28.8"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:40
#: f3f43ea77edc4ff0969e2466e6fe13e1
msgid "45.0"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:40
#: 9d070c4b9f3e4fceb27b29ecdf90eb41
msgid "50.7"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:40
#: 24ff991704c440deb34b92512f89c371
msgid "57.4"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:40
#: b4645b7317a44cb795fc4190149dd0e0
msgid "27.0"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:42
#: eeee44d1d65647569999de94e72c00cb
msgid "46.5"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:42
#: 41630bee9142494c801083cd5d213dc0
msgid "51.6"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:42
#: 762395735fb34bccbc4d057968bbfbf1
msgid "58.1"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:42
#: f5915835bcb24051bebed452fc398728
msgid "29.9"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
#: 39108e2a66444ca780a720f115251308
msgid "Qwen2-0.5B-Instruct"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:50
#: 2795adace57c401cb8bacc00082dfd53 a59271d53e434d17a8a0a19529158f2c
msgid "34.4"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
#: c93982789e4e453eb5a02d64f02cb74f
msgid "37.9"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
#: 213dfd43b2254a2caec1d4b1d231ed55
msgid "45.2"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
#: 11de22e2a04a4c04b0b91d09d028b853
msgid "20.0"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:46
#: 84b6570bcc8d4c6598336d5bc9b9d36a
msgid "32.6"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:46
#: b79e88232d114f43a179dcc5b0477c97
msgid "35.6"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:46
#: 1166b675e1e64e18a82c3219f321e248
msgid "43.9"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:46
#: fdf340d39b074778b55d36f477f8dc0a
msgid "18.1"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:48
#: ed930e1b13dd4c5caf80b2a180a1bcc3
msgid "29.7"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:48
#: c3d5617389634f7e96c66b4f869379a9
msgid "33.0"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:48
#: 4573b471c48d4028ad6fb378e75f40aa
msgid "39.2"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:48
#: c867c42e916f493b9715b1adf656ddcb
msgid "16.8"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:50
#: 20d4c89c335648bb93f07ebfb8ce9fce
msgid "31.1"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:50
#: 25400aeaf79d49cb914ffa5ff26bfe03
msgid "42.1"
msgstr ""
#: ../../Qwen/source/getting_started/quantization_benchmark.rst:50
#: d15e246b65b0427d970b78deffd8c2bc
msgid "16.7"
msgstr ""
# Copyright (C) 2024, Qwen Team, Alibaba Group.
# This file is distributed under the same license as the Qwen package.
#
msgid ""
msgstr ""
"Project-Id-Version: Qwen \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2025-04-28 19:42+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
"Language-Team: zh_CN <LL@li.org>\n"
"Plural-Forms: nplurals=1; plural=0;\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.17.0\n"
#: ../../Qwen/source/getting_started/quickstart.md:1
#: 595827c46f2e4884b69954cf22e0e957
msgid "Quickstart"
msgstr "快速开始"
#: ../../Qwen/source/getting_started/quickstart.md:3
#: 725288359306417a943352cef10f831c
msgid "This guide helps you quickly start using Qwen3. We provide examples of [Hugging Face Transformers](https://github.com/huggingface/transformers) as well as [ModelScope](https://github.com/modelscope/modelscope), and [vLLM](https://github.com/vllm-project/vllm) for deployment."
msgstr "本指南帮助您快速上手 Qwen3 的使用,并提供了如下示例: [Hugging Face Transformers](https://github.com/huggingface/transformers) 以及 [ModelScope](https://github.com/modelscope/modelscope) 和 [vLLM](https://github.com/vllm-project/vllm>) 在部署时的应用实例。"
#: ../../Qwen/source/getting_started/quickstart.md:6
#: 6bfc020002af4b4eaad8adf3902e30ac
msgid "You can find Qwen3 models in [the Qwen3 collection](https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f) at HuggingFace Hub and [the Qwen3 collection](https://www.modelscope.cn/collections/Qwen3-9743180bdc6b48) at ModelScope."
msgstr "你可以在 HuggingFace Hub 的 [Qwen3 collection](https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f) 或 ModelScope 的 [Qwen3 collection](https://www.modelscope.cn/collections/Qwen3-9743180bdc6b48) 中寻找 Qwen3 模型。"
#: ../../Qwen/source/getting_started/quickstart.md:8
#: 1dbf0833f8a5407b8d00056d029eb9d8
msgid "Transformers"
msgstr "Transformers"
#: ../../Qwen/source/getting_started/quickstart.md:10
#: cbe2f022b0b54729a1d3627cb19ad99f
msgid "To get a quick start with Qwen3, you can try the inference with `transformers` first. Make sure that you have installed `transformers>=4.51.0`. We advise you to use Python 3.10 or higher, and PyTorch 2.6 or higher."
msgstr "要快速上手 Qwen3 ,我们建议您首先尝试使用 `transformers` 进行推理。请确保已安装了 `transformers>=4.51.0` 版本。我们建议您使用 Python 3.10 或以上版本, PyTorch 2.6 或以上版本。"
#: ../../Qwen/source/getting_started/quickstart.md:14
#: bd305d4f44484f75bfe0c02a9eda68c4
msgid "The following is a very simple code snippet showing how to run Qwen3-8B:"
msgstr "以下是一个非常简单的代码片段示例,展示如何运行 Qwen3 模型:"
#: ../../Qwen/source/getting_started/quickstart.md:63
#: 0bb48ceb71854514be78497721308702
msgid "Qwen3 will think before respond, similar to QwQ models. This means the model will use its reasoning abilities to enhance the quality of generated responses. The model will first generate thinking content wrapped in a `<think>...</think>` block, followed by the final response."
msgstr "Qwen3 将在实际回复前思考,与 QwQ 模型类似。这意味着模型将运用其推理能力来提升生成回复的质量。模型会首先生成包含在 `<think>...</think>` 块中的思考内容,随后给出最终回复。"
#: ../../Qwen/source/getting_started/quickstart.md:67
#: d110ccfe1d834169992f03bcf932e250
msgid "Hard Switch: To strictly disable the model's thinking behavior, aligning its functionality with the previous Qwen2.5-Instruct models, you can set `enable_thinking=False` when formatting the text."
msgstr "硬开关:为了严格禁用模型的思考行为,使其功能与之前的Qwen2.5-Instruct模型保持一致,您可以在格式化文本时设置`enable_thinking=False`。"
#: ../../Qwen/source/getting_started/quickstart.md:77
#: 4bceeb7e0179470f88620507ade7915b
msgid "It can be particularly useful in scenarios where disabling thinking is essential for enhancing efficiency."
msgstr "在某些需要通过禁用思考来提升效率的场景中,这一功能尤其有用。"
#: ../../Qwen/source/getting_started/quickstart.md:79
#: 16b4b43b7a7b43a698118e17d778a6dd
msgid "Soft Switch: Qwen3 also understands the user's instruction on its thinking behaviour, in particular, the soft switch `/think` and `/no_think`. You can add them to user prompts or system messages to switch the model's thinking mode from turn to turn. The model will follow the most recent instruction in multi-turn conversations."
msgstr "软开关:Qwen3 还能够理解用户对其思考行为的指令,特别是软开关 `/think` 和 `/no_think`。您可以将这些指令添加到用户 (user) 或系统 (system) 消息中,以在对话轮次之间灵活切换模型的思考模式。在多轮对话中,模型将遵循最近的指令。"
#: ../../Qwen/source/getting_started/quickstart.md:85
#: 518d0395430f4920973e6da2753c1507
msgid "For thinking mode, use Temperature=0.6, TopP=0.95, TopK=20, and MinP=0 (the default setting in `generation_config.json`). DO NOT use greedy decoding, as it can lead to performance degradation and endless repetitions. For more detailed guidance, please refer to the Best Practices section."
msgstr "对于思考模式,使用 Temperature=0.6,TopP=0.95,TopK=20,以及 MinP=0(`generation_config.json` 中的默认设置)。不要使用贪婪解码,因为它可能导致性能下降和无尽的重复。更多详细指导,请参阅最佳实践部分。"
#: ../../Qwen/source/getting_started/quickstart.md:89
#: 80bf598dfdf048a791d05c6a21ccd425
msgid "For non-thinking mode, we suggest using Temperature=0.7, TopP=0.8, TopK=20, and MinP=0."
msgstr "对于非思考模式,我们建议使用 Temperature=0.7,TopP=0.8,TopK=20,以及 MinP=0。"
#: ../../Qwen/source/getting_started/quickstart.md:93
#: 7a585706796a4db9a9f34ec1241135b5
msgid "ModelScope"
msgstr "魔搭 (ModelScope)"
#: ../../Qwen/source/getting_started/quickstart.md:95
#: fbf6acee0f534a3d9197221626ce79e4
msgid "To tackle with downloading issues, we advise you to try [ModelScope](https://github.com/modelscope/modelscope). Before starting, you need to install `modelscope` with `pip`."
msgstr "为了解决下载问题,我们建议您尝试从 [ModelScope](https://github.com/modelscope/modelscope) 进行下载。开始之前,需要使用 `pip` 安装 `modelscope` 。"
#: ../../Qwen/source/getting_started/quickstart.md:98
#: e29964895f744793a18058022ad578b8
msgid "`modelscope` adopts a programmatic interface similar (but not identical) to `transformers`. For basic usage, you can simply change the first line of code above to the following:"
msgstr "`modelscope` 采用了与 `transformers` 类似(但不完全一致)的编程接口。对于基础使用,仅需将上面代码第一行做如下修改:"
#: ../../Qwen/source/getting_started/quickstart.md:105
#: 2686cab2a6f54fe7ae813a0aeeb04d14
msgid "For more information, please refer to [the documentation of `modelscope`](https://www.modelscope.cn/docs)."
msgstr "欲获取更多信息,请参考 [`modelscope` 文档](https://www.modelscope.cn/docs)。"
#: ../../Qwen/source/getting_started/quickstart.md:107
#: ce23fee238f8458599cc4d7e16a2e509
msgid "vLLM"
msgstr ""
#: ../../Qwen/source/getting_started/quickstart.md:109
#: cf0e10035e954a328775205ff39e9687
msgid "To deploy Qwen3, we advise you to use vLLM. vLLM is a fast and easy-to-use framework for LLM inference and serving. In the following, we demonstrate how to build a OpenAI-API compatible API service with vLLM."
msgstr "要部署 Qwen3 ,我们建议您使用 vLLM 。 vLLM 是一个用于 LLM 推理和服务的快速且易于使用的框架。以下,我们将展示如何使用 vLLM 构建一个与 OpenAI 兼容的 API 服务。"
#: ../../Qwen/source/getting_started/quickstart.md:113
#: 925651cdb57d478884f151b52834ab3c
msgid "First, make sure you have installed `vllm>=0.8.5`."
msgstr "首先,确保你已经安装 `vLLM>=0.8.5` :"
#: ../../Qwen/source/getting_started/quickstart.md:115
#: 4cb0c9b830984fafa3f5ee2e74dea6dc
msgid "Run the following code to build up a vLLM service. Here we take Qwen3-8B as an example:"
msgstr "运行以下代码以构建 vLLM 服务。此处我们以 Qwen3-8B 为例:"
#: ../../Qwen/source/getting_started/quickstart.md:122
#: c7b58160d10d43a2bb6e63572dbeff46
msgid "Then, you can use the [create chat interface](https://platform.openai.com/docs/api-reference/chat/completions/create) to communicate with Qwen:"
msgstr "然后,可以使用 [\"create chat\" interface](https://platform.openai.com/docs/api-reference/chat/completions/create>) 来与 Qwen 进行交流:"
#: ../../Qwen/source/getting_started/quickstart.md
#: 8f4c1e3692a34137ad9fbf6d7a50969c c685b92ca0ea49c0b3925b24cd43317c
msgid "curl"
msgstr ""
#: ../../Qwen/source/getting_started/quickstart.md
#: 147be07b6f3141c08f8c707a9f06403c ffc3d81775264a00ad0d7bcb85ff6caf
msgid "Python"
msgstr ""
#: ../../Qwen/source/getting_started/quickstart.md:142
#: ../../Qwen/source/getting_started/quickstart.md:192
#: 9a1026d8cf10458b8a3e717e105e8d5e ed7621681c36472a90b4be9c1fe98355
msgid "You can use the API client with the `openai` Python SDK as shown below:"
msgstr "您可以按照下面所示的方式,使用 `openai` Python SDK中的客户端:"
#: ../../Qwen/source/getting_started/quickstart.md:169
#: a5ae1f193b044cb982e5ea4d98b30afb
msgid "While the soft switch is always available, the hard switch is also availabe in vLLM through the following configuration to the API call. To disable thinking, use"
msgstr "虽然软开关始终可用,但硬开关也可以通过以下 API 调用配置在 vLLM 中使用。要禁用思考,请使用"
#: ../../Qwen/source/getting_started/quickstart.md:221
#: a200dc6f700d40f89e22d7745a5f01f0
msgid "Next Step"
msgstr "下一步"
#: ../../Qwen/source/getting_started/quickstart.md:223
#: e22d4b679b36490fb4877ae01bfb515a
msgid "Now, you can have fun with Qwen3 models. Would love to know more about its usage? Feel free to check other documents in this documentation."
msgstr "现在,您可以尽情探索 Qwen3 模型的各种用途。若想了解更多,请随时查阅本文档中的其他内容。"
#~ msgid "Hugging Face Transformers & ModelScope"
#~ msgstr ""
#~ msgid "Install with `pip`:"
#~ msgstr "使用 `pip` 安装:"
#~ msgid "Install with `conda`:"
#~ msgstr "使用 `conda` 安装:"
#~ msgid "Install from source:"
#~ msgstr "从源代码安装:"
#~ msgid "As you can see, it's just standard usage for casual LMs in `transformers`!"
#~ msgstr "如您所见,与 `transformers` 的常规使用方式无二!"
#~ msgid "Streaming Generation"
#~ msgstr "流式生成"
#~ msgid "Streaming mode for model chat is simple with the help of `TextStreamer`. Below we show you an example of how to use it:"
#~ msgstr "借助 `TextStreamer` , 模型生成的流式模式变得非常简单。下面我们将展示一个如何使用它的示例:"
#~ msgid "It will print the text to the console or the terminal as being generated."
#~ msgstr "命令行或终端中将屏显生成的文本。"
#~ msgid "vLLM for Deployment"
#~ msgstr "使用vLLM部署"
#~ msgid "with `vllm>=0.5.3`, you can also use"
#~ msgstr "如 `vllm>=0.5.3` ,也可以如下启动:"
#~ msgid "For more information, please refer to [the documentation of `vllm`](https://docs.vllm.ai/en/stable/)."
#~ msgstr "欲获取更多信息,请参考 [`vllm` 文档](https://docs.vllm.ai/en/stable/)。"
# Copyright (C) 2024, Qwen Team, Alibaba Group.
# This file is distributed under the same license as the Qwen package.
#
msgid ""
msgstr ""
"Project-Id-Version: Qwen \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2025-04-28 19:42+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
"Language-Team: zh_CN <LL@li.org>\n"
"Plural-Forms: nplurals=1; plural=0;\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.17.0\n"
#: ../../Qwen/source/index.rst:34
msgid "Getting Started"
msgstr "快速开始"
#: ../../Qwen/source/index.rst:44
msgid "Inference"
msgstr "推理"
#: ../../Qwen/source/index.rst:51
msgid "Run Locally"
msgstr "本地运行"
#: ../../Qwen/source/index.rst:60
msgid "Deployment"
msgstr "部署"
#: ../../Qwen/source/index.rst:71
msgid "Quantization"
msgstr "量化"
#: ../../Qwen/source/index.rst:80
msgid "Training"
msgstr "训练"
#: ../../Qwen/source/index.rst:87
msgid "Framework"
msgstr "框架"
#: ../../Qwen/source/index.rst:2 6e52d3a497924f828d4c6b9dd59370d5
msgid "Welcome to Qwen!"
msgstr "欢迎来到Qwen"
#: ../../Qwen/source/index.rst:4 235805a6d4a34184821c0f4f81020ef1
msgid "Qwen3"
msgstr ""
#: ../../Qwen/source/index.rst:11 b8a3aa3f31594232959a08d89e9dc7db
msgid "Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. Both language models and multimodal models are pretrained on large-scale multilingual and multimodal data and post-trained on quality data for aligning to human preferences. Qwen is capable of natural language understanding, text generation, vision understanding, audio understanding, tool use, role play, playing as AI agent, etc."
msgstr "Qwen是阿里巴巴集团Qwen团队研发的大语言模型和大型多模态模型系列。无论是语言模型还是多模态模型,均在大规模多语言和多模态数据上进行预训练,并通过高质量数据进行后期微调以贴近人类偏好。Qwen具备自然语言理解、文本生成、视觉理解、音频理解、工具使用、角色扮演、作为AI Agent进行互动等多种能力。"
#: ../../Qwen/source/index.rst:14 8735c67355064a97b2793b721a701b21
msgid "The latest version, Qwen3, has the following features:"
msgstr "最新版本Qwen3有以下特点:"
#: ../../Qwen/source/index.rst:16 1956d75084244379aad9503fcc572f00
msgid "**Dense and Mixture-of-Experts (MoE) models**, available in 0.6B, 1.7B, 4B, 8B, 14B, 32B and 30B-A3B, 235B-A22B."
msgstr "**全尺寸稠密与混合专家模型**:0.6B, 1.7B, 4B, 8B, 14B, 32B and 30B-A3B, 235B-A22B"
#: ../../Qwen/source/index.rst:17 1fdf12161cd14663b67b2c08f9219ddb
msgid "**Seamless switching between thinking mode** (for complex logical reasoning, math, and coding) and **non-thinking mode** (for efficient, general-purpose chat) **within a single model**, ensuring optimal performance across various scenarios."
msgstr "支持在**思考模式**(用于复杂逻辑推理、数学和编码)和 **非思考模式** (用于高效通用对话)之间**无缝切换**,确保在各种场景下的最佳性能。"
#: ../../Qwen/source/index.rst:18 189ff2a03ad249ef88202c34e9f8aa86
msgid "**Significantly enhancement in reasoning capabilities**, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning."
msgstr "**显著增强的推理能力**,在数学、代码生成和常识逻辑推理方面超越了之前的 QwQ(在思考模式下)和 Qwen2.5 指令模型(在非思考模式下)。"
#: ../../Qwen/source/index.rst:19 64ebcda0381148cb8edf8d92b49469ea
msgid "**Superior human preference alignment**, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience."
msgstr "**卓越的人类偏好对齐**,在创意写作、角色扮演、多轮对话和指令跟随方面表现出色,提供更自然、更吸引人和更具沉浸感的对话体验。"
#: ../../Qwen/source/index.rst:20 ec0ebb91f1ed491f8672aefef6307d85
msgid "**Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks."
msgstr "**擅长智能体能力**,可以在思考和非思考模式下精确集成外部工具,在复杂的基于代理的任务中在开源模型中表现领先。"
#: ../../Qwen/source/index.rst:21 526b161edf284e1b913aabc7e7fcc77c
msgid "**Support of 100+ languages and dialects** with strong capabilities for **multilingual instruction following** and **translation**."
msgstr "**支持 100 多种语言和方言**,具有强大的多语言理解、推理、指令跟随和生成能力。"
#: ../../Qwen/source/index.rst:23 79ed3f0e7da043bb8b53f510ed244814
msgid "For more information, please visit our:"
msgstr "想了解更多信息,欢迎访问:"
#: ../../Qwen/source/index.rst:25 b2e579ae57de4d2985ab1c350fdf2458
msgid "`Blog <https://qwenlm.github.io/>`__"
msgstr "`博客 <https://qwenlm.github.io/>`__"
#: ../../Qwen/source/index.rst:26 406389fe90064e879bd28665a021ee7e
msgid "`GitHub <https://github.com/QwenLM>`__"
msgstr "`GitHub <https://github.com/QwenLM>`__"
#: ../../Qwen/source/index.rst:27 714c64df6aed4e608571de0155199fef
msgid "`Hugging Face <https://huggingface.co/Qwen>`__"
msgstr "`Hugging Face <https://huggingface.co/Qwen>`__"
#: ../../Qwen/source/index.rst:28 214e12e0b1c04b268582b2c46d22334d
msgid "`ModelScope <https://modelscope.cn/organization/qwen>`__"
msgstr "`ModelScope <https://modelscope.cn/organization/qwen>`__"
#: ../../Qwen/source/index.rst:29 9c64e461dc3a440ab92d94887fe3d2d8
msgid "`Qwen3 Collection <https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f>`__"
msgstr ""
#: ../../Qwen/source/index.rst:31 c6056edc8a3a4a12bd3a75eeb210f7a2
msgid "Join our community by joining our `Discord <https://discord.gg/yPEP2vHTu4>`__ and `WeChat <https://github.com/QwenLM/Qwen/blob/main/assets/wechat.png>`__ group. We are looking forward to seeing you there!"
msgstr "加入社区,加入 `Discord <https://discord.gg/yPEP2vHTu4>`__ 和 `微信群 <https://github.com/QwenLM/Qwen/blob/main/assets/wechat.png>`__ 。很期待见到你们!"
#~ msgid "Web UI"
#~ msgstr "Web UI"
#~ msgid "Benchmark"
#~ msgstr "评测"
#~ msgid "Qwen2.5"
#~ msgstr ""
#~ msgid "Dense, easy-to-use, decoder-only language models, available in **0.5B**, **1.5B**, **3B**, **7B**, **14B**, **32B**, and **72B** sizes, and base and instruct variants."
#~ msgstr "易于使用的仅解码器稠密语言模型,提供 **0.5B** 、**1.5B** 、**3B** 、**7B** 、**14B** 、**32B** 和 **72B** 共7种参数规模的模型,并且有基模型和指令微调模型两种变体(其中“ B ”表示“十亿”, 72B 即为 720 亿)"
#~ msgid "Pretrained on our latest large-scale dataset, encompassing up to **18T** tokens."
#~ msgstr "利用我们最新的数据集进行预训练,包含多达 18T tokens (其中“ T ”表示“万亿”, 18T 即为 18 万亿)"
#~ msgid "Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON."
#~ msgstr "在遵循指令、生成长文本(超过 8K tokens )、理解结构化数据(例如,表格)以及生成结构化输出特别是 JSON 方面有了显著改进"
#~ msgid "More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots."
#~ msgstr "更加适应多样化的系统提示,增强了角色扮演的实现和聊天机器人的背景设置。"
#~ msgid "Context length support up to **128K** tokens and can generate up to **8K** tokens."
#~ msgstr "支持最多达 **128K** tokens 的上下文长度,并能生成多达 **8K** tokens 的文本。"
#~ msgid "Multilingual support for over **29** languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more."
#~ msgstr "支持超过 **29** 种语言,包括中文、英文、法文、西班牙文、葡萄牙文、德文、意大利文、俄文、日文、韩文、越南文、泰文、阿拉伯文等。"
#~ msgid "`Qwen2.5 Collection <https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e>`__"
#~ msgstr ""
This diff is collapsed.
# Copyright (C) 2024, Qwen Team, Alibaba Group.
# This file is distributed under the same license as the Qwen package.
#
msgid ""
msgstr ""
"Project-Id-Version: Qwen \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2025-04-28 19:42+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
"Language-Team: zh_CN <LL@li.org>\n"
"Plural-Forms: nplurals=1; plural=0;\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.17.0\n"
#: ../../Qwen/source/quantization/awq.md:1 363514c3e24c4d2aa54832e85acf34ef
msgid "AWQ"
msgstr "AWQ"
#: ../../Qwen/source/quantization/awq.md:4 36b5c0de1013499f9f1e41edf8fa28ca
msgid "To be updated for Qwen3."
msgstr "仍需为Qwen3更新。"
#: ../../Qwen/source/quantization/awq.md:7 9d6a80a82b044628bc9c911785ac9160
msgid "For quantized models, one of our recommendations is the usage of [AWQ](https://arxiv.org/abs/2306.00978) with [AutoAWQ](https://github.com/casper-hansen/AutoAWQ)."
msgstr "对于量化模型,我们推荐使用 [AWQ](https://arxiv.org/abs/2306.00978) 结合 [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) "
#: ../../Qwen/source/quantization/awq.md:9 139542ed4b414cfb834b3fd81ea88d51
msgid "**AWQ** refers to Activation-aware Weight Quantization, a hardware-friendly approach for LLM low-bit weight-only quantization."
msgstr "**AWQ**即激活值感知的权重量化(Activation-aware Weight Quantization),是一种针对LLM的低比特权重量化的硬件友好方法。"
#: ../../Qwen/source/quantization/awq.md:11 9a2959bb9f984e36a299bc40abca9402
msgid "**AutoAWQ** is an easy-to-use Python library for 4-bit quantized models. AutoAWQ speeds up models by 3x and reduces memory requirements by 3x compared to FP16. AutoAWQ implements the Activation-aware Weight Quantization (AWQ) algorithm for quantizing LLMs."
msgstr "**AutoAWQ**是一个易于使用的工具包,用于4比特量化模型。相较于FP16,AutoAWQ能够将模型的运行速度提升3倍,并将内存需求降低至原来的三分之一。AutoAWQ实现了AWQ算法,可用于LLM的量化处理。"
#: ../../Qwen/source/quantization/awq.md:15 4f9fcd93d1f44b48869224c0f4e8b76a
msgid "In this document, we show you how to use the quantized model with Hugging Face `transformers` and also how to quantize your own model."
msgstr "在本文档中,我们将向您展示如何在Hugging Face `transformers`框架下使用量化模型,以及如何对您自己的模型进行量化"
#: ../../Qwen/source/quantization/awq.md:17 870ebc162f3749b48fe454df85aaaf4b
msgid "Usage of AWQ Models with Hugging Face transformers"
msgstr "在Hugging Face transformers中使用AWQ量化模型"
#: ../../Qwen/source/quantization/awq.md:19 cc7bd785c7ac45a4980fbda683699e43
msgid "Now, `transformers` has officially supported AutoAWQ, which means that you can directly use the quantized model with `transformers`. The following is a very simple code snippet showing how to run `Qwen2.5-7B-Instruct-AWQ` with the quantized model:"
msgstr "现在,`transformers`已经正式支持AutoAWQ,这意味着您可以直接在`transformers`中使用AWQ量化模型。以下是一个非常简单的代码片段,展示如何运行量化模型 `Qwen2.5-7B-Instruct-AWQ` :"
#: ../../Qwen/source/quantization/awq.md:56 47826d51abf54ad8a89ef9b91127a700
msgid "Usage of AWQ Models with vLLM"
msgstr "在vLLM中使用AWQ量化模型"
#: ../../Qwen/source/quantization/awq.md:58 b7235ae8f8344dd4a3d2029bbe7a40fc
msgid "vLLM has supported AWQ, which means that you can directly use our provided AWQ models or those quantized with `AutoAWQ` with vLLM. We recommend using the latest version of vLLM (`vllm>=0.6.1`) which brings performance improvements to AWQ models; otherwise, the performance might not be well-optimized."
msgstr "vLLM已支持AWQ,您可以直接使用我们提供的AWQ量化模型或使用`AutoAWQ`量化的模型。我们建议使用最新版的vLLM (`vllm>=0.6.1`),新版为AWQ量化模型提升了效率提;不然推理效率可能并为被良好优化(即效率可能较非量化模型低)。"
#: ../../Qwen/source/quantization/awq.md:61 940ce8fdb5da442b99af2bc1739911c6
msgid "Actually, the usage is the same with the basic usage of vLLM. We provide a simple example of how to launch OpenAI-API compatible API with vLLM and `Qwen2.5-7B-Instruct-AWQ`:"
msgstr "实际上,使用AWQ模型与vLLM的基本用法相同。我们提供了一个简单的示例,展示了如何通过vLLM启动与OpenAI API兼容的接口,并使用 `Qwen2.5-7B-Instruct-AWQ` 模型:"
#: ../../Qwen/source/quantization/awq.md:64 2d249915352049a6a8d5a06e1f4682ee
msgid "Run the following in a shell to start an OpenAI-compatible API service:"
msgstr "在终端中运行以下命令以开启OpenAI兼容API:"
#: ../../Qwen/source/quantization/awq.md:70 be7bfbb81698429cbfcbcd24d062fc08
msgid "Then, you can call the API as"
msgstr "随后,您可以这样调用API:"
#: ../../Qwen/source/quantization/awq.md:86 0dff7d5c7b044548a82e0ba68a043d80
msgid "or you can use the API client with the `openai` Python package as shown below:"
msgstr "或者你可以按照下面所示的方式,使用 `openai` Python包中的API客户端:"
#: ../../Qwen/source/quantization/awq.md:115 65f4d60502ee486382e9bda9a5a826bb
msgid "Quantize Your Own Model with AutoAWQ"
msgstr "使用AutoAWQ量化你的模型"
#: ../../Qwen/source/quantization/awq.md:117 c7c42af91c1a419194d65200bcfa2f26
#, fuzzy
msgid "If you want to quantize your own model to AWQ quantized models, we advise you to use AutoAWQ."
msgstr "如果您希望将自定义模型量化为AWQ量化模型,我们建议您使用AutoAWQ。推荐通过安装源代码来获取并安装该工具包的最新版本:"
#: ../../Qwen/source/quantization/awq.md:123 232e94883d044030b2193392788b9314
msgid "Suppose you have finetuned a model based on `Qwen2.5-7B`, which is named `Qwen2.5-7B-finetuned`, with your own dataset, e.g., Alpaca. To build your own AWQ quantized model, you need to use the training data for calibration. Below, we provide a simple demonstration for you to run:"
msgstr "假设你已经基于 `Qwen2.5-7B` 模型进行了微调,并将其命名为 `Qwen2.5-7B-finetuned` ,且使用的是你自己的数据集,比如Alpaca。若要构建你自己的AWQ量化模型,你需要使用训练数据进行校准。以下,我们将为你提供一个简单的演示示例以便运行:"
#: ../../Qwen/source/quantization/awq.md:141 5162195f32ee4ecba229aa137da1aba4
msgid "Then you need to prepare your data for calibration. What you need to do is just put samples into a list, each of which is a text. As we directly use our finetuning data for calibration, we first format it with ChatML template. For example,"
msgstr "接下来,您需要准备数据以进行校准。您需要做的就是将样本放入一个列表中,其中每个样本都是一段文本。由于我们直接使用微调数据来进行校准,所以我们首先使用ChatML模板对其进行格式化。例如:"
#: ../../Qwen/source/quantization/awq.md:153 0d4736e90e0242a8be15533de3aab6ff
msgid "where each `msg` is a typical chat message as shown below:"
msgstr "其中每个 `msg` 是一个典型的聊天消息,如下所示:"
#: ../../Qwen/source/quantization/awq.md:163 79d86630600945ac85dbe13d07987016
msgid "Then just run the calibration process by one line of code:"
msgstr "然后只需通过一行代码运行校准过程:"
#: ../../Qwen/source/quantization/awq.md:169 1ae219a50508465b98e3b3398e631681
msgid "Finally, save the quantized model:"
msgstr "最后,保存量化模型:"
#: ../../Qwen/source/quantization/awq.md:176 58316c1a4172418aba9f37925963e17f
msgid "Then you can obtain your own AWQ quantized model for deployment. Enjoy!"
msgstr "然后你就可以得到一个可以用于部署的AWQ量化模型。玩得开心!"
# Copyright (C) 2024, Qwen Team, Alibaba Group.
# This file is distributed under the same license as the Qwen package.
#
msgid ""
msgstr ""
"Project-Id-Version: Qwen \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2025-04-28 19:42+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
"Language-Team: zh_CN <LL@li.org>\n"
"Plural-Forms: nplurals=1; plural=0;\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.17.0\n"
#: ../../Qwen/source/quantization/gptq.md:1 c90397f810fb44a0abba8dd02f998f1c
msgid "GPTQ"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:4 b79afc46b0f9474fb0c83751625aefc5
msgid "To be updated for Qwen3."
msgstr "仍需为Qwen3更新。"
#: ../../Qwen/source/quantization/gptq.md:7 898494af2a944193880f27e2f90db4f4
msgid "[GPTQ](https://arxiv.org/abs/2210.17323) is a quantization method for GPT-like LLMs, which uses one-shot weight quantization based on approximate second-order information. In this document, we show you how to use the quantized model with Hugging Face `transformers` and also how to quantize your own model with [AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ)."
msgstr "[GPTQ](https://arxiv.org/abs/2210.17323)是一种针对类GPT大型语言模型的量化方法,它基于近似二阶信息进行一次性权重量化。在本文档中,我们将向您展示如何使用 `transformers` 库加载并应用量化后的模型,同时也会指导您如何通过[AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ)来对您自己的模型进行量化处理。"
#: ../../Qwen/source/quantization/gptq.md:10 11b82020735d4828a4182cefbf98aeb1
msgid "Usage of GPTQ Models with Hugging Face transformers"
msgstr "在Hugging Face transformers中使用GPTQ模型"
#: ../../Qwen/source/quantization/gptq.md:14 2e9481d850954772949dd33897e0b06b
msgid "To use the official Qwen2.5 GPTQ models with `transformers`, please ensure that `optimum>=1.20.0` and compatible versions of `transformers` and `auto_gptq` are installed."
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:16 fe6662a312184d40b07d957f4c0888cc
msgid "You can do that by"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:22 9f0ad8e2a26145cf8bd9d60305566771
msgid "Now, `transformers` has officially supported AutoGPTQ, which means that you can directly use the quantized model with `transformers`. For each size of Qwen2.5, we provide both Int4 and Int8 GPTQ quantized models. The following is a very simple code snippet showing how to run `Qwen2.5-7B-Instruct-GPTQ-Int4`:"
msgstr "现在,`transformers` 正式支持了AutoGPTQ,这意味着您能够直接在`transformers`中使用量化后的模型。以下是一个非常简单的代码片段示例,展示如何运行 `Qwen2.5-7B-Instruct-GPTQ-Int4` (请注意,对于每种大小的Qwen2.5模型,我们都提供了Int4和Int8两种量化版本):"
#: ../../Qwen/source/quantization/gptq.md:60 855686b8990f403bba151d8498947f23
msgid "Usage of GPTQ Models with vLLM"
msgstr "在vLLM中使用GPTQ模型"
#: ../../Qwen/source/quantization/gptq.md:62 ad572c30a0904598b3cbeba7c38a607a
msgid "vLLM has supported GPTQ, which means that you can directly use our provided GPTQ models or those trained with `AutoGPTQ` with vLLM. If possible, it will automatically use the GPTQ Marlin kernel, which is more efficient."
msgstr "vLLM已支持GPTQ,您可以直接使用我们提供的GPTQ量化模型或使用`AutoGPTQ`量化的模型。我们建议使用最新版的vLLM。如有可能,其会自动使用效率更好的GPTQ Marlin实现。"
#: ../../Qwen/source/quantization/gptq.md:65 09050876d2c04aee9b619d28d4f5589c
msgid "Actually, the usage is the same with the basic usage of vLLM. We provide a simple example of how to launch OpenAI-API compatible API with vLLM and `Qwen2.5-7B-Instruct-GPTQ-Int4`:"
msgstr "实际上,使用GPTQ模型与vLLM的基本用法相同。我们提供了一个简单的示例,展示了如何通过vLLM启动与OpenAI API兼容的接口,并使用 `Qwen2.5-7B-Instruct-GPTQ-Int4` 模型:"
#: ../../Qwen/source/quantization/gptq.md:68 a31dd879cc444b5da8d16fb1705585a6
msgid "Run the following in a shell to start an OpenAI-compatible API service:"
msgstr "在终端中运行以下命令以开启OpenAI兼容API:"
#: ../../Qwen/source/quantization/gptq.md:74 9dfb41e03089473792928b05b1225de4
msgid "Then, you can call the API as"
msgstr "随后,您可以这样调用API:"
#: ../../Qwen/source/quantization/gptq.md:90 6b440bebe0d84118bb63ed9a7c169ab5
msgid "or you can use the API client with the `openai` Python package as shown below:"
msgstr "或者你可以按照下面所示的方式,使用 `openai` Python包中的API客户端:"
#: ../../Qwen/source/quantization/gptq.md:119 7ffaa1ca8b4740b98dc3f804348da523
msgid "Quantize Your Own Model with AutoGPTQ"
msgstr "使用AutoGPTQ量化你的模型"
#: ../../Qwen/source/quantization/gptq.md:121 40bd0b11507c4f06be5a5918d0dc3bdb
msgid "If you want to quantize your own model to GPTQ quantized models, we advise you to use AutoGPTQ. It is suggested installing the latest version of the package by installing from source code:"
msgstr "如果你想将自定义模型量化为GPTQ量化模型,我们建议你使用AutoGPTQ工具。推荐通过安装源代码的方式获取并安装最新版本的该软件包。"
#: ../../Qwen/source/quantization/gptq.md:130 d6ebb03d51bf4e0686ae17ce3f0a34db
msgid "Suppose you have finetuned a model based on `Qwen2.5-7B`, which is named `Qwen2.5-7B-finetuned`, with your own dataset, e.g., Alpaca. To build your own GPTQ quantized model, you need to use the training data for calibration. Below, we provide a simple demonstration for you to run:"
msgstr "假设你已经基于 `Qwen2.5-7B` 模型进行了微调,并将该微调后的模型命名为 `Qwen2.5-7B-finetuned` ,且使用的是自己的数据集,比如Alpaca。要构建你自己的GPTQ量化模型,你需要使用训练数据进行校准。以下是一个简单的演示示例,供你参考运行:"
#: ../../Qwen/source/quantization/gptq.md:161 9c1b27cc38764332891a8a13175663fc
msgid "However, if you would like to load the model on multiple GPUs, you need to use `max_memory` instead of `device_map`. Here is an example:"
msgstr "但是,如果你想使用多GPU来读取模型,你需要使用 `max_memory` 而不是 `device_map`。下面是一段示例代码:"
#: ../../Qwen/source/quantization/gptq.md:172 c2a9a50734854c19acf3e623597aee80
msgid "Then you need to prepare your data for calibration. What you need to do is just put samples into a list, each of which is a text. As we directly use our finetuning data for calibration, we first format it with ChatML template. For example,"
msgstr "接下来,你需要准备数据进行校准。你需要做的是将样本放入一个列表中,其中每个样本都是一段文本。由于我们直接使用微调数据进行校准,所以我们首先使用ChatML模板对它进行格式化处理。例如:"
#: ../../Qwen/source/quantization/gptq.md:188 7621f73d34d04dd791d2eda03edb0d06
msgid "where each `msg` is a typical chat message as shown below:"
msgstr "其中每个 `msg` 是一个典型的聊天消息,如下所示:"
#: ../../Qwen/source/quantization/gptq.md:198 293efa14ece74a0aa9cbf32ef21e6bcd
msgid "Then just run the calibration process by one line of code:"
msgstr "然后只需通过一行代码运行校准过程:"
#: ../../Qwen/source/quantization/gptq.md:209 919d7a77cc4a4ef084ee8e2240ff1797
msgid "Finally, save the quantized model:"
msgstr "最后,保存量化模型:"
#: ../../Qwen/source/quantization/gptq.md:216 b353bdf12d6148fdb0a77662f795ae7e
msgid "It is unfortunate that the `save_quantized` method does not support sharding. For sharding, you need to load the model and use `save_pretrained` from transformers to save and shard the model. Except for this, everything is so simple. Enjoy!"
msgstr "很遗憾, `save_quantized` 方法不支持模型分片。若要实现模型分片,您需要先加载模型,然后使用来自 `transformers` 库的 `save_pretrained` 方法来保存并分片模型。除此之外,一切操作都非常简单。祝您使用愉快!"
#: ../../Qwen/source/quantization/gptq.md:222 caea6f76804e40daa394ae2e2d52a6ce
msgid "Known Issues"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:224 07df69bd48d4445887b5c1fa09f2f0fb
msgid "Qwen2.5-72B-Instruct-GPTQ-Int4 cannot stop generation properly"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:226
#: ../../Qwen/source/quantization/gptq.md:235 a4f1c7b0cb5d49f2929ba5d1246e885d
#: d2dbf88d06974152943e6ec405419390
msgid "Model"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:226 cb9c0be91ecc46c3b6ecfa97a0a37dd7
msgid "Qwen2.5-72B-Instruct-GPTQ-Int4"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:227
#: ../../Qwen/source/quantization/gptq.md:236 c1fe04754a0642fa82ed425d6abaa487
#: f3ff85cbbc47459fb36b5ad0e38b4a1b
msgid "Framework"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:227 8a5a4fe9d7634cb1ac65025565c3593a
#, fuzzy
msgid "vLLM, AutoGPTQ (including Hugging Face transformers)"
msgstr "在Hugging Face transformers中使用GPTQ模型"
#: ../../Qwen/source/quantization/gptq.md:228
#: ../../Qwen/source/quantization/gptq.md:237 320d56294cc4490f8b30ac523388bc44
#: c04326d003f949a7b2b63c6c6cb20ac3
msgid "Description"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:228 22f80d0679dc426dbbfb21b90b993a27
msgid "Generation cannot stop properly. Continual generation after where it should stop, then repeated texts, either single character, a phrase, or paragraphs, are generated."
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:229
#: ../../Qwen/source/quantization/gptq.md:238 255a7a8ac98b4d2da51f79f207be5901
#: 673d23bf488840a2a32a18cd657f334f
msgid "Workaround"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:229 c2171874ed804ffb826ac686128d7bff
msgid "The following workaround could be considered"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:230 a59d6759991640609371bf7afd81e0b8
msgid "Using the original model in 16-bit floating point"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:231 97134ed43ee3414199928d755c24544e
msgid "Using the AWQ variants or llama.cpp-based models for reduced chances of abnormal generation"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:233 7c30819dea6c4cfb8eee98d0dd217bf9
msgid "Qwen2.5-32B-Instruct-GPTQ-Int4 broken with vLLM on multiple GPUs"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:235 a4a641abd99a47049c1fd172e9cfa2be
msgid "Qwen2.5-32B-Instruct-GPTQ-Int4"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:236 70216327dda349cabf03412f5fbe3114
msgid "vLLM"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:237 8edf21882ff24358b736c73477cfba9d
msgid "Deployment on multiple GPUs and only garbled text like `!!!!!!!!!!!!!!!!!!` could be generated."
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:238 10d9d8b3d8e74afea5ccd79bc698fb7c
msgid "Each of the following workaround could be considered"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:239 33d1632f26f9423c847d06af7a5d107d
msgid "Using the AWQ or GPTQ-Int8 variants"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:240 b27f1f32637349d09b8c74a2041a4d9b
msgid "Using a single GPU"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:241 fc27883584a04682b9e28b2ccf51dc0e
msgid "Using Hugging Face `transformers` if latency and throughput are not major concerns"
msgstr ""
#: ../../Qwen/source/quantization/gptq.md:244 5664e5bd63c845d49e8cfa75e789dfa3
msgid "Troubleshooting"
msgstr "问题排查"
#: ../../Qwen/source/quantization/gptq.md 06f2358881134920ab43f4256ad6300e
msgid "With `transformers` and `auto_gptq`, the logs suggest `CUDA extension not installed.` and the inference is slow."
msgstr "在使用 `transformers` 和 `auto_gptq` 时,日志提示 `CUDA extension not installed.` 并且推理速度缓慢。"
#: ../../Qwen/source/quantization/gptq.md:248 2d57d681b2d74c27b60523fa86676b6f
msgid "`auto_gptq` fails to find a fused CUDA kernel compatible with your environment and falls back to a plain implementation. Follow its [installation guide](https://github.com/AutoGPTQ/AutoGPTQ/blob/main/docs/INSTALLATION.md) to install a pre-built wheel or try installing `auto_gptq` from source."
msgstr "`auto_gptq` 未能找到与您的环境兼容的融合CUDA算子,因此退回到基础实现。请遵循其 [安装指南](https://github.com/AutoGPTQ/AutoGPTQ/blob/main/docs/INSTALLATION.md) 来安装预构建的 wheel 或尝试从源代码安装 `auto_gptq` 。"
#: ../../Qwen/source/quantization/gptq.md 95b57d1a962c4dc7aa02a69a403e2376
msgid "Self-quantized Qwen2.5-72B-Instruct-GPTQ with `vllm`, `ValueError: ... must be divisible by ...` is raised. The intermediate size of the self-quantized model is different from the official Qwen2.5-72B-Instruct-GPTQ models."
msgstr "`vllm` 使用自行量化的 Qwen2.5-72B-Instruct-GPTQ 时,会引发 `ValueError: ... must be divisible by ...` 错误。自量化的模型的 intermediate size 与官方的 Qwen2.5-72B-Instruct-GPTQ 模型不同。"
#: ../../Qwen/source/quantization/gptq.md:255 ecd9b51a549045949ff18fdb6226ddc8
#, python-brace-format
msgid "After quantization the size of the quantized weights are divided by the group size, which is typically 128. The intermediate size for the FFN blocks in Qwen2.5-72B is 29568. Unfortunately, {math}`29568 \\div 128 = 231`. Since the number of attention heads and the dimensions of the weights must be divisible by the tensor parallel size, it means you can only run the quantized model with `tensor_parallel_size=1`, i.e., one GPU card."
msgstr "量化后,量化权重的大小将被 group size(通常为128)整除。Qwen2-72B 中FFN块的中间大小为29568。不幸的是, {math}`29568 \\div 128 = 231` 。由于注意力头的数量和权重的维度必须能够被张量并行大小整除,这意味着你只能使用 `tensor_parallel_size=1` ,即一张 GPU 卡,来运行量化的模型。"
#: ../../Qwen/source/quantization/gptq.md:260 8b1c5e3934654679a2d85e3287cf9309
#, python-brace-format
msgid "A workaround is to make the intermediate size divisible by {math}`128 \\times 8 = 1024`. To achieve that, the weights should be padded with zeros. While it is mathematically equivalent before and after zero-padding the weights, the results may be slightly different in reality."
msgstr "一个解决方案是使中间大小能够被 {math}`128 \\times 8 = 1024` 整除。为了达到这一目的,应该使用零值对权重进行填充。虽然在数学上,在对权重进行零填充前后是等价的,但在现实中结果可能会略有不同。"
#: ../../Qwen/source/quantization/gptq.md:264 ae904f7ab91340c4a6831aef4de643ba
msgid "Try the following:"
msgstr "尝试以下方法:"
#: ../../Qwen/source/quantization/gptq.md:297 4cf8c516a2324e618d25333c84be9e6b
msgid "This will save the padded checkpoint to the specified directory. Then, copy other files from the original checkpoint to the new directory and modify the `intermediate_size` in `config.json` to `29696`. Finally, you can quantize the saved model checkpoint."
msgstr "这将会把填充后的检查点保存到指定的目录。然后,你需要从原始检查点复制其他文件到新目录,并将 `config.json` 中的 `intermediate_size` 修改为 `29696` 。最后,你可以量化保存的模型检查点。"
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
furo
myst-parser==4.0.0
sphinx<8,>4.5.0
sphinx-copybutton
sphinx-design>=0.6.0
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment