v1.0

a52e53db · chenzk · a52e53db · a52e53db · a52e53db · a52e53db
Commit a52e53db authored Apr 29, 2025 by chenzk
20 changed files
--- a/docs/locales/zh_CN/LC_MESSAGES/framework/qwen_agent.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/framework/qwen_agent.po
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2024, Qwen Team
+# This file is distributed under the same license as the Qwen package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2024.
+#
+msgid ""
+msgstr ""
+"Project-Id-Version: Qwen \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-04-28 19:42+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../Qwen/source/framework/qwen_agent.rst:2
+#: aaed24d3edd64e6ab1f20188f3d5ba24
+msgid "Qwen-Agent"
+msgstr "Qwen-Agent"
+
+#: ../../Qwen/source/framework/qwen_agent.rst:5
+#: 1cbbb8d342f243c58e0d66a3e44daac8
+msgid "To be updated for Qwen3."
+msgstr "仍需为Qwen3更新。"
+
+#: ../../Qwen/source/framework/qwen_agent.rst:7
+#: 3e1dbee121bc4a6c91a26618e27c0d86
+msgid "`Qwen-Agent <https://github.com/QwenLM/Qwen-Agent>`__ is a framework for developing LLM applications based on the instruction following, tool usage, planning, and memory capabilities of Qwen. It also comes with example applications such as Browser Assistant, Code Interpreter, and Custom Assistant."
+msgstr "`Qwen-Agent <https://github.com/QwenLM/Qwen-Agent>`__ 是一个基于 Qwen 的指令跟随、工具使用、计划和记忆能力来开发 LLM 应用程序的框架。它还附带了一些示例应用程序，例如浏览器助手、代码解释器和自定义助手。"
+
+#: ../../Qwen/source/framework/qwen_agent.rst:14
+#: f180730da09640169fb93950a2e8cb5f
+msgid "Installation"
+msgstr "安装"
+
+#: ../../Qwen/source/framework/qwen_agent.rst:23
+#: 89f39ac4160d49fba7f9d52dce6527c3
+msgid "Developing Your Own Agent"
+msgstr "开发您自己的智能体"
+
+#: ../../Qwen/source/framework/qwen_agent.rst:25
+#: 307456721ed7469eb7b8f636483188f4
+msgid "Qwen-Agent provides atomic components such as LLMs and prompts, as well as high-level components such as Agents. The example below uses the Assistant component as an illustration, demonstrating how to add custom tools and quickly develop an agent that uses tools."
+msgstr "Qwen-Agent 提供包括语言模型和提示词等原子级组件，及智能体等高级组件在内的多种组件。以下示例选取助理组件进行展示，阐述了如何整合自定义工具以及如何迅速开发出一个能够应用这些工具的代理程序。"
+
+#: ../../Qwen/source/framework/qwen_agent.rst:94
+#: 13034806dd414e19a5f53ece31d0fa16
+msgid "The framework also provides more atomic components for developers to combine. For additional showcases, please refer to `examples <https://github.com/QwenLM/Qwen-Agent/tree/main/examples>`__."
+msgstr "该框架还为开发者提供了更多的原子组件以供组合使用。欲了解更多示例，请参见 `examples <https://github.com/QwenLM/Qwen-Agent/tree/main/examples>`__。"
+
--- a/docs/locales/zh_CN/LC_MESSAGES/getting_started/concepts.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/getting_started/concepts.po
--- a/docs/locales/zh_CN/LC_MESSAGES/getting_started/quantization_benchmark.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/getting_started/quantization_benchmark.po
+# SOME DESCRIPTIVE TITLE.
+# Copyright (C) 2024, Qwen Team
+# This file is distributed under the same license as the Qwen package.
+# FIRST AUTHOR <EMAIL@ADDRESS>, 2024.
+#
+msgid ""
+msgstr ""
+"Project-Id-Version: Qwen \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-04-28 19:42+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:2
+#: 6d4d3bb3020f4e4d8dba0ca5778cdcae
+msgid "Performance of Quantized Models"
+msgstr "量化模型效果评估"
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:5
+#: 3a541cd8cba74edf9b06b46f59eaaf38
+msgid "To be updated for Qwen3."
+msgstr "仍需为Qwen3更新。"
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:7
+#: 3a95fc299de141dea4fc729ef907ce17
+msgid "This section reports the generation performance of quantized models (including GPTQ and AWQ) of the Qwen2 series. Specifically, we report:"
+msgstr "本部分介绍Qwen2量化模型（包括GPTQ与AWQ量化方案）的效果评估，有以下数据集"
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:11
+#: 9386a3b95eb340568185da78224a1ccd
+msgid "MMLU (Accuracy)"
+msgstr "MMLU （准确率）"
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:12
+#: 3cd93b881c90488895c61298104bc7fb
+msgid "C-Eval (Accuracy)"
+msgstr "C-Eval （准确率）"
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:13
+#: 7ac4bb515b0a49699d4eb95fc433bb51
+msgid "IFEval (Strict Prompt-Level Accuracy)"
+msgstr "IFEval （提示词级的严格准确率，Strict Prompt-Level Accuracy）"
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:15
+#: 08e3f35820344c93877618815650b866
+msgid "We use greedy decoding in evaluating all models."
+msgstr "所有模型均使用贪心解码。"
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:18
+#: 9aec40221219455d8fc4e473e5acf09c
+msgid "Quantization"
+msgstr "量化模型"
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:18
+#: 93f274f4751f445d85f04937b25c7f7d
+msgid "Average"
+msgstr "平均"
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:18
+#: 776612f5dd4a40d98976bdfe4896508c
+msgid "MMLU"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:18
+#: f6e8014116cf4179a934d601ee61d04d
+msgid "C-Eval"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:18
+#: 0c40e96c4a3b4cdeaaf1a95ff1aa8f98
+msgid "IFEval"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
+#: 773ccb0f10bd4cf690e819af51c40e76
+msgid "Qwen2-72B-Instruct"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
+#: 71e180f75e624b738d56ec2a1fad253c 7ebe73a2e96445c4bb733845c3190240
+#: bd5a3b8861d646fa9e8d8bc51bb1b80c cc79a78b34f94c18b7bdaf1bfcc8824d
+msgid "BF16"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:22
+#: 08517ffc3e6e4ceb812c3d8710307266 2e879d3d1fef4c878b097550d745e7ae
+msgid "81.3"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
+#: f795aa42cf7d42ccb5a573a5f44be79f
+msgid "82.3"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
+#: 01c54f3da3454e178a07a9f88ed5302b
+msgid "83.8"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:20
+#: 7651df5ccaa14b11a3a89827a5265ae8
+msgid "77.6"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:22
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:30
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:38
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:46
+#: 04de04c9ff3640f096301e76fdd291de 301aa8e494ff4fe4aefcc8cfb7a4c065
+#: d395be41cf144318a1faeccc6f6965c8 ec513d10a75d44b8bd134287a57b5cdd
+msgid "GPTQ-Int8"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:22
+#: 411166db878d4d8f8515e9f5d78a651c
+msgid "80.7"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:22
+#: e63ce8a2f1cc4cec9b52521015e2aebe
+msgid "83.4"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:22
+#: e6be6c30e0d740d39c6c8807e2d4f5f8
+msgid "77.5"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:24
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:32
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:40
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:48
+#: 21720ff324814b2b865f37a40c3586b5 4644a49bcdfd457b84eb5b2771177d78
+#: 560dcb4bfa6e45088faefdb504d629a5 7044a0d2dd6945138ea385287ab5bf33
+msgid "GPTQ-Int4"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:24
+#: 1cb55cd40b3c484d8213c15375b2ad68
+msgid "81.2"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:24
+#: 32b889d9ef014f2ab6be6881e20d40ae
+msgid "80.8"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:24
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:26
+#: ba86de9eb27b40e0ba6a57580aed89c3 eed2e99c0edc426e81ec24e961fe971e
+msgid "83.9"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:24
+#: ee3a3132082048d5b79721fa84f6f816
+msgid "78.9"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:26
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:34
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:42
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:50
+#: 632f832fc1f249fa92764538b698550d 8c7ccf4f75f44b27bb1b5aac544836cb
+#: b473937c2be94c3490483bb5a820e2fe bc1abd77dd27412992d21bda1831a2a8
+msgid "AWQ"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:26
+#: 2711a3f907224e51ba30818b2e730a30
+msgid "80.4"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:26
+#: ca9624c0258b425ba53f024b086c173a
+msgid "80.5"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:26
+#: 2f4b57d4394c4cb187407145ce8d5f1e
+msgid "76.9"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
+#: 48cc75ed7bf04778b327c7b03d418e37
+msgid "Qwen2-7B-Instruct"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
+#: 75182905b74a41099ff859fb86752e99
+msgid "66.9"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
+#: 80cda712e9dc482fac24952d3bb27b28
+msgid "70.5"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
+#: 0701d66bc3084aef8937e4b687705f37
+msgid "77.2"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:28
+#: 8efb5c133644420c808dfd78f8fcde2f
+msgid "53.1"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:30
+#: 2076e02516bd4ff1856bc12a8d6bd320
+msgid "66.2"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:30
+#: 588f4ad13845491d9589ea094265d532
+msgid "69.1"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:30
+#: 0c79963a231a402eb6db1671e851be38
+msgid "76.7"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:30
+#: 5d525163672f456289990489459466ae
+msgid "52.9"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:32
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:34
+#: 9283ca6491194b59a5edf57228f9b5af a4123c0691a442f6850ae25615c108af
+msgid "64.1"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:32
+#: 9e7ffb49aac34129894b0582c0d8aba1
+msgid "67.8"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:32
+#: 7c2fc310e5764b7fbf6034ffd3a5d26d
+msgid "75.2"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:32
+#: 33e6b6e590a64c08adccf0bb161c1046
+msgid "49.4"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:34
+#: b3cbe7665bdf4f4388f015fb6606540e
+msgid "67.4"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:34
+#: a47d3b52e80249f986c4339b9d3fff10
+msgid "73.6"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:34
+#: d76543cff2df434185fbe51712024679
+msgid "51.4"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
+#: cee2c965036d41c6a93ffbf9a9788e4b
+msgid "Qwen2-1.5B-Instruct"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
+#: 8c9d1cd8fb5a4d75b85d0edcb9ed69df
+msgid "48.4"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
+#: f5e05b0942a24e2b9cac753932ad51c4
+msgid "52.4"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
+#: c6f81ec529004598aa14c55228ff9538
+msgid "63.8"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:36
+#: 5b2b4092d04f4d02a56bd0df5807e2c5
+msgid "29.0"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:38
+#: 08d2bf82e83f4a889d622c72c1e1b3b2
+msgid "48.1"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:38
+#: 3d8ea738153f467ba55d50e6bf0f84c0
+msgid "53.0"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:38
+#: 8755d6c4c1e64cd38122f08a92bd90ca
+msgid "62.5"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:38
+#: 1c403dbb3692472a88706cb4b4a1f0f3
+msgid "28.8"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:40
+#: f3f43ea77edc4ff0969e2466e6fe13e1
+msgid "45.0"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:40
+#: 9d070c4b9f3e4fceb27b29ecdf90eb41
+msgid "50.7"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:40
+#: 24ff991704c440deb34b92512f89c371
+msgid "57.4"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:40
+#: b4645b7317a44cb795fc4190149dd0e0
+msgid "27.0"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:42
+#: eeee44d1d65647569999de94e72c00cb
+msgid "46.5"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:42
+#: 41630bee9142494c801083cd5d213dc0
+msgid "51.6"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:42
+#: 762395735fb34bccbc4d057968bbfbf1
+msgid "58.1"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:42
+#: f5915835bcb24051bebed452fc398728
+msgid "29.9"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
+#: 39108e2a66444ca780a720f115251308
+msgid "Qwen2-0.5B-Instruct"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:50
+#: 2795adace57c401cb8bacc00082dfd53 a59271d53e434d17a8a0a19529158f2c
+msgid "34.4"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
+#: c93982789e4e453eb5a02d64f02cb74f
+msgid "37.9"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
+#: 213dfd43b2254a2caec1d4b1d231ed55
+msgid "45.2"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:44
+#: 11de22e2a04a4c04b0b91d09d028b853
+msgid "20.0"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:46
+#: 84b6570bcc8d4c6598336d5bc9b9d36a
+msgid "32.6"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:46
+#: b79e88232d114f43a179dcc5b0477c97
+msgid "35.6"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:46
+#: 1166b675e1e64e18a82c3219f321e248
+msgid "43.9"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:46
+#: fdf340d39b074778b55d36f477f8dc0a
+msgid "18.1"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:48
+#: ed930e1b13dd4c5caf80b2a180a1bcc3
+msgid "29.7"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:48
+#: c3d5617389634f7e96c66b4f869379a9
+msgid "33.0"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:48
+#: 4573b471c48d4028ad6fb378e75f40aa
+msgid "39.2"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:48
+#: c867c42e916f493b9715b1adf656ddcb
+msgid "16.8"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:50
+#: 20d4c89c335648bb93f07ebfb8ce9fce
+msgid "31.1"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:50
+#: 25400aeaf79d49cb914ffa5ff26bfe03
+msgid "42.1"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quantization_benchmark.rst:50
+#: d15e246b65b0427d970b78deffd8c2bc
+msgid "16.7"
+msgstr ""
+
--- a/docs/locales/zh_CN/LC_MESSAGES/getting_started/quickstart.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/getting_started/quickstart.po
+# Copyright (C) 2024, Qwen Team, Alibaba Group.
+# This file is distributed under the same license as the Qwen package.
+#
+msgid ""
+msgstr ""
+"Project-Id-Version: Qwen \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-04-28 19:42+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../Qwen/source/getting_started/quickstart.md:1
+#: 595827c46f2e4884b69954cf22e0e957
+msgid "Quickstart"
+msgstr "快速开始"
+
+#: ../../Qwen/source/getting_started/quickstart.md:3
+#: 725288359306417a943352cef10f831c
+msgid "This guide helps you quickly start using Qwen3.  We provide examples of [Hugging Face Transformers](https://github.com/huggingface/transformers) as well as [ModelScope](https://github.com/modelscope/modelscope), and [vLLM](https://github.com/vllm-project/vllm) for deployment."
+msgstr "本指南帮助您快速上手 Qwen3 的使用，并提供了如下示例： [Hugging Face Transformers](https://github.com/huggingface/transformers) 以及 [ModelScope](https://github.com/modelscope/modelscope) 和 [vLLM](https://github.com/vllm-project/vllm>) 在部署时的应用实例。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:6
+#: 6bfc020002af4b4eaad8adf3902e30ac
+msgid "You can find Qwen3 models in [the Qwen3 collection](https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f) at HuggingFace Hub and [the Qwen3 collection](https://www.modelscope.cn/collections/Qwen3-9743180bdc6b48) at ModelScope."
+msgstr "你可以在 HuggingFace Hub 的 [Qwen3 collection](https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f) 或 ModelScope 的 [Qwen3 collection](https://www.modelscope.cn/collections/Qwen3-9743180bdc6b48) 中寻找 Qwen3 模型。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:8
+#: 1dbf0833f8a5407b8d00056d029eb9d8
+msgid "Transformers"
+msgstr "Transformers"
+
+#: ../../Qwen/source/getting_started/quickstart.md:10
+#: cbe2f022b0b54729a1d3627cb19ad99f
+msgid "To get a quick start with Qwen3, you can try the inference with `transformers` first. Make sure that you have installed `transformers>=4.51.0`. We advise you to use Python 3.10 or higher, and PyTorch 2.6 or higher."
+msgstr "要快速上手 Qwen3 ，我们建议您首先尝试使用 `transformers` 进行推理。请确保已安装了 `transformers>=4.51.0` 版本。我们建议您使用 Python 3.10 或以上版本， PyTorch 2.6 或以上版本。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:14
+#: bd305d4f44484f75bfe0c02a9eda68c4
+msgid "The following is a very simple code snippet showing how to run Qwen3-8B:"
+msgstr "以下是一个非常简单的代码片段示例，展示如何运行 Qwen3 模型："
+
+#: ../../Qwen/source/getting_started/quickstart.md:63
+#: 0bb48ceb71854514be78497721308702
+msgid "Qwen3 will think before respond, similar to QwQ models. This means the model will use its reasoning abilities to enhance the quality of generated responses. The model will first generate thinking content wrapped in a `<think>...</think>` block, followed by the final response."
+msgstr "Qwen3 将在实际回复前思考，与 QwQ 模型类似。这意味着模型将运用其推理能力来提升生成回复的质量。模型会首先生成包含在 `<think>...</think>` 块中的思考内容，随后给出最终回复。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:67
+#: d110ccfe1d834169992f03bcf932e250
+msgid "Hard Switch: To strictly disable the model's thinking behavior, aligning its functionality with the previous Qwen2.5-Instruct models, you can set `enable_thinking=False` when formatting the text."
+msgstr "硬开关：为了严格禁用模型的思考行为，使其功能与之前的Qwen2.5-Instruct模型保持一致，您可以在格式化文本时设置`enable_thinking=False`。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:77
+#: 4bceeb7e0179470f88620507ade7915b
+msgid "It can be particularly useful in scenarios where disabling thinking is essential for enhancing efficiency."
+msgstr "在某些需要通过禁用思考来提升效率的场景中，这一功能尤其有用。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:79
+#: 16b4b43b7a7b43a698118e17d778a6dd
+msgid "Soft Switch: Qwen3 also understands the user's instruction on its thinking behaviour, in particular, the soft switch `/think` and `/no_think`. You can add them to user prompts or system messages to switch the model's thinking mode from turn to turn.  The model will follow the most recent instruction in multi-turn conversations."
+msgstr "软开关：Qwen3 还能够理解用户对其思考行为的指令，特别是软开关 `/think` 和 `/no_think`。您可以将这些指令添加到用户 (user) 或系统 (system) 消息中，以在对话轮次之间灵活切换模型的思考模式。在多轮对话中，模型将遵循最近的指令。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:85
+#: 518d0395430f4920973e6da2753c1507
+msgid "For thinking mode, use Temperature=0.6, TopP=0.95, TopK=20, and MinP=0 (the default setting in `generation_config.json`). DO NOT use greedy decoding, as it can lead to performance degradation and endless repetitions.  For more detailed guidance, please refer to the Best Practices section."
+msgstr "对于思考模式，使用 Temperature=0.6，TopP=0.95，TopK=20，以及 MinP=0（`generation_config.json` 中的默认设置）。不要使用贪婪解码，因为它可能导致性能下降和无尽的重复。更多详细指导，请参阅最佳实践部分。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:89
+#: 80bf598dfdf048a791d05c6a21ccd425
+msgid "For non-thinking mode, we suggest using Temperature=0.7, TopP=0.8, TopK=20, and MinP=0."
+msgstr "对于非思考模式，我们建议使用 Temperature=0.7，TopP=0.8，TopK=20，以及 MinP=0。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:93
+#: 7a585706796a4db9a9f34ec1241135b5
+msgid "ModelScope"
+msgstr "魔搭 (ModelScope)"
+
+#: ../../Qwen/source/getting_started/quickstart.md:95
+#: fbf6acee0f534a3d9197221626ce79e4
+msgid "To tackle with downloading issues, we advise you to try [ModelScope](https://github.com/modelscope/modelscope). Before starting, you need to install `modelscope` with `pip`."
+msgstr "为了解决下载问题，我们建议您尝试从 [ModelScope](https://github.com/modelscope/modelscope) 进行下载。开始之前，需要使用 `pip` 安装 `modelscope` 。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:98
+#: e29964895f744793a18058022ad578b8
+msgid "`modelscope` adopts a programmatic interface similar (but not identical) to `transformers`. For basic usage, you can simply change the first line of code above to the following:"
+msgstr "`modelscope` 采用了与 `transformers` 类似（但不完全一致）的编程接口。对于基础使用，仅需将上面代码第一行做如下修改："
+
+#: ../../Qwen/source/getting_started/quickstart.md:105
+#: 2686cab2a6f54fe7ae813a0aeeb04d14
+msgid "For more information, please refer to [the documentation of `modelscope`](https://www.modelscope.cn/docs)."
+msgstr "欲获取更多信息，请参考 [`modelscope` 文档](https://www.modelscope.cn/docs)。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:107
+#: ce23fee238f8458599cc4d7e16a2e509
+msgid "vLLM"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quickstart.md:109
+#: cf0e10035e954a328775205ff39e9687
+msgid "To deploy Qwen3, we advise you to use vLLM.  vLLM is a fast and easy-to-use framework for LLM inference and serving.  In the following, we demonstrate how to build a OpenAI-API compatible API service with vLLM."
+msgstr "要部署 Qwen3 ，我们建议您使用 vLLM 。 vLLM 是一个用于 LLM 推理和服务的快速且易于使用的框架。以下，我们将展示如何使用 vLLM 构建一个与 OpenAI 兼容的 API 服务。"
+
+#: ../../Qwen/source/getting_started/quickstart.md:113
+#: 925651cdb57d478884f151b52834ab3c
+msgid "First, make sure you have installed `vllm>=0.8.5`."
+msgstr "首先，确保你已经安装 `vLLM>=0.8.5` ："
+
+#: ../../Qwen/source/getting_started/quickstart.md:115
+#: 4cb0c9b830984fafa3f5ee2e74dea6dc
+msgid "Run the following code to build up a vLLM service.  Here we take Qwen3-8B as an example:"
+msgstr "运行以下代码以构建 vLLM 服务。此处我们以 Qwen3-8B 为例："
+
+#: ../../Qwen/source/getting_started/quickstart.md:122
+#: c7b58160d10d43a2bb6e63572dbeff46
+msgid "Then, you can use the [create chat interface](https://platform.openai.com/docs/api-reference/chat/completions/create) to communicate with Qwen:"
+msgstr "然后，可以使用 [\"create chat\" interface](https://platform.openai.com/docs/api-reference/chat/completions/create>) 来与 Qwen 进行交流："
+
+#: ../../Qwen/source/getting_started/quickstart.md
+#: 8f4c1e3692a34137ad9fbf6d7a50969c c685b92ca0ea49c0b3925b24cd43317c
+msgid "curl"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quickstart.md
+#: 147be07b6f3141c08f8c707a9f06403c ffc3d81775264a00ad0d7bcb85ff6caf
+msgid "Python"
+msgstr ""
+
+#: ../../Qwen/source/getting_started/quickstart.md:142
+#: ../../Qwen/source/getting_started/quickstart.md:192
+#: 9a1026d8cf10458b8a3e717e105e8d5e ed7621681c36472a90b4be9c1fe98355
+msgid "You can use the API client with the `openai` Python SDK as shown below:"
+msgstr "您可以按照下面所示的方式，使用 `openai` Python SDK中的客户端："
+
+#: ../../Qwen/source/getting_started/quickstart.md:169
+#: a5ae1f193b044cb982e5ea4d98b30afb
+msgid "While the soft switch is always available, the hard switch is also availabe in vLLM through the following configuration to the API call. To disable thinking, use"
+msgstr "虽然软开关始终可用，但硬开关也可以通过以下 API 调用配置在 vLLM 中使用。要禁用思考，请使用"
+
+#: ../../Qwen/source/getting_started/quickstart.md:221
+#: a200dc6f700d40f89e22d7745a5f01f0
+msgid "Next Step"
+msgstr "下一步"
+
+#: ../../Qwen/source/getting_started/quickstart.md:223
+#: e22d4b679b36490fb4877ae01bfb515a
+msgid "Now, you can have fun with Qwen3 models.  Would love to know more about its usage?  Feel free to check other documents in this documentation."
+msgstr "现在，您可以尽情探索 Qwen3 模型的各种用途。若想了解更多，请随时查阅本文档中的其他内容。"
+
+#~ msgid "Hugging Face Transformers & ModelScope"
+#~ msgstr ""
+
+#~ msgid "Install with `pip`:"
+#~ msgstr "使用 `pip` 安装："
+
+#~ msgid "Install with `conda`:"
+#~ msgstr "使用 `conda` 安装："
+
+#~ msgid "Install from source:"
+#~ msgstr "从源代码安装："
+
+#~ msgid "As you can see, it's just standard usage for casual LMs in `transformers`!"
+#~ msgstr "如您所见，与 `transformers` 的常规使用方式无二！"
+
+#~ msgid "Streaming Generation"
+#~ msgstr "流式生成"
+
+#~ msgid "Streaming mode for model chat is simple with the help of `TextStreamer`.  Below we show you an example of how to use it:"
+#~ msgstr "借助 `TextStreamer` ， 模型生成的流式模式变得非常简单。下面我们将展示一个如何使用它的示例："
+
+#~ msgid "It will print the text to the console or the terminal as being generated."
+#~ msgstr "命令行或终端中将屏显生成的文本。"
+
+#~ msgid "vLLM for Deployment"
+#~ msgstr "使用vLLM部署"
+
+#~ msgid "with `vllm>=0.5.3`, you can also use"
+#~ msgstr "如 `vllm>=0.5.3` ，也可以如下启动："
+
+#~ msgid "For more information, please refer to [the documentation of `vllm`](https://docs.vllm.ai/en/stable/)."
+#~ msgstr "欲获取更多信息，请参考 [`vllm` 文档](https://docs.vllm.ai/en/stable/)。"
+
--- a/docs/locales/zh_CN/LC_MESSAGES/getting_started/speed_benchmark.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/getting_started/speed_benchmark.po
--- a/docs/locales/zh_CN/LC_MESSAGES/index.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/index.po
+# Copyright (C) 2024, Qwen Team, Alibaba Group.
+# This file is distributed under the same license as the Qwen package.
+#
+msgid ""
+msgstr ""
+"Project-Id-Version: Qwen \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-04-28 19:42+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../Qwen/source/index.rst:34
+msgid "Getting Started"
+msgstr "快速开始"
+
+#: ../../Qwen/source/index.rst:44
+msgid "Inference"
+msgstr "推理"
+
+#: ../../Qwen/source/index.rst:51
+msgid "Run Locally"
+msgstr "本地运行"
+
+#: ../../Qwen/source/index.rst:60
+msgid "Deployment"
+msgstr "部署"
+
+#: ../../Qwen/source/index.rst:71
+msgid "Quantization"
+msgstr "量化"
+
+#: ../../Qwen/source/index.rst:80
+msgid "Training"
+msgstr "训练"
+
+#: ../../Qwen/source/index.rst:87
+msgid "Framework"
+msgstr "框架"
+
+#: ../../Qwen/source/index.rst:2 6e52d3a497924f828d4c6b9dd59370d5
+msgid "Welcome to Qwen!"
+msgstr "欢迎来到Qwen"
+
+#: ../../Qwen/source/index.rst:4 235805a6d4a34184821c0f4f81020ef1
+msgid "Qwen3"
+msgstr ""
+
+#: ../../Qwen/source/index.rst:11 b8a3aa3f31594232959a08d89e9dc7db
+msgid "Qwen is the large language model and large multimodal model series of the Qwen Team, Alibaba Group. Both language models and multimodal models are pretrained on large-scale multilingual and multimodal data and post-trained on quality data for aligning to human preferences. Qwen is capable of natural language understanding, text generation, vision understanding, audio understanding, tool use, role play, playing as AI agent, etc."
+msgstr "Qwen是阿里巴巴集团Qwen团队研发的大语言模型和大型多模态模型系列。无论是语言模型还是多模态模型，均在大规模多语言和多模态数据上进行预训练，并通过高质量数据进行后期微调以贴近人类偏好。Qwen具备自然语言理解、文本生成、视觉理解、音频理解、工具使用、角色扮演、作为AI Agent进行互动等多种能力。"
+
+#: ../../Qwen/source/index.rst:14 8735c67355064a97b2793b721a701b21
+msgid "The latest version, Qwen3, has the following features:"
+msgstr "最新版本Qwen3有以下特点："
+
+#: ../../Qwen/source/index.rst:16 1956d75084244379aad9503fcc572f00
+msgid "**Dense and Mixture-of-Experts (MoE) models**, available in 0.6B, 1.7B, 4B, 8B, 14B, 32B and 30B-A3B, 235B-A22B."
+msgstr "**全尺寸稠密与混合专家模型**：0.6B, 1.7B, 4B, 8B, 14B, 32B and 30B-A3B, 235B-A22B"
+
+#: ../../Qwen/source/index.rst:17 1fdf12161cd14663b67b2c08f9219ddb
+msgid "**Seamless switching between thinking mode** (for complex logical reasoning, math, and coding) and **non-thinking mode** (for efficient, general-purpose chat) **within a single model**, ensuring optimal performance across various scenarios."
+msgstr "支持在**思考模式**（用于复杂逻辑推理、数学和编码）和 **非思考模式** （用于高效通用对话）之间**无缝切换**，确保在各种场景下的最佳性能。"
+
+#: ../../Qwen/source/index.rst:18 189ff2a03ad249ef88202c34e9f8aa86
+msgid "**Significantly enhancement in reasoning capabilities**, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning."
+msgstr "**显著增强的推理能力**，在数学、代码生成和常识逻辑推理方面超越了之前的 QwQ（在思考模式下）和 Qwen2.5 指令模型（在非思考模式下）。"
+
+#: ../../Qwen/source/index.rst:19 64ebcda0381148cb8edf8d92b49469ea
+msgid "**Superior human preference alignment**, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience."
+msgstr "**卓越的人类偏好对齐**，在创意写作、角色扮演、多轮对话和指令跟随方面表现出色，提供更自然、更吸引人和更具沉浸感的对话体验。"
+
+#: ../../Qwen/source/index.rst:20 ec0ebb91f1ed491f8672aefef6307d85
+msgid "**Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks."
+msgstr "**擅长智能体能力**，可以在思考和非思考模式下精确集成外部工具，在复杂的基于代理的任务中在开源模型中表现领先。"
+
+#: ../../Qwen/source/index.rst:21 526b161edf284e1b913aabc7e7fcc77c
+msgid "**Support of 100+ languages and dialects** with strong capabilities for **multilingual instruction following** and **translation**."
+msgstr "**支持 100 多种语言和方言**，具有强大的多语言理解、推理、指令跟随和生成能力。"
+
+#: ../../Qwen/source/index.rst:23 79ed3f0e7da043bb8b53f510ed244814
+msgid "For more information, please visit our:"
+msgstr "想了解更多信息，欢迎访问："
+
+#: ../../Qwen/source/index.rst:25 b2e579ae57de4d2985ab1c350fdf2458
+msgid "`Blog <https://qwenlm.github.io/>`__"
+msgstr "`博客 <https://qwenlm.github.io/>`__"
+
+#: ../../Qwen/source/index.rst:26 406389fe90064e879bd28665a021ee7e
+msgid "`GitHub <https://github.com/QwenLM>`__"
+msgstr "`GitHub <https://github.com/QwenLM>`__"
+
+#: ../../Qwen/source/index.rst:27 714c64df6aed4e608571de0155199fef
+msgid "`Hugging Face <https://huggingface.co/Qwen>`__"
+msgstr "`Hugging Face <https://huggingface.co/Qwen>`__"
+
+#: ../../Qwen/source/index.rst:28 214e12e0b1c04b268582b2c46d22334d
+msgid "`ModelScope <https://modelscope.cn/organization/qwen>`__"
+msgstr "`ModelScope <https://modelscope.cn/organization/qwen>`__"
+
+#: ../../Qwen/source/index.rst:29 9c64e461dc3a440ab92d94887fe3d2d8
+msgid "`Qwen3 Collection <https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f>`__"
+msgstr ""
+
+#: ../../Qwen/source/index.rst:31 c6056edc8a3a4a12bd3a75eeb210f7a2
+msgid "Join our community by joining our `Discord <https://discord.gg/yPEP2vHTu4>`__ and `WeChat <https://github.com/QwenLM/Qwen/blob/main/assets/wechat.png>`__ group. We are looking forward to seeing you there!"
+msgstr "加入社区，加入 `Discord <https://discord.gg/yPEP2vHTu4>`__ 和 `微信群 <https://github.com/QwenLM/Qwen/blob/main/assets/wechat.png>`__ 。很期待见到你们！"
+
+#~ msgid "Web UI"
+#~ msgstr "Web UI"
+
+#~ msgid "Benchmark"
+#~ msgstr "评测"
+
+#~ msgid "Qwen2.5"
+#~ msgstr ""
+
+#~ msgid "Dense, easy-to-use, decoder-only language models, available in **0.5B**, **1.5B**, **3B**, **7B**, **14B**, **32B**, and **72B** sizes, and base and instruct variants."
+#~ msgstr "易于使用的仅解码器稠密语言模型，提供 **0.5B** 、**1.5B** 、**3B** 、**7B** 、**14B** 、**32B** 和 **72B** 共7种参数规模的模型，并且有基模型和指令微调模型两种变体（其中“ B ”表示“十亿”， 72B 即为 720 亿）"
+
+#~ msgid "Pretrained on our latest large-scale dataset, encompassing up to **18T** tokens."
+#~ msgstr "利用我们最新的数据集进行预训练，包含多达 18T tokens （其中“ T ”表示“万亿”， 18T 即为 18 万亿）"
+
+#~ msgid "Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON."
+#~ msgstr "在遵循指令、生成长文本（超过 8K tokens ）、理解结构化数据（例如，表格）以及生成结构化输出特别是 JSON 方面有了显著改进"
+
+#~ msgid "More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots."
+#~ msgstr "更加适应多样化的系统提示，增强了角色扮演的实现和聊天机器人的背景设置。"
+
+#~ msgid "Context length support up to **128K** tokens and can generate up to **8K** tokens."
+#~ msgstr "支持最多达 **128K** tokens 的上下文长度，并能生成多达 **8K** tokens 的文本。"
+
+#~ msgid "Multilingual support for over **29** languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more."
+#~ msgstr "支持超过 **29** 种语言，包括中文、英文、法文、西班牙文、葡萄牙文、德文、意大利文、俄文、日文、韩文、越南文、泰文、阿拉伯文等。"
+
+#~ msgid "`Qwen2.5 Collection <https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e>`__"
+#~ msgstr ""
+
--- a/docs/locales/zh_CN/LC_MESSAGES/inference/transformers.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/inference/transformers.po
--- a/docs/locales/zh_CN/LC_MESSAGES/quantization/awq.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/quantization/awq.po
+# Copyright (C) 2024, Qwen Team, Alibaba Group.
+# This file is distributed under the same license as the Qwen package.
+#
+msgid ""
+msgstr ""
+"Project-Id-Version: Qwen \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-04-28 19:42+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../Qwen/source/quantization/awq.md:1 363514c3e24c4d2aa54832e85acf34ef
+msgid "AWQ"
+msgstr "AWQ"
+
+#: ../../Qwen/source/quantization/awq.md:4 36b5c0de1013499f9f1e41edf8fa28ca
+msgid "To be updated for Qwen3."
+msgstr "仍需为Qwen3更新。"
+
+#: ../../Qwen/source/quantization/awq.md:7 9d6a80a82b044628bc9c911785ac9160
+msgid "For quantized models, one of our recommendations is the usage of [AWQ](https://arxiv.org/abs/2306.00978) with [AutoAWQ](https://github.com/casper-hansen/AutoAWQ)."
+msgstr "对于量化模型，我们推荐使用 [AWQ](https://arxiv.org/abs/2306.00978) 结合 [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) "
+
+#: ../../Qwen/source/quantization/awq.md:9 139542ed4b414cfb834b3fd81ea88d51
+msgid "**AWQ** refers to Activation-aware Weight Quantization, a hardware-friendly approach for LLM low-bit weight-only quantization."
+msgstr "**AWQ**即激活值感知的权重量化(Activation-aware Weight Quantization)，是一种针对LLM的低比特权重量化的硬件友好方法。"
+
+#: ../../Qwen/source/quantization/awq.md:11 9a2959bb9f984e36a299bc40abca9402
+msgid "**AutoAWQ** is an easy-to-use Python library for 4-bit quantized models.  AutoAWQ speeds up models by 3x and reduces memory requirements by 3x compared to FP16.  AutoAWQ implements the Activation-aware Weight Quantization (AWQ) algorithm for quantizing LLMs."
+msgstr "**AutoAWQ**是一个易于使用的工具包，用于4比特量化模型。相较于FP16，AutoAWQ能够将模型的运行速度提升3倍，并将内存需求降低至原来的三分之一。AutoAWQ实现了AWQ算法，可用于LLM的量化处理。"
+
+#: ../../Qwen/source/quantization/awq.md:15 4f9fcd93d1f44b48869224c0f4e8b76a
+msgid "In this document, we show you how to use the quantized model with Hugging Face `transformers` and also how to quantize your own model."
+msgstr "在本文档中，我们将向您展示如何在Hugging Face `transformers`框架下使用量化模型，以及如何对您自己的模型进行量化"
+
+#: ../../Qwen/source/quantization/awq.md:17 870ebc162f3749b48fe454df85aaaf4b
+msgid "Usage of AWQ Models with Hugging Face transformers"
+msgstr "在Hugging Face transformers中使用AWQ量化模型"
+
+#: ../../Qwen/source/quantization/awq.md:19 cc7bd785c7ac45a4980fbda683699e43
+msgid "Now, `transformers` has officially supported AutoAWQ, which means that you can directly use the quantized model with `transformers`.  The following is a very simple code snippet showing how to run `Qwen2.5-7B-Instruct-AWQ` with the quantized model:"
+msgstr "现在，`transformers`已经正式支持AutoAWQ，这意味着您可以直接在`transformers`中使用AWQ量化模型。以下是一个非常简单的代码片段，展示如何运行量化模型 `Qwen2.5-7B-Instruct-AWQ` ："
+
+#: ../../Qwen/source/quantization/awq.md:56 47826d51abf54ad8a89ef9b91127a700
+msgid "Usage of AWQ  Models with vLLM"
+msgstr "在vLLM中使用AWQ量化模型"
+
+#: ../../Qwen/source/quantization/awq.md:58 b7235ae8f8344dd4a3d2029bbe7a40fc
+msgid "vLLM has supported AWQ, which means that you can directly use our provided AWQ models or those quantized with `AutoAWQ` with vLLM. We recommend using the latest version of vLLM (`vllm>=0.6.1`) which brings performance improvements to AWQ models; otherwise, the performance might not be well-optimized."
+msgstr "vLLM已支持AWQ，您可以直接使用我们提供的AWQ量化模型或使用`AutoAWQ`量化的模型。我们建议使用最新版的vLLM (`vllm>=0.6.1`)，新版为AWQ量化模型提升了效率提；不然推理效率可能并为被良好优化（即效率可能较非量化模型低）。"
+
+#: ../../Qwen/source/quantization/awq.md:61 940ce8fdb5da442b99af2bc1739911c6
+msgid "Actually, the usage is the same with the basic usage of vLLM.  We provide a simple example of how to launch OpenAI-API compatible API with vLLM and `Qwen2.5-7B-Instruct-AWQ`:"
+msgstr "实际上，使用AWQ模型与vLLM的基本用法相同。我们提供了一个简单的示例，展示了如何通过vLLM启动与OpenAI API兼容的接口，并使用 `Qwen2.5-7B-Instruct-AWQ` 模型："
+
+#: ../../Qwen/source/quantization/awq.md:64 2d249915352049a6a8d5a06e1f4682ee
+msgid "Run the following in a shell to start an OpenAI-compatible API service:"
+msgstr "在终端中运行以下命令以开启OpenAI兼容API："
+
+#: ../../Qwen/source/quantization/awq.md:70 be7bfbb81698429cbfcbcd24d062fc08
+msgid "Then, you can call the API as"
+msgstr "随后，您可以这样调用API："
+
+#: ../../Qwen/source/quantization/awq.md:86 0dff7d5c7b044548a82e0ba68a043d80
+msgid "or you can use the API client with the `openai` Python package as shown below:"
+msgstr "或者你可以按照下面所示的方式，使用 `openai` Python包中的API客户端："
+
+#: ../../Qwen/source/quantization/awq.md:115 65f4d60502ee486382e9bda9a5a826bb
+msgid "Quantize Your Own Model with AutoAWQ"
+msgstr "使用AutoAWQ量化你的模型"
+
+#: ../../Qwen/source/quantization/awq.md:117 c7c42af91c1a419194d65200bcfa2f26
+#, fuzzy
+msgid "If you want to quantize your own model to AWQ quantized models, we advise you to use AutoAWQ."
+msgstr "如果您希望将自定义模型量化为AWQ量化模型，我们建议您使用AutoAWQ。推荐通过安装源代码来获取并安装该工具包的最新版本："
+
+#: ../../Qwen/source/quantization/awq.md:123 232e94883d044030b2193392788b9314
+msgid "Suppose you have finetuned a model based on `Qwen2.5-7B`, which is named `Qwen2.5-7B-finetuned`, with your own dataset, e.g., Alpaca.  To build your own AWQ quantized model, you need to use the training data for calibration.  Below, we provide a simple demonstration for you to run:"
+msgstr "假设你已经基于 `Qwen2.5-7B` 模型进行了微调，并将其命名为 `Qwen2.5-7B-finetuned` ，且使用的是你自己的数据集，比如Alpaca。若要构建你自己的AWQ量化模型，你需要使用训练数据进行校准。以下，我们将为你提供一个简单的演示示例以便运行："
+
+#: ../../Qwen/source/quantization/awq.md:141 5162195f32ee4ecba229aa137da1aba4
+msgid "Then you need to prepare your data for calibration.  What you need to do is just put samples into a list, each of which is a text.  As we directly use our finetuning data for calibration, we first format it with ChatML template.  For example,"
+msgstr "接下来，您需要准备数据以进行校准。您需要做的就是将样本放入一个列表中，其中每个样本都是一段文本。由于我们直接使用微调数据来进行校准，所以我们首先使用ChatML模板对其进行格式化。例如："
+
+#: ../../Qwen/source/quantization/awq.md:153 0d4736e90e0242a8be15533de3aab6ff
+msgid "where each `msg` is a typical chat message as shown below:"
+msgstr "其中每个 `msg` 是一个典型的聊天消息，如下所示："
+
+#: ../../Qwen/source/quantization/awq.md:163 79d86630600945ac85dbe13d07987016
+msgid "Then just run the calibration process by one line of code:"
+msgstr "然后只需通过一行代码运行校准过程："
+
+#: ../../Qwen/source/quantization/awq.md:169 1ae219a50508465b98e3b3398e631681
+msgid "Finally, save the quantized model:"
+msgstr "最后，保存量化模型："
+
+#: ../../Qwen/source/quantization/awq.md:176 58316c1a4172418aba9f37925963e17f
+msgid "Then you can obtain your own AWQ quantized model for deployment.  Enjoy!"
+msgstr "然后你就可以得到一个可以用于部署的AWQ量化模型。玩得开心！"
+
--- a/docs/locales/zh_CN/LC_MESSAGES/quantization/gptq.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/quantization/gptq.po
+# Copyright (C) 2024, Qwen Team, Alibaba Group.
+# This file is distributed under the same license as the Qwen package.
+#
+msgid ""
+msgstr ""
+"Project-Id-Version: Qwen \n"
+"Report-Msgid-Bugs-To: \n"
+"POT-Creation-Date: 2025-04-28 19:42+0800\n"
+"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
+"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
+"Language: zh_CN\n"
+"Language-Team: zh_CN <LL@li.org>\n"
+"Plural-Forms: nplurals=1; plural=0;\n"
+"MIME-Version: 1.0\n"
+"Content-Type: text/plain; charset=utf-8\n"
+"Content-Transfer-Encoding: 8bit\n"
+"Generated-By: Babel 2.17.0\n"
+
+#: ../../Qwen/source/quantization/gptq.md:1 c90397f810fb44a0abba8dd02f998f1c
+msgid "GPTQ"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:4 b79afc46b0f9474fb0c83751625aefc5
+msgid "To be updated for Qwen3."
+msgstr "仍需为Qwen3更新。"
+
+#: ../../Qwen/source/quantization/gptq.md:7 898494af2a944193880f27e2f90db4f4
+msgid "[GPTQ](https://arxiv.org/abs/2210.17323) is a quantization method for GPT-like LLMs, which uses one-shot weight quantization based on approximate second-order information. In this document, we show you how to use the quantized model with Hugging Face `transformers` and also how to quantize your own model with [AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ)."
+msgstr "[GPTQ](https://arxiv.org/abs/2210.17323)是一种针对类GPT大型语言模型的量化方法，它基于近似二阶信息进行一次性权重量化。在本文档中，我们将向您展示如何使用 `transformers` 库加载并应用量化后的模型，同时也会指导您如何通过[AutoGPTQ](https://github.com/AutoGPTQ/AutoGPTQ)来对您自己的模型进行量化处理。"
+
+#: ../../Qwen/source/quantization/gptq.md:10 11b82020735d4828a4182cefbf98aeb1
+msgid "Usage of GPTQ Models with Hugging Face transformers"
+msgstr "在Hugging Face transformers中使用GPTQ模型"
+
+#: ../../Qwen/source/quantization/gptq.md:14 2e9481d850954772949dd33897e0b06b
+msgid "To use the official Qwen2.5 GPTQ models with `transformers`, please ensure that `optimum>=1.20.0` and compatible versions of `transformers` and `auto_gptq` are installed."
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:16 fe6662a312184d40b07d957f4c0888cc
+msgid "You can do that by"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:22 9f0ad8e2a26145cf8bd9d60305566771
+msgid "Now, `transformers` has officially supported AutoGPTQ, which means that you can directly use the quantized model with `transformers`.  For each size of Qwen2.5, we provide both Int4 and Int8 GPTQ quantized models. The following is a very simple code snippet showing how to run `Qwen2.5-7B-Instruct-GPTQ-Int4`:"
+msgstr "现在，`transformers` 正式支持了AutoGPTQ，这意味着您能够直接在`transformers`中使用量化后的模型。以下是一个非常简单的代码片段示例，展示如何运行  `Qwen2.5-7B-Instruct-GPTQ-Int4` （请注意，对于每种大小的Qwen2.5模型，我们都提供了Int4和Int8两种量化版本）："
+
+#: ../../Qwen/source/quantization/gptq.md:60 855686b8990f403bba151d8498947f23
+msgid "Usage of GPTQ Models with vLLM"
+msgstr "在vLLM中使用GPTQ模型"
+
+#: ../../Qwen/source/quantization/gptq.md:62 ad572c30a0904598b3cbeba7c38a607a
+msgid "vLLM has supported GPTQ, which means that you can directly use our provided GPTQ models or those trained with `AutoGPTQ` with vLLM. If possible, it will automatically use the GPTQ Marlin kernel, which is more efficient."
+msgstr "vLLM已支持GPTQ，您可以直接使用我们提供的GPTQ量化模型或使用`AutoGPTQ`量化的模型。我们建议使用最新版的vLLM。如有可能，其会自动使用效率更好的GPTQ Marlin实现。"
+
+#: ../../Qwen/source/quantization/gptq.md:65 09050876d2c04aee9b619d28d4f5589c
+msgid "Actually, the usage is the same with the basic usage of vLLM.  We provide a simple example of how to launch OpenAI-API compatible API with vLLM and `Qwen2.5-7B-Instruct-GPTQ-Int4`:"
+msgstr "实际上，使用GPTQ模型与vLLM的基本用法相同。我们提供了一个简单的示例，展示了如何通过vLLM启动与OpenAI API兼容的接口，并使用 `Qwen2.5-7B-Instruct-GPTQ-Int4` 模型："
+
+#: ../../Qwen/source/quantization/gptq.md:68 a31dd879cc444b5da8d16fb1705585a6
+msgid "Run the following in a shell to start an OpenAI-compatible API service:"
+msgstr "在终端中运行以下命令以开启OpenAI兼容API："
+
+#: ../../Qwen/source/quantization/gptq.md:74 9dfb41e03089473792928b05b1225de4
+msgid "Then, you can call the API as"
+msgstr "随后，您可以这样调用API："
+
+#: ../../Qwen/source/quantization/gptq.md:90 6b440bebe0d84118bb63ed9a7c169ab5
+msgid "or you can use the API client with the `openai` Python package as shown below:"
+msgstr "或者你可以按照下面所示的方式，使用 `openai` Python包中的API客户端："
+
+#: ../../Qwen/source/quantization/gptq.md:119 7ffaa1ca8b4740b98dc3f804348da523
+msgid "Quantize Your Own Model with AutoGPTQ"
+msgstr "使用AutoGPTQ量化你的模型"
+
+#: ../../Qwen/source/quantization/gptq.md:121 40bd0b11507c4f06be5a5918d0dc3bdb
+msgid "If you want to quantize your own model to GPTQ quantized models, we advise you to use AutoGPTQ.  It is suggested installing the latest version of the package by installing from source code:"
+msgstr "如果你想将自定义模型量化为GPTQ量化模型，我们建议你使用AutoGPTQ工具。推荐通过安装源代码的方式获取并安装最新版本的该软件包。"
+
+#: ../../Qwen/source/quantization/gptq.md:130 d6ebb03d51bf4e0686ae17ce3f0a34db
+msgid "Suppose you have finetuned a model based on `Qwen2.5-7B`, which is named `Qwen2.5-7B-finetuned`, with your own dataset, e.g., Alpaca.  To build your own GPTQ quantized model, you need to use the training data for calibration.  Below, we provide a simple demonstration for you to run:"
+msgstr "假设你已经基于 `Qwen2.5-7B` 模型进行了微调，并将该微调后的模型命名为 `Qwen2.5-7B-finetuned` ，且使用的是自己的数据集，比如Alpaca。要构建你自己的GPTQ量化模型，你需要使用训练数据进行校准。以下是一个简单的演示示例，供你参考运行："
+
+#: ../../Qwen/source/quantization/gptq.md:161 9c1b27cc38764332891a8a13175663fc
+msgid "However, if you would like to load the model on multiple GPUs, you need to use `max_memory` instead of `device_map`. Here is an example:"
+msgstr "但是，如果你想使用多GPU来读取模型，你需要使用 `max_memory` 而不是 `device_map`。下面是一段示例代码："
+
+#: ../../Qwen/source/quantization/gptq.md:172 c2a9a50734854c19acf3e623597aee80
+msgid "Then you need to prepare your data for calibration.  What you need to do is just put samples into a list, each of which is a text.  As we directly use our finetuning data for calibration, we first format it with ChatML template.  For example,"
+msgstr "接下来，你需要准备数据进行校准。你需要做的是将样本放入一个列表中，其中每个样本都是一段文本。由于我们直接使用微调数据进行校准，所以我们首先使用ChatML模板对它进行格式化处理。例如："
+
+#: ../../Qwen/source/quantization/gptq.md:188 7621f73d34d04dd791d2eda03edb0d06
+msgid "where each `msg` is a typical chat message as shown below:"
+msgstr "其中每个 `msg` 是一个典型的聊天消息，如下所示："
+
+#: ../../Qwen/source/quantization/gptq.md:198 293efa14ece74a0aa9cbf32ef21e6bcd
+msgid "Then just run the calibration process by one line of code:"
+msgstr "然后只需通过一行代码运行校准过程："
+
+#: ../../Qwen/source/quantization/gptq.md:209 919d7a77cc4a4ef084ee8e2240ff1797
+msgid "Finally, save the quantized model:"
+msgstr "最后，保存量化模型："
+
+#: ../../Qwen/source/quantization/gptq.md:216 b353bdf12d6148fdb0a77662f795ae7e
+msgid "It is unfortunate that the `save_quantized` method does not support sharding.  For sharding, you need to load the model and use `save_pretrained` from transformers to save and shard the model. Except for this, everything is so simple.  Enjoy!"
+msgstr "很遗憾， `save_quantized` 方法不支持模型分片。若要实现模型分片，您需要先加载模型，然后使用来自 `transformers` 库的 `save_pretrained` 方法来保存并分片模型。除此之外，一切操作都非常简单。祝您使用愉快！"
+
+#: ../../Qwen/source/quantization/gptq.md:222 caea6f76804e40daa394ae2e2d52a6ce
+msgid "Known Issues"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:224 07df69bd48d4445887b5c1fa09f2f0fb
+msgid "Qwen2.5-72B-Instruct-GPTQ-Int4 cannot stop generation properly"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:226
+#: ../../Qwen/source/quantization/gptq.md:235 a4f1c7b0cb5d49f2929ba5d1246e885d
+#: d2dbf88d06974152943e6ec405419390
+msgid "Model"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:226 cb9c0be91ecc46c3b6ecfa97a0a37dd7
+msgid "Qwen2.5-72B-Instruct-GPTQ-Int4"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:227
+#: ../../Qwen/source/quantization/gptq.md:236 c1fe04754a0642fa82ed425d6abaa487
+#: f3ff85cbbc47459fb36b5ad0e38b4a1b
+msgid "Framework"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:227 8a5a4fe9d7634cb1ac65025565c3593a
+#, fuzzy
+msgid "vLLM, AutoGPTQ (including Hugging Face transformers)"
+msgstr "在Hugging Face transformers中使用GPTQ模型"
+
+#: ../../Qwen/source/quantization/gptq.md:228
+#: ../../Qwen/source/quantization/gptq.md:237 320d56294cc4490f8b30ac523388bc44
+#: c04326d003f949a7b2b63c6c6cb20ac3
+msgid "Description"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:228 22f80d0679dc426dbbfb21b90b993a27
+msgid "Generation cannot stop properly. Continual generation after where it should stop, then repeated texts, either single character, a phrase, or paragraphs, are generated."
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:229
+#: ../../Qwen/source/quantization/gptq.md:238 255a7a8ac98b4d2da51f79f207be5901
+#: 673d23bf488840a2a32a18cd657f334f
+msgid "Workaround"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:229 c2171874ed804ffb826ac686128d7bff
+msgid "The following workaround could be considered"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:230 a59d6759991640609371bf7afd81e0b8
+msgid "Using the original model in 16-bit floating point"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:231 97134ed43ee3414199928d755c24544e
+msgid "Using the AWQ variants or llama.cpp-based models for reduced chances of abnormal generation"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:233 7c30819dea6c4cfb8eee98d0dd217bf9
+msgid "Qwen2.5-32B-Instruct-GPTQ-Int4 broken with vLLM on multiple GPUs"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:235 a4a641abd99a47049c1fd172e9cfa2be
+msgid "Qwen2.5-32B-Instruct-GPTQ-Int4"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:236 70216327dda349cabf03412f5fbe3114
+msgid "vLLM"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:237 8edf21882ff24358b736c73477cfba9d
+msgid "Deployment on multiple GPUs and only garbled text like `!!!!!!!!!!!!!!!!!!` could be generated."
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:238 10d9d8b3d8e74afea5ccd79bc698fb7c
+msgid "Each of the following workaround could be considered"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:239 33d1632f26f9423c847d06af7a5d107d
+msgid "Using the AWQ or GPTQ-Int8 variants"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:240 b27f1f32637349d09b8c74a2041a4d9b
+msgid "Using a single GPU"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:241 fc27883584a04682b9e28b2ccf51dc0e
+msgid "Using Hugging Face `transformers` if latency and throughput are not major concerns"
+msgstr ""
+
+#: ../../Qwen/source/quantization/gptq.md:244 5664e5bd63c845d49e8cfa75e789dfa3
+msgid "Troubleshooting"
+msgstr "问题排查"
+
+#: ../../Qwen/source/quantization/gptq.md 06f2358881134920ab43f4256ad6300e
+msgid "With `transformers` and `auto_gptq`, the logs suggest `CUDA extension not installed.` and the inference is slow."
+msgstr "在使用 `transformers` 和 `auto_gptq` 时，日志提示 `CUDA extension not installed.` 并且推理速度缓慢。"
+
+#: ../../Qwen/source/quantization/gptq.md:248 2d57d681b2d74c27b60523fa86676b6f
+msgid "`auto_gptq` fails to find a fused CUDA kernel compatible with your environment and falls back to a plain implementation. Follow its [installation guide](https://github.com/AutoGPTQ/AutoGPTQ/blob/main/docs/INSTALLATION.md) to install a pre-built wheel or try installing `auto_gptq` from source."
+msgstr "`auto_gptq` 未能找到与您的环境兼容的融合CUDA算子，因此退回到基础实现。请遵循其 [安装指南](https://github.com/AutoGPTQ/AutoGPTQ/blob/main/docs/INSTALLATION.md) 来安装预构建的 wheel 或尝试从源代码安装 `auto_gptq` 。"
+
+#: ../../Qwen/source/quantization/gptq.md 95b57d1a962c4dc7aa02a69a403e2376
+msgid "Self-quantized Qwen2.5-72B-Instruct-GPTQ with `vllm`, `ValueError: ... must be divisible by ...` is raised. The intermediate size of the self-quantized model is different from the official Qwen2.5-72B-Instruct-GPTQ models."
+msgstr "`vllm` 使用自行量化的 Qwen2.5-72B-Instruct-GPTQ 时，会引发 `ValueError: ... must be divisible by ...` 错误。自量化的模型的 intermediate size 与官方的 Qwen2.5-72B-Instruct-GPTQ 模型不同。"
+
+#: ../../Qwen/source/quantization/gptq.md:255 ecd9b51a549045949ff18fdb6226ddc8
+#, python-brace-format
+msgid "After quantization the size of the quantized weights are divided by the group size, which is typically 128. The intermediate size for the FFN blocks in Qwen2.5-72B is 29568. Unfortunately, {math}`29568 \\div 128 = 231`. Since the number of attention heads and the dimensions of the weights must be divisible by the tensor parallel size, it means you can only run the quantized model with `tensor_parallel_size=1`, i.e., one GPU card."
+msgstr "量化后，量化权重的大小将被 group size（通常为128）整除。Qwen2-72B 中FFN块的中间大小为29568。不幸的是， {math}`29568 \\div 128 = 231` 。由于注意力头的数量和权重的维度必须能够被张量并行大小整除，这意味着你只能使用 `tensor_parallel_size=1` ，即一张 GPU 卡，来运行量化的模型。"
+
+#: ../../Qwen/source/quantization/gptq.md:260 8b1c5e3934654679a2d85e3287cf9309
+#, python-brace-format
+msgid "A workaround is to make the intermediate size divisible by {math}`128 \\times 8 = 1024`. To achieve that, the weights should be padded with zeros. While it is mathematically equivalent before and after zero-padding the weights, the results may be slightly different in reality."
+msgstr "一个解决方案是使中间大小能够被 {math}`128 \\times 8 = 1024` 整除。为了达到这一目的，应该使用零值对权重进行填充。虽然在数学上，在对权重进行零填充前后是等价的，但在现实中结果可能会略有不同。"
+
+#: ../../Qwen/source/quantization/gptq.md:264 ae904f7ab91340c4a6831aef4de643ba
+msgid "Try the following:"
+msgstr "尝试以下方法："
+
+#: ../../Qwen/source/quantization/gptq.md:297 4cf8c516a2324e618d25333c84be9e6b
+msgid "This will save the padded checkpoint to the specified directory. Then, copy other files from the original checkpoint to the new directory and modify the `intermediate_size` in `config.json` to `29696`. Finally, you can quantize the saved model checkpoint."
+msgstr "这将会把填充后的检查点保存到指定的目录。然后，你需要从原始检查点复制其他文件到新目录，并将 `config.json` 中的 `intermediate_size` 修改为 `29696` 。最后，你可以量化保存的模型检查点。"
+
--- a/docs/locales/zh_CN/LC_MESSAGES/quantization/llama.cpp.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/quantization/llama.cpp.po
--- a/docs/locales/zh_CN/LC_MESSAGES/run_locally/llama.cpp.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/run_locally/llama.cpp.po
--- a/docs/locales/zh_CN/LC_MESSAGES/run_locally/mlx-lm.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/run_locally/mlx-lm.po
--- a/docs/locales/zh_CN/LC_MESSAGES/run_locally/ollama.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/run_locally/ollama.po
--- a/docs/locales/zh_CN/LC_MESSAGES/training/llama_factory.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/training/llama_factory.po
--- a/docs/locales/zh_CN/LC_MESSAGES/training/ms_swift.po
+++ b/docs/locales/zh_CN/LC_MESSAGES/training/ms_swift.po
--- a/docs/make.bat
+++ b/docs/make.bat
--- a/docs/requirements-docs.txt
+++ b/docs/requirements-docs.txt
+furo
+myst-parser==4.0.0
+sphinx<8,>4.5.0
+sphinx-copybutton
+sphinx-design>=0.6.0
--- a/docs/source/_static/css/custom.css
+++ b/docs/source/_static/css/custom.css
--- a/docs/source/_static/design-tabs.js
+++ b/docs/source/_static/design-tabs.js
--- a/docs/source/assets/.DS_Store
+++ b/docs/source/assets/.DS_Store