update codes

1768a324 · dengjb · 18493eef · 1768a324 · 1768a324 · 1768a324
Commit 1768a324 authored Jul 16, 2024 by dengjb
20 changed files
--- a/local_mode/resources/demo.gif
+++ b/local_mode/resources/demo.gif
--- a/local_mode/resources/demo_zh.gif
+++ b/local_mode/resources/demo_zh.gif
--- a/local_mode/resources/pic1.png
+++ b/local_mode/resources/pic1.png
--- a/local_mode/services/chat.py
+++ b/local_mode/services/chat.py
+"""
+coding   : utf-8
+@Date    : 2024/7/10
+@Author  : Shaobo
+@Describe: 
+"""
+from models.codegeex import CodegeexChatModel
+
+model: CodegeexChatModel
+
+
+def stream_chat_with_codegeex(request):
+    yield from model.stream_chat(request)
+
+
+def chat_with_codegeex(request):
+    return model.chat(request)
+
+
+def init_model(args):
+    global model
+    model = CodegeexChatModel(args)
--- a/metric/README.md
+++ b/metric/README.md
+# The Most Powerful Versatile Code Model Under 10 Billion Parameters
+
+CodeGeeX4-ALL-9B, the open-source version of the latest generation of the CodeGeeX4 series, iterates on the powerful language capabilities of GLM4, significantly enhancing code generation capabilities. Using a single CodeGeeX4-ALL-9B model, it supports comprehensive functionalities such as code completion and generation, code interpreter, online search, tool invocation, repository-level long code Q&A and generation, covering various programming and development scenarios. CodeGeeX4-ALL-9B has achieved highly competitive performance on multiple authoritative code capability evaluation sets, such as NaturalCodeBench and BigCodeBench. It is the most powerful model under 10 billion parameters, even surpassing general models several times its size, achieving the best balance between inference performance and model effectiveness.
+
+## 1. BigCodeBench
+
+BigCodeBench test results show that CodeGeeX4-ALL-9B performs the best at the same size:
+
+![BigCodeBench Test Results](./pics/Bigcodebench.png)
+
+## 2. NaturalCodeBench & HumanEval
+NaturalCodeBench test results show that CodeGeeX4-ALL-9B achieves the best results in tasks such as code completion, code interpreter, code Q&A, code translation, and code repair:
+
+![NaturalCodeBench Test Results](./pics/NCB&HUMANEVAL.png)
+
+## 3. Code Needle In A Haystack
+
+CodeGeeX4-ALL-9B's context handling capability has reached 128K, an 8-fold increase compared to the previous generation model!
+
+For code large models under 10B parameters, accurately extracting information from massive amounts of code is a key challenge. CodeGeeX4-ALL-9B's upgraded support for 128K context enables it to process and utilize longer code files, and even information from project code, helping the model to understand complex and detail-rich code more deeply. Based on the longer context, CodeGeeX4-ALL-9B can handle more complex project-level tasks, accurately answering content from different code files and making modifications to the code even when the input length increases significantly.
+
+In the "Needle In A Haystack" (NIAH) evaluation, the CodeGeeX4-ALL-9B model demonstrated its ability to embed and retrieve code within contexts up to 128K, achieving a 100% retrieval accuracy.
+
+![NIAH_PYTHON Evaluation](./pics/NIAH_PYTHON.png)
+
+![NIAH_ALL_FILES Evaluation](./pics/NIAH_ALL.png)
+
+The above figures show the test results in a test set composed entirely of Python code, where an assignment statement such as `zhipu_codemodel = "codegeex"` (Needle) is inserted, and the model is tested on whether it can correctly answer the value of `zhipu_codemodel`. CodeGeeX4-All-9B completed the task 100%.
+
+## 4. Function Call Capabilities
+
+CodeGeeX4-ALL-9B is currently the only code large model that implements Function Call capabilities.
+
+The Berkeley Function Calling Leaderboard is the first test set that can comprehensively evaluate the function calling capabilities of large models. The AST dataset evaluates the model's calling capabilities for Java, JavaScript, and Python programs; the Executable dataset evaluates the model's function calling capabilities for real-world API scenarios.
+
+![Berkeley Function Calling Leaderboard](./pics/FunctionCall.png)
+
+CodeGeeX4-ALL-9B underwent comprehensive testing on the Berkeley Function Calling Leaderboard, including various forms of function calls, different function call scenarios, and function call executability tests, achieving the following results: a call success rate of over 90% in both AST and Exec test sets.
+
+## 5. Cross-File Completion
+
+Cross-File Evaluation is a multilingual benchmark built on diverse real-world repositories in Python, Java, TypeScript, and C#. It uses a static-analysis-based method to strictly require cross-file context for accurate code completion.
+
+| Model            | PYTHON EM | PYTHON ES | JAVA EM | JAVA ES | TypeScript EM | TypeScript ES | C# EM  | C# ES  |
+|------------------|------------|------------|----------|----------|----------------|----------------|---------|---------|
+| DeepSeekCoder-7B | 29.9       | 62.9       | 39.8     | 74.8     | 39             | 77             | 52.2    | 78.1    |
+| StarCoder2-7B    | 25.3       | 58         | 31.4     | 67.4     | 33.3           | 73.2           | 43.5    | 69.8    |
+| CodeLlama-7B     | 23.5       | 53.5       | 33.9     | 68.4     | 11.5           | 71.5           | 50.6    | 75.4    |
+| CodeGeeX-9B      | 32.3      | 70.3      | 48.6    | 84.4    | 35.3          | 78.0          | 48.0   | 84.8   |
--- a/metric/README_zh.md
+++ b/metric/README_zh.md
+# CodeGeeX4-ALL-9B
+
+## CodeGeeX4-ALL-9B：百亿参数以下性能最强的全能代码模型
+
+CodeGeeX4-ALL-9B作为最新一代CodeGeeX4系列模型的开源版本，在GLM4强大语言能力的基础上继续迭代，大幅增强代码生成能力。使用CodeGeeX4-ALL-9B单一模型，即可支持代码补全和生成、代码解释器、联网搜索、工具调用、仓库级长代码问答及生成等全面功能，覆盖了编程开发的各种场景。CodeGeeX4-ALL-9B在多个权威代码能力评测集，如NaturalCodeBench、BigCodeBench上都取得了极具竞争力的表现，是百亿参数量级以下性能最强的模型，甚至超过数倍规模的通用模型，在推理性能和模型效果上得到最佳平衡。
+
+### 1. 性能表现评测
+
+BigCodeBench测试结果显示，CodeGeeX4-ALL-9B在同等尺寸下效果最好：
+
+![BigCodeBench Test Results](./pics/Bigcodebench.png)
+
+NaturalCodeBench测试结果显示，CodeGeeX4-ALL-9B在代码补全、代码解释器、代码问答、代码翻译、代码修复等任务上均取得了最佳效果：
+
+![NaturalCodeBench测试结果](./pics/NCB&HUMANEVAL.png)
+
+### 2. CodeGeeX4-ALL-9B上下文处理能力
+
+CodeGeeX4-ALL-9B上下文处理能力达到了128K，相较于上一代模型增长8倍！
+
+对于参数量10B以下的代码大模型，从海量的代码中准确提取信息是一个关键性的挑战。CodeGeeX4-ALL-9B升级支持128K上下文，使其能够处理和利用更长代码文件、甚至是项目代码中的信息，有助于模型更深入理解复杂和细节丰富的代码。基于更长的上下文，CodeGeeX4-ALL-9B可以处理更复杂的项目级任务，在输入显著变长的情况下，依然能准确回答不同代码文件中的内容，并对代码作出修改。
+
+在“大海捞针”（Needle In A Haystack, NIAH）评估中，CodeGeeX4-ALL-9B模型展示了其在处理长达128K的上下文中进行代码的嵌入和检索能力，实现了100%的检索准确度。
+
+![NIAH_PYTHON评估](./pics/NIAH_PYTHON.png)
+
+![NIAH_ALL_FILES评估](./pics/NIAH_ALL.png)
+
+上图展示的是在一个全部由Python代码组成的测试集中，插入一个赋值语句如：`zhipu_codemodel = "codegeex"`（Needle），测试模型是否可以正确回答出`zhipu_codemodel`的值，CodeGeeX4-ALL-9B 100%完成任务。
+
+### 3. CodeGeeX4-ALL-9B 支持 Function Call 能力
+
+CodeGeeX4-ALL-9B是目前唯一一个实现Function Call的代码大模型。
+
+Berkeley Function Calling Leaderboard是第一个可全面评估大模型函数调用能力的测试集。其中AST数据集是评估模型对Java、JavaScript、Python程序的调用能力；Executable数据集是评估模型对真实场景API的函数调用能力。
+
+![Berkeley Function Calling Leaderboard](./pics/FunctionCall.png)
+
+CodeGeeX4-ALL-9B在Berkeley Function Calling Leaderboard上进行了全面的测试，包括各种形式的函数调用、不同的函数调用场景以及函数调用可执行性的测试，得到了以下结果：在AST和Exec测试集中调用成功率超过90%。
+
+
+### 4. CodeGeeX4-ALL-9B 跨文件补全
+Cross-File Evaluation是一个多语言的基准，建立在Python、Java、TypeScript和C#的多样化真实仓库之上。它使用基于静态分析的方法，严格要求跨文件上下文以实现准确的代码补全。
+
+| Model            | PYTHON EM | PYTHON ES | JAVA EM | JAVA ES | TypeScript EM | TypeScript ES | C# EM  | C# ES  |
+|------------------|------------|------------|----------|----------|----------------|----------------|---------|---------|
+| DeepSeekCoder-7B | 29.9       | 62.9       | 39.8     | 74.8     | 39             | 77             | 52.2    | 78.1    |
+| StarCoder2-7B    | 25.3       | 58         | 31.4     | 67.4     | 33.3           | 73.2           | 43.5    | 69.8    |
+| CodeLlama-7B     | 23.5       | 53.5       | 33.9     | 68.4     | 11.5           | 71.5           | 50.6    | 75.4    |
+| CodeGeeX-9B      | 32.3      | 70.3      | 48.6    | 84.4    | 35.3          | 78.0          | 48.0   | 84.8   |
\ No newline at end of file
--- a/metric/pics/Bigcodebench.png
+++ b/metric/pics/Bigcodebench.png
--- a/metric/pics/FunctionCall.png
+++ b/metric/pics/FunctionCall.png
--- a/metric/pics/NCB&HUMANEVAL.png
+++ b/metric/pics/NCB&HUMANEVAL.png
--- a/metric/pics/NIAH_ALL.png
+++ b/metric/pics/NIAH_ALL.png
--- a/metric/pics/NIAH_PYTHON.png
+++ b/metric/pics/NIAH_PYTHON.png
--- a/metric/pics/cce.jpg
+++ b/metric/pics/cce.jpg
--- a/model.properties
+++ b/model.properties
+# 模型唯一标识
+modelCode=793
+# 模型名称
+modelName=codegeex4_pytorch
+# 模型描述
+modelDescription=codegeex4是一个在GLM-4-9B上持续训练的多语言代码生成模型，显著增强了其代码生成能力
+# 应用场景
+appScenario=推理,训练,代码生成,制造,能源,教育
+# 框架类型
+frameType=pytorch
--- a/repodemo/.chainlit/config.toml
+++ b/repodemo/.chainlit/config.toml
+[project]
+# Whether to enable telemetry (default: true). No personal data is collected.
+enable_telemetry = false
+
+
+# List of environment variables to be provided by each user to use the app.
+user_env = []
+
+# Duration (in seconds) during which the session is saved when the connection is lost
+session_timeout = 3600
+
+# Enable third parties caching (e.g LangChain cache)
+cache = false
+
+# Authorized origins
+allow_origins = ["*"]
+
+# Follow symlink for asset mount (see https://github.com/Chainlit/chainlit/issues/317)
+# follow_symlink = false
+
+[features]
+# Process and display HTML in messages. This can be a security risk (see https://stackoverflow.com/questions/19603097/why-is-it-dangerous-to-render-user-generated-html-or-javascript)
+unsafe_allow_html = false
+
+# Process and display mathematical expressions. This can clash with "$" characters in messages.
+latex = false
+
+# Automatically tag threads with the current chat profile (if a chat profile is used)
+auto_tag_thread = true
+
+# Authorize users to spontaneously upload files with messages
+[features.spontaneous_file_upload]
+    enabled = false
+    accept = ["*/*"]
+    max_files = 20
+    max_size_mb = 500
+
+[features.audio]
+    # Threshold for audio recording
+    min_decibels = -45
+    # Delay for the user to start speaking in MS
+    initial_silence_timeout = 3000
+    # Delay for the user to continue speaking in MS. If the user stops speaking for this duration, the recording will stop.
+    silence_timeout = 1500
+    # Above this duration (MS), the recording will forcefully stop.
+    max_duration = 15000
+    # Duration of the audio chunks in MS
+    chunk_duration = 1000
+    # Sample rate of the audio
+    sample_rate = 44100
+
+[UI]
+# Name of the assistant.
+name = "CodeGeeX4 RepoDemo"
+
+# Description of the assistant. This is used for HTML tags.
+description = "CodeGeeX4项目级能力展示"
+
+# Large size content are by default collapsed for a cleaner ui
+default_collapse_content = true
+
+# Hide the chain of thought details from the user in the UI.
+hide_cot = false
+
+# Link to your github repo. This will add a github button in the UI's header.
+github = "https://github.com/CodeGeeX"
+
+# Specify a CSS file that can be used to customize the user interface.
+# The CSS file can be served from the public directory or via an external link.
+# custom_css = "/public/test.css"
+
+# Specify a Javascript file that can be used to customize the user interface.
+# The Javascript file can be served from the public directory.
+# custom_js = "/public/test.js"
+
+# Specify a custom font url.
+# custom_font = "https://fonts.googleapis.com/css2?family=Inter:wght@400;500;700&display=swap"
+
+# Specify a custom meta image url.
+custom_meta_image_url = "/public/logo_dark.png"
+
+# Specify a custom build directory for the frontend.
+# This can be used to customize the frontend code.
+# Be careful: If this is a relative path, it should not start with a slash.
+# custom_build = "./public/build"
+
+[UI.theme]
+    default = "dark"
+    layout = "wide"
+    #font_family = "Inter, sans-serif"
+# Override default MUI light theme. (Check theme.ts)
+[UI.theme.light]
+    #background = "#FAFAFA"
+    #paper = "#FFFFFF"
+
+    [UI.theme.light.primary]
+        #main = "#F80061"
+        #dark = "#980039"
+        #light = "#FFE7EB"
+    [UI.theme.light.text]
+        #primary = "#212121"
+        #secondary = "#616161"
+
+# Override default MUI dark theme. (Check theme.ts)
+[UI.theme.dark]
+    #background = "#FAFAFA"
+    #paper = "#FFFFFF"
+
+    [UI.theme.dark.primary]
+        #main = "#F80061"
+        #dark = "#980039"
+        #light = "#FFE7EB"
+    [UI.theme.dark.text]
+        #primary = "#EEEEEE"
+        #secondary = "#BDBDBD"
+
+[meta]
+generated_by = "1.1.305"
--- a/repodemo/.chainlit/translations/en-US.json
+++ b/repodemo/.chainlit/translations/en-US.json
--- a/repodemo/.chainlit/translations/zh-CN.json
+++ b/repodemo/.chainlit/translations/zh-CN.json
--- a/repodemo/.env
+++ b/repodemo/.env
+openai_api_key = ""
+openai_api_base = "https://open.bigmodel.cn/api/paas/v4/"
+model_name = "codegeex-4"
+bing_api_key = ""
\ No newline at end of file
--- a/repodemo/chainlit.md
+++ b/repodemo/chainlit.md
--- a/repodemo/chainlit_zh-CN.md
+++ b/repodemo/chainlit_zh-CN.md
--- a/repodemo/llm/api/codegeex4.py
+++ b/repodemo/llm/api/codegeex4.py