Update files

7d06d0f9 · yangzhong · 2f320edb · 7d06d0f9 · 7d06d0f9 · 7d06d0f9
Commit 7d06d0f9 authored Jun 26, 2025 by yangzhong
20 changed files
--- a/CITATION.cff
+++ b/CITATION.cff
+cff-version: 1.2.0
+message: "If you find our resources useful, please cite our paper as below."
+authors:
+- family-names: "Cui"
+  given-names: "Yiming"
+  orcid: "https://orcid.org/0000-0002-2452-375X"
+- family-names: "Yang"
+  given-names: "Ziqing"
+- family-names: "Yao"
+  given-names: "Xin"  
+title: "Chinese LLaMA and Alpaca 2"
+version: 1.0
+date-released: 2023-07-28
+url: "https://github.com/ymcui/Chinese-LLaMA-Alpaca-2"
+preferred-citation: 
+  type: article
+  authors:
+  - family-names: "Cui"
+    given-names: "Yiming"
+    orcid: "https://orcid.org/0000-0002-2452-375X"
+  - family-names: "Yang"
+    given-names: "Ziqing"
+  - family-names: "Yao"
+    given-names: "Xin"  
+  title: "Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca"
+  journal: "arXiv pre-print"
+  year: 2023
+  url: "https://arxiv.org/abs/2304.08177"
\ No newline at end of file
--- a/LICENSE
+++ b/LICENSE
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+   1. Definitions.
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+   END OF TERMS AND CONDITIONS
+   APPENDIX: How to apply the Apache License to your work.
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+   Copyright 2023 Yiming Cui, Ziqing Yang, Xin Yao
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+       http://www.apache.org/licenses/LICENSE-2.0
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
\ No newline at end of file
--- a/README_EN.md
+++ b/README_EN.md
--- a/download_model.py
+++ b/download_model.py
+import os
+os.environ['CURL_CA_BUNDLE'] = ''
+os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
+from huggingface_hub import hf_hub_download, snapshot_download
+snapshot_download(repo_id="hfl/chinese-llama-2-7b", local_dir='./pre_model')
--- a/examples/README.md
+++ b/examples/README.md
+## 输出示例
+本目录针对Chinese-Alpaca-2模型给出参考输出样例，其目的是帮助用户快速了解模型输出情况，同时也有助于排查下载的模型是否和预期输出一致。输出样本来自于模型在线对战题库（共10个类别），每个类别选择3道题进行展示。
+- [Chinese-Alpaca-2-7B输出样例](./alpaca-2-7b.md)
+- [Chinese-Alpaca-2-13B输出样例](./alpaca-2-13b.md)
+**📊 模型在线对战**：[http://llm-arena.ymcui.com](http://llm-arena.ymcui.com/)
--- a/examples/alpaca-2-13b.md
+++ b/examples/alpaca-2-13b.md
--- a/examples/alpaca-2-7b.md
+++ b/examples/alpaca-2-7b.md
--- a/notebooks/gradio_web_demo.ipynb
+++ b/notebooks/gradio_web_demo.ipynb
--- a/prompts/README.md
+++ b/prompts/README.md
+## 系统指令 System Prompts
+### alpaca-2.txt (default)
+这个文件是训练时采用的默认系统指令，内容极简，因此回复长度上略短于一代Pro系列模型。
+This file is the default system prompt used in the SFT phase, which is simple. Thus, the length of the response may be shorter than 1st-gen Pro series models.
+### alpaca-2-long.txt
+这个文件是增加模型回复内容长度的系统指令示例，用户可根据实际情况自行参照修改。但建议保留最原始的`alpaca-2.txt`中的内容，在此基础上进行自定义系统指令的编写。
+This file is an improved system prompt sample to extend the response length. The users can modify this prompt if necessary. However, we suggest keep the original content in `alpaca-2.txt` and add your customized prompt based on this.
--- a/prompts/alpaca-2-long.txt
+++ b/prompts/alpaca-2-long.txt
+You are a helpful assistant. 你是一个乐于助人的助手。请你提供专业、有逻辑、内容真实、有价值的详细回复。
\ No newline at end of file
--- a/prompts/alpaca-2.txt
+++ b/prompts/alpaca-2.txt
+You are a helpful assistant. 你是一个乐于助人的助手。
\ No newline at end of file
--- a/requirements.txt
+++ b/requirements.txt
+peft==0.3.0
+#torch==2.0.1
+transformers==4.35.0
+sentencepiece==0.1.99
+bitsandbytes==0.41.1
--- a/scripts/README.md
+++ b/scripts/README.md
+# 代码与脚本 Code and Scripts
+### training/
+预训练与指令精调代码，Wiki：
+- 预训练：[https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/pt_scripts_zh](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/pt_scripts_zh)
+- 指令精调：[https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/sft_scripts_zh](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/sft_scripts_zh)
+Pre-training and instruction finetuning code, Wiki:
+- Pre-training: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/pt_scripts_en
+- Instruction finetuning: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/sft_scripts_en
+### inference/
+使用🤗transformers进行推理，Wiki：[https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/inference_with_transformers_zh](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/inference_with_transformers_zh)
+Inference using 🤗transformers, Wiki: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/inference_with_transformers_en
+### openai_server_demo/
+使用fastapi实现的仿OPENAI API风格的服务器，Wiki：[https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/api_calls_zh](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/api_calls_zh)
+A server that implements OPENAI API using fastapi, Wiki: [https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/api_calls_en](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/api_calls_en)
+### ceval/
+C-Eval评测脚本，Wiki：[https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/ceval_zh](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/ceval_zh)
+Inference script for C-Eval, Wiki: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/ceval_en
+### cmmlu/
+CMMLU评测脚本，Wiki：[https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/cmmlu_zh](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/cmmlu_zh)
+Inference script for CMMLU, Wiki: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/cmmlu_en
+### longbench/
+LongBench评测脚本，Wiki：[https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/longbench_zh](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/longbench_zh)
+Inference script for LongBench, Wiki: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/longbench_en
+### llama-cpp/
+llama.cpp启动脚本、server脚本，Wiki：[https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/llamacpp_zh](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/llamacpp_zh)
+launch script and server script for llama.cpp, Wiki: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/llamacpp_en
+### attn_ang_long_ctx_patches.py
+Memory efficient attention补丁和NTK上下文拓展方法补丁。
+Patches for memory efficient attention and NTK context size scaling.
+### merge_llama2_with_chinese_lora_low_mem.py
+低资源版合并LLaMA-2/Alpaca-2 LoRA脚本，Wiki：[https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/manual_conversion_zh](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/manual_conversion_zh)
+Script for merging LLaMA-2/Alpaca-2 LoRA (low-resource version). Wiki: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/wiki/manual_conversion_en
+### tokenizer/
+Chinese-LLaMA-2 & Chinese-Alpaca-2 tokenizer
\ No newline at end of file
--- a/scripts/attn_and_long_ctx_patches.py
+++ b/scripts/attn_and_long_ctx_patches.py
+import torch
+from torch import nn
+from typing import Optional, Tuple, Union
+import transformers
+from transformers.models.llama.modeling_llama import apply_rotary_pos_emb, rotate_half
+import math
+try:
+    from xformers import ops as xops
+except ImportError:
+    xops = None
+    print(
+        "Xformers is not installed correctly. If you want to use memory_efficient_attention use the following command to install Xformers\npip install xformers."
+    )
+STORE_KV_BEFORE_ROPE = False
+USE_MEM_EFF_ATTENTION = False
+ALPHA = 1.0
+AUTO_COEFF = 1.0
+SCALING_FACTOR = None
+def apply_rotary_pos_emb_single(q, cos, sin, position_ids):
+    # The first two dimensions of cos and sin are always 1, so we can `squeeze` them.
+    cos = cos.squeeze(1).squeeze(0)  # [seq_len, dim]
+    sin = sin.squeeze(1).squeeze(0)  # [seq_len, dim]
+    cos = cos[position_ids].unsqueeze(1)  # [bs, 1, seq_len, dim]
+    sin = sin[position_ids].unsqueeze(1)  # [bs, 1, seq_len, dim]
+    q_embed = (q * cos) + (rotate_half(q) * sin)
+    return q_embed
+def xformers_forward(
+    self,
+    hidden_states: torch.Tensor,
+    attention_mask: Optional[torch.Tensor] = None,
+    position_ids: Optional[torch.LongTensor] = None,
+    past_key_value: Optional[Tuple[torch.Tensor]] = None,
+    output_attentions: bool = False,
+    use_cache: bool = False,
+    padding_mask=None,
+) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:
+    bsz, q_len, _ = hidden_states.size()
+    query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
+    key_states = self.k_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
+    value_states = self.v_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
+    kv_seq_len = key_states.shape[-2]
+    past_kv_len = 0
+    if past_key_value is not None:
+        past_kv_len = past_key_value[0].shape[-2]
+        kv_seq_len += past_kv_len
+    if STORE_KV_BEFORE_ROPE is False:
+        cos, sin = self.rotary_emb(value_states, seq_len=kv_seq_len)
+        query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin, position_ids)
+        # [bsz, nh, t, hd]
+        if past_key_value is not None:
+            # reuse k, v, self_attention
+            key_states = torch.cat([past_key_value[0], key_states], dim=2)
+            value_states = torch.cat([past_key_value[1], value_states], dim=2)
+        past_key_value = (key_states, value_states) if use_cache else None
+    else:
+        if past_key_value is not None:
+            # reuse k, v, self_attention
+            key_states = torch.cat([past_key_value[0], key_states], dim=2)
+            value_states = torch.cat([past_key_value[1], value_states], dim=2)
+        past_key_value = (key_states, value_states) if use_cache else None
+        cos, sin = self.rotary_emb(value_states, seq_len=kv_seq_len)
+        query_states = apply_rotary_pos_emb_single(query_states, cos, sin, position_ids)
+        position_ids = torch.arange(kv_seq_len, dtype=torch.long, device=cos.device)
+        position_ids = position_ids.unsqueeze(0).view(-1, kv_seq_len)
+        key_states = apply_rotary_pos_emb_single(key_states, cos, sin, position_ids)
+    pad_query = False
+    if xops is not None and USE_MEM_EFF_ATTENTION:
+        attn_weights = None
+        query_states = query_states.transpose(1, 2)
+        key_states = key_states.transpose(1, 2)
+        value_states = value_states.transpose(1, 2)
+        if query_states.size(1)==1 and key_states.size(1)>1:
+            attn_bias = None
+        elif query_states.size(1)<key_states.size(1) and key_states.size(1)>1 and past_kv_len > 0:
+            attn_bias = xops.LowerTriangularMask()
+            query_states = torch.cat(
+                (
+                    torch.full(
+                        (bsz, past_kv_len, self.num_heads, self.head_dim),
+                        0.0,
+                        dtype=query_states.dtype,
+                        device=query_states.device,
+                    ),
+                    query_states,
+                ),
+                dim=1,
+            )
+            pad_query = True
+        else:
+            attn_bias = xops.LowerTriangularMask()
+        attn_output = xops.memory_efficient_attention(
+            query_states, key_states, value_states, attn_bias=attn_bias, p=0)
+    else:
+        attn_weights = torch.matmul(query_states, key_states.transpose(2, 3)) / math.sqrt(self.head_dim)
+        if attn_weights.size() != (bsz, self.num_heads, q_len, kv_seq_len):
+            raise ValueError(
+                f"Attention weights should be of size {(bsz * self.num_heads, q_len, kv_seq_len)}, but is"
+                f" {attn_weights.size()}"
+            )
+        if attention_mask is not None:
+            if attention_mask.size() != (bsz, 1, q_len, kv_seq_len):
+                raise ValueError(
+                    f"Attention mask should be of size {(bsz, 1, q_len, kv_seq_len)}, but is {attention_mask.size()}"
+                )
+            attn_weights = attn_weights + attention_mask
+            attn_weights = torch.max(
+                attn_weights, torch.tensor(torch.finfo(attn_weights.dtype).min, device=attn_weights.device)
+            )
+        # upcast attention to fp32
+        attn_weights = nn.functional.softmax(attn_weights, dim=-1, dtype=torch.float32).to(query_states.dtype)
+        attn_output = torch.matmul(attn_weights, value_states)
+        if attn_output.size() != (bsz, self.num_heads, q_len, self.head_dim):
+            raise ValueError(
+                f"`attn_output` should be of size {(bsz, self.num_heads, q_len, self.head_dim)}, but is"
+                f" {attn_output.size()}"
+            )
+        attn_output = attn_output.transpose(1, 2)
+    if pad_query:
+        attn_output = attn_output[:,past_kv_len:]
+    attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
+    attn_output = self.o_proj(attn_output)
+    if not output_attentions:
+        attn_weights = None
+    return attn_output, attn_weights, past_key_value
+old_init = transformers.models.llama.modeling_llama.LlamaRotaryEmbedding.__init__
+def _set_cos_sin_cache(self, seq_len, device, dtype):
+    self.max_seq_len_cached = seq_len
+    t = torch.arange(self.max_seq_len_cached, device=device, dtype=torch.float32)
+    t = t / self.scaling_factor
+    freqs = torch.einsum("i,j->ij", t, self.ntk_inv_freq.to(device))
+    # Different from paper, but it uses a different permutation in order to obtain the same calculation
+    emb = torch.cat((freqs, freqs), dim=-1)
+    self.register_buffer("cos_cached", emb.cos().to(dtype), persistent=False)
+    self.register_buffer("sin_cached", emb.sin().to(dtype), persistent=False)
+def adaptive_ntk_init(self, dim, max_position_embeddings=2048, base=10000, device=None, scaling_factor=None):
+    self.alpha = ALPHA
+    if SCALING_FACTOR is None:
+        self.scaling_factor = scaling_factor or 1.0
+    else:
+        self.scaling_factor = SCALING_FACTOR
+    if isinstance(ALPHA,(float,int)):
+        base = base * ALPHA ** (dim / (dim-2))
+        self.base = base
+    elif ALPHA=='auto':
+        self.base = base
+    else:
+        raise ValueError(ALPHA)
+    old_init(self, dim, max_position_embeddings, base, device)
+    self.ntk_inv_freq = 1.0 / (base ** (torch.arange(0, dim, 2).float().to(device) / dim))
+    self._set_cos_sin_cache = _set_cos_sin_cache
+    self._set_cos_sin_cache(
+        self, seq_len=max_position_embeddings, device=self.ntk_inv_freq.device, dtype=torch.get_default_dtype()
+    )
+def adaptive_ntk_forward(self, x, seq_len=None):
+    if seq_len > self.max_seq_len_cached:
+        if isinstance(self.alpha,(float,int)):
+            self._set_cos_sin_cache(self, seq_len=seq_len, device=x.device, dtype=x.dtype)
+        elif self.alpha=='auto':
+            t = torch.arange(seq_len, device=x.device, dtype=torch.float32)
+            t = t / self.scaling_factor
+            dim = self.dim
+            alpha = (seq_len / (self.max_position_embeddings/2) - 1) * AUTO_COEFF
+            base = self.base * alpha ** (dim / (dim-2))
+            ntk_inv_freq = 1.0 / (base ** (torch.arange(0, dim, 2).float().to(x.device) / dim ))
+            freqs = torch.einsum("i,j->ij", t, ntk_inv_freq)
+            emb = torch.cat((freqs, freqs), dim=-1).to(x.device)
+            cos_cached = emb.cos()
+            sin_cached = emb.sin()
+            return (
+                cos_cached[:seq_len].to(dtype=x.dtype),
+                sin_cached[:seq_len].to(dtype=x.dtype)
+            )
+    return (
+        self.cos_cached[:seq_len].to(dtype=x.dtype),
+        self.sin_cached[:seq_len].to(dtype=x.dtype)
+    )
+def apply_attention_patch(
+        use_memory_efficient_attention=False,
+        store_kv_before_rope=False
+        ):
+    global USE_MEM_EFF_ATTENTION, STORE_KV_BEFORE_ROPE
+    if use_memory_efficient_attention is True and xops is not None:
+        USE_MEM_EFF_ATTENTION = use_memory_efficient_attention
+    print("USE_XFORMERS_ATTENTION: ", USE_MEM_EFF_ATTENTION)
+    STORE_KV_BEFORE_ROPE = store_kv_before_rope
+    print("STORE_KV_BEFORE_ROPE:", STORE_KV_BEFORE_ROPE)
+    transformers.models.llama.modeling_llama.LlamaAttention.forward = xformers_forward
+def apply_ntk_scaling_patch(alpha: Union[float,str], scaling_factor: Optional[float] = None):
+    global ALPHA
+    global SCALING_FACTOR
+    ALPHA = alpha
+    SCALING_FACTOR = scaling_factor
+    try:
+        ALPHA = float(ALPHA)
+    except ValueError:
+        if ALPHA!="auto":
+            raise ValueError(f"Alpha can only be a float or 'auto', but given {ALPHA}")
+    print(f"Apply NTK scaling with ALPHA={ALPHA}")
+    if scaling_factor is None:
+        print(f"The value of scaling factor will be read from model config file, or set to 1.")
+    else:
+        print(f"Warning: scaling factor is set to {SCALING_FACTOR}. \
+              If you set the value by hand, do not forget to update \
+              max_position_embeddings in the model config file.")
+    transformers.models.llama.modeling_llama.LlamaRotaryEmbedding.__init__ = adaptive_ntk_init
+    if hasattr(transformers.models.llama.modeling_llama,'LlamaLinearScalingRotaryEmbedding'):
+        transformers.models.llama.modeling_llama.LlamaLinearScalingRotaryEmbedding.__init__ = adaptive_ntk_init
+    transformers.models.llama.modeling_llama.LlamaRotaryEmbedding.forward = adaptive_ntk_forward
\ No newline at end of file
--- a/scripts/ceval/eval.py
+++ b/scripts/ceval/eval.py
+# This code is modified from C-Eval Project: https://github.com/SJTU-LIT/ceval
+import os
+import argparse
+import pandas as pd
+import torch
+import json
+from llama_evaluator import Llama_Evaluator
+import time
+choices = ["A", "B", "C", "D"]
+def main(args, evaluator,take):
+    assert os.path.exists("subject_mapping.json"), "subject_mapping.json not found!"
+    with open("subject_mapping.json") as f:
+        subject_mapping = json.load(f)
+    filenames = os.listdir("data/val")
+    subject_list = [val_file.replace("_val.csv","") for val_file in filenames]
+    accuracy, summary = {}, {}
+    run_date=time.strftime('%Y-%m-%d_%H-%M-%S',time.localtime(time.time()))
+    output_dir = args.output_dir
+    save_result_dir=os.path.join(output_dir,f"take{take}")
+    if not os.path.exists(save_result_dir):
+        os.makedirs(save_result_dir,exist_ok=True)
+    all_answers = {}
+    for index,subject_name in enumerate(subject_list):
+        print(f"{index/len(subject_list)} Inference starts at {run_date} on {args.model_path} with subject of {subject_name}!")
+        val_file_path=os.path.join('data/val',f'{subject_name}_val.csv')
+        dev_file_path=os.path.join('data/dev',f'{subject_name}_dev.csv')
+        test_file_path=os.path.join('data/test',f'{subject_name}_test.csv')
+        val_df=pd.read_csv(val_file_path) if args.do_test is False else pd.read_csv(test_file_path)
+        dev_df=pd.read_csv(dev_file_path) if args.few_shot else None
+        correct_ratio, answers = evaluator.eval_subject(subject_name, val_df, dev_df,
+            save_result_dir=save_result_dir if args.do_save_csv else None,
+            few_shot=args.few_shot,
+            cot=args.cot,
+            with_prompt=args.with_prompt,
+            constrained_decoding=args.constrained_decoding,
+            do_test=args.do_test)
+        print(f"Subject: {subject_name}")
+        print(f"Acc: {correct_ratio}")
+        accuracy[subject_name] = correct_ratio
+        summary[subject_name] = {"score":correct_ratio,
+                                 "num":len(val_df),
+                                 "correct":correct_ratio*len(val_df)/100}
+        all_answers[subject_name] = answers
+    json.dump(all_answers,open(save_result_dir+'/submission.json','w'),ensure_ascii=False,indent=4)
+    print("Accuracy:")
+    for k, v in accuracy.items():
+        print(k, ": ", v)
+    total_num = 0
+    total_correct = 0
+    summary['grouped'] = {
+        "STEM": {"correct": 0.0, "num": 0}, 
+        "Social Science": {"correct": 0.0, "num": 0}, 
+        "Humanities": {"correct": 0.0, "num": 0}, 
+        "Other": {"correct": 0.0, "num": 0}
+        }
+    for subj, info in subject_mapping.items():
+        group = info[2]
+        summary['grouped'][group]["num"]   += summary[subj]['num']
+        summary['grouped'][group]["correct"] += summary[subj]['correct']
+    for group, info in summary['grouped'].items():
+        info['score'] = info["correct"] / info["num"]
+        total_num += info["num"]
+        total_correct += info["correct"]
+    summary['All'] = {"score": total_correct / total_num, "num": total_num, "correct": total_correct}
+    json.dump(summary,open(save_result_dir+'/summary.json','w'),ensure_ascii=False,indent=2)
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--model_path", type=str)
+    parser.add_argument("--cot",choices=["False","True"], default="False")
+    parser.add_argument("--few_shot", choices=["False","True"], default="True")
+    parser.add_argument("--ntrain", "-k", type=int, default=5)
+    parser.add_argument("--with_prompt", choices=["False","True"], default="False")
+    parser.add_argument("--constrained_decoding", choices=["False","True"], default="True")
+    parser.add_argument("--temperature",type=float,default=0.2)
+    parser.add_argument("--n_times", default=1,type=int)
+    parser.add_argument("--do_save_csv", choices=["False","True"], default="False")
+    parser.add_argument("--output_dir", type=str)
+    parser.add_argument("--do_test", choices=["False","True"], default="False")
+    parser.add_argument("--verbose", action="store_true", help="Print detailed information of each example.")
+    args = parser.parse_args()
+    args.cot = args.cot == "True"
+    args.few_shot = args.few_shot == "True"
+    args.with_prompt = args.with_prompt == "True"
+    args.constrained_decoding = args.constrained_decoding == "True"
+    args.do_test = args.do_test == "True"
+    args.do_save_csv = args.do_save_csv == "True"
+    if args.constrained_decoding is True:
+        args.n_times=max(args.n_times,1)
+    print(args)
+    device = torch.device(0)
+    print(device)
+    evaluator=Llama_Evaluator(
+        choices=choices,
+        k=args.ntrain,
+        model_path=args.model_path,
+        device=device,
+        temperature = args.temperature,
+        verbose = args.verbose
+    )
+    for i in range(args.n_times):
+        main(args,evaluator=evaluator,take=i)
--- a/scripts/ceval/evaluator.py
+++ b/scripts/ceval/evaluator.py
+# This code is modified from C-Eval Project: https://github.com/SJTU-LIT/ceval
+import string
+class Evaluator:
+    def __init__(self, choices, model_name, k=-1):
+        self.choices = choices
+        self.model_name = model_name
+        self.k = k
+        self.puncs = list(string.punctuation)
+    def format_example(self, line, include_answer=True):
+        example = line['question']
+        for choice in self.choices:
+            example += f'\n{choice}. {line[f"{choice}"]}'
+        example += '\n答案：'
+        if include_answer:
+            example += f'{line["answer"]}\n\n'
+        return example
+    def generate_few_shot_prompt(self, subject, dev_df):
+        prompt = f"以下是中国关于{subject}考试的单项选择题，请选出其中的正确答案。\n\n"
+        k = self.k
+        if self.k == -1:
+            k = dev_df.shape[0]
+        for i in range(k):
+            prompt += self.format_example(dev_df.iloc[i, :])
+        return prompt
+    def eval_subject(self, subject_name, test_df, dev_df=None, few_shot=False, save_result_dir=None):
+        pass
+    def normalize_answer(self,s):
+        def white_space_fix(text):
+            return ' '.join(text.split())
+        def remove_punc(text):
+            exclude=set(self.puncs)
+            return ''.join(ch for ch in text if ch not in exclude)
+        def lower(text):
+            return text.lower()
+        return white_space_fix(remove_punc(lower(s)))
+    def exact_match(self,pred, target):
+        return self.normalize_answer(pred)==self.normalize_answer(target)
--- a/scripts/ceval/llama_evaluator.py
+++ b/scripts/ceval/llama_evaluator.py
+# This code is modified from C-Eval Project: https://github.com/SJTU-LIT/ceval
+import os
+import re
+from tqdm import tqdm
+import random
+import numpy as np
+import torch
+from transformers import AutoModelForCausalLM, LlamaTokenizer
+from transformers import GenerationConfig
+from evaluator import Evaluator
+DEFAULT_SYSTEM_PROMPT = """You are a helpful assistant. 你是一个乐于助人的助手。"""
+class Llama_Evaluator(Evaluator):
+    def __init__(self, choices, k, model_path, device, temperature=0.2, verbose=False):
+        super(Llama_Evaluator, self).__init__(choices, model_path, k)
+        load_type = torch.float16
+        self.model_path = model_path
+        self.device = device
+        self.verbose = verbose
+        self.tokenizer = LlamaTokenizer.from_pretrained(model_path, legacy=True)
+        self.model = AutoModelForCausalLM.from_pretrained(
+            model_path,
+            load_in_8bit=False,
+            torch_dtype=load_type,
+            low_cpu_mem_usage=True,
+            device_map='auto',
+            trust_remote_code=True)
+        self.generation_config = GenerationConfig(
+            temperature=temperature,
+            top_k=40,
+            top_p=0.9,
+            do_sample=True,
+            num_beams=1,
+            repetition_penalty=1.1,
+            max_new_tokens=20
+        )
+        self.sA_id = self.tokenizer.encode("A", add_special_tokens=False)[0]
+        self.sB_id = self.tokenizer.encode("B", add_special_tokens=False)[0]
+        self.sC_id = self.tokenizer.encode("C", add_special_tokens=False)[0]
+        self.sD_id = self.tokenizer.encode("D", add_special_tokens=False)[0]
+        self.A_id = self.tokenizer.encode("：A")[-1]
+        self.B_id = self.tokenizer.encode("：B")[-1]
+        self.C_id = self.tokenizer.encode("：C")[-1]
+        self.D_id = self.tokenizer.encode("：D")[-1]
+    def eval_subject(self, subject_name,
+            test_df,
+            dev_df=None,
+            few_shot=False,
+            cot=False,
+            save_result_dir=None,
+            with_prompt=False,
+            constrained_decoding=False,
+            do_test=False):
+        all_answers = {}
+        if constrained_decoding is True:
+            self.generation_config.output_scores = True
+            self.generation_config.return_dict_in_generate = True
+            self.generation_config.max_new_tokens = 1
+            self.generation_config.top_p = 1.0
+            self.generation_config.top_k = 0
+        correct_num = 0
+        if save_result_dir:
+            result = []
+            score = []
+        if few_shot:
+            if with_prompt:
+                history = self.generate_alpaca2_few_shot_prompt(subject_name, dev_df, cot=cot)
+            else:
+                history = self.generate_llama2_few_shot_prompt(subject_name, dev_df, cot=cot)
+        else:
+            history = ''
+        answers = ['NA'] * len(test_df) if do_test is True else list(test_df['answer'])
+        for row_index, row in tqdm(test_df.iterrows(), total=len(test_df)):
+            question = self.format_example(row, include_answer=False, cot=cot,with_prompt=with_prompt)
+            instruction = question
+            if with_prompt:
+                prompt_template = (
+                                        "[INST] <<SYS>>\n"
+                                        "{system_prompt}\n"
+                                        "<</SYS>>\n\n"
+                                        "{instruction} [/INST]"
+                                    )
+                instruction = prompt_template.format_map({'instruction': instruction,'system_prompt':DEFAULT_SYSTEM_PROMPT})
+            instruction = history + instruction
+            inputs = self.tokenizer(instruction, return_tensors="pt")
+            generation_output = self.model.generate(
+                    input_ids = inputs["input_ids"].to(self.device),
+                    attention_mask = inputs['attention_mask'].to(self.device),
+                    eos_token_id=self.tokenizer.eos_token_id,
+                    pad_token_id=self.tokenizer.pad_token_id,
+                    generation_config = self.generation_config
+                )
+            batch_size, length = inputs.input_ids.shape
+            if constrained_decoding is True:
+                logits = generation_output.scores[0][0]
+                logits = logits.float().cpu().detach()
+                choices1_logits = logits[[self.sA_id,self.sB_id,self.sC_id,self.sD_id]]
+                choices2_logits = logits[[self.A_id,self.B_id,self.C_id,self.D_id]]
+                choicesAll_logits = (choices1_logits + choices2_logits).numpy()
+                assert not (np.any(np.isinf(choicesAll_logits)) or np.any(np.isnan(choicesAll_logits)))
+                ans = {0: "A", 1: "B", 2: "C", 3: "D"}[np.argmax(choicesAll_logits)]
+                response = self.tokenizer.decode([logits.argmax(-1).item()])
+            else:
+                response = self.tokenizer.decode(generation_output[0, length:], skip_special_tokens=True)
+                ans, direct_extract = self.extract_answer(row, response)
+            if ans == answers[row_index]:
+                correct_num += 1
+                correct = 1
+            else:
+                correct = 0
+            if self.verbose is True:
+                print(f"\n======={str(row_index)}=======")
+                print(f"question: {question}\n")
+                print(f"response: {response}\n")
+                print(f"extracted answer: {ans}")
+                print(f"ground truth: {answers[row_index]} \n")
+            if save_result_dir:
+                result.append(response)
+                score.append(correct)
+            all_answers[str(row_index)] = ans
+        correct_ratio = 100*correct_num/len(answers)
+        if save_result_dir:
+            test_df['model_output'] = result
+            test_df['correctness'] = score
+            test_df.to_csv(os.path.join(save_result_dir, f'{subject_name}_test.csv'))
+        return correct_ratio, all_answers
+    def format_example(self, line, include_answer=True, cot=False, with_prompt=False):
+        example = line['question']
+        for choice in self.choices:
+            example += f'\n{choice}. {line[f"{choice}"]}'
+        if include_answer:
+            if cot:
+                example += "\n答案：让我们一步一步思考，\n" + \
+                    line["explanation"] + f"\n所以答案是{line['answer']}。\n\n"
+            else:
+                example += '\n答案：' + line["answer"] + '\n\n'
+        else:
+            if with_prompt is False:
+                if cot:
+                    example += "\n答案：让我们一步一步思考，\n1."
+                else:
+                    example += '\n答案：'
+            else:
+                if cot:
+                    example += "\n答案是什么？让我们一步一步思考，\n1."
+                else:
+                    example += '\n答案：'
+        return example
+    def generate_llama2_few_shot_prompt(self, subject, dev_df, cot=False):
+        prompt = f"以下是中国关于{subject}考试的单项选择题，请选出其中的正确答案。\n\n"
+        k = self.k
+        if self.k == -1:
+            k = dev_df.shape[0]
+        for i in range(k):
+            prompt += self.format_example(
+                dev_df.iloc[i, :],
+                include_answer=True,
+                cot=cot
+            )
+        return prompt
+    def generate_alpaca2_few_shot_prompt(self, subject, dev_df, cot=False):
+        prompt = f"以下是中国关于{subject}考试的单项选择题，请选出其中的正确答案。\n\n"
+        prompt_template = (
+            "[INST] <<SYS>>\n"
+            "{system_prompt}\n"
+            "<</SYS>>\n\n"
+            "{instruction} [/INST]好的，我会结合{subject}相关知识回答"
+        )
+        prompt = prompt_template.format_map({'instruction':prompt,'system_prompt':DEFAULT_SYSTEM_PROMPT,'subject':subject})
+        k = self.k
+        if self.k == -1:
+            k = dev_df.shape[0]
+        for i in range(k):
+            line = dev_df.iloc[i, :]
+            q=line['question']
+            for choice in self.choices:
+                q += f'\n{choice}. {line[f"{choice}"]}'
+            a = line['answer']
+            prompt += "[INST] "+q+"\n答案：[/INST]"+a+"\n"
+        return prompt
+    def extract_answer(self, line, gen_ans):
+        m = re.findall(r'所以答案是(.+?)。', gen_ans, re.M)
+        if len(m) > 0 and m[-1] in self.choices:
+            return m[-1], True
+        answer_patterns = [
+            r'([ABCD])是正确的',
+            r'选项([ABCD])正确',
+            r'答案为([ABCD])',
+            r'答案是([ABCD])',
+            r'答案([ABCD])',
+            r'选择([ABCD])',
+            r'答案：([ABCD])',
+            r'选择答案([ABCD])'
+        ]
+        # RE extraction
+        for answer_pattern in answer_patterns:
+            m = re.search(answer_pattern, gen_ans, re.M)
+            if m:
+                answer = m.group(1)
+                return answer, False
+        # only containing one choice-character
+        m = re.findall(r'[ABCD]', gen_ans, re.M)
+        if len(m) >= 1:
+            answer = m[0]
+            return answer, False
+        # only containing one choice-context
+        choices_dict = {}
+        pattern = ""
+        for c in self.choices:
+            choices_dict[str(line[f'{c}'])] = c
+            pattern += re.escape(str(line[f'{c}']))+"|"
+        pattern = pattern[:-1]
+        m = re.findall(pattern, gen_ans, re.M)
+        print("w/ escape:",repr(pattern),gen_ans,(len(m)>=1))
+        if len(m) >= 1:
+            answer = choices_dict[m[0]]
+            return answer, False
+        return  random.choice('ABCD'), False
--- a/scripts/ceval/subject_mapping.json
+++ b/scripts/ceval/subject_mapping.json
+{
+	"computer_network": [
+		"Computer Network",
+		"\u8ba1\u7b97\u673a\u7f51\u7edc",
+		"STEM"
+	],
+	"operating_system": [
+		"Operating System",
+		"\u64cd\u4f5c\u7cfb\u7edf",
+		"STEM"
+	],
+	"computer_architecture": [
+		"Computer Architecture",
+		"\u8ba1\u7b97\u673a\u7ec4\u6210",
+		"STEM"
+	],
+	"college_programming": [
+		"College Programming",
+		"\u5927\u5b66\u7f16\u7a0b",
+		"STEM"
+	],
+	"college_physics": [
+		"College Physics",
+		"\u5927\u5b66\u7269\u7406",
+		"STEM"
+	],
+	"college_chemistry": [
+		"College Chemistry",
+		"\u5927\u5b66\u5316\u5b66",
+		"STEM"
+	],
+	"advanced_mathematics": [
+		"Advanced Mathematics",
+		"\u9ad8\u7b49\u6570\u5b66",
+		"STEM"
+	],
+	"probability_and_statistics": [
+		"Probability and Statistics",
+		"\u6982\u7387\u7edf\u8ba1",
+		"STEM"
+	],
+	"discrete_mathematics": [
+		"Discrete Mathematics",
+		"\u79bb\u6563\u6570\u5b66",
+		"STEM"
+	],
+	"electrical_engineer": [
+		"Electrical Engineer",
+		"\u6ce8\u518c\u7535\u6c14\u5de5\u7a0b\u5e08",
+		"STEM"
+	],
+	"metrology_engineer": [
+		"Metrology Engineer",
+		"\u6ce8\u518c\u8ba1\u91cf\u5e08",
+		"STEM"
+	],
+	"high_school_mathematics": [
+		"High School Mathematics",
+		"\u9ad8\u4e2d\u6570\u5b66",
+		"STEM"
+	],
+	"high_school_physics": [
+		"High School Physics",
+		"\u9ad8\u4e2d\u7269\u7406",
+		"STEM"
+	],
+	"high_school_chemistry": [
+		"High School Chemistry",
+		"\u9ad8\u4e2d\u5316\u5b66",
+		"STEM"
+	],
+	"high_school_biology": [
+		"High School Biology",
+		"\u9ad8\u4e2d\u751f\u7269",
+		"STEM"
+	],
+	"middle_school_mathematics": [
+		"Middle School Mathematics",
+		"\u521d\u4e2d\u6570\u5b66",
+		"STEM"
+	],
+	"middle_school_biology": [
+		"Middle School Biology",
+		"\u521d\u4e2d\u751f\u7269",
+		"STEM"
+	],
+	"middle_school_physics": [
+		"Middle School Physics",
+		"\u521d\u4e2d\u7269\u7406",
+		"STEM"
+	],
+	"middle_school_chemistry": [
+		"Middle School Chemistry",
+		"\u521d\u4e2d\u5316\u5b66",
+		"STEM"
+	],
+	"veterinary_medicine": [
+		"Veterinary Medicine",
+		"\u517d\u533b\u5b66",
+		"STEM"
+	],
+	"college_economics": [
+		"College Economics",
+		"\u5927\u5b66\u7ecf\u6d4e\u5b66",
+		"Social Science"
+	],
+	"business_administration": [
+		"Business Administration",
+		"\u5de5\u5546\u7ba1\u7406",
+		"Social Science"
+	],
+	"marxism": [
+		"Marxism",
+		"\u9a6c\u514b\u601d\u4e3b\u4e49\u57fa\u672c\u539f\u7406",
+		"Social Science"
+	],
+	"mao_zedong_thought": [
+		"Mao Zedong Thought",
+		"\u6bdb\u6cfd\u4e1c\u601d\u60f3\u548c\u4e2d\u56fd\u7279\u8272\u793e\u4f1a\u4e3b\u4e49\u7406\u8bba\u4f53\u7cfb\u6982\u8bba",
+		"Social Science"
+	],
+	"education_science": [
+		"Education Science",
+		"\u6559\u80b2\u5b66",
+		"Social Science"
+	],
+	"teacher_qualification": [
+		"Teacher Qualification",
+		"\u6559\u5e08\u8d44\u683c",
+		"Social Science"
+	],
+	"high_school_politics": [
+		"High School Politics",
+		"\u9ad8\u4e2d\u653f\u6cbb",
+		"Social Science"
+	],
+	"high_school_geography": [
+		"High School Geography",
+		"\u9ad8\u4e2d\u5730\u7406",
+		"Social Science"
+	],
+	"middle_school_politics": [
+		"Middle School Politics",
+		"\u521d\u4e2d\u653f\u6cbb",
+		"Social Science"
+	],
+	"middle_school_geography": [
+		"Middle School Geography",
+		"\u521d\u4e2d\u5730\u7406",
+		"Social Science"
+	],
+	"modern_chinese_history": [
+		"Modern Chinese History",
+		"\u8fd1\u4ee3\u53f2\u7eb2\u8981",
+		"Humanities"
+	],
+	"ideological_and_moral_cultivation": [
+		"Ideological and Moral Cultivation",
+		"\u601d\u60f3\u9053\u5fb7\u4fee\u517b\u4e0e\u6cd5\u5f8b\u57fa\u7840",
+		"Humanities"
+	],
+	"logic": [
+		"Logic",
+		"\u903b\u8f91\u5b66",
+		"Humanities"
+	],
+	"law": [
+		"Law",
+		"\u6cd5\u5b66",
+		"Humanities"
+	],
+	"chinese_language_and_literature": [
+		"Chinese Language and Literature",
+		"\u4e2d\u56fd\u8bed\u8a00\u6587\u5b66",
+		"Humanities"
+	],
+	"art_studies": [
+		"Art Studies",
+		"\u827a\u672f\u5b66",
+		"Humanities"
+	],
+	"professional_tour_guide": [
+		"Professional Tour Guide",
+		"\u5bfc\u6e38\u8d44\u683c",
+		"Humanities"
+	],
+	"legal_professional": [
+		"Legal Professional",
+		"\u6cd5\u5f8b\u804c\u4e1a\u8d44\u683c",
+		"Humanities"
+	],
+	"high_school_chinese": [
+		"High School Chinese",
+		"\u9ad8\u4e2d\u8bed\u6587",
+		"Humanities"
+	],
+	"high_school_history": [
+		"High School History",
+		"\u9ad8\u4e2d\u5386\u53f2",
+		"Humanities"
+	],
+	"middle_school_history": [
+		"Middle School History",
+		"\u521d\u4e2d\u5386\u53f2",
+		"Humanities"
+	],
+	"civil_servant": [
+		"Civil Servant",
+		"\u516c\u52a1\u5458",
+		"Other"
+	],
+	"sports_science": [
+		"Sports Science",
+		"\u4f53\u80b2\u5b66",
+		"Other"
+	],
+	"plant_protection": [
+		"Plant Protection",
+		"\u690d\u7269\u4fdd\u62a4",
+		"Other"
+	],
+	"basic_medicine": [
+		"Basic Medicine",
+		"\u57fa\u7840\u533b\u5b66",
+		"Other"
+	],
+	"clinical_medicine": [
+		"Clinical Medicine",
+		"\u4e34\u5e8a\u533b\u5b66",
+		"Other"
+	],
+	"urban_and_rural_planner": [
+		"Urban and Rural Planner",
+		"\u6ce8\u518c\u57ce\u4e61\u89c4\u5212\u5e08",
+		"Other"
+	],
+	"accountant": [
+		"Accountant",
+		"\u6ce8\u518c\u4f1a\u8ba1\u5e08",
+		"Other"
+	],
+	"fire_engineer": [
+		"Fire Engineer",
+		"\u6ce8\u518c\u6d88\u9632\u5de5\u7a0b\u5e08",
+		"Other"
+	],
+	"environmental_impact_assessment_engineer": [
+		"Environmental Impact Assessment Engineer",
+		"\u73af\u5883\u5f71\u54cd\u8bc4\u4ef7\u5de5\u7a0b\u5e08",
+		"Other"
+	],
+	"tax_accountant": [
+		"Tax Accountant",
+		"\u7a0e\u52a1\u5e08",
+		"Other"
+	],
+	"physician": [
+		"Physician",
+		"\u533b\u5e08\u8d44\u683c",
+		"Other"
+	]
+}
\ No newline at end of file
--- a/scripts/cmmlu/categories.py
+++ b/scripts/cmmlu/categories.py
+# This code is modified from CMMLU Project: https://github.com/haonan-li/CMMLU
+name_en2zh = {
+    "agronomy": "农学",
+    "anatomy": "解剖学",
+    "ancient_chinese": "古汉语",
+    "arts": "艺术学",
+    "astronomy": "天文学",
+    "business_ethics": "商业伦理",
+    "chinese_civil_service_exam": "中国公务员考试",
+    "chinese_driving_rule": "中国驾驶规则",
+    "chinese_food_culture": "中国饮食文化",
+    "chinese_foreign_policy": "中国外交政策",
+    "chinese_history":"中国历史",
+    "chinese_literature": "中国文学",
+    "chinese_teacher_qualification": "中国教师资格",
+    "clinical_knowledge": "临床知识",
+    "college_actuarial_science":"大学精算学",
+    "college_education":"大学教育学",
+    "college_engineering_hydrology": "大学工程水文学",
+    "college_law": "大学法律",
+    "college_mathematics": "大学数学",
+    "college_medical_statistics":"大学医学统计",
+    "college_medicine": "大学医学",
+    "computer_science": "计算机科学",
+    "computer_security": "计算机安全",
+    "conceptual_physics": "概念物理学",
+    "construction_project_management": "建设工程管理",
+    "economics": "经济学",
+    "education": "教育学",
+    "electrical_engineering": "电气工程",
+    "elementary_chinese":"小学语文",
+    "elementary_commonsense":"小学常识",
+    "elementary_information_and_technology": "小学信息技术",
+    "elementary_mathematics": "初等数学",
+    "ethnology": "民族学",
+    "food_science": "食品科学",
+    "genetics": "遗传学",
+    "global_facts": "全球事实",
+    "high_school_biology": "高中生物",
+    "high_school_chemistry": "高中化学",
+    "high_school_geography": "高中地理",
+    "high_school_mathematics": "高中数学",
+    "high_school_physics": "高中物理学",
+    "high_school_politics": "高中政治",
+    "human_sexuality": "人类性行为",
+    "international_law": "国际法学",
+    "journalism": "新闻学",
+    "jurisprudence": "法理学",
+    "legal_and_moral_basis": "法律与道德基础",
+    "logical": "逻辑学",
+    "machine_learning": "机器学习",
+    "management": "管理学",
+    "marketing": "市场营销",
+    "marxist_theory": "马克思主义理论",
+    "modern_chinese": "现代汉语",
+    "nutrition": "营养学",
+    "philosophy": "哲学",
+    "professional_accounting": "专业会计",
+    "professional_law": "专业法学",
+    "professional_medicine": "专业医学",
+    "professional_psychology": "专业心理学",
+    "public_relations": "公共关系",
+    "security_study":"安全研究",
+    "sociology": "社会学",
+    "sports_science": "体育学",
+    "traditional_chinese_medicine": "中医中药",
+    "virology": "病毒学",
+    "world_history":"世界历史",
+    "world_religions": "世界宗教",
+}
+subcategories = {
+    "agronomy": ['other'],
+    "anatomy": ['biology'],
+    "ancient_chinese": ['linguistics','china specific'],
+    "arts": ['arts'],
+    "astronomy": ['physics'],
+    "business_ethics": ['business'],
+    "chinese_civil_service_exam": ['politics','china specific'],
+    "chinese_driving_rule": ['other','china specific'],
+    "chinese_food_culture": ['culture','china specific'],
+    "chinese_foreign_policy": ['politics','china specific'],
+    "chinese_history":['history','china specific'],
+    "chinese_literature": ['literature','china specific'],
+    "chinese_teacher_qualification": ['education','china specific'],
+    "college_actuarial_science":['math'],
+    "college_education":['education'],
+    "college_engineering_hydrology": ['engineering'],
+    "college_law": ['law'],
+    "college_mathematics": ['math'],
+    "college_medical_statistics":['statistics'],
+    "clinical_knowledge": ['other'],
+    "college_medicine": ['other'],
+    "computer_science": ['computer science'],
+    "computer_security": ['other'],
+    "conceptual_physics": ['physics'],
+    "construction_project_management": ['other','china specific'],
+    "economics": ['economics'],
+    "education": ['education'],
+    "elementary_chinese":['linguistics','china specific'],
+    "elementary_commonsense":['other','china specific'],
+    "elementary_information_and_technology": ['other'],
+    "electrical_engineering": ['engineering'],
+    "elementary_mathematics": ['math'],
+    "ethnology": ['culture','china specific'],
+    "food_science": ['other'],
+    "genetics": ['biology'],
+    "global_facts": ['global'],
+    "high_school_biology": ['biology'],
+    "high_school_chemistry": ['chemistry'],
+    "high_school_geography": ['geography'],
+    "high_school_mathematics": ['math'],
+    "high_school_physics": ['physics'],
+    "high_school_politics": ['politics','china specific'],
+    "human_sexuality": ['other'],
+    "international_law": ['law'],
+    "journalism": ['sociology'],
+    "jurisprudence": ['law'],
+    "legal_and_moral_basis": ['other'],
+    "logical": ['philosophy'],
+    "machine_learning": ['computer science'],
+    "management": ['business'],
+    "marketing": ['business'],
+    "marxist_theory": ['philosophy'],
+    "modern_chinese": ['linguistics','china specific'],
+    "nutrition": ['other'],
+    "philosophy": ['philosophy'],
+    "professional_accounting": ['business'],
+    "professional_law": ['law'],
+    "professional_medicine": ['other'],
+    "professional_psychology": ['psychology'],
+    "public_relations": ['politics'],
+    "security_study": ['politics'],
+    "sociology": ['culture'],
+    "sports_science": ['other'],
+    "traditional_chinese_medicine": ['other','china specific'],
+    "virology": ['biology'],
+    "world_history":['history'],
+    "world_religions": ['global'],
+}
+categories = {
+    "STEM": ["physics", "chemistry", "biology", "computer science", "math", "engineering", "statistics"],
+    "Humanities": ["history", "philosophy", "law", "arts", "literature", "global"],
+    "Social Science": ['linguistics',"business", "politics", "culture", "economics", "geography", "psychology", "education", "sociology"],
+    "Other":["other"],
+    "China specific": ["china specific"],
+}
--- a/scripts/cmmlu/eval.py
+++ b/scripts/cmmlu/eval.py
+# This code is modified from C-Eval Project: https://github.com/SJTU-LIT/ceval
+import os
+import argparse
+import pandas as pd
+import torch
+import json
+from llama2_evaluator import Llama_Evaluator
+from glob import glob
+import time
+from collections import defaultdict
+from categories import name_en2zh, subcategories, categories
+choices = ["A", "B", "C", "D"]
+category2subject = defaultdict(list)
+for k,v in categories.items():
+    for subject, subcat in subcategories.items():
+        for c in subcat:
+            if c in v:
+                category2subject[k].append(subject)
+category2subject_list = defaultdict(list)
+for key,value in category2subject.items():
+      for val in value:
+          category2subject_list[val]=[val,name_en2zh[val],key]
+category2subject=category2subject_list
+choices = ["A", "B", "C", "D"]
+def main(args, evaluator,take):
+    subject_mapping = category2subject #json.load(f)
+    filenames = [s.split('/')[-1] for s in glob(args.input_dir+"/test/*csv")]
+    subject_list = [val_file.replace(".csv","") for val_file in filenames]
+    accuracy, summary = {}, {}
+    run_date=time.strftime('%Y-%m-%d_%H-%M-%S',time.localtime(time.time()))
+    output_dir = args.output_dir
+    save_result_dir=os.path.join(output_dir,f"take{take}")
+    if not os.path.exists(save_result_dir):
+        os.makedirs(save_result_dir,exist_ok=True)
+    all_answers = {}
+    for index,subject_name in enumerate(subject_list):
+        print(f"{index/len(subject_list)} Inference starts at {run_date} on {args.model_path} with subject of {subject_name}!")
+        val_file_path=os.path.join(args.input_dir+'/test',f'{subject_name}.csv')
+        dev_file_path=os.path.join(args.input_dir+'/dev',f'{subject_name}.csv')
+        val_df=pd.read_csv(val_file_path)
+        dev_df=pd.read_csv(dev_file_path) if args.few_shot else None
+        correct_ratio, answers = evaluator.eval_subject(subject_name, val_df, dev_df,
+            save_result_dir=save_result_dir if args.do_save_csv else None,
+            few_shot=args.few_shot,
+            cot=args.cot,
+            with_prompt=args.with_prompt,
+            constrained_decoding=args.constrained_decoding,
+            do_test=False)
+        print(f"Subject: {subject_name}")
+        print(f"Acc: {correct_ratio}")
+        accuracy[subject_name] = correct_ratio
+        summary[subject_name] = {"score":correct_ratio,
+                                 "num":len(val_df),
+                                 "correct":correct_ratio*len(val_df)/100}
+        all_answers[subject_name] = answers
+    json.dump(all_answers,open(save_result_dir+'/submission.json','w'),ensure_ascii=False,indent=4)
+    print("\n\nModel:",args.model_path)
+    print("Accuracy:")
+    for k, v in accuracy.items():
+        print(k, ": ", v)
+    total_num = 0
+    total_correct = 0
+    summary['grouped'] = {
+        "China specific": {"correct": 0.0, "num": 0},
+        "STEM": {"correct": 0.0, "num": 0}, 
+        "Social Science": {"correct": 0.0, "num": 0}, 
+        "Humanities": {"correct": 0.0, "num": 0}, 
+        "Other": {"correct": 0.0, "num": 0}
+        }
+    for subj, info in subject_mapping.items():
+        group = info[2]
+        summary['grouped'][group]["num"]   += summary[subj]['num']
+        summary['grouped'][group]["correct"] += summary[subj]['correct']
+    for group, info in summary['grouped'].items():
+        info['score'] = info["correct"] / info["num"]
+        total_num += info["num"]
+        total_correct += info["correct"]
+    summary['All'] = {"score": total_correct / total_num, "num": total_num, "correct": total_correct}
+    json.dump(summary,open(save_result_dir+'/summary.json','w'),ensure_ascii=False,indent=2)
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--ntrain", "-k", type=int, default=5)
+    parser.add_argument("--model_path", type=str)
+    parser.add_argument("--cot",choices=["False","True"], default="False")
+    parser.add_argument("--few_shot", choices=["False","True"], default="True")
+    parser.add_argument("--with_prompt", choices=["False","True"], default="False")
+    parser.add_argument("--constrained_decoding", choices=["False","True"], default="False")
+    parser.add_argument("--temperature",type=float,default=0.2)
+    parser.add_argument("--n_times", default=1,type=int)
+    parser.add_argument("--do_save_csv", choices=["False","True"], default="False")
+    parser.add_argument("--output_dir", type=str)
+    parser.add_argument("--input_dir", type=str)
+    parser.add_argument("--verbose", action="store_true", help="Print detailed information of each example.")
+    args = parser.parse_args()
+    args.cot = args.cot == "True"
+    args.few_shot = args.few_shot == "True"
+    args.with_prompt = args.with_prompt == "True"
+    args.do_save_csv = args.do_save_csv == "True"
+    args.constrained_decoding = args.constrained_decoding == "True"
+    if args.constrained_decoding is True:
+        args.n_times=max(args.n_times,1)
+    print(args)
+    device = torch.device(0)
+    print(device)
+    evaluator=Llama_Evaluator(
+        choices=choices,
+        k=args.ntrain,
+        model_path=args.model_path,
+        device=device,
+        temperature = args.temperature,
+        verbose = args.verbose
+    )
+    for i in range(args.n_times):
+        main(args,evaluator=evaluator,take=i)