Commit 06c5529f authored by Rayyyyy's avatar Rayyyyy
Browse files

uodate some change according to meta-llama/llama3

parent 0a0618ac
......@@ -51,7 +51,8 @@ do
wget --continue ${PRESIGNED_URL/'*'/"${MODEL_PATH}/tokenizer.model"} -O ${TARGET_FOLDER}"/${MODEL_FOLDER_PATH}/tokenizer.model"
wget --continue ${PRESIGNED_URL/'*'/"${MODEL_PATH}/checklist.chk"} -O ${TARGET_FOLDER}"/${MODEL_FOLDER_PATH}/checklist.chk"
echo "Checking checksums"
if [ "$CPU_ARCH" = "arm64" ]; then
CPU_ARCH=$(uname -m)
if [[ "$CPU_ARCH" == "arm64" ]]; then
(cd ${TARGET_FOLDER}"/${MODEL_FOLDER_PATH}" && md5 checklist.chk)
else
(cd ${TARGET_FOLDER}"/${MODEL_FOLDER_PATH}" && md5sum -c checklist.chk)
......
......@@ -3,7 +3,7 @@ This document contains additional context on the settings and parameters for how
### Auto-eval benchmark notes
#### MMLU
- We are reporting macro averages for MMLU benchmarks. The micro average numbers for MMLU are: 65.4 and 67.4 for the 8B pre-trained and instruct-aligned models, 78.9 and 82.0 for the 70B pre-trained and instruct-aligned models
- For the instruct-aligned MMLU we ask the model to generate the best choice character
- The pre-trained models are evaluated in the standard way by calualting the likelihood of each choice character. For the instruct-aligned models, we use a dialogue prompt (*user/assistant*) for the shots and ask the model to generate the best choice character as answer.
#### AGI English
- We use the default few-shot and prompt settings as specified [here](https://github.com/ruixiangcui/AGIEval). The score is averaged over the english subtasks.
#### CommonSenseQA
......
......@@ -46,7 +46,7 @@ class TokenizerTests(TestCase):
[
128006, # <|start_header_id|>
882, # "user"
128007, # <|end_of_header|>
128007, # <|end_header_id|>
271, # "\n\n"
2028, 374, 264, 1296, 11914, 13, # This is a test sentence.
128009, # <|eot_id|>
......@@ -70,19 +70,19 @@ class TokenizerTests(TestCase):
128000, # <|begin_of_text|>
128006, # <|start_header_id|>
9125, # "system"
128007, # <|end_of_header|>
128007, # <|end_header_id|>
271, # "\n\n"
2028, 374, 264, 1296, 11914, 13, # "This is a test sentence."
128009, # <|eot_id|>
128006, # <|start_header_id|>
882, # "user"
128007, # <|end_of_header|>
128007, # <|end_header_id|>
271, # "\n\n"
2028, 374, 264, 2077, 13, # "This is a response.",
128009, # <|eot_id|>
128006, # <|start_header_id|>
78191, # "assistant"
128007, # <|end_of_header|>
128007, # <|end_header_id|>
271, # "\n\n"
]
)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment