Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
gaoqiong
lm-evaluation-harness
Commits
cd71c7f0
Unverified
Commit
cd71c7f0
authored
Jul 12, 2024
by
Yen-Ting Lin
Committed by
GitHub
Jul 12, 2024
Browse files
clean for pr
parent
a2af2101
Changes
46
Hide whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
0 additions
and
141 deletions
+0
-141
lm_eval/tasks/umtceval/default/umtceval_hr_law.yaml
lm_eval/tasks/umtceval/default/umtceval_hr_law.yaml
+0
-7
lm_eval/tasks/umtceval/default/umtceval_process_en.yaml
lm_eval/tasks/umtceval/default/umtceval_process_en.yaml
+0
-7
lm_eval/tasks/umtceval/default/umtceval_umtc.yaml
lm_eval/tasks/umtceval/default/umtceval_umtc.yaml
+0
-7
lm_eval/tasks/umtceval/default/utils.py
lm_eval/tasks/umtceval/default/utils.py
+0
-17
lm_eval/tasks/umtceval/subject.tsv
lm_eval/tasks/umtceval/subject.tsv
+0
-4
run_all.sh
run_all.sh
+0
-99
No files found.
lm_eval/tasks/umtceval/default/umtceval_hr_law.yaml
deleted
100644 → 0
View file @
a2af2101
"
dataset_name"
:
"
hr_law"
"
description"
:
"
以下為欣興電子人資的單選題,請提供正確答案的選項。
\n\n
"
"
group"
:
"
umtceval_law"
"
group_alias"
:
"
law"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
umtceval_hr_law"
"
task_alias"
:
"
hr
law"
lm_eval/tasks/umtceval/default/umtceval_process_en.yaml
deleted
100644 → 0
View file @
a2af2101
"
dataset_name"
:
"
process_en"
"
description"
:
"
以下為欣興電子製程的單選題,請提供正確答案的選項。
\n\n
"
"
group"
:
"
umtceval_law"
"
group_alias"
:
"
law"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
umtceval_process_en"
"
task_alias"
:
"
process
en"
lm_eval/tasks/umtceval/default/umtceval_umtc.yaml
deleted
100644 → 0
View file @
a2af2101
"
dataset_name"
:
"
umtc"
"
description"
:
"
以下為欣興電子的單選題,請提供正確答案的選項。
\n\n
"
"
group"
:
"
umtceval_law"
"
group_alias"
:
"
law"
"
include"
:
"
_default_template_yaml"
"
task"
:
"
umtceval_umtc"
"
task_alias"
:
"
umtc"
lm_eval/tasks/umtceval/default/utils.py
deleted
100644 → 0
View file @
a2af2101
import
datasets
def
process_docs
(
dataset
:
datasets
.
Dataset
)
->
datasets
.
Dataset
:
def
_helper
(
doc
):
# modifies the contents of a single
# document in our dataset.
answer_list
=
[
"A"
,
"B"
,
"C"
,
"D"
]
choices
=
[
doc
[
"A"
],
doc
[
"B"
],
doc
[
"C"
],
doc
[
"D"
]]
out_doc
=
{
"questions"
:
doc
[
"question"
],
"choices"
:
choices
,
"goal"
:
answer_list
.
index
(
doc
[
"answer"
]),
}
return
out_doc
return
dataset
.
map
(
_helper
)
# returns back a datasets.Dataset object
lm_eval/tasks/umtceval/subject.tsv
deleted
100644 → 0
View file @
a2af2101
subject name category
umtc 欣興電子 default
hr_law 欣興電子人資 default
process_en 欣興電子製程 default
run_all.sh
deleted
100644 → 0
View file @
a2af2101
#!/bin/bash
# Define the models to run
declare
-a
models
=(
"yentinglin/Llama-3-Taiwan-70B-Instruct"
"yentinglin/Llama-3-Taiwan-70B-Instruct-DPO"
"yentinglin/Llama-3-Taiwan-8B-Instruct-rc1"
"yentinglin/Taiwan-LLM-34B-Instruct"
"yentinglin/Taiwan-LLM-MoE-pilot"
"yentinglin/Taiwan-LLM-8x7B-DPO"
"yentinglin/Taiwan-LLM-7B-v2.0-base"
"yentinglin/Taiwan-LLM-7B-v2.0-chat"
"yentinglin/Taiwan-LLM-7B-v2.0.1-chat"
"yentinglin/Taiwan-LLM-7B-v2.1-chat"
"yentinglin/Taiwan-LLM-13B-v2.0-base"
"yentinglin/Taiwan-LLM-13B-v2.0-chat"
"yentinglin/Taiwan-LLaMa-v1.0"
"yentinglin/Taiwan-LLaMa-v1.0-base"
"yentinglin/Taiwan-LLaMa-v0.9"
"yentinglin/Taiwan-LLaMa-v0.0"
"meta-llama/Meta-Llama-3-70B-Instruct"
"meta-llama/Meta-Llama-3-70B"
"meta-llama/Meta-Llama-3-8B-Instruct"
"meta-llama/Meta-Llama-3-8B"
"Qwen/Qwen1.5-110B-Chat"
"Qwen/Qwen1.5-110B"
"Qwen/Qwen1.5-32B"
"Qwen/Qwen1.5-32B-Chat"
"Qwen/Qwen1.5-72B-Chat"
"Qwen/Qwen1.5-72B"
"Qwen/Qwen1.5-MoE-A2.7B"
"Qwen/Qwen1.5-MoE-A2.7B-Chat"
"Qwen/Qwen1.5-4B"
"Qwen/Qwen1.5-4B-Chat"
"Qwen/Qwen1.5-0.5B"
"Qwen/Qwen1.5-0.5B-Chat"
"Qwen/Qwen1.5-1.8B"
"Qwen/Qwen1.5-7B"
"Qwen/Qwen1.5-14B"
"Qwen/Qwen1.5-14B-Chat"
"deepseek-ai/DeepSeek-V2-Chat"
"01-ai/Yi-1.5-34B"
"01-ai/Yi-1.5-34B-Chat"
"01-ai/Yi-1.5-34B-32K"
"01-ai/Yi-1.5-34B-Chat-16K"
"01-ai/Yi-1.5-9B-32K"
"01-ai/Yi-1.5-9B-Chat-16K"
"01-ai/Yi-1.5-9B"
"01-ai/Yi-1.5-9B-Chat"
"01-ai/Yi-1.5-6B"
"01-ai/Yi-1.5-6B-Chat"
"CohereForAI/c4ai-command-r-plus"
"CohereForAI/c4ai-command-r-v01"
"CohereForAI/aya-23-35B"
"CohereForAI/aya-23-8B"
"mistralai/Mixtral-8x22B-Instruct-v0.1"
"mistralai/Mixtral-8x22B-v0.1"
"mistralai/Mistral-7B-Instruct-v0.3"
"mistralai/Mistral-7B-v0.3"
"mistralai/Mistral-7B-Instruct-v0.2"
"mistralai/Mixtral-8x7B-Instruct-v0.1"
"mistralai/Mixtral-8x7B-v0.1"
"mistralai/Mistral-7B-v0.1"
"MediaTek-Research/Breeze-7B-32k-Instruct-v1_0"
"MediaTek-Research/Breeze-7B-Instruct-v0_1"
"MediaTek-Research/Breeze-7B-Base-v0_1"
"MediaTek-Research/Breeze-7B-Instruct-v1_0"
"MediaTek-Research/Breeze-7B-Base-v1_0"
"INX-TEXT/Bailong-instruct-7B"
"taide/Llama3-TAIDE-LX-8B-Chat-Alpha1"
"taide/TAIDE-LX-7B-Chat"
"taide/TAIDE-LX-7B"
"microsoft/Phi-3-mini-4k-instruct"
"microsoft/Phi-3-mini-128k-instruct"
"microsoft/Phi-3-small-8k-instruct"
"microsoft/Phi-3-small-128k-instruct"
"microsoft/Phi-3-medium-4k-instruct"
"microsoft/Phi-3-medium-128k-instruct"
"google/gemma-1.1-2b-it"
"google/gemma-1.1-7b-it"
"google/gemma-7b"
"google/gemma-2b"
"apple/OpenELM-3B-Instruct"
)
# SLURM script to be used
SLURM_SCRIPT
=
"harness_eval.slurm"
# Parameters for the script
PARAMS
=
"tmlu,twllm_eval,tw_legal,ccp,pega,tmmluplus,mmlu,pega_mmlu,umtceval"
# Loop through each model and submit a job
for
model
in
"
${
models
[@]
}
"
do
echo
"Submitting job for
$model
"
sbatch
$SLURM_SCRIPT
$model
$PARAMS
done
echo
"All jobs submitted"
Prev
1
2
3
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment