Unverified Commit 655718d0 authored by Janna's avatar Janna Committed by GitHub
Browse files

Longbench v2 (#3338)



* initial commit

* change to acc

* fix long-dialogue tasks

* fix versioning

* more fixes

* fix naming

* fix naming

* more renaming

* maybe a dataset fix

* fix dataset and use new dataset schema

* add README

* fix prompt and dataset naming

* lint

* remove utils.py

* lint

* more linting

* fix typo

* fix naming

* add longbenchv2

---------
Co-authored-by: default avatarBaber <baber@hey.com>
parent 8efef8f1
include: _longbench_common_yaml
tag:
- longbench2
- longbench2_multi
task: longbench2_legal_multi
dataset_name: legal_multi
include: _longbench_common_yaml
tag:
- longbench2
- longbench2_single
task: longbench2_legal_single
dataset_name: legal_single
include: _longbench_common_yaml
tag:
- longbench2
- longbench2_single
task: longbench2_lit_single
dataset_name: literary
include: _longbench_common_yaml
tag:
- longbench2
task: longbench2_code
dataset_name: code_repo_qa
include: _longbench_common_yaml
tag:
- longbench2
- longbench2_incontext
task: longbench2_many_shot
dataset_name: manyshot_learning
include: _longbench_common_yaml
tag:
- longbench2
- longbench2_multi
task: longbench2_news_multi
dataset_name: multinews
include: _longbench_common_yaml
tag:
- longbench2
- longbench2_structured
task: longbench2_table
dataset_name: table_qa
include: _longbench_common_yaml
tag:
- longbench2
- longbench2_incontext
task: longbench2_translate
dataset_name: new_language_translation
include: _longbench_common_yaml
tag:
- longbench2
- longbench2_incontext
task: longbench2_user_guide
dataset_name: user_guide_qa
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment