Multimodal prototyping (#2243)

* add WIP hf vlm class * add doc_to_image * add mmmu tasks * fix merge conflicts * add lintang's changes to hf_vlms.py * fix doc_to_image * added yaml_path for config-loading * revert * add line to process str type v * update * modeling cleanup * add aggregation for mmmu * rewrite MMMU processing code based on only MMMU authors' repo (doc_to_image still WIP) * implemented doc_to_image * update doc_to_image to accept list of features * update functions * readd image processed * update args process * bugfix for repeated images fed to model * push WIP loglikelihood code * commit most recent code (generative ; qwen2-vl testing) * preliminary image_token_id handling * small mmmu update: some qs have >4 mcqa options * push updated modeling code * use processor.apply_chat_template * add mathvista draft * nit * nit * ensure no footguns in text<>multimodal LM<>task incompatibility * add notification to readme regarding launch of prototype! * fix compatibility check * reorganize mmmu configs * chat_template=None * add interleave chat_template * add condition * add max_images; interleave=true * nit * testmini_mcq * nit * pass image string; convert img * add vllm * add init * vlm add multi attr * fixup * pass max images to vllm model init * nit * encoding to device * fix HFMultimodalLM.chat_template ? * add mmmu readme * remove erroneous prints * use HFMultimodalLM.chat_template ; restore tasks/__init__.py * add docstring for replace_placeholders in utils * fix `replace_placeholders`; set image_string=None * fix typo * cleanup + fix merge conflicts * update MMMU readme * del mathvista * add some sample scores * Update README.md * add log msg for image_string value --------- Co-authored-by: haileyschoelkopf <hailey@eleuther.ai> Co-authored-by: Baber Abbasi <baber@eleuther.ai> Co-authored-by: Baber <baber@hey.com> Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>

Multimodal prototyping (#2243)
* add WIP hf vlm class * add doc_to_image * add mmmu tasks * fix merge conflicts * add lintang's changes to hf_vlms.py * fix doc_to_image * added yaml_path for config-loading * revert * add line to process str type v * update * modeling cleanup * add aggregation for mmmu * rewrite MMMU processing code based on only MMMU authors' repo (doc_to_image still WIP) * implemented doc_to_image * update doc_to_image to accept list of features * update functions * readd image processed * update args process * bugfix for repeated images fed to model * push WIP loglikelihood code * commit most recent code (generative ; qwen2-vl testing) * preliminary image_token_id handling * small mmmu update: some qs have >4 mcqa options * push updated modeling code * use processor.apply_chat_template * add mathvista draft * nit * nit * ensure no footguns in text<>multimodal LM<>task incompatibility * add notification to readme regarding launch of prototype! * fix compatibility check * reorganize mmmu configs * chat_template=None * add interleave chat_template * add condition * add max_images; interleave=true * nit * testmini_mcq * nit * pass image string; convert img * add vllm * add init * vlm add multi attr * fixup * pass max images to vllm model init * nit * encoding to device * fix HFMultimodalLM.chat_template ? * add mmmu readme * remove erroneous prints * use HFMultimodalLM.chat_template ; restore tasks/__init__.py * add docstring for replace_placeholders in utils * fix `replace_placeholders`; set image_string=None * fix typo * cleanup + fix merge conflicts * update MMMU readme * del mathvista * add some sample scores * Update README.md * add log msg for image_string value --------- Co-authored-by: haileyschoelkopf <hailey@eleuther.ai> Co-authored-by: Baber Abbasi <baber@eleuther.ai> Co-authored-by: Baber <baber@hey.com> Co-authored-by: Hailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
fb963f0f · Lintang Sutawika · GitHub · decc533d · fb963f0f · fb963f0f
Unverified Commit fb963f0f authored Sep 13, 2024 by Lintang Sutawika Committed by GitHub Sep 13, 2024
20 changed files
--- a/lm_eval/tasks/mmmu/mmmu_art.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_art.yaml
+task: mmmu_val_art
+include: _template_yaml
+task_alias: Art
+dataset_name: Art
--- a/lm_eval/tasks/mmmu/mmmu_art_theory.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_art_theory.yaml
+task: mmmu_val_art_theory
+include: _template_yaml
+task_alias: Art Theory
+dataset_name: Art_Theory
--- a/lm_eval/tasks/mmmu/mmmu_basic_medical_science.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_basic_medical_science.yaml
+task: mmmu_val_basic_medical_science
+include: _template_yaml
+task_alias: Basic Medical Science
+dataset_name: Basic_Medical_Science
--- a/lm_eval/tasks/mmmu/mmmu_biology.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_biology.yaml
+task: mmmu_val_biology
+include: _template_yaml
+task_alias: Biology
+dataset_name: Biology
--- a/lm_eval/tasks/mmmu/mmmu_chemistry.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_chemistry.yaml
+task: mmmu_val_chemistry
+include: _template_yaml
+task_alias: Chemistry
+dataset_name: Chemistry
--- a/lm_eval/tasks/mmmu/mmmu_clinical_medicine.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_clinical_medicine.yaml
+task: mmmu_val_clinical_medicine
+include: _template_yaml
+task_alias: Clinical Medicine
+dataset_name: Clinical_Medicine
--- a/lm_eval/tasks/mmmu/mmmu_computer_science.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_computer_science.yaml
+task: mmmu_val_computer_science
+include: _template_yaml
+task_alias: Computer Science
+dataset_name: Computer_Science
--- a/lm_eval/tasks/mmmu/mmmu_design.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_design.yaml
+task: mmmu_val_design
+include: _template_yaml
+task_alias: Design
+dataset_name: Design
--- a/lm_eval/tasks/mmmu/mmmu_diagnostics_and_laboratory_medicine.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_diagnostics_and_laboratory_medicine.yaml
+task: mmmu_val_diagnostics_and_laboratory_medicine
+include: _template_yaml
+task_alias: Diagnostics and Laboratory Medicine
+dataset_name: Diagnostics_and_Laboratory_Medicine
--- a/lm_eval/tasks/mmmu/mmmu_economics.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_economics.yaml
+task: mmmu_val_economics
+include: _template_yaml
+task_alias: Economics
+dataset_name: Economics
--- a/lm_eval/tasks/mmmu/mmmu_electronics.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_electronics.yaml
+task: mmmu_val_electronics
+include: _template_yaml
+task_alias: Electronics
+dataset_name: Electronics
--- a/lm_eval/tasks/mmmu/mmmu_energy_and_power.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_energy_and_power.yaml
+task: mmmu_val_energy_and_power
+include: _template_yaml
+task_alias: Energy and Power
+dataset_name: Energy_and_Power
--- a/lm_eval/tasks/mmmu/mmmu_finance.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_finance.yaml
+task: mmmu_val_finance
+include: _template_yaml
+task_alias: Finance
+dataset_name: Finance
--- a/lm_eval/tasks/mmmu/mmmu_geography.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_geography.yaml
+task: mmmu_val_geography
+include: _template_yaml
+task_alias: Geography
+dataset_name: Geography
--- a/lm_eval/tasks/mmmu/mmmu_history.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_history.yaml
+task: mmmu_val_history
+include: _template_yaml
+task_alias: History
+dataset_name: History
--- a/lm_eval/tasks/mmmu/mmmu_literature.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_literature.yaml
+task: mmmu_val_literature
+include: _template_yaml
+task_alias: Literature
+dataset_name: Literature
--- a/lm_eval/tasks/mmmu/mmmu_manage.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_manage.yaml
+task: mmmu_val_manage
+include: _template_yaml
+task_alias: Manage
+dataset_name: Manage
--- a/lm_eval/tasks/mmmu/mmmu_marketing.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_marketing.yaml
+task: mmmu_val_marketing
+include: _template_yaml
+task_alias: Marketing
+dataset_name: Marketing
--- a/lm_eval/tasks/mmmu/mmmu_materials.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_materials.yaml
+task: mmmu_val_materials
+include: _template_yaml
+task_alias: Materials
+dataset_name: Materials
--- a/lm_eval/tasks/mmmu/mmmu_math.yaml
+++ b/lm_eval/tasks/mmmu/mmmu_math.yaml
+task: mmmu_val_math
+include: _template_yaml
+task_alias: Math
+dataset_name: Math