• Lintang Sutawika's avatar
    Multimodal prototyping (#2243) · fb963f0f
    Lintang Sutawika authored
    
    
    * add WIP hf vlm class
    
    * add doc_to_image
    
    * add mmmu tasks
    
    * fix merge conflicts
    
    * add lintang's changes to hf_vlms.py
    
    * fix doc_to_image
    
    * added yaml_path for config-loading
    
    * revert
    
    * add line to process str type v
    
    * update
    
    * modeling cleanup
    
    * add aggregation for mmmu
    
    * rewrite MMMU processing code based on only MMMU authors' repo (doc_to_image still WIP)
    
    * implemented doc_to_image
    
    * update doc_to_image to accept list of features
    
    * update functions
    
    * readd image processed
    
    * update args process
    
    * bugfix for repeated images fed to model
    
    * push WIP loglikelihood code
    
    * commit most recent code (generative ; qwen2-vl testing)
    
    * preliminary image_token_id handling
    
    * small mmmu update: some qs have >4 mcqa options
    
    * push updated modeling code
    
    * use processor.apply_chat_template
    
    * add mathvista draft
    
    * nit
    
    * nit
    
    * ensure no footguns in text<>multimodal LM<>task incompatibility
    
    * add notification to readme regarding launch of prototype!
    
    * fix compatibility check
    
    * reorganize mmmu configs
    
    * chat_template=None
    
    * add interleave chat_template
    
    * add condition
    
    * add max_images; interleave=true
    
    * nit
    
    * testmini_mcq
    
    * nit
    
    * pass image string; convert img
    
    * add vllm
    
    * add init
    
    * vlm add multi attr
    
    * fixup
    
    * pass max images to vllm model init
    
    * nit
    
    * encoding to device
    
    * fix HFMultimodalLM.chat_template ?
    
    * add mmmu readme
    
    * remove erroneous prints
    
    * use HFMultimodalLM.chat_template ; restore tasks/__init__.py
    
    * add docstring for replace_placeholders in utils
    
    * fix `replace_placeholders`; set image_string=None
    
    * fix typo
    
    * cleanup + fix merge conflicts
    
    * update MMMU readme
    
    * del mathvista
    
    * add some sample scores
    
    * Update README.md
    
    * add log msg for image_string value
    
    ---------
    Co-authored-by: default avatarhaileyschoelkopf <hailey@eleuther.ai>
    Co-authored-by: default avatarBaber Abbasi <baber@eleuther.ai>
    Co-authored-by: default avatarBaber <baber@hey.com>
    Co-authored-by: default avatarHailey Schoelkopf <65563625+haileyschoelkopf@users.noreply.github.com>
    fb963f0f
README.md 38.9 KB