title="TextMonkey : An OCR-Free Large Multimodal Model for Understanding Document"
description="""
<font size=4>
Welcome to TextMonkey
Hello! I'm TextMonkey, a Large Language and Vision Assistant developed by HUST VLRLab and KingSoft.
You can click on the examples below the demo to display them.
## Example prompts for different tasks
You need to replace "Question" with your question.
1.**Read All Text:** Read all the text in the image.
2.**Text Spotting:** OCR with grounding:
3.**Position of Text:** <ref>"Question"</ref>
4.**VQA:** "Question" Answer:
5.**VQA with Grounding:** "Question" Provide the location coordinates of the answer when answering the question.
6.**Output Json**: Convert the chart in this image to json format. Answer:(Convert the document in this image to json format. Answer:)(Convert the table in this image to json format. Answer:)