Merge pull request #1 from Yuliang-Liu/main

update dev

Merge pull request #1 from Yuliang-Liu/main
update dev
80836a45 · Melos · GitHub · 9198a7c9 · 7fbd0a5b · 80836a45
Unverified Commit 80836a45 authored Nov 14, 2023 by Melos Committed by GitHub Nov 14, 2023
Show whitespace changes
Inline Side-by-side

Showing with 18 additions and 8 deletions

README.md README.md +18 -8

images/demo_gpt4v_compare.png images/demo_gpt4v_compare.png +0 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -15,7 +15,7 @@ Zhang Li*, Biao Yang*, Qiang Liu, Zhiyin Ma, Shuo Zhang, Jingxu Yang, Yabo Sun,
 </div>

 <p align="center">
-<a href="updating">Paper will be released soon</a>&nbsp&nbsp | &nbsp&nbsp<a href="http://221.232.49.195:7680/">Demo</a>&nbsp&nbsp 
+<a href="https://arxiv.org/abs/2311.06607">Paper</a>&nbsp&nbsp | &nbsp&nbsp<a href="http://121.60.58.184:7680/">Demo</a>&nbsp&nbsp | &nbsp&nbsp<a href="updating">Model&Code update soon</a>&nbsp&nbsp 
 <!--     | &nbsp&nbsp<a href="Monkey Model">Monkey Models</a>&nbsp ｜ &nbsp <a href="updating">Tutorial</a> -->
 </p>
 -----
@@ -28,26 +28,36 @@ Zhang Li*, Biao Yang*, Qiang Liu, Zhiyin Ma, Shuo Zhang, Jingxu Yang, Yabo Sun,
 - **Support resolution up to 1344 x 896.** Surpassing the standard 448 x 448 resolution typically employed for LMMs, this significant increase in resolution augments the ability to discern and understand unnoticeable or tightly clustered objects and dense text. 
 - **Enhanced general performance.** We carried out testing across 16 diverse datasets, leading to impressive performance by our Monkey model in tasks such as Image Captioning, General Visual Question Answering, Text-centric Visual Question Answering, and Document-oriented Visual Question Answering.

-## performance
+## Demo

+To use the [demo](http://121.60.58.184:7680/), simply upload an image from your desktop or phone, or capture one directly. Before 11/11/2023, we have observed that many cases Monkey can achieve more accurate results than GPT4V: 
+<br>
+<p align="center">
+    <img src="images/demo_gpt4v_compare.png" width="900"/>
+<p>
 <br>

+For those who prefer responses in Chinese, use the '生成中文描述' button to get descriptions in Chinese.
+
+<br>
 <p align="center">
-    <img src="images/radar.png" width="800"/>
+    <img src="images/generation.png" width="900"/>
 <p>
 <br>


-## Demo

-Have a try using the providing [Demo](http://221.232.49.195:7680/). All you need are to simpley upload or capture image from desktop or your phone, then click the generate. You may also generate multiple times to get more information. You can also generate Chinese answer by using “生成中文描述”: 
+## performance

 <br>
+
 <p align="center">
-    <img src="images/generation.png" width="900"/>
+    <img src="images/radar.png" width="800"/>
 <p>
 <br>

+
+
 ## Cases

 Our model can accurately describe the details in the image.

--- a/images/demo_gpt4v_compare.png
+++ b/images/demo_gpt4v_compare.png