README_image.md 2.07 KB
Newer Older
mashun1's avatar
mashun1 committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
# LLaVA-NeXT: Stronger LLMs Supercharge Multimodal Capabilities in the Wild

## 论文

`LLaVA-NeXT: Stronger LLMs Supercharge Multimodal Capabilities in the Wild`

* https://llava-vl.github.io/blog/2024-05-10-llava-next-stronger-llms/

## 模型结构

参考[README.md](../README.md)

## 算法原理

参考[README.md](../README.md)

## 数据集



## 训练



## 推理

```bash
python image.py
```

注意:在运行前需修改文件中的模型路径。

### 评估

```bash
export HF_ENDPOINT=https://hf-mirror.com

huggingface-cli login --token $HUGGINGFACE_TOKEN --add-to-git-credential
```

注意:此命令为自动下载评估数据所需。


```bash
accelerate launch --num_processes=8 \
  -m lmms_eval \
  --model llava \
  --model_args pretrained=/path/to/llama3-llava-next-8b,conv_template=llava_llama_3 \
  --tasks ai2d,chartqa,docvqa_val,mme,mmbench_en_dev \
  --batch_size 1 \
  --log_samples \
  --log_samples_suffix llava_next \
  --output_path ./logs/
```

```bash
accelerate launch --num_processes=1 \
  -m lmms_eval \
  --model llava \
  --model_args pretrained=/path/to/llava-next-72b,conv_template=qwen_1_5,model_name=llava_qwen,device_map=auto \
  --tasks ai2d,chartqa,docvqa_val,mme,mmbench_en_dev \
  --batch_size 1 \
  --log_samples \
  --log_samples_suffix llava_next \
  --output_path ./logs/
```

## result

![alt text](readme_imgs/multimodal-8b.png)

### 精度



## 应用场景

参考[README.md](../README.md)

## 预训练权重

|model|url|
|:---:|:---:|
mashun1's avatar
mashun1 committed
84
85
|llama3-llava-next-8b|[hf](https://huggingface.co/lmms-lab/llama3-llava-next-8b) \| [SCNet](http://113.200.138.88:18080/aimodels/lmms-lab/llama3-llava-next-8b.git) |
|llava-next-qwen-32b|[hf](https://huggingface.co/lmms-lab/llava-next-qwen-32b) \| [SCNet](http://113.200.138.88:18080/aimodels/lmms-lab/llava-next-qwen-32b.git) |
mashun1's avatar
mashun1 committed
86
87
88
89
90
91
92
93
94
95

模型下载后保存至`ckpts`(需自行创建).

## 源码仓库及问题反馈

参考[README.md](../README.md)

## 参考资料

* https://github.com/LLaVA-VL/LLaVA-NeXT/blob/main/docs/LLaVA-NeXT.md
mashun1's avatar
mashun1 committed
96
* https://llava-vl.github.io/blog/2024-05-10-llava-next-stronger-llms/