Modify infer bugs about the version of transformers

3f958171 · chenych · 1bada89a · 3f958171 · 3f958171 · 3f958171
Commit 3f958171 authored Dec 25, 2025 by chenych
6 changed files
--- a/README.md
+++ b/README.md
@@ -7,7 +7,7 @@ InternVL 2.5 保留了与前代模型 InternVL 1.5 和 2.0 相同的模型架构
 正如之前的版本一样，应用了像素解混操作，将视觉标记的数量减少到原来的四分之一。此外，采用了与InternVL 1.5类似的动态分辨率策略，将图像分割成448×448像素的块。从InternVL 2.0开始的关键区别在于，额外引入了对多图像和视频数据的支持。
 <div align=center>
-    <img src="./Pic/arch.png"/>
+    <img src="./doc/arch.png"/>
 </div>
 ## 环境依赖
@@ -15,11 +15,11 @@ InternVL 2.5 保留了与前代模型 InternVL 1.5 和 2.0 相同的模型架构
 | 软件 | 版本 |
 | :------: | :------: |
-| DTK | 24.04.3 |
+| DTK | 25.04.2 |
 | python | 3.10 |
-| torch | 2.3.0 |
+| torch | 2.5.1+das.opt1.dtk25042 |
-| transformers | >=4.37.2 |
+| transformers | 4.37.2 |
-| flash-attn | 2.6.1 |
+| flash-attn | 2.6.1+das.opt1.dtk2504 |
 推荐使用镜像:
@@ -29,16 +29,15 @@ InternVL 2.5 保留了与前代模型 InternVL 1.5 和 2.0 相同的模型架构
 docker run -it --shm-size 200g --network=host --name {docker_name} --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_path/:/path/your_code_path/ -v /opt/hyhal/:/opt/hyhal/:ro {docker_image_name} bash
 示例如下：
-docker run -it --shm-size=1024G --network=host --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name internvl -v /path/your_code_data/:/path/your_code_data/ -v /opt/hyhal:/opt/hyhal  image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.3.0-py3.10-dtk24.04.3-ubuntu20.04 bash
+docker run -it --shm-size 200g --network=host --name internvl --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_path/:/path/your_code_path/ -v /opt/hyhal/:/opt/hyhal/:ro image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.5.1-ubuntu22.04-dtk25.04.2-py3.10 bash
-docker run -it --shm-size 200g --network=host --name qwen3 --privileged --device=/dev/kfd --device=/dev/dri --device=/dev/mkfd --group-add video --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -u root -v /path/your_code_path/:/path/your_code_path/ -v /opt/hyhal/:/opt/hyhal/:ro image.sourcefind.cn:5000/dcu/admin/base/vllm:0.9.2-ubuntu22.04-dtk25.04.2-py3.10 bash
 ```
 更多镜像可前往[光源](https://sourcefind.cn/#/service-list)下载使用。
 关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装，其它包参照requirements.txt安装：
 ```bash
 pip install -r requirements.txt
-pip install accelerate
 ```
+> 如果出现vllm等包提示版本不兼容，可忽略。
 ## 数据集
 暂无
@@ -58,7 +57,7 @@ python internvl_inference.py
 ## 效果展示
 - 多模态推理
 <div align=left>
-    <img src="./Pic/result.png"/>
+    <img src="./doc/result.png"/>
 </div>
 ### 精度

--- a/Pic/arch.png
+++ b/Pic/arch.png
--- a/Pic/result.png
+++ b/Pic/result.png
--- a/Pic/theory.png
+++ b/Pic/theory.png
--- a/model.properties
+++ b/model.properties
@@ -4,7 +4,11 @@ modelCode=1473
 modelName=InternVL2.5_pytorch
 # 模型描述
 modelDescription=高级多模态大型语言模型（MLLM）系列，它建立在InternVL 2.0的基础上，保持了其核心模型架构，同时引入了训练和测试策略以及数据质量的显著增强。
-# 应用场景
+# 运行过程
-appScenario=推理,对话问答,科研,教育,政府,金融
+processType=推理
+# 算法类别
+appCategory=对话问答
 # 框架类型
 frameType=Pytorch
+# 加速卡类型
+accelerateType=BW1000
--- a/requirements.txt
+++ b/requirements.txt
-transformers>=4.37.2
+transformers==4.37.2
 decord
 timm
+accelerate
\ No newline at end of file