Add icon and SCNet

c91cfeba · Rayyyyy · ac652aac · c91cfeba · c91cfeba · c91cfeba
Commit c91cfeba authored Jul 09, 2024 by Rayyyyy
Showing with 43 additions and 34 deletions

README.md README.md +40 -32

docker/Dockerfile docker/Dockerfile +1 -1

eval/nyuv2_depth/eval.sh eval/nyuv2_depth/eval.sh +1 -1

hostfile hostfile +1 -0

icon.png icon.png +0 -0

No files found.
--- a/README.md
+++ b/README.md
 # Painter
 ## 论文
-[Images Speak in Images: A Generalist Painter for In-Context Visual Learning](https://arxiv.org/abs/2212.02499)
+`Images Speak in Images: A Generalist Painter for In-Context Visual Learning`
+- https://arxiv.org/abs/2212.02499

 ## 模型结构
 整个模型基于VIT系列的骨干，VIT的backbone由encoder、decoder两部分组成, 编码器是由VIT的骨干block堆积而成，解码器其实是卷积层构成的。
@@ -9,18 +10,18 @@
 </div>

 ## 算法原理
-通用视觉模型 Painter , 将“以视觉为中心”作为建模核心思想，将图像作为输入和输出，从而获得上下文视觉信息，完成不同的视觉任务。将视觉任务的连续输出空间离散化, 并使用语言或专门设计的离散标记作为任务提示, 将视觉问题转化为 NLP 问题。
+通用视觉模型Painter, 将“以视觉为中心”作为建模核心思想，将图像作为输入和输出，从而获得上下文视觉信息，完成不同的视觉任务。将视觉任务的连续输出空间离散化, 并使用语言或专门设计的离散标记作为任务提示, 将视觉问题转化为NLP问题。
 <div align=center>
    <img src="./doc/progress.png"/>
 </div>

 ## 环境配置
-Tips: timm==0.3.2 版本存在 [cannot import name 'container_abcs' from 'torch._six'](https://github.com/huggingface/pytorch-image-models/issues/420#issuecomment-776459842) 问题, 需要将 `timm/models/layers/helpers.py` 中 `from torch._six import container_abcs` 修改为
+Tips: timm==0.3.2 版本存在[cannot import name 'container_abcs' from 'torch._six'](https://github.com/huggingface/pytorch-image-models/issues/420#issuecomment-776459842)问题, 需要将 `timm/models/layers/helpers.py` 中`from torch._six import container_abcs`修改为

 ```bash
 import torch
-TORCH_MAJOR = int(torch.__version__.split('.')[0])
-TORCH_MINOR = int(torch.__version__.split('.')[1])
+TORCH_MAJOR=int(torch.__version__.split('.')[0])
+TORCH_MINOR=int(torch.__version__.split('.')[1])

 if TORCH_MAJOR == 1 and TORCH_MINOR < 8:
    from torch._six import container_abcs
@@ -32,11 +33,10 @@ else:
 -v 路径、docker_name和imageID根据实际情况修改

 ```bash
-docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest
+docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-centos7.6-dtk24.04-py310
 docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash

 cd /your_code_path/painter_pytorch
-pip install --upgrade setuptools wheel
 pip install -r requirements.txt
 # 安装detectron2
 git clone https://github.com/facebookresearch/detectron2
@@ -48,12 +48,11 @@ python -m pip install -e detectron2

 ```bash
 cd ./docker
-cp ../requirements.txt requirements.txt
+
 docker build --no-cache -t painter:latest .
 docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash

 cd /your_code_path/painter_pytorch
-pip install --upgrade setuptools wheel
 pip install -r requirements.txt
 # 安装detectron2
 git clone https://github.com/facebookresearch/detectron2
@@ -62,13 +61,13 @@ python -m pip install -e detectron2

 ### Anaconda（方法三）

-1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装： https://developer.hpccube.com/tool/
+1、关于本项目DCU显卡所需的特殊深度学习库可从光合开发者社区下载安装：https://developer.hpccube.com/tool/

 ```
-DTK软件栈：dtk23.04.1
-python：python3.8
-torch：1.13.1
-torchvision：0.14.1
+DTK软件栈：dtk24.04
+python：python3.10
+torch：2.1.0
+torchvision：0.16.0
 ```

 Tips：以上dtk软件栈、python、torch等DCU相关工具版本需要严格一一对应
@@ -76,7 +75,6 @@ Tips：以上dtk软件栈、python、torch等DCU相关工具版本需要严格
 2、其他非特殊库直接按照requirements.txt安装

 ```bash
-pip install --upgrade setuptools wheel
 pip install -r requirements.txt
 # 安装detectron2
 git clone https://github.com/facebookresearch/detectron2
@@ -84,9 +82,9 @@ python -m pip install -e detectron2
 ```

 ## 数据集
-本项目所需数据集较多, 可以使用提供的[a toy training dataset](https://huggingface.co/BAAI/Painter/blob/main/toy_datasets.tar)数据集来验证部分功能, 数据集由每个类别中各10个类别组成. 将数据集放置于 `$Painter_ROOT/toy_datasets` 路径下, 并设置`$Painter_ROOT/train_painter_vit_large.sh` 中 `DATA_PATH=toy_datasets`即可, 其他参数请参考训练章节的介绍。
+本项目所需数据集较多, 可以使用提供的[toy_datasets](http://113.200.138.88:18080/aimodels/baai/Painter/-/blob/main/toy_datasets.tar)数据集来验证部分功能, 数据集由每个类别中各10个类别组成. 将数据集放置于`$Painter_ROOT/toy_datasets`路径下, 并设置`$Painter_ROOT/train_painter_vit_large.sh`中`DATA_PATH=toy_datasets`即可, 其他参数请参考训练章节的介绍。

-如需完整数据集, 请参考[data instructions](docs/DATA.md)。完整数据集的目录结构如下：
+如需完整数据集, 请参考[data instructions](./docs/DATA.md)。完整数据集的目录结构如下：

 ```
 ├── nyu_depth_v2/
@@ -174,58 +172,67 @@ python -m pip install -e detectron2
 ```

 ## 训练
-下载预训练模型 [MAE ViT-Large model ](https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_large.pth), 修改 `$Painter_ROOT/train.sh` 或 `$Painter_ROOT/single_process.sh` 中finetune参数地址.
+下载预训练模型[MAE ViT-Large model](https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_large.pth), 修改`$Painter_ROOT/train.sh`或`$Painter_ROOT/single_process.sh`中finetune参数地址。

 ### 单机多卡
-本项目默认参数是单机4卡 (total_bsz = 1x4x32 = 128), 如需使用其他的卡数, 请修改 train.sh 中对应参数.
+本项目默认参数是单机4卡(total_bsz=1x4x32=128), 如需使用其他的卡数, 请修改`train.sh`中对应参数.
+
 ```bash
 bash train.sh
 ```

 ### 多机多卡
-Tips: 作者使用8个节点, 每个节点8张卡 (total_bsz = 8x8x32 = 2048) 进行的训练;
-使用多节点的情况下，需要将使用节点写入hostfile文件, 多节点每个节点一行, 例如: c1xxxxxx slots=4
+Tips: 作者使用8个节点, 每个节点8张卡 (total_bsz=8x8x32=2048) 进行的训练;
+
+使用多节点的情况下，需要将使用节点写入hostfile文件, 多节点每个节点一行, 例如: c1xxxxxx slots=8, 8代表当前节点8张卡
+
 ```bash
 bash run_train_multi.sh
 ```

 ## 推理
-1. 下载推理模型[🤗 Hugging Face Models](https://huggingface.co/BAAI/Painter/blob/main/painter_vit_large.pth), 或者准备好自己的待测试模型
+1. 下载推理模型[painter_vit_large.pth](http://113.200.138.88:18080/aimodels/baai/Painter), 或者准备好自己的待测试模型；

-2. 部分测试无法使用toy_datasets进行验证，如果使用toy_datasets数据集进行推理, 需确认默认图片是否存在, 如不存在请修改对应参数, 如
-`eval/nyuv2_depth/eval.sh` 中的 `PROMPT="study_room_0005b/rgb_00094"`, rgb_00094图片不存在于toy_datasets数据集中，请对应修改为toy_datasets中的图片名，如`PROMPT="study_room_0005b/rgb_00092"`
+2. 部分测试无法使用toy_datasets进行验证，如果使用toy_datasets数据集进行推理, 需确认默认图片是否存在, 如不存在请修改对应参数`PROMPT`;

-各个数据集推理方法如下:
+3. 各个数据集推理方法如下:

 ### NYU Depth V2
-设置 `$Painter_ROOT/eval/nyuv2_depth/eval.sh` 文件里的 `JOB_NAME`、`PROMPT`、`CKPT_FILE` 和 `DATA_DIR` 参数，执行：
+设置[eval/nyuv2_depth/eval.sh](./eval/nyuv2_depth/eval.sh)文件里的`JOB_NAME`、`PROMPT`、`CKPT_FILE`和`DATA_DIR`参数，执行：
+
 ```bash
 bash eval/nyuv2_depth/eval.sh
 ```

 ### ADE20k Semantic Segmentation
-1. **无法使用 toy_datasets 进行验证**;
-2. 设置 `$Painter_ROOT/eval/ade20k_semantic/eval.sh` 文件里的 `JOB_NAME`、`PROMPT` 参数, 执行下面的命令:
+1. **当前数据集无法使用 toy_datasets 进行验证**;
+
+2. 设置 [eval/ade20k_semantic/eval.sh](./eval/ade20k_semantic/eval.sh)文件里的`JOB_NAME`、`PROMPT`参数, 执行下面的命令:
+
 ```bash
 bash eval/ade20k_semantic/eval.sh
 ```

 ### COCO Panoptic Segmentation
-1. **无法使用 toy_datasets 进行验证**;
-2. 设置 `$Painter_ROOT/eval/coco_panoptic/eval.sh` 文件里的 `JOB_NAME`、`PROMPT` 参数, 然后执行下面的命令:
+1. **当前数据集无法使用 toy_datasets 进行验证**;
+
+2. 设置[eval/coco_panoptic/eval.sh](./eval/coco_panoptic/eval.sh)文件里的`JOB_NAME`、`PROMPT`参数, 然后执行下面的命令:
 ```bash
 bash eval/coco_panoptic/eval.sh
 ```

 ### COCO Human Pose Estimation
-1. **无法使用 toy_datasets 进行验证**;
+1. **当前数据集无法使用 toy_datasets 进行验证**;
+
 2. 生成验证所需的图像:
+
 ```bash
 python -m torch.distributed.launch --nproc_per_node=4 --master_port=29500 --use_env eval/mmpose_custom/painter_inference_pose.py --ckpt_path models/painter_vit_large/painter_vit_large.pth
 python -m torch.distributed.launch --nproc_per_node=4 --master_port=29500 --use_env eval/mmpose_custom/painter_inference_pose.py --ckpt_path models/painter_vit_large/painter_vit_large.pth --flip_test
 ```

 3. 修改  `$Painter_ROOT/eval/mmpose_custom/configs/coco_256x192_test_offline.py` 文件中的 `job_name`、`data_root`、`bbox_file` 和 `ckpt_file` 参数, 执行:
+
 ```bash
 cd $Painter_ROOT/eval/mmpose_custom
 ./tools/dist_test.sh configs/coco_256x192_test_offline.py none 1 --eval mAP
@@ -234,8 +241,9 @@ cd $Painter_ROOT/eval/mmpose_custom
 ### Low-Light Image Enhancement
 ```bash
 python eval/lol/painter_inference_lol.py --ckpt_path models/path/of/painter_vit_large.pth --data_dir path/of/datasets
-
+```
 Example:
+```
 python eval/lol/painter_inference_lol.py --ckpt_path models/painter_vit_large.pth --data_dir datasets
 ```


--- a/docker/Dockerfile
+++ b/docker/Dockerfile
-FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest
\ No newline at end of file
+FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:2.1.0-centos7.6-dtk24.04-py310
\ No newline at end of file
--- a/eval/nyuv2_depth/eval.sh
+++ b/eval/nyuv2_depth/eval.sh
@@ -4,7 +4,7 @@ set -x

 JOB_NAME="painter_vit_large"
 CKPT_FILE="painter_vit_large.pth"
-PROMPT="study_room_0005b/rgb_00094"
+PROMPT="study_room_0005b/rgb_00092"

 MODEL="painter_vit_large_patch16_input896x448_win_dec64_8glb_sl1"
 CKPT_PATH="models/${JOB_NAME}/${CKPT_FILE}"

--- a/hostfile
+++ b/hostfile
+c1xxxxxx slots=8
\ No newline at end of file
--- a/icon.png
+++ b/icon.png