Commit f9f61fde authored by chenych's avatar chenych
Browse files

update readme and eval

parent 106580f9
__pycache__ __pycache__
datasets/
toy_datasets/
models models
models_inference models_inference
work_dirs/ work_dirs/
wandb wandb
datasets
.idea .idea
.nfs* .nfs*
*.pth *.pth
log.txt log.txt
......
...@@ -4,48 +4,63 @@ ...@@ -4,48 +4,63 @@
## 模型结构 ## 模型结构
<div align=center> <div align=center>
<img src="./doc/method.jpg"/> <img src="./doc/method.jpg"/>
</div> </div>
## 算法原理 ## 算法原理
将视觉任务的连续输出空间离散化,并使用语言或专门设计的离散标记作为任务提示,将视觉问题转化为 NLP 问题.
<div align=center> <div align=center>
<img src="./doc/progress.png"/> <img src="./doc/progress.png"/>
</div> </div>
## 环境配置 ## 环境配置
Tips: timm==0.3.2 版本存在 [cannot import name 'container_abcs' from 'torch._six'](https://github.com/huggingface/pytorch-image-models/issues/420#issuecomment-776459842) 问题, 需要将 timm/models/layers/helpers.py 中 from torch._six import container_abcs 修改为
```bash
import torch
TORCH_MAJOR = int(torch.__version__.split('.')[0])
TORCH_MINOR = int(torch.__version__.split('.')[1])
if TORCH_MAJOR == 1 and TORCH_MINOR < 8:
from torch._six import container_abcs
else:
import collections.abc as container_abcs
```
### Docker(方法一) ### Docker(方法一)
-v 路径、docker_name和imageID根据实际情况修改 -v 路径、docker_name和imageID根据实际情况修改
```image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest ```bash
docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest docker pull image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest
docker run -it -v /path/your_code_data/:/path/ your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
cd /your_code_path/maskeddenoising_pytorch cd /your_code_path/painter_pytorch
pip install --upgrade setuptools wheel pip install --upgrade setuptools wheel
pip install -r requirement.txt pip install -r requirements.txt
# 安装detectron2
git clone https://github.com/facebookresearch/detectron2
python -m pip install -e detectron2
``` ```
### Dockerfile(方法二) ### Dockerfile(方法二)
-v 路径、docker_name和imageID根据实际情况修改 -v 路径、docker_name和imageID根据实际情况修改
``` ```bash
cd ./docker cd ./docker
cp ../requirement.txt requirement.txt cp ../requirements.txt requirements.txt
docker build --no-cache -t maskeddenoising:latest . docker build --no-cache -t painter:latest .
docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash docker run -it -v /path/your_code_data/:/path/your_code_data/ --shm-size=32G --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name docker_name imageID bash
cd /your_code_path/maskeddenoising_pytorch cd /your_code_path/painter_pytorch
pip install --upgrade setuptools wheel pip install --upgrade setuptools wheel
pip install -r requirement.txt pip install -r requirements.txt
# 安装detectron2
git clone https://github.com/facebookresearch/detectron2
python -m pip install -e detectron2
``` ```
### Anaconda(方法三) ### Anaconda(方法三)
...@@ -61,22 +76,21 @@ torchvision:0.14.1 ...@@ -61,22 +76,21 @@ torchvision:0.14.1
Tips:以上dtk软件栈、python、torch等DCU相关工具版本需要严格一一对应 Tips:以上dtk软件栈、python、torch等DCU相关工具版本需要严格一一对应
2、其他非特殊库直接按照requirement.txt安装 2、其他非特殊库直接按照requirements.txt安装
```bash ```bash
pip install --upgrade setuptools wheel pip install --upgrade setuptools wheel
pip install -r requirement.txt pip install -r requirements.txt
# 安装detectron2
git clone https://github.com/facebookresearch/detectron2
python -m pip install -e detectron2
``` ```
## 数据集 ## 数据集
本项目所需数据集较多, 所以提供了一个少量数据集 toy_datasets进行项目功能验证,只需要将Painter_pytorch/train_painter_vit_large.sh 脚本中的DATA_PATH参数设置成toy_datasets即可,其他参数请参考训练章节的介绍.
如需完整数据集, 请参考一下步骤:
### 数据集所需环境配置 ### 数据集所需环境配置
#### ADE20K Semantic Segmentation
```bash
git clone https://github.com/facebookresearch/detectron2
python -m pip install -e detectron2
```
#### COCO Panoptic Segmentation #### COCO Panoptic Segmentation
...@@ -84,7 +98,6 @@ python -m pip install -e detectron2 ...@@ -84,7 +98,6 @@ python -m pip install -e detectron2
pip install openmim #(0.3.9) pip install openmim #(0.3.9)
mim install mmcv-full # 注意版本是不是1.7.1 mim install mmcv-full # 注意版本是不是1.7.1
pip install mmdet==2.26.0 # 对应 mmcv-1.7.1 pip install mmdet==2.26.0 # 对应 mmcv-1.7.1
pip install yapf==0.40.1
``` ```
#### COCO Pose Estimation #### COCO Pose Estimation
...@@ -287,9 +300,7 @@ python data/sidd/gen_json_sidd.py --split train ...@@ -287,9 +300,7 @@ python data/sidd/gen_json_sidd.py --split train
python data/sidd/gen_json_sidd.py --split val python data/sidd/gen_json_sidd.py --split val
``` ```
### Low-Light Image Enhancement ### Low-Light Image Enhancement
首先, 下载 LOL 数据集 [google drive](https://drive.google.com/file/d/157bjO1_cFuSd0HWDUuAmcHRJDVyWpOxB/view), 将下载的数据集存放到 `$Painter_ROOT/datasets/light_enhance/` 路径下. 完成后的 LOL 文件结构如下所示: 首先, 下载 LOL 数据集 [google drive](https://drive.google.com/file/d/157bjO1_cFuSd0HWDUuAmcHRJDVyWpOxB/view), 将下载的数据集存放到 `$Painter_ROOT/datasets/light_enhance/` 路径下. 完成后的 LOL 文件结构如下所示:
``` ```
...@@ -302,7 +313,7 @@ light_enhance/ ...@@ -302,7 +313,7 @@ light_enhance/
high/ high/
``` ```
Next, prepare json files for training and evaluation. The generated json files will be saved at `$Painter_ROOTdatasets/light_enhance/`. 准备训练和验证所需json文件, 生成的json文件将保存在 `$Painter_ROOTdatasets/light_enhance/` 路径下.
``` ```
python data/lol/gen_json_lol.py --split train python data/lol/gen_json_lol.py --split train
python data/lol/gen_json_lol.py --split val python data/lol/gen_json_lol.py --split val
...@@ -315,15 +326,15 @@ python data/lol/gen_json_lol.py --split val ...@@ -315,15 +326,15 @@ python data/lol/gen_json_lol.py --split val
│ ├── sync/ │ ├── sync/
│ ├── official_splits/ │ ├── official_splits/
│ ├── nyu_depth_v2_labeled.mat │ ├── nyu_depth_v2_labeled.mat
│ ├── nyuv2_sync_image_depth.json # generated │ ├── nyuv2_sync_image_depth.json # 生成
│ ├── nyuv2_test_image_depth.json # generated │ ├── nyuv2_test_image_depth.json # 生成
├── ade20k/ ├── ade20k/
│ ├── images/ │ ├── images/
│ ├── annotations/ │ ├── annotations/
│ ├── annotations_detectron2/ # generated │ ├── annotations_detectron2/ # 生成
│ ├── annotations_with_color/ # generated │ ├── annotations_with_color/ # 生成
│ ├── ade20k_training_image_semantic.json # generated │ ├── ade20k_training_image_semantic.json # 生成
│ ├── ade20k_validation_image_semantic.json # generated │ ├── ade20k_validation_image_semantic.json # 生成
├── ADEChallengeData2016/ # sim-link to $Painter_ROOT/datasets/ade20k ├── ADEChallengeData2016/ # sim-link to $Painter_ROOT/datasets/ade20k
├── coco/ ├── coco/
│ ├── train2017/ │ ├── train2017/
...@@ -336,14 +347,14 @@ python data/lol/gen_json_lol.py --split val ...@@ -336,14 +347,14 @@ python data/lol/gen_json_lol.py --split val
│ ├── panoptic_val2017.json │ ├── panoptic_val2017.json
│ ├── panoptic_train2017/ │ ├── panoptic_train2017/
│ ├── panoptic_val2017/ │ ├── panoptic_val2017/
│ ├── panoptic_semseg_val2017/ # generated │ ├── panoptic_semseg_val2017/ # 生成
│ ├── panoptic_val2017/ # sim-link to $Painter_ROOT/datasets/coco/annotations/panoptic_val2017 │ ├── panoptic_val2017/ # sim-link to $Painter_ROOT/datasets/coco/annotations/panoptic_val2017
│ ├── pano_sem_seg/ # generated │ ├── pano_sem_seg/ # 生成
│ ├── panoptic_segm_train2017_with_color │ ├── panoptic_segm_train2017_with_color
│ ├── panoptic_segm_val2017_with_color │ ├── panoptic_segm_val2017_with_color
│ ├── coco_train2017_image_panoptic_sem_seg.json │ ├── coco_train2017_image_panoptic_sem_seg.json
│ ├── coco_val2017_image_panoptic_sem_seg.json │ ├── coco_val2017_image_panoptic_sem_seg.json
│ ├── pano_ca_inst/ # generated │ ├── pano_ca_inst/ # 生成
│ ├── train_aug0/ │ ├── train_aug0/
│ ├── train_aug1/ │ ├── train_aug1/
│ ├── ... │ ├── ...
...@@ -356,7 +367,7 @@ python data/lol/gen_json_lol.py --split val ...@@ -356,7 +367,7 @@ python data/lol/gen_json_lol.py --split val
├── coco_pose/ ├── coco_pose/
│ ├── person_detection_results/ │ ├── person_detection_results/
│ ├── COCO_val2017_detections_AP_H_56_person.json │ ├── COCO_val2017_detections_AP_H_56_person.json
│ ├── data_pair/ # generated │ ├── data_pair/ # 生成
│ ├── train_256x192_aug0/ │ ├── train_256x192_aug0/
│ ├── train_256x192_aug1/ │ ├── train_256x192_aug1/
│ ├── ... │ ├── ...
...@@ -364,8 +375,8 @@ python data/lol/gen_json_lol.py --split val ...@@ -364,8 +375,8 @@ python data/lol/gen_json_lol.py --split val
│ ├── val_256x192/ │ ├── val_256x192/
│ ├── test_256x192/ │ ├── test_256x192/
│ ├── test_256x192_flip/ │ ├── test_256x192_flip/
│ ├── coco_pose_256x192_train.json # generated │ ├── coco_pose_256x192_train.json # 生成
│ ├── coco_pose_256x192_val.json # generated │ ├── coco_pose_256x192_val.json # 生成
├── derain/ ├── derain/
│ ├── train/ │ ├── train/
│ ├── input/ │ ├── input/
...@@ -382,8 +393,8 @@ python data/lol/gen_json_lol.py --split val ...@@ -382,8 +393,8 @@ python data/lol/gen_json_lol.py --split val
│ ├── SIDD_Medium_Srgb/ │ ├── SIDD_Medium_Srgb/
│ ├── train/ │ ├── train/
│ ├── val/ │ ├── val/
│ ├── denoise_ssid_train.json # generated │ ├── denoise_ssid_train.json # 生成
│ ├── denoise_ssid_val.json # generated │ ├── denoise_ssid_val.json # 生成
├── light_enhance/ ├── light_enhance/
│ ├── our485/ │ ├── our485/
│ ├── low/ │ ├── low/
...@@ -391,102 +402,71 @@ python data/lol/gen_json_lol.py --split val ...@@ -391,102 +402,71 @@ python data/lol/gen_json_lol.py --split val
│ ├── eval15/ │ ├── eval15/
│ ├── low/ │ ├── low/
│ ├── high/ │ ├── high/
│ ├── enhance_lol_train.json # generated │ ├── enhance_lol_train.json # 生成
│ ├── enhance_lol_val.json # generated │ ├── enhance_lol_val.json # 生成
``` ```
## 训练 ## 训练
下载预训练模型 [MAE ViT-Large model ](https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_large.pth), 修改`$Painter_ROOT/train_painter_vit_large.sh`中finetune参数地址. 下载预训练模型 [MAE ViT-Large model ](https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_large.pth), 修改 `$Painter_ROOT/train_painter_vit_large.sh` 中finetune参数地址.
### 单机多卡
#### 普通训练
``` ### 单机多卡
bash train_painter_vit_large.sh 本项目默认参数是单机4卡 (total_bsz = 1x4x32 = 128), 如需使用其他的卡数, 请修改 train_painter_vit_large.sh 中对应参数.
```bash
bash train.sh
``` ```
#### 分布式训练 ### 多机多卡
``` Tips: 作者使用8个节点,每个节点8张卡 (total_bsz = 8x8x32 = 2048) 进行的训练
bash train_multi.sh ```bash
bash train_painter_vit_large.sh
``` ```
## 推理 ## 推理
下载推理模型[🤗 Hugging Face Models](https://huggingface.co/BAAI/Painter/blob/main/painter_vit_large.pth). The results on various tasks are summarized below: 下载推理模型[🤗 Hugging Face Models](https://huggingface.co/BAAI/Painter/blob/main/painter_vit_large.pth), 或者准备好自己的待测试模型, 各个任务的推理方法如下:
## NYU Depth V2 ### NYU Depth V2
To evaluate Painter on NYU Depth V2, you may first update the `$JOB_NAME` in `$Painter_ROOT/eval/nyuv2_depth/eval.sh`, then run: 首先设置 `$Painter_ROOT/eval/nyuv2_depth/eval.sh` 文件里的 `JOB_NAME``DATA_DIR` 参数, 然后执行下面的命令:
```bash ```bash
bash eval/nyuv2_depth/eval.sh bash eval/nyuv2_depth/eval.sh
``` ```
## ADE20k Semantic Segmentation ### ADE20k Semantic Segmentation
To evaluate Painter on ADE20k semantic segmentation, you may first update the `$JOB_NAME` in `$Painter_ROOT/eval/ade20k_semantic/eval.sh`, then run: 首先设置 `$Painter_ROOT/eval/ade20k_semantic/eval.sh` 文件里的 `JOB_NAME` 参数, 然后执行下面的命令:
```bash ```bash
bash eval/ade20k_semantic/eval.sh bash eval/ade20k_semantic/eval.sh
``` ```
## COCO Panoptic Segmentation ### COCO Panoptic Segmentation
To evaluate Painter on COCO panoptic segmentation, you may first update the `$JOB_NAME` in `$Painter_ROOT/eval/coco_panoptic/eval.sh`, then run: 首先设置 `$Painter_ROOT/eval/coco_panoptic/eval.sh` 文件里的 `JOB_NAME` 参数, 然后执行下面的命令:
```bash ```bash
bash eval/coco_panoptic/eval.sh bash eval/coco_panoptic/eval.sh
``` ```
### COCO Human Pose Estimation
## COCO Human Pose Estimation 为了评估Painter对COCO姿态的估计, 首先生成验证所需的图像:
为了评估Painter对COCO姿态的估计, 首先生成绘制的图像:
```bash ```bash
python -m torch.distributed.launch --nproc_per_node=8 --master_port=29500 --use_env eval/mmpose_custom/painter_inference_pose.py --ckpt_path models/painter_vit_large/painter_vit_large.pth python -m torch.distributed.launch --nproc_per_node=8 --master_port=29500 --use_env eval/mmpose_custom/painter_inference_pose.py --ckpt_path models/painter_vit_large/painter_vit_large.pth
python -m torch.distributed.launch --nproc_per_node=8 --master_port=29500 --use_env eval/mmpose_custom/painter_inference_pose.py --ckpt_path models/painter_vit_large/painter_vit_large.pth --flip_test python -m torch.distributed.launch --nproc_per_node=8 --master_port=29500 --use_env eval/mmpose_custom/painter_inference_pose.py --ckpt_path models/painter_vit_large/painter_vit_large.pth --flip_test
``` ```
Then, you may update the `job_name` and `ckpt_file` in `$Painter_ROOT/eval/mmpose_custom/configs/coco_256x192_test_offline.py`, and run: 接下来, 修改 `$Painter_ROOT/eval/mmpose_custom/configs/coco_256x192_test_offline.py` 文件中的 `job_name``data_root``bbox_file``ckpt_file` 参数, 执行:
```bash ```bash
cd $Painter_ROOT/eval/mmpose_custom cd $Painter_ROOT/eval/mmpose_custom
./tools/dist_test.sh configs/coco_256x192_test_offline.py none 1 --eval mAP ./tools/dist_test.sh configs/coco_256x192_test_offline.py none 1 --eval mAP
``` ```
## Low-level Vision Tasks
### Deraining
To evaluate Painter on deraining, first generate the derained images.
```bash
python eval/derain/painter_inference_derain.py --ckpt_path models/painter_vit_large/painter_vit_large.pth
```
Then, update the path to derained images and ground truth in `$Painter_ROOT/eval/derain/evaluate_PSNR_SSIM.m` and run the following script in MATLAB.
```bash
$Painter_ROOT/eval/derain/evaluate_PSNR_SSIM.m
```
### Denoising
To evaluate Painter on SIDD denoising, first generate the denoised images.
```bash
python eval/sidd/painter_inference_sidd.py --ckpt_path models/painter_vit_large/painter_vit_large.pth
```
Then, update the path to denoising output and ground truth in `$Painter_ROOT/eval/sidd/eval_sidd.m` and run the following script in MATLAB.
```bash
$Painter_ROOT/eval/sidd/eval_sidd.m
```
### Low-Light Image Enhancement ### Low-Light Image Enhancement
To evaluate Painter on LoL image enhancement: To evaluate Painter on LoL image enhancement:
```bash ```bash
python eval/lol/painter_inference_lol.py --ckpt_path models/painter_vit_large/painter_vit_large.pth python eval/lol/painter_inference_lol.py --ckpt_path models/path/of/painter_vit_large.pth --data_dir path/of/datasets
```
#### 单卡推理 Example:
python eval/lol/painter_inference_lol.py --ckpt_path models/painter_vit_large.pth --data_dir toy_datasets
```
bash test.sh
``` ```
## result ## result
...@@ -505,21 +485,21 @@ bash test.sh ...@@ -505,21 +485,21 @@ bash test.sh
基于项目提供的测试数据, 得到单卡测试结果如下: 基于项目提供的测试数据, 得到单卡测试结果如下:
| | PSNR | SSIM | LPIPS | | | xxx | xxx | xxx |
| :------: | :------: | :------: | :------: | | :------: | :------: | :------: | :------: |
| ours | 29.04 | 0.7615 | 0.1294 | | ours | xxxx | xxxx | xxxx |
| paper | 30.13 | 0.7981 | 0.1031 | | paper | xxxx | xxxx | xxxx |
## 应用场景 ## 应用场景
### 算法类别 ### 算法类别
图像降噪 语义分割、深度估计、实例分割、关键点检测、图像去噪、图像去雨、图像增强
### 热点应用行业 ### 热点应用行业
交通,公安,制造
## 源码仓库及问题反馈 ## 源码仓库及问题反馈
http://developer.hpccube.com/codes/modelzoo/maskeddenoising_pytorch.git http://developer.hpccube.com/codes/modelzoo/painter_pytorch.git
## 参考资料 ## 参考资料
https://github.com/haoyuc/MaskedDenoising.git https://github.com/baaivision/Painter/tree/main/Painter
FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04-py38-latest FROM image.sourcefind.cn:5000/dcu/admin/base/pytorch:1.13.1-centos7.6-dtk-23.04.1-py38-latest
RUN source /opt/dtk/env.sh RUN source /opt/dtk/env.sh
COPY requirements.txt requirements.txt
RUN pip3 install -r requirements.txt RUN pip3 install -r requirements.txt
...@@ -12,11 +12,13 @@ MODEL="painter_vit_large_patch16_input896x448_win_dec64_8glb_sl1" ...@@ -12,11 +12,13 @@ MODEL="painter_vit_large_patch16_input896x448_win_dec64_8glb_sl1"
CKPT_PATH="models/${JOB_NAME}/${CKPT_FILE}" CKPT_PATH="models/${JOB_NAME}/${CKPT_FILE}"
DST_DIR="models_inference/${JOB_NAME}/ade20k_semseg_inference_${CKPT_FILE}_${PROMPT}_size${SIZE}" DST_DIR="models_inference/${JOB_NAME}/ade20k_semseg_inference_${CKPT_FILE}_${PROMPT}_size${SIZE}"
DATA_DIR="datasets"
# inference # inference
python -m torch.distributed.launch --nproc_per_node=${NUM_GPUS} --master_port=29504 --use_env \ python -m torch.distributed.launch --nproc_per_node=${NUM_GPUS} --master_port=29504 --use_env \
eval/ade20k_semantic/painter_inference_segm.py \ eval/ade20k_semantic/painter_inference_segm.py \
--model ${MODEL} --prompt ${PROMPT} \ --model ${MODEL} --prompt ${PROMPT} \
--data_dir ${DATA_DIR} \
--ckpt_path ${CKPT_PATH} --input_size ${SIZE} --ckpt_path ${CKPT_PATH} --input_size ${SIZE}
# postprocessing and eval # postprocessing and eval
......
...@@ -43,7 +43,7 @@ def get_args_parser(): ...@@ -43,7 +43,7 @@ def get_args_parser():
parser.add_argument('--prompt', type=str, help='prompt image in train set', parser.add_argument('--prompt', type=str, help='prompt image in train set',
default='ADE_train_00014165') default='ADE_train_00014165')
parser.add_argument('--input_size', type=int, default=448) parser.add_argument('--input_size', type=int, default=448)
parser.add_argument('--data_dir', type=str, default="datasets")
# distributed training parameters # distributed training parameters
parser.add_argument('--world_size', default=1, type=int, parser.add_argument('--world_size', default=1, type=int,
help='number of distributed processes') help='number of distributed processes')
...@@ -94,7 +94,7 @@ def run_one_image(img, tgt, size, model, out_path, device): ...@@ -94,7 +94,7 @@ def run_one_image(img, tgt, size, model, out_path, device):
if __name__ == '__main__': if __name__ == '__main__':
dataset_dir = "datasets/"
args = get_args_parser() args = get_args_parser()
args = ddp_utils.init_distributed_mode(args) args = ddp_utils.init_distributed_mode(args)
device = torch.device("cuda") device = torch.device("cuda")
...@@ -120,15 +120,15 @@ if __name__ == '__main__': ...@@ -120,15 +120,15 @@ if __name__ == '__main__':
device = torch.device("cuda") device = torch.device("cuda")
model_painter.to(device) model_painter.to(device)
img_src_dir = dataset_dir + "ade20k/images/validation" img_src_dir = "{}/ade20k/images/validation".format(args.data_dir)
# img_path_list = glob.glob(os.path.join(img_src_dir, "*.jpg")) # img_path_list = glob.glob(os.path.join(img_src_dir, "*.jpg"))
dataset_val = DatasetTest(img_src_dir, input_size, ext_list=('*.jpg',)) dataset_val = DatasetTest(img_src_dir, input_size, ext_list=('*.jpg',))
sampler_val = DistributedSampler(dataset_val, shuffle=False) sampler_val = DistributedSampler(dataset_val, shuffle=False)
data_loader_val = DataLoader(dataset_val, batch_size=1, sampler=sampler_val, data_loader_val = DataLoader(dataset_val, batch_size=1, sampler=sampler_val,
drop_last=False, collate_fn=ddp_utils.collate_fn, num_workers=2) drop_last=False, collate_fn=ddp_utils.collate_fn, num_workers=2)
img2_path = dataset_dir + "ade20k/images/training/{}.jpg".format(prompt) img2_path = "{}/ade20k/images/training/{}.jpg".format(args.data_dir, prompt)
tgt2_path = dataset_dir + "ade20k/annotations_with_color/training/{}.png".format(prompt) tgt2_path = "{}/ade20k/annotations_with_color/training/{}.png".format(args.data_dir, prompt)
# load the shared prompt image pair # load the shared prompt image pair
img2 = Image.open(img2_path).convert("RGB") img2 = Image.open(img2_path).convert("RGB")
......
...@@ -36,6 +36,7 @@ def get_args_parser(): ...@@ -36,6 +36,7 @@ def get_args_parser():
parser.add_argument('--dist_thr', type=float, help='dir to ckpt', parser.add_argument('--dist_thr', type=float, help='dir to ckpt',
default=19.) default=19.)
parser.add_argument('--num_windows', type=int, default=4) parser.add_argument('--num_windows', type=int, default=4)
parser.add_argument('--data_dir', type=str, default="datasets")
return parser.parse_args() return parser.parse_args()
...@@ -357,7 +358,7 @@ class COCOEvaluatorCustom(COCOEvaluator): ...@@ -357,7 +358,7 @@ class COCOEvaluatorCustom(COCOEvaluator):
if __name__ == '__main__': if __name__ == '__main__':
args = get_args_parser() args = get_args_parser()
dataset_name = 'coco_2017_val' dataset_name = 'coco_2017_val'
coco_annotation = "datasets/coco/annotations/instances_val2017.json" coco_annotation = "{}/coco/annotations/instances_val2017.json".format(args.data_dir)
pred_dir = args.pred_dir pred_dir = args.pred_dir
output_folder = os.path.join(pred_dir, 'eval_{}'.format(dataset_name)) output_folder = os.path.join(pred_dir, 'eval_{}'.format(dataset_name))
......
...@@ -210,6 +210,7 @@ def get_args_parser(): ...@@ -210,6 +210,7 @@ def get_args_parser():
default="models_inference/new3_all_lr5e-4/") default="models_inference/new3_all_lr5e-4/")
parser.add_argument('--ckpt_file', type=str, default="") parser.add_argument('--ckpt_file', type=str, default="")
parser.add_argument('--input_size', type=int, default=448) parser.add_argument('--input_size', type=int, default=448)
parser.add_argument('--data_dir', type=str, default="datasets")
return parser.parse_args() return parser.parse_args()
...@@ -222,7 +223,7 @@ if __name__ == "__main__": ...@@ -222,7 +223,7 @@ if __name__ == "__main__":
ckpt_file, args.prompt, args.input_size)) ckpt_file, args.prompt, args.input_size))
pred_dir_semseg = os.path.join(work_dir, "pano_semseg_inference_{}_{}_size{}".format( pred_dir_semseg = os.path.join(work_dir, "pano_semseg_inference_{}_{}_size{}".format(
ckpt_file, args.prompt, args.input_size)) ckpt_file, args.prompt, args.input_size))
gt_file = "datasets/coco/annotations/instances_val2017.json" gt_file = "{}/coco/annotations/instances_val2017.json".format(args.data_dir)
print(pred_dir_inst) print(pred_dir_inst)
print(pred_dir_semseg) print(pred_dir_semseg)
......
...@@ -152,6 +152,7 @@ class COCOPanopticEvaluatorCustom(COCOPanopticEvaluator): ...@@ -152,6 +152,7 @@ class COCOPanopticEvaluatorCustom(COCOPanopticEvaluator):
overlap_threshold = None, overlap_threshold = None,
stuff_area_thresh = None, stuff_area_thresh = None,
instances_score_thresh = None, instances_score_thresh = None,
data_dir='datasets'
): ):
""" """
Args: Args:
...@@ -166,7 +167,7 @@ class COCOPanopticEvaluatorCustom(COCOPanopticEvaluator): ...@@ -166,7 +167,7 @@ class COCOPanopticEvaluatorCustom(COCOPanopticEvaluator):
self.instance_seg_result_path = instance_seg_result_path self.instance_seg_result_path = instance_seg_result_path
self.cocoDt = None self.cocoDt = None
if self.instance_seg_result_path is not None: if self.instance_seg_result_path is not None:
gt_file = "datasets/coco/annotations/instances_val2017.json" gt_file = "{}/coco/annotations/instances_val2017.json".format(data_dir)
cocoGt = COCO(annotation_file=gt_file) cocoGt = COCO(annotation_file=gt_file)
inst_result_file = os.path.join(instance_seg_result_path, "coco_instances_results.json") inst_result_file = os.path.join(instance_seg_result_path, "coco_instances_results.json")
print("loading pre-computed instance seg from \n{}".format(inst_result_file)) print("loading pre-computed instance seg from \n{}".format(inst_result_file))
...@@ -177,6 +178,7 @@ class COCOPanopticEvaluatorCustom(COCOPanopticEvaluator): ...@@ -177,6 +178,7 @@ class COCOPanopticEvaluatorCustom(COCOPanopticEvaluator):
self.stuff_area_thresh = stuff_area_thresh self.stuff_area_thresh = stuff_area_thresh
self.instances_score_thresh = instances_score_thresh self.instances_score_thresh = instances_score_thresh
def process(self, inputs, outputs): def process(self, inputs, outputs):
from panopticapi.utils import id2rgb from panopticapi.utils import id2rgb
...@@ -294,6 +296,7 @@ def get_args_parser_pano_seg(): ...@@ -294,6 +296,7 @@ def get_args_parser_pano_seg():
parser.add_argument('--work_dir', type=str, help='color type', parser.add_argument('--work_dir', type=str, help='color type',
default="") default="")
parser.add_argument('--input_size', type=int, default=448) parser.add_argument('--input_size', type=int, default=448)
parser.add_argument('--data_dir', type=str, default="datasets")
return parser.parse_args() return parser.parse_args()
...@@ -313,7 +316,7 @@ if __name__ == "__main__": ...@@ -313,7 +316,7 @@ if __name__ == "__main__":
"instance_segm_post_merge_{}_{}".format(ckpt_file, args.prompt), "instance_segm_post_merge_{}_{}".format(ckpt_file, args.prompt),
"dist{}_{}nms_iou{}".format(args.dist_thr, args.nms_type, args.nms_iou), "dist{}_{}nms_iou{}".format(args.dist_thr, args.nms_type, args.nms_iou),
) )
gt_file = "datasets/coco/annotations/instances_val2017.json" gt_file = "{}/coco/annotations/instances_val2017.json".format(args.data_dir)
print(pred_dir_inst) print(pred_dir_inst)
print(pred_dir_semseg) print(pred_dir_semseg)
...@@ -361,6 +364,7 @@ if __name__ == "__main__": ...@@ -361,6 +364,7 @@ if __name__ == "__main__":
overlap_threshold=args.overlap_threshold, overlap_threshold=args.overlap_threshold,
stuff_area_thresh=args.stuff_area_thresh, stuff_area_thresh=args.stuff_area_thresh,
instances_score_thresh=args.instances_score_thresh, instances_score_thresh=args.instances_score_thresh,
data_dir=args.data_dir
) )
inputs = [] inputs = []
......
...@@ -14,25 +14,30 @@ CKPT_PATH="models/${JOB_NAME}/${CKPT_FILE}" ...@@ -14,25 +14,30 @@ CKPT_PATH="models/${JOB_NAME}/${CKPT_FILE}"
MODEL="painter_vit_large_patch16_input896x448_win_dec64_8glb_sl1" MODEL="painter_vit_large_patch16_input896x448_win_dec64_8glb_sl1"
WORK_DIR="models_inference/${JOB_NAME}" WORK_DIR="models_inference/${JOB_NAME}"
DATA_DIR="datasets"
# inference # inference
python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS --master_port=29504 --use_env \ python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS --master_port=29504 --use_env \
eval/coco_panoptic/painter_inference_pano_semseg.py \ eval/coco_panoptic/painter_inference_pano_semseg.py \
--ckpt_path ${CKPT_PATH} --model ${MODEL} --prompt ${PROMPT} \ --ckpt_path ${CKPT_PATH} --model ${MODEL} --prompt ${PROMPT} \
--input_size ${SIZE} --data_dir ${DATA_DIR} \
--input_size ${SIZE}
python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS --master_port=29504 --use_env \ python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS --master_port=29504 --use_env \
eval/coco_panoptic/painter_inference_pano_inst.py \ eval/coco_panoptic/painter_inference_pano_inst.py \
--ckpt_path ${CKPT_PATH} --model ${MODEL} --prompt ${PROMPT} \ --ckpt_path ${CKPT_PATH} --model ${MODEL} --prompt ${PROMPT} \
--input_size ${SIZE} --data_dir ${DATA_DIR} \
--input_size ${SIZE}
# postprocessing and eval # postprocessing and eval
python \ python \
eval/coco_panoptic/COCOInstSegEvaluatorCustom.py \ eval/coco_panoptic/COCOInstSegEvaluatorCustom.py \
--work_dir ${WORK_DIR} --ckpt_file ${CKPT_FILE} \ --work_dir ${WORK_DIR} --ckpt_file ${CKPT_FILE} \
--data_dir ${DATA_DIR} \
--dist_thr ${DIST_THR} --prompt ${PROMPT} --input_size ${SIZE} --dist_thr ${DIST_THR} --prompt ${PROMPT} --input_size ${SIZE}
python \ python \
eval/coco_panoptic/COCOPanoEvaluatorCustom.py \ eval/coco_panoptic/COCOPanoEvaluatorCustom.py \
--work_dir ${WORK_DIR} --ckpt_file ${CKPT_FILE} \ --work_dir ${WORK_DIR} --ckpt_file ${CKPT_FILE} \
--data_dir ${DATA_DIR} \
--dist_thr ${DIST_THR} --prompt ${PROMPT} --input_size ${SIZE} --dist_thr ${DIST_THR} --prompt ${PROMPT} --input_size ${SIZE}
...@@ -43,7 +43,7 @@ def get_args_parser(): ...@@ -43,7 +43,7 @@ def get_args_parser():
parser.add_argument('--prompt', type=str, help='prompt image in train set', parser.add_argument('--prompt', type=str, help='prompt image in train set',
default='000000466730') default='000000466730')
parser.add_argument('--input_size', type=int, default=448) parser.add_argument('--input_size', type=int, default=448)
parser.add_argument('--data_dir', type=str, default="datasets")
# distributed training parameters # distributed training parameters
parser.add_argument('--world_size', default=1, type=int, parser.add_argument('--world_size', default=1, type=int,
help='number of distributed processes') help='number of distributed processes')
...@@ -94,9 +94,9 @@ def run_one_image(img, tgt, size, model, out_path, device): ...@@ -94,9 +94,9 @@ def run_one_image(img, tgt, size, model, out_path, device):
if __name__ == '__main__': if __name__ == '__main__':
dataset_dir = "datasets/"
args = get_args_parser() args = get_args_parser()
args = ddp_utils.init_distributed_mode(args) args = ddp_utils.init_distributed_mode(args)
device = torch.device("cuda") device = torch.device("cuda")
ckpt_path = args.ckpt_path ckpt_path = args.ckpt_path
...@@ -117,15 +117,15 @@ if __name__ == '__main__': ...@@ -117,15 +117,15 @@ if __name__ == '__main__':
model_painter = prepare_model(ckpt_path, model, args=args) model_painter = prepare_model(ckpt_path, model, args=args)
print('Model loaded.') print('Model loaded.')
img_src_dir = dataset_dir + "coco/val2017" img_src_dir = "{}/coco/val2017".format(args.data_dir)
# img_path_list = glob.glob(os.path.join(img_src_dir, "*.jpg")) # img_path_list = glob.glob(os.path.join(img_src_dir, "*.jpg"))
dataset_val = DatasetTest(img_src_dir, input_size, ext_list=('*.jpg',)) dataset_val = DatasetTest(img_src_dir, input_size, ext_list=('*.jpg',))
sampler_val = DistributedSampler(dataset_val, shuffle=False) sampler_val = DistributedSampler(dataset_val, shuffle=False)
data_loader_val = DataLoader(dataset_val, batch_size=1, sampler=sampler_val, data_loader_val = DataLoader(dataset_val, batch_size=1, sampler=sampler_val,
drop_last=False, collate_fn=ddp_utils.collate_fn, num_workers=2) drop_last=False, collate_fn=ddp_utils.collate_fn, num_workers=2)
img2_path = dataset_dir + "coco/pano_ca_inst/train_org/{}_image_train_org.png".format(prompt) img2_path = "{}/coco/pano_ca_inst/train_org/{}_image_train_org.png".format(args.data_dir, prompt)
tgt2_path = dataset_dir + "coco/pano_ca_inst/train_org/{}_label_train_org.png".format(prompt) tgt2_path = "{}/coco/pano_ca_inst/train_org/{}_label_train_org.png".format(args.data_dir, prompt)
# load the shared prompt image pair # load the shared prompt image pair
img2 = Image.open(img2_path).convert("RGB") img2 = Image.open(img2_path).convert("RGB")
......
...@@ -43,7 +43,7 @@ def get_args_parser(): ...@@ -43,7 +43,7 @@ def get_args_parser():
parser.add_argument('--prompt', type=str, help='prompt image in train set', parser.add_argument('--prompt', type=str, help='prompt image in train set',
default='000000466730') default='000000466730')
parser.add_argument('--input_size', type=int, default=448) parser.add_argument('--input_size', type=int, default=448)
parser.add_argument('--data_dir', type=str, default="datasets")
# distributed training parameters # distributed training parameters
parser.add_argument('--world_size', default=1, type=int, parser.add_argument('--world_size', default=1, type=int,
help='number of distributed processes') help='number of distributed processes')
...@@ -94,9 +94,9 @@ def run_one_image(img, tgt, size, model, out_path, device): ...@@ -94,9 +94,9 @@ def run_one_image(img, tgt, size, model, out_path, device):
if __name__ == '__main__': if __name__ == '__main__':
dataset_dir = "datasets/"
args = get_args_parser() args = get_args_parser()
args = ddp_utils.init_distributed_mode(args) args = ddp_utils.init_distributed_mode(args)
device = torch.device("cuda") device = torch.device("cuda")
ckpt_path = args.ckpt_path ckpt_path = args.ckpt_path
...@@ -120,14 +120,14 @@ if __name__ == '__main__': ...@@ -120,14 +120,14 @@ if __name__ == '__main__':
device = torch.device("cuda") device = torch.device("cuda")
models_painter.to(device) models_painter.to(device)
img_src_dir = dataset_dir + "coco/val2017" img_src_dir = "{}/coco/val2017".format(args.data_dir)
dataset_val = DatasetTest(img_src_dir, input_size, ext_list=('*.jpg',)) dataset_val = DatasetTest(img_src_dir, input_size, ext_list=('*.jpg',))
sampler_val = DistributedSampler(dataset_val, shuffle=False) sampler_val = DistributedSampler(dataset_val, shuffle=False)
data_loader_val = DataLoader(dataset_val, batch_size=1, sampler=sampler_val, data_loader_val = DataLoader(dataset_val, batch_size=1, sampler=sampler_val,
drop_last=False, collate_fn=ddp_utils.collate_fn, num_workers=2) drop_last=False, collate_fn=ddp_utils.collate_fn, num_workers=2)
img2_path = dataset_dir + "coco/train2017/{}.jpg".format(prompt) img2_path = "{}/coco/train2017/{}.jpg".format(args.data_dir, prompt)
tgt2_path = dataset_dir + "coco/pano_sem_seg/panoptic_segm_train2017_with_color/{}.png".format(prompt) tgt2_path = "{}/coco/pano_sem_seg/panoptic_segm_train2017_with_color/{}.png".format(args.data_dir, prompt)
# load the shared prompt image pair # load the shared prompt image pair
img2 = Image.open(img2_path).convert("RGB") img2 = Image.open(img2_path).convert("RGB")
......
...@@ -85,6 +85,7 @@ def get_args_parser(): ...@@ -85,6 +85,7 @@ def get_args_parser():
parser.add_argument('--prompt', type=str, help='prompt image in train set', parser.add_argument('--prompt', type=str, help='prompt image in train set',
default='100') default='100')
parser.add_argument('--input_size', type=int, default=448) parser.add_argument('--input_size', type=int, default=448)
parser.add_argument('--data_dir', type=str, default='datasets')
parser.add_argument('--save', action='store_true', help='save predictions', parser.add_argument('--save', action='store_true', help='save predictions',
default=False) default=False)
return parser.parse_args() return parser.parse_args()
...@@ -112,11 +113,11 @@ if __name__ == '__main__': ...@@ -112,11 +113,11 @@ if __name__ == '__main__':
device = torch.device("cuda") device = torch.device("cuda")
model_painter.to(device) model_painter.to(device)
img_src_dir = "datasets/light_enhance/eval15/low" img_src_dir = "{}/light_enhance/eval15/low".format(args.data_dir)
img_path_list = glob.glob(os.path.join(img_src_dir, "*.png")) img_path_list = glob.glob(os.path.join(img_src_dir, "*.png"))
img2_path = "datasets/light_enhance/our485/low/{}.png".format(prompt) img2_path = "{}/light_enhance/our485/low/{}.png".format(args.data_dir, prompt)
tgt2_path = "datasets/light_enhance/our485/high/{}.png".format(prompt) tgt2_path = "{}/light_enhance/our485/high/{}.png".format(args.data_dir, prompt)
print('prompt: {}'.format(tgt2_path)) print('prompt: {}'.format(tgt2_path))
# load the shared prompt image pair # load the shared prompt image pair
......
...@@ -9,12 +9,13 @@ PROMPT="study_room_0005b/rgb_00094" ...@@ -9,12 +9,13 @@ PROMPT="study_room_0005b/rgb_00094"
MODEL="painter_vit_large_patch16_input896x448_win_dec64_8glb_sl1" MODEL="painter_vit_large_patch16_input896x448_win_dec64_8glb_sl1"
CKPT_PATH="models/${JOB_NAME}/${CKPT_FILE}" CKPT_PATH="models/${JOB_NAME}/${CKPT_FILE}"
DST_DIR="models_inference/${JOB_NAME}/nyuv2_depth_inference_${CKPT_FILE}_${PROMPT}" DST_DIR="models_inference/${JOB_NAME}/nyuv2_depth_inference_${CKPT_FILE}_${PROMPT}"
DATA_DIR="datasets"
# inference # inference
python eval/nyuv2_depth/painter_inference_depth.py \ python eval/nyuv2_depth/painter_inference_depth.py \
--ckpt_path ${CKPT_PATH} --model ${MODEL} --prompt ${PROMPT} --ckpt_path ${CKPT_PATH} --model ${MODEL} --prompt ${PROMPT}
python eval/nyuv2_depth/eval_with_pngs.py \ python eval/nyuv2_depth/eval_with_pngs.py \
--pred_path ${DST_DIR} \ --pred_path ${DST_DIR} \
--gt_path datasets/nyu_depth_v2/official_splits/test/ \ --gt_path ${DATA_DIR}/nyu_depth_v2/official_splits/test/ \
--data_dir ${DATA_DIR} \
--dataset nyu --min_depth_eval 1e-3 --max_depth_eval 10 --eigen_crop --dataset nyu --min_depth_eval 1e-3 --max_depth_eval 10 --eigen_crop
...@@ -61,7 +61,7 @@ def run_one_image(img, tgt, size, model, out_path, device): ...@@ -61,7 +61,7 @@ def run_one_image(img, tgt, size, model, out_path, device):
bool_masked_pos[model.patch_embed.num_patches//2:] = 1 bool_masked_pos[model.patch_embed.num_patches//2:] = 1
bool_masked_pos = bool_masked_pos.unsqueeze(dim=0) bool_masked_pos = bool_masked_pos.unsqueeze(dim=0)
valid = torch.ones_like(tgt) valid = torch.ones_like(tgt)
loss, y, mask = model(x.float().to(device), tgt.float().to(device), bool_masked_pos.to(device), valid.float().to(device)) loss, y, mask = model(x.float().to(device), tgt.float().to(device), bool_masked_pos.to(device), valid.float().to(device))
y = model.unpatchify(y) y = model.unpatchify(y)
y = torch.einsum('nchw->nhwc', y).detach().cpu() y = torch.einsum('nchw->nhwc', y).detach().cpu()
...@@ -72,7 +72,7 @@ def run_one_image(img, tgt, size, model, out_path, device): ...@@ -72,7 +72,7 @@ def run_one_image(img, tgt, size, model, out_path, device):
output = output.mean(-1).int() output = output.mean(-1).int()
output = Image.fromarray(output.numpy()) output = Image.fromarray(output.numpy())
output.save(out_path) output.save(out_path)
def get_args_parser(): def get_args_parser():
parser = argparse.ArgumentParser('NYU Depth V2', add_help=False) parser = argparse.ArgumentParser('NYU Depth V2', add_help=False)
...@@ -83,6 +83,7 @@ def get_args_parser(): ...@@ -83,6 +83,7 @@ def get_args_parser():
parser.add_argument('--prompt', type=str, help='prompt image in train set', parser.add_argument('--prompt', type=str, help='prompt image in train set',
default='study_room_0005b/rgb_00094') default='study_room_0005b/rgb_00094')
parser.add_argument('--input_size', type=int, default=448) parser.add_argument('--input_size', type=int, default=448)
parser.add_argument('--data_dir', type=str, default="datasets")
return parser.parse_args() return parser.parse_args()
...@@ -105,11 +106,10 @@ if __name__ == '__main__': ...@@ -105,11 +106,10 @@ if __name__ == '__main__':
print(dst_dir) print(dst_dir)
if not os.path.exists(dst_dir): if not os.path.exists(dst_dir):
os.makedirs(dst_dir) os.makedirs(dst_dir)
img_src_dir = "{}/nyu_depth_v2/official_splits/test/".format(args.data_dir)
img_src_dir = "datasets/nyu_depth_v2/official_splits/test/"
img_path_list = glob.glob(img_src_dir + "/*/rgb*g") img_path_list = glob.glob(img_src_dir + "/*/rgb*g")
img2_path = "datasets/nyu_depth_v2/sync/{}.jpg".format(args.prompt) img2_path = "{}/nyu_depth_v2/sync/{}.jpg".format(args.data_dir, args.prompt)
tgt_path = "datasets/nyu_depth_v2/sync/{}.png".format(args.prompt.replace('rgb', 'sync_depth')) tgt_path = "{}/nyu_depth_v2/sync/{}.png".format(args.data_dir, args.prompt.replace('rgb', 'sync_depth'))
tgt2_path = tgt_path tgt2_path = tgt_path
res, hres = args.input_size, args.input_size res, hres = args.input_size, args.input_size
......
# 模型唯一标识 # 模型唯一标识
modelCode=xxx modelCode=xxx
# 模型名称 # 模型名称
modelName=hdetr_pytorch modelName=painter_pytorch
# 模型描述 # 模型描述
modelDescription=HDETR引入了一种混合匹配方案,这个新的匹配机制允许将多个查询分配给每个正样本,从而提高了训练效果,适用于多种视觉任务如目标检测、3D物体检测、姿势估计和对象跟踪等 modelDescription=将视觉任务的连续输出空间离散化,并使用语言或专门设计的离散标记作为任务提示,将视觉问题转化为 NLP 问题.
# 应用场景 # 应用场景
appScenario=推理,训练,目标检测,教育,交通,公安 appScenario=推理,训练,图像超分,教育,交通,公安
# 框架类型 # 框架类型
frameType=PyTorch frameType=PyTorch
...@@ -3,4 +3,10 @@ git+https://github.com/cocodataset/panopticapi.git ...@@ -3,4 +3,10 @@ git+https://github.com/cocodataset/panopticapi.git
h5py # for depth h5py # for depth
xtcocotools # for pose xtcocotools # for pose
natsort # for denoising natsort # for denoising
wandb wandb
\ No newline at end of file scikit-image
git+https://github.com/svenkreiss/poseval.git
tensorbord
fvcore==0.1.5
yapf==0.40.1
fairscale==0.4.13
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment