Use pre-commit to reformat code

Use pre-commit to reformat code

Use pre-commit to reformat code
41b18fd8 · zhe chen · ff20ea39 · 41b18fd8 · 41b18fd8 · 41b18fd8
Commit 41b18fd8 authored Jan 06, 2025 by zhe chen
20 changed files
--- a/classification/ops_dcnv3/test.py
+++ b/classification/ops_dcnv3/test.py
@@ -4,17 +4,15 @@
 # Licensed under The MIT License [see LICENSE for details]
 # --------------------------------------------------------
-from __future__ import absolute_import
+from __future__ import absolute_import, division, print_function
-from __future__ import print_function
-from __future__ import division
+import math
 import time
 import torch
 import torch.nn as nn
-import math
-from torch.autograd import gradcheck
 from functions.dcnv3_func import DCNv3Function, dcnv3_core_pytorch
+from torch.autograd import gradcheck
 H_in, W_in = 8, 8
 N, M, D = 2, 4, 16

--- a/classification/optimizer.py
+++ b/classification/optimizer.py
@@ -35,7 +35,7 @@ def build_optimizer(config, model):
    optimizer = None
    use_zero = config.TRAIN.OPTIMIZER.USE_ZERO
    if use_zero:
-        print(f"\nUse Zero!")
+        print(f'\nUse Zero!')
        if opt_lower == 'sgd':
            # an ugly implementation
            # this problem is fixed after torch 1.12
@@ -119,7 +119,7 @@ def set_weight_decay_and_lr(
                if f'levels.{i}' in name:
                    param.requires_grad = False
        # 1. check wd
-        if len(param.shape) == 1 or name.endswith(".bias") or (
+        if len(param.shape) == 1 or name.endswith('.bias') or (
                name in skip_list) or check_keywords_in_name(
                    name, skip_keywords):
            wd = 0.

--- a/classification/utils.py
+++ b/classification/utils.py
@@ -4,13 +4,15 @@
 # Licensed under The MIT License [see LICENSE for details]
 # --------------------------------------------------------
-import os
 import math
-import torch
+import os
+from collections import OrderedDict
 import numpy as np
+import torch
 import torch.distributed as dist
-from collections import OrderedDict
 from timm.utils import get_state_dict
 try:
    # noinspection PyUnresolvedReferences
    from apex import amp

--- a/detection/README.md
+++ b/detection/README.md
 # InternImage for Object Detection
-This folder contains the implementation of the InternImage for object detection. 
+This folder contains the implementation of the InternImage for object detection.
 Our detection code is developed on top of [MMDetection v2.28.1](https://github.com/open-mmlab/mmdetection/tree/v2.28.1).
 ## Usage
 ### Install
@@ -28,6 +27,7 @@ conda activate internimage
 - Install `PyTorch>=1.10.0` and `torchvision>=0.9.0` with `CUDA>=10.2`:
 For examples, to install torch==1.11 with CUDA==11.3:
 ```bash
 pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113  -f https://download.pytorch.org/whl/torch_stable.html
 ```
@@ -47,12 +47,14 @@ pip install opencv-python termcolor yacs pyyaml scipy
 ```
 - Compile CUDA operators
 ```bash
 cd ./ops_dcnv3
 sh ./make.sh
 # unit test (should see all checking is True)
 python test.py
 ```
 - You can also install the operator using .whl files
 [DCNv3-1.0-whl](https://github.com/OpenGVLab/InternImage/releases/tag/whl_files)
@@ -61,7 +63,6 @@ python test.py
 Prepare COCO according to the guidelines in [MMDetection v2.28.1](https://github.com/open-mmlab/mmdetection/blob/master/docs/en/1_exist_data_model.md).
 ### Evaluation
 To evaluate our `InternImage` on COCO val, run:
@@ -107,6 +108,7 @@ GPUS=32 sh slurm_train.sh <partition> <job-name> configs/coco/cascade_internimag
 ### Export
 To export a detection model from PyTorch to TensorRT, run:
 ```shell
 MODEL="model_name"
 CKPT_PATH="/path/to/model/ckpt.pth"
@@ -122,6 +124,7 @@ python deploy.py \
 ```
 For example, to export `mask_rcnn_internimage_t_fpn_1x_coco` from PyTorch to TensorRT, run:
 ```shell
 MODEL="mask_rcnn_internimage_t_fpn_1x_coco"
 CKPT_PATH="/path/to/model/ckpt/mask_rcnn_internimage_t_fpn_1x_coco.pth"

--- a/detection/configs/_base_/datasets/coco_detection.py
+++ b/detection/configs/_base_/datasets/coco_detection.py
@@ -46,4 +46,4 @@ data = dict(
        ann_file=data_root + 'annotations/instances_val2017.json',
        img_prefix=data_root + 'val2017/',
        pipeline=test_pipeline))
 evaluation = dict(interval=1, metric='bbox', classwise=True)
\ No newline at end of file
--- a/detection/configs/_base_/models/mask_rcnn_convnext_fpn.py
+++ b/detection/configs/_base_/models/mask_rcnn_convnext_fpn.py
@@ -125,4 +125,4 @@ model = dict(
            score_thr=0.05,
            nms=dict(type='nms', iou_threshold=0.5),
            max_per_img=100,
            mask_thr_binary=0.5)))
\ No newline at end of file
--- a/detection/configs/coco/README.md
+++ b/detection/configs/coco/README.md
 # COCO
 ## Introduction
 Introduced by Lin et al. in [Microsoft COCO: Common Objects in Context](https://arxiv.org/pdf/1405.0312v3.pdf)
@@ -11,19 +10,18 @@ Splits: The first version of MS COCO dataset was released in 2014. It contains 1
 Based on community feedback, in 2017 the training/validation split was changed from 83K/41K to 118K/5K. The new split uses the same images and annotations. The 2017 test set is a subset of 41K images of the 2015 test set. Additionally, the 2017 release contains a new unannotated dataset of 123K images.
 ## Model Zoo
 ### Mask R-CNN + InternImage
-|    backbone    |  schd | box mAP | mask mAP | train speed | train time |#param | FLOPs | Config | Download | 
+|   backbone    | schd | box mAP | mask mAP | train speed  | train time | #param | FLOPs |                       Config                       |                                                                                                          Download                                                                                                          |
-| :------------: |  :---------: | :-----: | :------: | :-----: |:------: | :-----: |:------: | :-----: | :---: |
+| :-----------: | :--: | :-----: | :------: | :----------: | :--------: | :----: | :---: | :------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
-| InternImage-T  |          1x      |  47.2   |   42.5   | 0.36s / iter |  9h | 49M   | 270G  |  [config](./mask_rcnn_internimage_t_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.log.json) |
+| InternImage-T |  1x  |  47.2   |   42.5   | 0.36s / iter |     9h     |  49M   | 270G  | [config](./mask_rcnn_internimage_t_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_1x_coco.log.json) |
-| InternImage-T  |          3x      |  49.1   |   43.7   | 0.34s / iter | 26h  |  49M   | 270G  | [config](./mask_rcnn_internimage_t_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.log.json) |
+| InternImage-T |  3x  |  49.1   |   43.7   | 0.34s / iter |    26h     |  49M   | 270G  | [config](./mask_rcnn_internimage_t_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_t_fpn_3x_coco.log.json) |
-| InternImage-S  |          1x      |  47.8   |   43.3   | 0.40s / iter | 10h  |  69M   | 340G  |  [config](./mask_rcnn_internimage_s_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.log.json) |
+| InternImage-S |  1x  |  47.8   |   43.3   | 0.40s / iter |    10h     |  69M   | 340G  | [config](./mask_rcnn_internimage_s_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_1x_coco.log.json) |
-| InternImage-S  |          3x      |  49.7   |   44.5   | 0.40s / iter | 30h  |  69M   | 340G  | [config](./mask_rcnn_internimage_s_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.log.json) |
+| InternImage-S |  3x  |  49.7   |   44.5   | 0.40s / iter |    30h     |  69M   | 340G  | [config](./mask_rcnn_internimage_s_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_s_fpn_3x_coco.log.json) |
-| InternImage-B  |          1x      |  48.8   |   44.0   | 0.45s / iter | 11.5h  |  115M   | 501G  | [config](./mask_rcnn_internimage_b_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.log.json) |
+| InternImage-B |  1x  |  48.8   |   44.0   | 0.45s / iter |   11.5h    |  115M  | 501G  | [config](./mask_rcnn_internimage_b_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_1x_coco.log.json) |
-| InternImage-B  |          3x      |  50.3   |   44.8   | 0.45s / iter | 34h  |  115M   | 501G  |  [config](./mask_rcnn_internimage_b_fpn_3x_coco.py)| [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.log.json) |
+| InternImage-B |  3x  |  50.3   |   44.8   | 0.45s / iter |    34h     |  115M  | 501G  | [config](./mask_rcnn_internimage_b_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/mask_rcnn_internimage_b_fpn_3x_coco.log.json) |
 - Training speed is measured with A100 GPUs using current code and may be faster than the speed in logs.
 - Some logs are our recent newly trained ones. There might be slight differences between the results in logs and our paper.
@@ -31,22 +29,21 @@ Based on community feedback, in 2017 the training/validation split was changed f
 ### Cascade Mask R-CNN + InternImage
-|    backbone    |         schd | box mAP | mask mAP | train speed |	train time | #param | FLOPs | Config | Download |
+|    backbone    | schd | box mAP | mask mAP | train speed  | train time | #param | FLOPs |                      Config                       |                                                                                                         Download                                                                                                         |
-| :------------: |  :---------: | :-----: | :------: | :-----: | :---: | :-----: | :---: | :---: | :---: |
+| :------------: | :--: | :-----: | :------: | :----------: | :--------: | :----: | :---: | :-----------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
-| InternImage-L  |        1x      |  54.9   |   47.7   | 0.73s / iter | 18h |  277M   | 1399G | [config](./cascade_internimage_l_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_1x_coco.pth)  |
+| InternImage-L  |  1x  |  54.9   |   47.7   | 0.73s / iter |    18h     |  277M  | 1399G | [config](./cascade_internimage_l_fpn_1x_coco.py)  |                                                         [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_1x_coco.pth)                                                          |
-| InternImage-L  |        3x      |  56.1   |   48.5   | 0.79s / iter | 15h (4n) |  277M   | 1399G | [config](./cascade_internimage_l_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.log.json) |
+| InternImage-L  |  3x  |  56.1   |   48.5   | 0.79s / iter |  15h (4n)  |  277M  | 1399G | [config](./cascade_internimage_l_fpn_3x_coco.py)  |  [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_l_fpn_3x_coco.log.json)  |
-| InternImage-XL |        1x      |  55.3   |   48.1   | 0.82s / iter | 21h |  387M   | 1782G | [config](./cascade_internimage_xl_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.log.json) |
+| InternImage-XL |  1x  |  55.3   |   48.1   | 0.82s / iter |    21h     |  387M  | 1782G | [config](./cascade_internimage_xl_fpn_1x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_1x_coco.log.json) |
-| InternImage-XL |        3x      |  56.2   |   48.8   | 0.91s / iter | 17h (4n) |  387M   | 1782G | [config](./cascade_internimage_xl_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.log.json) |
+| InternImage-XL |  3x  |  56.2   |   48.8   | 0.91s / iter |  17h (4n)  |  387M  | 1782G | [config](./cascade_internimage_xl_fpn_3x_coco.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/cascade_internimage_xl_fpn_3x_coco.log.json) |
 - Training speed is measured with A100 GPUs using current code and may be faster than the speed in logs.
 - Some logs are our recent newly trained ones. There might be slight differences between the results in logs and our paper.
 - Please set `with_cp=True` to save memory if you meet `out-of-memory` issues.
 ### DINO + InternImage
-|    backbone    |  lr type     | pretrain    |       schd | box mAP | 	train time | #param | Config | Download |
-| :------------: |  :---------: |:---------: | :---------: | :-----: |  :---: | :-----: | :---: | :---: | 
-| InternImage-T  | layer-wise lr    | ImageNet-1K  |     1x      |  53.9   |  9.5h |  49M    | [config](./dino_4scale_internimage_t_1x_coco_layer_wise_lr.py)     | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.json) |
-| InternImage-L  | layer-wise lr    | ImageNet-22K |     1x      |  57.5   |   18h |  241M   |  [config](./dino_4scale_internimage_l_1x_coco_layer_wise_lr.py)    | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_layer_wise_lr.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_layer_wise_lr.log.json) |
-| InternImage-L  | 0.1x backbone lr | ImageNet-22K |     1x      |  57.6   |   18h |  241M   |  [config](./dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.log.json) |
+|   backbone    |     lr type      |   pretrain   | schd | box mAP | train time | #param |                              Config                               |                                                                                                                         Download                                                                                                                         |
+| :-----------: | :--------------: | :----------: | :--: | :-----: | :--------: | :----: | :---------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| InternImage-T |  layer-wise lr   | ImageNet-1K  |  1x  |  53.9   |    9.5h    |  49M   |  [config](./dino_4scale_internimage_t_1x_coco_layer_wise_lr.py)   |                    [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_t_1x_coco.json)                    |
+| InternImage-L |  layer-wise lr   | ImageNet-22K |  1x  |  57.5   |    18h     |  241M  |  [config](./dino_4scale_internimage_l_1x_coco_layer_wise_lr.py)   |    [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_layer_wise_lr.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_layer_wise_lr.log.json)    |
+| InternImage-L | 0.1x backbone lr | ImageNet-22K |  1x  |  57.6   |    18h     |  241M  | [config](./dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.py) | [ckpt](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.pth) \| [log](https://huggingface.co/OpenGVLab/InternImage/resolve/main/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.log.json) |
--- a/detection/configs/coco/cascade_internimage_l_fpn_1x_coco.py
+++ b/detection/configs/coco/cascade_internimage_l_fpn_1x_coco.py
@@ -106,4 +106,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/cascade_internimage_xl_fpn_1x_coco.py
+++ b/detection/configs/coco/cascade_internimage_xl_fpn_1x_coco.py
@@ -106,4 +106,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.py
+++ b/detection/configs/coco/dino_4scale_internimage_l_1x_coco_0.1x_backbone_lr.py
@@ -175,4 +175,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/dino_4scale_internimage_l_1x_coco_layer_wise_lr.py
+++ b/detection/configs/coco/dino_4scale_internimage_l_1x_coco_layer_wise_lr.py
@@ -174,4 +174,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/dino_4scale_internimage_t_1x_coco_layer_wise_lr.py
+++ b/detection/configs/coco/dino_4scale_internimage_t_1x_coco_layer_wise_lr.py
@@ -174,4 +174,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/mask_rcnn_internimage_b_fpn_1x_coco.py
+++ b/detection/configs/coco/mask_rcnn_internimage_b_fpn_1x_coco.py
@@ -46,4 +46,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/mask_rcnn_internimage_b_fpn_3x_coco.py
+++ b/detection/configs/coco/mask_rcnn_internimage_b_fpn_3x_coco.py
@@ -89,4 +89,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/mask_rcnn_internimage_s_fpn_1x_coco.py
+++ b/detection/configs/coco/mask_rcnn_internimage_s_fpn_1x_coco.py
@@ -46,4 +46,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/mask_rcnn_internimage_s_fpn_3x_coco.py
+++ b/detection/configs/coco/mask_rcnn_internimage_s_fpn_3x_coco.py
@@ -89,4 +89,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/mask_rcnn_internimage_t_fpn_1x_coco.py
+++ b/detection/configs/coco/mask_rcnn_internimage_t_fpn_1x_coco.py
@@ -46,4 +46,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/mask_rcnn_internimage_t_fpn_1x_coco_with_dcnv4.py
+++ b/detection/configs/coco/mask_rcnn_internimage_t_fpn_1x_coco_with_dcnv4.py
@@ -47,4 +47,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/coco/mask_rcnn_internimage_t_fpn_3x_coco.py
+++ b/detection/configs/coco/mask_rcnn_internimage_t_fpn_3x_coco.py
@@ -89,4 +89,4 @@ checkpoint_config = dict(
    interval=1,
    max_keep_ckpts=3,
    save_last=True,
 )
\ No newline at end of file
--- a/detection/configs/crowd_human/README.md
+++ b/detection/configs/crowd_human/README.md
 # CrowdHuman
 ## Introduction
 Introduced by Shao et al. in [CrowdHuman: A Benchmark for Detecting Human in a Crowd](https://arxiv.org/pdf/1805.00123.pdf)
@@ -8,6 +7,7 @@ Introduced by Shao et al. in [CrowdHuman: A Benchmark for Detecting Human in a C
 CrowdHuman is a benchmark dataset to better evaluate detectors in crowd scenarios. The CrowdHuman dataset is large, rich-annotated and contains high diversity. CrowdHuman contains 15000, 4370 and 5000 images for training, validation, and testing, respectively. There are a total of 470K human instances from train and validation subsets and 23 persons per image, with various kinds of occlusions in the dataset. Each human instance is annotated with a head bounding-box, human visible-region bounding-box and human full-body bounding-box. We hope our dataset will serve as a solid baseline and help promote future research in human detection tasks.
 ## Prepare the data
 Download the original dataset from [CrowdHuman](https://www.crowdhuman.org/download.html). Then convert annotations by detection/tools/create_crowd_anno.py
 - Data Tree of CrowdHuman should look like:
@@ -25,16 +25,15 @@ Download the original dataset from [CrowdHuman](https://www.crowdhuman.org/downl
      ├── 1074488,79d54000c6f9d9e5.jpg
      └── ...
-## Model Zoo
+  ```
+## Model Zoo
 ### Cascade Mask R-CNN + InternImage
+|    backbone    | schd | box mAP | mask mAP | train speed | train time | #param | FLOPs |                          Config                          | Download |
-|    backbone    |         schd | box mAP | mask mAP | train speed | 	train time | #param | FLOPs | Config | Download |
+| :------------: | :--: | :-----: | :------: | :---------: | :--------: | :----: | :---: | :------------------------------------------------------: | :------: |
-| :------------: |  :---------: |:-------:|:--------:|:-----------:|:-----------:|:------:|:-----:| :---: |:--------:|
+| InternImage-XL |  3x  |   TBD   |   TBD    |     TBD     |    TBD     |  TBD   |  TBD  | [config](./cascade_internimage_xl_fpn_3x_crowd_human.py) |   TBD    |
-| InternImage-XL |        3x      |   TBD   |   TBD    |     TBD     |     TBD     |  TBD   |  TBD  | [config](./cascade_internimage_xl_fpn_3x_crowd_human.py) |   TBD    |
 - Training speed is measured with A100 GPUs using current code and may be faster than the speed in logs.
 - Some logs are our recent newly trained ones. There might be slight differences between the results in logs and our paper.