- 🏆 **Achieved `90.1% Top1` accuracy in ImageNet, the most accurate among open-source models**
- 🏆 **Achieved `65.5 mAP` on the COCO benchmark dataset for object detection, the only model that exceeded `65.0 mAP`**
## Related Projects
### Foundation Models
-[Uni-Perceiver](https://github.com/fundamentalvision/Uni-Perceiver): A Pre-training unified architecture for generic perception for zero-shot and few-shot tasks
-[Uni-Perceiver v2](https://arxiv.org/abs/2211.09808): A generalist model for large-scale vision and vision-language tasks
-[M3I-Pretraining](https://github.com/OpenGVLab/M3I-Pretraining): One-stage pre-training paradigm via maximizing multi-modal mutual information
### Autonomous Driving
-[BEVFormer](https://github.com/fundamentalvision/BEVFormer): A cutting-edge baseline for camera-based 3D detection
-[BEVFormer v2](https://arxiv.org/abs/2211.10439): Adapting modern image backbones to Bird's-Eye-View recognition via perspective supervision
## Application in Challenges
-[2022 Waymo 3D Camera-Only Detection Challenge](https://waymo.com/open/challenges/2022/3d-camera-only-detection/): BEVFormer++ **Ranks 1st** based on InternImage
-[nuScenes 3D detection task](https://www.nuscenes.org/object-detection?externalData=all&mapData=all&modalities=Camera): BEVFormer v2 achieves SOTA performance of 64.8 NDS on nuScenes Camera Only
-[CVPR 2023 Workshop End-to-End Autonomous Driving](https://opendrivelab.com/e2ead/cvpr23): InternImage supports the baseline of the [3D Occupancy Prediction Challenge](https://opendrivelab.com/AD23Challenge.html#Track3) and [OpenLane Topology Challenge](https://opendrivelab.com/AD23Challenge.html#Track1)
## News
-`Mar 14, 2023`: 🚀 "INTERN-2.5" is released!
-`Feb 28, 2023`: 🚀 InternImage is accepted to CVPR 2023!
...
...
@@ -47,6 +63,7 @@ ADE20K, outperforming previous models by a large margin.
- [ ] Models/APIs for other downstream tasks
-[ ] Support [CVPR 2023 Workshop on End-to-End Autonomous Driving](https://opendrivelab.com/e2ead/cvpr23), see [here](https://github.com/OpenGVLab/InternImage/tree/master/autonomous_driving)
- [ ] Support Segment Anything
-[x] Support extracting intermediate features, see [here](classification/extract_feature.py)
-[x] Low-cost training with [DeepSpeed](https://github.com/microsoft/DeepSpeed), see [here](https://github.com/OpenGVLab/InternImage/tree/master/classification)
-[x] Compiling-free .whl package of DCNv3 operator, see [here](https://github.com/OpenGVLab/InternImage/releases/tag/whl_files)
- [x] InternImage-H(1B)/G(3B)
...
...
@@ -266,11 +283,6 @@ For more details on building custom ops, please refering to [this document](http
# Online HD Map Construction Challenge For Autonomous Driving
</div>
We train a fast version of vectormapnet_intern .
If you need detaild information about the challenge, please refer to https://github.com/Tsinghua-MARS-Lab/Online-HD-Map-Construction-CVPR2023/tree/master
The evaluation metrics of this challenge follows [HDMapNet](https://arxiv.org/abs/2107.06307). We provide [VectorMapNet](https://arxiv.org/abs/2206.08920) as the baseline. Please cite:
```
@article{li2021hdmapnet,
title={HDMapNet: An Online HD Map Construction and Evaluation Framework},
author={Qi Li and Yue Wang and Yilun Wang and Hang Zhao},
journal={arXiv preprint arXiv:2107.06307},
year={2021}
}
```
Our dataset is built on top of the [Argoverse 2](https://www.argoverse.org/av2.html) dataset. Please also cite:
```
@INPROCEEDINGS {Argoverse2,
author = {Benjamin Wilson and William Qi and Tanmay Agarwal and John Lambert and Jagjeet Singh and Siddhesh Khandelwal and Bowen Pan and Ratnesh Kumar and Andrew Hartnett and Jhony Kaesemodel Pontes and Deva Ramanan and Peter Carr and James Hays},
title = {Argoverse 2: Next Generation Datasets for Self-driving Perception and Forecasting},
booktitle = {Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS Datasets and Benchmarks 2021)},
year = {2021}
}
```
## License
Before participating in our challenge, you should register on the website and agree to the terms of use of the [Argoverse 2](https://www.argoverse.org/av2.html) dataset.
All code in this project is released under [GNU General Public License v3.0](./LICENSE).