description:Explore the DOTA8 dataset - a small, versatile oriented object detection dataset ideal for testing and debugging object detection models using Ultralytics YOLO11.
[Ultralytics](https://www.ultralytics.com/) DOTA8 is a small, but versatile oriented [object detection](https://www.ultralytics.com/glossary/object-detection) dataset composed of the first 8 images of 8 images of the split DOTAv1 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com/) and [YOLO11](https://github.com/ultralytics/ultralytics).
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the DOTA8 dataset, the `dota8.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dota8.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dota8.yaml).
!!! example "ultralytics/cfg/datasets/dota8.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/dota8.yaml"
```
## Usage
To train a YOLO11n-obb model on the DOTA8 dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-obb.pt") # load a pretrained model (recommended for training)
-**Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the DOTA8 dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the DOTA dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@article{9560031,
author={Ding, Jian and Xue, Nan and Xia, Gui-Song and Bai, Xiang and Yang, Wen and Yang, Michael and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges},
year={2021},
volume={},
number={},
pages={1-1},
doi={10.1109/TPAMI.2021.3117983}
}
```
A special note of gratitude to the team behind the DOTA datasets for their commendable effort in curating this dataset. For an exhaustive understanding of the dataset and its nuances, please visit the [official DOTA website](https://captain-whu.github.io/DOTA/index.html).
## FAQ
### What is the DOTA8 dataset and how can it be used?
The DOTA8 dataset is a small, versatile oriented object detection dataset made up of the first 8 images from the DOTAv1 split set, with 4 images designated for training and 4 for validation. It's ideal for testing and debugging object detection models like Ultralytics YOLO11. Due to its manageable size and diversity, it helps in identifying pipeline errors and running sanity checks before deploying larger datasets. Learn more about object detection with [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics).
### How do I train a YOLO11 model using the DOTA8 dataset?
To train a YOLO11n-obb model on the DOTA8 dataset for 100 epochs with an image size of 640, you can use the following code snippets. For comprehensive argument options, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-obb.pt") # load a pretrained model (recommended for training)
### What are the key features of the DOTA dataset and where can I access the YAML file?
The DOTA dataset is known for its large-scale benchmark and the challenges it presents for object detection in aerial images. The DOTA8 subset is a smaller, manageable dataset ideal for initial tests. You can access the `dota8.yaml` file, which contains paths, classes, and configuration details, at this [GitHub link](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dota8.yaml).
### How does mosaicing enhance model training with the DOTA8 dataset?
Mosaicing combines multiple images into one during training, increasing the variety of objects and contexts within each batch. This improves a model's ability to generalize to different object sizes, aspect ratios, and scenes. This technique can be visually demonstrated through a training batch composed of mosaiced DOTA8 dataset images, helping in robust model development. Explore more about mosaicing and training techniques on our [Training](../../modes/train.md) page.
### Why should I use Ultralytics YOLO11 for object detection tasks?
Ultralytics YOLO11 provides state-of-the-art real-time object detection capabilities, including features like oriented bounding boxes (OBB), [instance segmentation](https://www.ultralytics.com/glossary/instance-segmentation), and a highly versatile training pipeline. It's suitable for various applications and offers pretrained models for efficient fine-tuning. Explore further about the advantages and usage in the [Ultralytics YOLO11 documentation](https://github.com/ultralytics/ultralytics).
description:Discover OBB dataset formats for Ultralytics YOLO models. Learn about their structure, application, and format conversions to enhance your object detection training.
Training a precise [object detection](https://www.ultralytics.com/glossary/object-detection) model with oriented bounding boxes (OBB) requires a thorough dataset. This guide explains the various OBB dataset formats compatible with Ultralytics YOLO models, offering insights into their structure, application, and methods for format conversions.
## Supported OBB Dataset Formats
### YOLO OBB Format
The YOLO OBB format designates bounding boxes by their four corner points with coordinates normalized between 0 and 1. It follows this format:
```bash
class_index x1 y1 x2 y2 x3 y3 x4 y4
```
Internally, YOLO processes losses and outputs in the `xywhr` format, which represents the [bounding box](https://www.ultralytics.com/glossary/bounding-box)'s center point (xy), width, height, and rotation.
<palign="center"><imgwidth="800"src="https://github.com/ultralytics/docs/releases/download/0/obb-format-examples.avif"alt="OBB format examples"></p>
An example of a `*.txt` label file for the above image, which contains an object of class `0` in OBB format, could look like:
Currently, the following datasets with Oriented Bounding Boxes are supported:
-[DOTA-v1](dota-v2.md): The first version of the DOTA dataset, providing a comprehensive set of aerial images with oriented bounding boxes for object detection.
-[DOTA-v1.5](dota-v2.md): An intermediate version of the DOTA dataset, offering additional annotations and improvements over DOTA-v1 for enhanced object detection tasks.
-[DOTA-v2](dota-v2.md): DOTA (A Large-scale Dataset for Object Detection in Aerial Images) version 2, emphasizes detection from aerial perspectives and contains oriented bounding boxes with 1.7 million instances and 11,268 images.
-[DOTA8](dota8.md): A small, 8-image subset of the full DOTA dataset suitable for testing workflows and Continuous Integration (CI) checks of OBB training in the `ultralytics` repository.
### Incorporating your own OBB dataset
For those looking to introduce their own datasets with oriented bounding boxes, ensure compatibility with the "YOLO OBB format" mentioned above. Convert your annotations to this required format and detail the paths, classes, and class names in a corresponding YAML configuration file.
## Convert Label Formats
### DOTA Dataset Format to YOLO OBB Format
Transitioning labels from the DOTA dataset format to the YOLO OBB format can be achieved with this script:
!!! example
=== "Python"
```python
from ultralytics.data.converter import convert_dota_to_yolo_obb
convert_dota_to_yolo_obb("path/to/DOTA")
```
This conversion mechanism is instrumental for datasets in the DOTA format, ensuring alignment with the Ultralytics YOLO OBB format.
It's imperative to validate the compatibility of the dataset with your model and adhere to the necessary format conventions. Properly structured datasets are pivotal for training efficient object detection models with oriented bounding boxes.
## FAQ
### What are Oriented Bounding Boxes (OBB) and how are they used in Ultralytics YOLO models?
Oriented Bounding Boxes (OBB) are a type of bounding box annotation where the box can be rotated to align more closely with the object being detected, rather than just being axis-aligned. This is particularly useful in aerial or satellite imagery where objects might not be aligned with the image axes. In Ultralytics YOLO models, OBBs are represented by their four corner points in the YOLO OBB format. This allows for more accurate object detection since the bounding boxes can rotate to fit the objects better.
### How do I convert my existing DOTA dataset labels to YOLO OBB format for use with Ultralytics YOLO11?
You can convert DOTA dataset labels to YOLO OBB format using the `convert_dota_to_yolo_obb` function from Ultralytics. This conversion ensures compatibility with the Ultralytics YOLO models, enabling you to leverage the OBB capabilities for enhanced object detection. Here's a quick example:
This script will reformat your DOTA annotations into a YOLO-compatible format.
### How do I train a YOLO11 model with oriented bounding boxes (OBB) on my dataset?
Training a YOLO11 model with OBBs involves ensuring your dataset is in the YOLO OBB format and then using the Ultralytics API to train the model. Here's an example in both Python and CLI:
This ensures your model leverages the detailed OBB annotations for improved detection [accuracy](https://www.ultralytics.com/glossary/accuracy).
### What datasets are currently supported for OBB training in Ultralytics YOLO models?
Currently, Ultralytics supports the following datasets for OBB training:
-[DOTA-v1](dota-v2.md): The first version of the DOTA dataset, providing a comprehensive set of aerial images with oriented bounding boxes for object detection.
-[DOTA-v1.5](dota-v2.md): An intermediate version of the DOTA dataset, offering additional annotations and improvements over DOTA-v1 for enhanced object detection tasks.
-[DOTA-v2](dota-v2.md): This dataset includes 1.7 million instances with oriented bounding boxes and 11,268 images, primarily focusing on aerial object detection.
-[DOTA8](dota8.md): A smaller, 8-image subset of the DOTA dataset used for testing and continuous integration (CI) checks.
These datasets are tailored for scenarios where OBBs offer a significant advantage, such as aerial and satellite image analysis.
### Can I use my own dataset with oriented bounding boxes for YOLO11 training, and if so, how?
Yes, you can use your own dataset with oriented bounding boxes for YOLO11 training. Ensure your dataset annotations are converted to the YOLO OBB format, which involves defining bounding boxes by their four corner points. You can then create a YAML configuration file specifying the dataset paths, classes, and other necessary details. For more information on creating and configuring your datasets, refer to the [Supported Datasets](#supported-datasets) section.
description:Explore the COCO-Pose dataset for advanced pose estimation. Learn about datasets, pretrained models, metrics, and applications for training with YOLO.
The [COCO-Pose](https://cocodataset.org/#keypoints-2017) dataset is a specialized version of the COCO (Common Objects in Context) dataset, designed for pose estimation tasks. It leverages the COCO Keypoints 2017 images and labels to enable the training of models like YOLO for pose estimation tasks.
- COCO-Pose builds upon the COCO Keypoints 2017 dataset which contains 200K images labeled with keypoints for pose estimation tasks.
- The dataset supports 17 keypoints for human figures, facilitating detailed pose estimation.
- Like COCO, it provides standardized evaluation metrics, including Object Keypoint Similarity (OKS) for pose estimation tasks, making it suitable for comparing model performance.
## Dataset Structure
The COCO-Pose dataset is split into three subsets:
1.**Train2017**: This subset contains a portion of the 118K images from the COCO dataset, annotated for training pose estimation models.
2.**Val2017**: This subset has a selection of images used for validation purposes during model training.
3.**Test2017**: This subset consists of images used for testing and benchmarking the trained models. Ground truth annotations for this subset are not publicly available, and the results are submitted to the [COCO evaluation server](https://codalab.lisn.upsaclay.fr/competitions/7384) for performance evaluation.
## Applications
The COCO-Pose dataset is specifically used for training and evaluating [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models in keypoint detection and pose estimation tasks, such as OpenPose. The dataset's large number of annotated images and standardized evaluation metrics make it an essential resource for [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) researchers and practitioners focused on pose estimation.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO-Pose dataset, the `coco-pose.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco-pose.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco-pose.yaml).
!!! example "ultralytics/cfg/datasets/coco-pose.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco-pose.yaml"
```
## Usage
To train a YOLO11n-pose model on the COCO-Pose dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-pose.pt") # load a pretrained model (recommended for training)
The COCO-Pose dataset contains a diverse set of images with human figures annotated with keypoints. Here are some examples of images from the dataset, along with their corresponding annotations:
-**Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO-Pose dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO-Pose dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO-Pose dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
## FAQ
### What is the COCO-Pose dataset and how is it used with Ultralytics YOLO for pose estimation?
The [COCO-Pose](https://cocodataset.org/#keypoints-2017) dataset is a specialized version of the COCO (Common Objects in Context) dataset designed for pose estimation tasks. It builds upon the COCO Keypoints 2017 images and annotations, allowing for the training of models like Ultralytics YOLO for detailed pose estimation. For instance, you can use the COCO-Pose dataset to train a YOLO11n-pose model by loading a pretrained model and training it with a YAML configuration. For training examples, refer to the [Training](../../modes/train.md) documentation.
### How can I train a YOLO11 model on the COCO-Pose dataset?
Training a YOLO11 model on the COCO-Pose dataset can be accomplished using either Python or CLI commands. For example, to train a YOLO11n-pose model for 100 epochs with an image size of 640, you can follow the steps below:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-pose.pt") # load a pretrained model (recommended for training)
For more details on the training process and available arguments, check the [training page](../../modes/train.md).
### What are the different metrics provided by the COCO-Pose dataset for evaluating model performance?
The COCO-Pose dataset provides several standardized evaluation metrics for pose estimation tasks, similar to the original COCO dataset. Key metrics include the Object Keypoint Similarity (OKS), which evaluates the [accuracy](https://www.ultralytics.com/glossary/accuracy) of predicted keypoints against ground truth annotations. These metrics allow for thorough performance comparisons between different models. For instance, the COCO-Pose pretrained models such as YOLO11n-pose, YOLO11s-pose, and others have specific performance metrics listed in the documentation, like mAP<sup>pose</sup>50-95 and mAP<sup>pose</sup>50.
### How is the dataset structured and split for the COCO-Pose dataset?
The COCO-Pose dataset is split into three subsets:
1.**Train2017**: Contains a portion of the 118K COCO images, annotated for training pose estimation models.
2.**Val2017**: Selected images for validation purposes during model training.
3.**Test2017**: Images used for testing and benchmarking trained models. Ground truth annotations for this subset are not publicly available; results are submitted to the [COCO evaluation server](https://codalab.lisn.upsaclay.fr/competitions/7384) for performance evaluation.
These subsets help organize the training, validation, and testing phases effectively. For configuration details, explore the `coco-pose.yaml` file available on [GitHub](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco-pose.yaml).
### What are the key features and applications of the COCO-Pose dataset?
The COCO-Pose dataset extends the COCO Keypoints 2017 annotations to include 17 keypoints for human figures, enabling detailed pose estimation. Standardized evaluation metrics (e.g., OKS) facilitate comparisons across different models. Applications of the COCO-Pose dataset span various domains, such as sports analytics, healthcare, and human-computer interaction, wherever detailed pose estimation of human figures is required. For practical use, leveraging pretrained models like those provided in the documentation (e.g., YOLO11n-pose) can significantly streamline the process ([Key Features](#key-features)).
If you use the COCO-Pose dataset in your research or development work, please cite the paper with the following [BibTeX entry](#citations-and-acknowledgments).
description:Explore the compact, versatile COCO8-Pose dataset for testing and debugging object detection models. Ideal for quick experiments with YOLO11.
keywords:COCO8-Pose, Ultralytics, pose detection dataset, object detection, YOLO11, machine learning, computer vision, training data
---
# COCO8-Pose Dataset
## Introduction
[Ultralytics](https://www.ultralytics.com/) COCO8-Pose is a small, but versatile pose detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging [object detection](https://www.ultralytics.com/glossary/object-detection) models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com/) and [YOLO11](https://github.com/ultralytics/ultralytics).
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO8-Pose dataset, the `coco8-pose.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-pose.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-pose.yaml).
!!! example "ultralytics/cfg/datasets/coco8-pose.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco8-pose.yaml"
```
## Usage
To train a YOLO11n-pose model on the COCO8-Pose dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-pose.pt") # load a pretrained model (recommended for training)
-**Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO8-Pose dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
## FAQ
### What is the COCO8-Pose dataset, and how is it used with Ultralytics YOLO11?
The COCO8-Pose dataset is a small, versatile pose detection dataset that includes the first 8 images from the COCO train 2017 set, with 4 images for training and 4 for validation. It's designed for testing and debugging object detection models and experimenting with new detection approaches. This dataset is ideal for quick experiments with [Ultralytics YOLO11](https://docs.ultralytics.com/models/yolo11/). For more details on dataset configuration, check out the dataset YAML file [here](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-pose.yaml).
### How do I train a YOLO11 model using the COCO8-Pose dataset in Ultralytics?
To train a YOLO11n-pose model on the COCO8-Pose dataset for 100 epochs with an image size of 640, follow these examples:
For a comprehensive list of training arguments, refer to the model [Training](../../modes/train.md) page.
### What are the benefits of using the COCO8-Pose dataset?
The COCO8-Pose dataset offers several benefits:
-**Compact Size**: With only 8 images, it is easy to manage and perfect for quick experiments.
-**Diverse Data**: Despite its small size, it includes a variety of scenes, useful for thorough pipeline testing.
-**Error Debugging**: Ideal for identifying training errors and performing sanity checks before scaling up to larger datasets.
For more about its features and usage, see the [Dataset Introduction](#introduction) section.
### How does mosaicing benefit the YOLO11 training process using the COCO8-Pose dataset?
Mosaicing, demonstrated in the sample images of the COCO8-Pose dataset, combines multiple images into one, increasing the variety of objects and scenes within each training batch. This technique helps improve the model's ability to generalize across various object sizes, aspect ratios, and contexts, ultimately enhancing model performance. See the [Sample Images and Annotations](#sample-images-and-annotations) section for example images.
### Where can I find the COCO8-Pose dataset YAML file and how do I use it?
The COCO8-Pose dataset YAML file can be found [here](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-pose.yaml). This file defines the dataset configuration, including paths, classes, and other relevant information. Use this file with the YOLO11 training scripts as mentioned in the [Train Example](#how-do-i-train-a-yolo11-model-using-the-coco8-pose-dataset-in-ultralytics) section.
For more FAQs and detailed documentation, visit the [Ultralytics Documentation](https://docs.ultralytics.com/).
description:Explore the hand keypoints estimation dataset for advanced pose estimation. Learn about datasets, pretrained models, metrics, and applications for training with YOLO.
The hand-keypoints dataset contains 26,768 images of hands annotated with keypoints, making it suitable for training models like Ultralytics YOLO for pose estimation tasks. The annotations were generated using the Google MediaPipe library, ensuring high [accuracy](https://www.ultralytics.com/glossary/accuracy) and consistency, and the dataset is compatible [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics) formats.
The dataset includes keypoints for hand detection. The keypoints are annotated as follows:
1. Wrist
2. Thumb (4 points)
3. Index finger (4 points)
4. Middle finger (4 points)
5. Ring finger (4 points)
6. Little finger (4 points)
Each hand has a total of 21 keypoints.
## Key Features
-**Large Dataset**: 26,768 images with hand keypoint annotations.
-**YOLO11 Compatibility**: Ready for use with YOLO11 models.
-**21 Keypoints**: Detailed hand pose representation.
## Dataset Structure
The hand keypoint dataset is split into two subsets:
1.**Train**: This subset contains 18,776 images from the hand keypoints dataset, annotated for training pose estimation models.
2.**Val**: This subset contains 7992 images that can be used for validation purposes during model training.
## Applications
Hand keypoints can be used for gesture recognition, AR/VR controls, robotic manipulation, and hand movement analysis in healthcare. They can be also applied in animation for motion capture and biometric authentication systems for security.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the Hand Keypoints dataset, the `hand-keypoints.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/hand-keypoints.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/hand-keypoints.yaml).
!!! example "ultralytics/cfg/datasets/hand-keypoints.yaml"
To train a YOLO11n-pose model on the Hand Keypoints dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-pose.pt") # load a pretrained model (recommended for training)
The Hand keypoints dataset contains a diverse set of images with human hands annotated with keypoints. Here are some examples of images from the dataset, along with their corresponding annotations:
-**Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the Hand Keypoints dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the hand-keypoints dataset in your research or development work, please acknowledge the following sources:
!!! quote ""
=== "Credits"
We would like to thank the following sources for providing the images used in this dataset:
The images were collected and used under the respective licenses provided by each platform and are distributed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-nc-sa/4.0/).
We would also like to acknowledge the creator of this dataset, [Rion Dsilva](https://www.linkedin.com/in/rion-dsilva-043464229/), for his great contribution to Vision AI research.
## FAQ
### How do I train a YOLO11 model on the Hand Keypoints dataset?
To train a YOLO11 model on the Hand Keypoints dataset, you can use either Python or the command line interface (CLI). Here's an example for training a YOLO11n-pose model for 100 epochs with an image size of 640:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-pose.pt") # load a pretrained model (recommended for training)
For more information, refer to the [Applications](#applications) section.
### How is the Hand Keypoints dataset structured?
The Hand Keypoints dataset is divided into two subsets:
1.**Train**: Contains 18,776 images for training pose estimation models.
2.**Val**: Contains 7,992 images for validation purposes during model training.
This structure ensures a comprehensive training and validation process. For more details, see the [Dataset Structure](#dataset-structure) section.
### How do I use the dataset YAML file for training?
The dataset configuration is defined in a YAML file, which includes paths, classes, and other relevant information. The `hand-keypoints.yaml` file can be found at [hand-keypoints.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/hand-keypoints.yaml).
To use this YAML file for training, specify it in your training script or CLI command as shown in the training example above. For more details, refer to the [Dataset YAML](#dataset-yaml) section.
description:Learn about Ultralytics YOLO format for pose estimation datasets, supported formats, COCO-Pose, COCO8-Pose, Tiger-Pose, and how to add your own dataset.
In this format, `<class-index>` is the index of the class for the object,`<x> <y> <width> <height>` are coordinates of [bounding box](https://www.ultralytics.com/glossary/bounding-box), and `<px1> <py1> <px2> <py2> ... <pxn> <pyn>` are the pixel coordinates of the keypoints. The coordinates are separated by spaces.
### Dataset YAML format
The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training Detection Models. Here is an example of the YAML format used for defining a detection dataset:
```yaml
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path:../datasets/coco8-pose# dataset root dir
train:images/train# train images (relative to 'path') 4 images
val:images/val# val images (relative to 'path') 4 images
test:# test images (optional)
# Keypoints
kpt_shape:[17,3]# number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)
The `train` and `val` fields specify the paths to the directories containing the training and validation images, respectively.
`names` is a dictionary of class names. The order of the names should match the order of the object class indices in the YOLO dataset files.
(Optional) if the points are symmetric then need flip_idx, like left-right side of human or face. For example if we assume five keypoints of facial landmark: [left eye, right eye, nose, left mouth, right mouth], and the original index is [0, 1, 2, 3, 4], then flip_idx is [1, 0, 2, 4, 3] (just exchange the left-right index, i.e. 0-1 and 3-4, and do not modify others like nose in this example).
## Usage
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-pose.pt") # load a pretrained model (recommended for training)
This section outlines the datasets that are compatible with Ultralytics YOLO format and can be used for training pose estimation models:
### COCO-Pose
-**Description**: COCO-Pose is a large-scale [object detection](https://www.ultralytics.com/glossary/object-detection), segmentation, and pose estimation dataset. It is a subset of the popular COCO dataset and focuses on human pose estimation. COCO-Pose includes multiple keypoints for each human instance.
-**Label Format**: Same as Ultralytics YOLO format as described above, with keypoints for human poses.
-**Number of Classes**: 1 (Human).
-**Keypoints**: 17 keypoints including nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles.
-**Usage**: Suitable for training human pose estimation models.
-**Additional Notes**: The dataset is rich and diverse, containing over 200k labeled images.
-[Read more about COCO-Pose](coco.md)
### COCO8-Pose
-**Description**: [Ultralytics](https://www.ultralytics.com/) COCO8-Pose is a small, but versatile pose detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation.
-**Label Format**: Same as Ultralytics YOLO format as described above, with keypoints for human poses.
-**Number of Classes**: 1 (Human).
-**Keypoints**: 17 keypoints including nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles.
-**Usage**: Suitable for testing and debugging object detection models, or for experimenting with new detection approaches.
-**Additional Notes**: COCO8-Pose is ideal for sanity checks and CI checks.
-[Read more about COCO8-Pose](coco8-pose.md)
### Tiger-Pose
-**Description**: [Ultralytics](https://www.ultralytics.com/) This animal pose dataset comprises 263 images sourced from a [YouTube Video](https://www.youtube.com/watch?v=MIBAT6BGE6U&pp=ygUbVGlnZXIgd2Fsa2luZyByZWZlcmVuY2UubXA0), with 210 images allocated for training and 53 for validation.
-**Label Format**: Same as Ultralytics YOLO format as described above, with 12 keypoints for animal pose and no visible dimension.
-**Number of Classes**: 1 (Tiger).
-**Keypoints**: 12 keypoints.
-**Usage**: Great for animal pose or any other pose that is not human-based.
-[Read more about Tiger-Pose](tiger-pose.md)
### Hand Keypoints
-**Description**: Hand keypoints pose dataset comprises nearly 26K images, with 18776 images allocated for training and 7992 for validation.
-**Label Format**: Same as Ultralytics YOLO format as described above, but with 21 keypoints for human hand and visible dimension.
-**Number of Classes**: 1 (Hand).
-**Keypoints**: 21 keypoints.
-**Usage**: Great for human hand pose estimation.
-[Read more about Hand Keypoints](hand-keypoints.md)
### Adding your own dataset
If you have your own dataset and would like to use it for training pose estimation models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
### Conversion Tool
Ultralytics provides a convenient conversion tool to convert labels from the popular COCO dataset format to YOLO format:
!!! example
=== "Python"
```python
from ultralytics.data.converter import convert_coco
This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format. The `use_keypoints` parameter specifies whether to include keypoints (for pose estimation) in the converted labels.
## FAQ
### What is the Ultralytics YOLO format for pose estimation?
The Ultralytics YOLO format for pose estimation datasets involves labeling each image with a corresponding text file. Each row of the text file stores information about an object instance:
- Object class index
- Object center coordinates (normalized x and y)
- Object width and height (normalized)
- Object keypoint coordinates (normalized pxn and pyn)
For 2D poses, keypoints include pixel coordinates. For 3D, each keypoint also has a visibility flag. For more details, see [Ultralytics YOLO format](#ultralytics-yolo-format).
### How do I use the COCO-Pose dataset with Ultralytics YOLO?
To use the COCO-Pose dataset with Ultralytics YOLO:
1. Download the dataset and prepare your label files in the YOLO format.
2. Create a YAML configuration file specifying paths to training and validation images, keypoint shape, and class names.
3. Use the configuration file for training:
```python
from ultralytics import YOLO
model = YOLO("yolo11n-pose.pt") # load pretrained model
For complete steps, check the [Adding your own dataset](#adding-your-own-dataset) section.
### What is the purpose of the dataset YAML file in Ultralytics YOLO?
The dataset YAML file in Ultralytics YOLO defines the dataset and model configuration for training. It specifies paths to training, validation, and test images, keypoint shapes, class names, and other configuration options. This structured format helps streamline dataset management and model training. Here is an example YAML format:
```yaml
path:../datasets/coco8-pose
train:images/train
val:images/val
names:
0:person
```
Read more about creating YAML configuration files in [Dataset YAML format](#dataset-yaml-format).
### How can I convert COCO dataset labels to Ultralytics YOLO format for pose estimation?
Ultralytics provides a conversion tool to convert COCO dataset labels to the YOLO format, including keypoint information:
[Ultralytics](https://www.ultralytics.com/) introduces the Tiger-Pose dataset, a versatile collection designed for pose estimation tasks. This dataset comprises 263 images sourced from a [YouTube Video](https://www.youtube.com/watch?v=MIBAT6BGE6U&pp=ygUbVGlnZXIgd2Fsa2luZyByZWZlcmVuY2UubXA0), with 210 images allocated for training and 53 for validation. It serves as an excellent resource for testing and troubleshooting pose estimation algorithm.
Despite its manageable size of 210 images, tiger-pose dataset offers diversity, making it suitable for assessing training pipelines, identifying potential errors, and serving as a valuable preliminary step before working with larger datasets for pose estimation.
This dataset is intended for use with [Ultralytics HUB](https://hub.ultralytics.com/) and [YOLO11](https://github.com/ultralytics/ultralytics).
<strong>Watch:</strong> Train YOLO11 Pose Model on Tiger-Pose Dataset Using Ultralytics HUB
</p>
## Dataset YAML
A YAML (Yet Another Markup Language) file serves as the means to specify the configuration details of a dataset. It encompasses crucial data such as file paths, class definitions, and other pertinent information. Specifically, for the `tiger-pose.yaml` file, you can check [Ultralytics Tiger-Pose Dataset Configuration File](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/tiger-pose.yaml).
!!! example "ultralytics/cfg/datasets/tiger-pose.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/tiger-pose.yaml"
```
## Usage
To train a YOLO11n-pose model on the Tiger-Pose dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-pose.pt") # load a pretrained model (recommended for training)
-**Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the Tiger-Pose dataset and the benefits of using mosaicing during the training process.
## Inference Example
!!! example "Inference Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("path/to/best.pt") # load a tiger-pose trained model
The dataset has been released available under the [AGPL-3.0 License](https://github.com/ultralytics/ultralytics/blob/main/LICENSE).
## FAQ
### What is the Ultralytics Tiger-Pose dataset used for?
The Ultralytics Tiger-Pose dataset is designed for pose estimation tasks, consisting of 263 images sourced from a [YouTube video](https://www.youtube.com/watch?v=MIBAT6BGE6U&pp=ygUbVGlnZXIgd2Fsa2luZyByZWZlcmVuY2UubXA0). The dataset is divided into 210 training images and 53 validation images. It is particularly useful for testing, training, and refining pose estimation algorithms using [Ultralytics HUB](https://hub.ultralytics.com/) and [YOLO11](https://github.com/ultralytics/ultralytics).
### How do I train a YOLO11 model on the Tiger-Pose dataset?
To train a YOLO11n-pose model on the Tiger-Pose dataset for 100 epochs with an image size of 640, use the following code snippets. For more details, visit the [Training](../../modes/train.md) page:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-pose.pt") # load a pretrained model (recommended for training)
### What configurations does the `tiger-pose.yaml` file include?
The `tiger-pose.yaml` file is used to specify the configuration details of the Tiger-Pose dataset. It includes crucial data such as file paths and class definitions. To see the exact configuration, you can check out the [Ultralytics Tiger-Pose Dataset Configuration File](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/tiger-pose.yaml).
### How can I run inference using a YOLO11 model trained on the Tiger-Pose dataset?
To perform inference using a YOLO11 model trained on the Tiger-Pose dataset, you can use the following code snippets. For a detailed guide, visit the [Prediction](../../modes/predict.md) page:
!!! example "Inference Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("path/to/best.pt") # load a tiger-pose trained model
### What are the benefits of using the Tiger-Pose dataset for pose estimation?
The Tiger-Pose dataset, despite its manageable size of 210 images for training, provides a diverse collection of images that are ideal for testing pose estimation pipelines. The dataset helps identify potential errors and acts as a preliminary step before working with larger datasets. Additionally, the dataset supports the training and refinement of pose estimation algorithms using advanced tools like [Ultralytics HUB](https://hub.ultralytics.com/) and [YOLO11](https://github.com/ultralytics/ultralytics), enhancing model performance and [accuracy](https://www.ultralytics.com/glossary/accuracy).
description:Explore the Roboflow Carparts Segmentation Dataset for automotive AI applications. Enhance your segmentation models with rich, annotated data.
The [Roboflow](https://roboflow.com/?ref=ultralytics)[Carparts Segmentation Dataset](https://universe.roboflow.com/gianmarco-russo-vt9xr/car-seg-un1pm?ref=ultralytics) is a curated collection of images and videos designed for [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) applications, specifically focusing on segmentation tasks related to car parts. This dataset provides a diverse set of visuals captured from multiple perspectives, offering valuable annotated examples for training and testing segmentation models.
Whether you're working on automotive research, developing AI solutions for vehicle maintenance, or exploring computer vision applications, the Carparts Segmentation Dataset serves as a valuable resource for enhancing accuracy and efficiency in your projects.
<strong>Watch:</strong> Carparts <ahref="https://www.ultralytics.com/glossary/instance-segmentation">Instance Segmentation</a> Using Ultralytics HUB
</p>
## Dataset Structure
The data distribution within the Carparts Segmentation Dataset is organized as outlined below:
-**Training set**: Includes 3156 images, each accompanied by its corresponding annotations.
-**Testing set**: Comprises 276 images, with each one paired with its respective annotations.
-**Validation set**: Consists of 401 images, each having corresponding annotations.
## Applications
Carparts Segmentation finds applications in automotive quality control, auto repair, e-commerce cataloging, traffic monitoring, autonomous vehicles, insurance processing, recycling, and smart city initiatives. It streamlines processes by accurately identifying and categorizing different vehicle components, contributing to efficiency and automation in various industries.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the Package Segmentation dataset, the `carparts-seg.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/carparts-seg.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/carparts-seg.yaml).
!!! example "ultralytics/cfg/datasets/carparts-seg.yaml"
To train Ultralytics YOLO11n model on the Carparts Segmentation dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load a pretrained model (recommended for training)
The Carparts Segmentation dataset includes a diverse array of images and videos taken from various perspectives. Below, you'll find examples of data from the dataset along with their corresponding annotations:
- This image illustrates object segmentation within a sample, featuring annotated bounding boxes with masks surrounding identified objects. The dataset consists of a varied set of images captured in various locations, environments, and densities, serving as a comprehensive resource for crafting models specific to this task.
- This instance highlights the diversity and complexity inherent in the dataset, emphasizing the crucial role of high-quality data in computer vision tasks, particularly in the realm of car parts segmentation.
## Citations and Acknowledgments
If you integrate the Carparts Segmentation dataset into your research or development projects, please make reference to the following paper:
We extend our thanks to the Roboflow team for their dedication in developing and managing the Carparts Segmentation dataset, a valuable resource for vehicle maintenance and research projects. For additional details about the Carparts Segmentation dataset and its creators, please visit the [CarParts Segmentation Dataset Page](https://universe.roboflow.com/gianmarco-russo-vt9xr/car-seg-un1pm?ref=ultralytics).
## FAQ
### What is the Roboflow Carparts Segmentation Dataset?
The [Roboflow Carparts Segmentation Dataset](https://universe.roboflow.com/gianmarco-russo-vt9xr/car-seg-un1pm?ref=ultralytics) is a curated collection of images and videos specifically designed for car part segmentation tasks in computer vision. This dataset includes a diverse range of visuals captured from multiple perspectives, making it an invaluable resource for training and testing segmentation models for automotive applications.
### How can I use the Carparts Segmentation Dataset with Ultralytics YOLO11?
To train a YOLO11 model on the Carparts Segmentation dataset, you can follow these steps:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load a pretrained model (recommended for training)
For more details, refer to the [Training](../../modes/train.md) documentation.
### What are some applications of Carparts Segmentation?
Carparts Segmentation can be widely applied in various fields such as:
-**Automotive quality control**
-**Auto repair and maintenance**
-**E-commerce cataloging**
-**Traffic monitoring**
-**Autonomous vehicles**
-**Insurance claim processing**
-**Recycling initiatives**
-**Smart city projects**
This segmentation helps in accurately identifying and categorizing different vehicle components, enhancing the efficiency and automation in these industries.
### Where can I find the dataset configuration file for Carparts Segmentation?
The dataset configuration file for the Carparts Segmentation dataset, `carparts-seg.yaml`, can be found at the following location: [carparts-seg.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/carparts-seg.yaml).
### Why should I use the Carparts Segmentation Dataset?
The Carparts Segmentation Dataset provides rich, annotated data essential for developing high-[accuracy](https://www.ultralytics.com/glossary/accuracy) segmentation models in automotive computer vision. This dataset's diversity and detailed annotations improve model training, making it ideal for applications like vehicle maintenance automation, enhancing vehicle safety systems, and supporting autonomous driving technologies. Partnering with a robust dataset accelerates AI development and ensures better model performance.
For more details, visit the [CarParts Segmentation Dataset Page](https://universe.roboflow.com/gianmarco-russo-vt9xr/car-seg-un1pm?ref=ultralytics).
The [COCO-Seg](https://cocodataset.org/#home) dataset, an extension of the COCO (Common Objects in Context) dataset, is specially designed to aid research in object [instance segmentation](https://www.ultralytics.com/glossary/instance-segmentation). It uses the same images as COCO but introduces more detailed segmentation annotations. This dataset is a crucial resource for researchers and developers working on instance segmentation tasks, especially for training YOLO models.
## COCO-Seg Pretrained Models
{% include "macros/yolo-seg-perf.md" %}
## Key Features
- COCO-Seg retains the original 330K images from COCO.
- The dataset consists of the same 80 object categories found in the original COCO dataset.
- Annotations now include more detailed instance segmentation masks for each object in the images.
- COCO-Seg provides standardized evaluation metrics like [mean Average Precision](https://www.ultralytics.com/glossary/mean-average-precision-map)(mAP) for object detection, and mean Average [Recall](https://www.ultralytics.com/glossary/recall)(mAR) for instance segmentation tasks, enabling effective comparison of model performance.
## Dataset Structure
The COCO-Seg dataset is partitioned into three subsets:
1.**Train2017**: This subset contains 118K images for training instance segmentation models.
2.**Val2017**: This subset includes 5K images used for validation purposes during model training.
3.**Test2017**: This subset encompasses 20K images used for testing and benchmarking the trained models. Ground truth annotations for this subset are not publicly available, and the results are submitted to the [COCO evaluation server](https://codalab.lisn.upsaclay.fr/competitions/7383) for performance evaluation.
## Applications
COCO-Seg is widely used for training and evaluating [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models in instance segmentation, such as the YOLO models. The large number of annotated images, the diversity of object categories, and the standardized evaluation metrics make it an indispensable resource for computer vision researchers and practitioners.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO-Seg dataset, the `coco.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml).
!!! example "ultralytics/cfg/datasets/coco.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco.yaml"
```
## Usage
To train a YOLO11n-seg model on the COCO-Seg dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load a pretrained model (recommended for training)
COCO-Seg, like its predecessor COCO, contains a diverse set of images with various object categories and complex scenes. However, COCO-Seg introduces more detailed instance segmentation masks for each object in the images. Here are some examples of images from the dataset, along with their corresponding instance segmentation masks:
-**Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This aids the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO-Seg dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO-Seg dataset in your research or development work, please cite the original COCO paper and acknowledge the extension to COCO-Seg:
!!! quote ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We extend our thanks to the COCO Consortium for creating and maintaining this invaluable resource for the [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
## FAQ
### What is the COCO-Seg dataset and how does it differ from the original COCO dataset?
The [COCO-Seg](https://cocodataset.org/#home) dataset is an extension of the original COCO (Common Objects in Context) dataset, specifically designed for instance segmentation tasks. While it uses the same images as the COCO dataset, COCO-Seg includes more detailed segmentation annotations, making it a powerful resource for researchers and developers focusing on object instance segmentation.
### How can I train a YOLO11 model using the COCO-Seg dataset?
To train a YOLO11n-seg model on the COCO-Seg dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a detailed list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load a pretrained model (recommended for training)
### What are the key features of the COCO-Seg dataset?
The COCO-Seg dataset includes several key features:
- Retains the original 330K images from the COCO dataset.
- Annotates the same 80 object categories found in the original COCO.
- Provides more detailed instance segmentation masks for each object.
- Uses standardized evaluation metrics such as mean Average [Precision](https://www.ultralytics.com/glossary/precision)(mAP) for [object detection](https://www.ultralytics.com/glossary/object-detection) and mean Average Recall (mAR) for instance segmentation tasks.
### What pretrained models are available for COCO-Seg, and what are their performance metrics?
The COCO-Seg dataset supports multiple pretrained YOLO11 segmentation models with varying performance metrics. Here's a summary of the available models and their key metrics:
{% include "macros/yolo-seg-perf.md" %}
### How is the COCO-Seg dataset structured and what subsets does it contain?
The COCO-Seg dataset is partitioned into three subsets for specific training and evaluation needs:
1.**Train2017**: Contains 118K images used primarily for training instance segmentation models.
2.**Val2017**: Comprises 5K images utilized for validation during the training process.
3.**Test2017**: Encompasses 20K images reserved for testing and benchmarking trained models. Note that ground truth annotations for this subset are not publicly available, and performance results are submitted to the [COCO evaluation server](https://codalab.lisn.upsaclay.fr/competitions/7383) for assessment.
description:Discover the versatile and manageable COCO8-Seg dataset by Ultralytics, ideal for testing and debugging segmentation models or new detection approaches.
[Ultralytics](https://www.ultralytics.com/) COCO8-Seg is a small, but versatile [instance segmentation](https://www.ultralytics.com/glossary/instance-segmentation) dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging segmentation models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com/) and [YOLO11](https://github.com/ultralytics/ultralytics).
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO8-Seg dataset, the `coco8-seg.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-seg.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-seg.yaml).
!!! example "ultralytics/cfg/datasets/coco8-seg.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco8-seg.yaml"
```
## Usage
To train a YOLO11n-seg model on the COCO8-Seg dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load a pretrained model (recommended for training)
-**Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO8-Seg dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
## FAQ
### What is the COCO8-Seg dataset, and how is it used in Ultralytics YOLO11?
The **COCO8-Seg dataset** is a compact instance segmentation dataset by Ultralytics, consisting of the first 8 images from the COCO train 2017 set—4 images for training and 4 for validation. This dataset is tailored for testing and debugging segmentation models or experimenting with new detection methods. It is particularly useful with Ultralytics [YOLO11](https://github.com/ultralytics/ultralytics) and [HUB](https://hub.ultralytics.com/) for rapid iteration and pipeline error-checking before scaling to larger datasets. For detailed usage, refer to the model [Training](../../modes/train.md) page.
### How can I train a YOLO11n-seg model using the COCO8-Seg dataset?
To train a **YOLO11n-seg** model on the COCO8-Seg dataset for 100 epochs with an image size of 640, you can use Python or CLI commands. Here's a quick example:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # Load a pretrained model (recommended for training)
For a thorough explanation of available arguments and configuration options, you can check the [Training](../../modes/train.md) documentation.
### Why is the COCO8-Seg dataset important for model development and debugging?
The **COCO8-Seg dataset** is ideal for its manageability and diversity within a small size. It consists of only 8 images, providing a quick way to test and debug segmentation models or new detection approaches without the overhead of larger datasets. This makes it an efficient tool for sanity checks and pipeline error identification before committing to extensive training on large datasets. Learn more about dataset formats [here](https://docs.ultralytics.com/datasets/segment/).
### Where can I find the YAML configuration file for the COCO8-Seg dataset?
The YAML configuration file for the **COCO8-Seg dataset** is available in the Ultralytics repository. You can access the file directly [here](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-seg.yaml). The YAML file includes essential information about dataset paths, classes, and configuration settings required for model training and validation.
### What are some benefits of using mosaicing during training with the COCO8-Seg dataset?
Using **mosaicing** during training helps increase the diversity and variety of objects and scenes in each training batch. This technique combines multiple images into a single composite image, enhancing the model's ability to generalize to different object sizes, aspect ratios, and contexts within the scene. Mosaicing is beneficial for improving a model's robustness and [accuracy](https://www.ultralytics.com/glossary/accuracy), especially when working with small datasets like COCO8-Seg. For an example of mosaiced images, see the [Sample Images and Annotations](#sample-images-and-annotations) section.
description:Explore the extensive Roboflow Crack Segmentation Dataset, perfect for transportation and public safety studies or self-driving car model development.
The [Roboflow](https://roboflow.com/?ref=ultralytics)[Crack Segmentation Dataset](https://universe.roboflow.com/university-bswxt/crack-bphdr?ref=ultralytics) stands out as an extensive resource designed specifically for individuals involved in transportation and public safety studies. It is equally beneficial for those working on the development of self-driving car models or simply exploring [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) applications for recreational purposes.
Comprising a total of 4029 static images captured from diverse road and wall scenarios, this dataset emerges as a valuable asset for tasks related to crack segmentation. Whether you are delving into the intricacies of transportation research or seeking to enhance the [accuracy](https://www.ultralytics.com/glossary/accuracy) of your self-driving car models, this dataset provides a rich and varied collection of images to support your endeavors.
## Dataset Structure
The division of data within the Crack Segmentation Dataset is outlined as follows:
-**Training set**: Consists of 3717 images with corresponding annotations.
-**Testing set**: Comprises 112 images along with their respective annotations.
-**Validation set**: Includes 200 images with their corresponding annotations.
## Applications
Crack segmentation finds practical applications in infrastructure maintenance, aiding in the identification and assessment of structural damage. It also plays a crucial role in enhancing road safety by enabling automated systems to detect and address pavement cracks for timely repairs.
## Dataset YAML
A YAML (Yet Another Markup Language) file is employed to outline the configuration of the dataset, encompassing details about paths, classes, and other pertinent information. Specifically, for the Crack Segmentation dataset, the `crack-seg.yaml` file is managed and accessible at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/crack-seg.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/crack-seg.yaml).
!!! example "ultralytics/cfg/datasets/crack-seg.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/crack-seg.yaml"
```
## Usage
To train Ultralytics YOLO11n model on the Crack Segmentation dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load a pretrained model (recommended for training)
The Crack Segmentation dataset comprises a varied collection of images and videos captured from multiple perspectives. Below are instances of data from the dataset, accompanied by their respective annotations:
- This image presents an example of image object segmentation, featuring annotated bounding boxes with masks outlining identified objects. The dataset includes a diverse array of images taken in different locations, environments, and densities, making it a comprehensive resource for developing models designed for this particular task.
- The example underscores the diversity and complexity found in the Crack segmentation dataset, emphasizing the crucial role of high-quality data in computer vision tasks.
## Citations and Acknowledgments
If you incorporate the crack segmentation dataset into your research or development endeavors, kindly reference the following paper:
We would like to acknowledge the Roboflow team for creating and maintaining the Crack Segmentation dataset as a valuable resource for the road safety and research projects. For more information about the Crack segmentation dataset and its creators, visit the [Crack Segmentation Dataset Page](https://universe.roboflow.com/university-bswxt/crack-bphdr?ref=ultralytics).
## FAQ
### What is the Roboflow Crack Segmentation Dataset?
The [Roboflow Crack Segmentation Dataset](https://universe.roboflow.com/university-bswxt/crack-bphdr?ref=ultralytics) is a comprehensive collection of 4029 static images designed specifically for transportation and public safety studies. It is ideal for tasks such as self-driving car model development and infrastructure maintenance. The dataset includes training, testing, and validation sets, aiding in accurate crack detection and segmentation.
### How do I train a model using the Crack Segmentation Dataset with Ultralytics YOLO11?
To train an Ultralytics YOLO11 model on the Crack Segmentation dataset, use the following code snippets. Detailed instructions and further parameters can be found on the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load a pretrained model (recommended for training)
### Why should I use the Crack Segmentation Dataset for my self-driving car project?
The Crack Segmentation Dataset is exceptionally suited for self-driving car projects due to its diverse collection of 4029 road and wall images, which provide a varied range of scenarios. This diversity enhances the accuracy and robustness of models trained for crack detection, crucial for maintaining road safety and ensuring timely infrastructure repairs.
### What unique features does Ultralytics YOLO offer for crack segmentation?
Ultralytics YOLO offers advanced real-time [object detection](https://www.ultralytics.com/glossary/object-detection), segmentation, and classification capabilities that make it ideal for crack segmentation tasks. Its ability to handle large datasets and complex scenarios ensures high accuracy and efficiency. For example, the model [Training](../../modes/train.md), [Predict](../../modes/predict.md), and [Export](../../modes/export.md) modes cover comprehensive functionalities from training to deployment.
### How do I cite the Roboflow Crack Segmentation Dataset in my research paper?
If you incorporate the Crack Segmentation Dataset into your research, please use the following BibTeX reference:
description:Explore the supported dataset formats for Ultralytics YOLO and learn how to prepare and use datasets for training object segmentation models.
keywords:Ultralytics, YOLO, instance segmentation, dataset formats, auto-annotation, COCO, segmentation models, training data
---
# Instance Segmentation Datasets Overview
## Supported Dataset Formats
### Ultralytics YOLO format
The dataset label format used for training YOLO segmentation models is as follows:
1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
2. One row per object: Each row in the text file corresponds to one object instance in the image.
3. Object information per row: Each row contains the following information about the object instance:
- Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
- Object bounding coordinates: The bounding coordinates around the mask area, normalized to be between 0 and 1.
The format for a single row in the segmentation dataset file is as follows:
```
<class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>
```
In this format, `<class-index>` is the index of the class for the object, and `<x1> <y1> <x2> <y2> ... <xn> <yn>` are the bounding coordinates of the object's segmentation mask. The coordinates are separated by spaces.
Here is an example of the YOLO dataset format for a single image with two objects made up of a 3-point segment and a 5-point segment.
- The length of each row does **not** have to be equal.
- Each segmentation label must have a **minimum of 3 xy points**: `<class-index> <x1> <y1> <x2> <y2> <x3> <y3>`
### Dataset YAML format
The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training Detection Models. Here is an example of the YAML format used for defining a detection dataset:
```yaml
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path:../datasets/coco8-seg# dataset root dir
train:images/train# train images (relative to 'path') 4 images
val:images/val# val images (relative to 'path') 4 images
test:# test images (optional)
# Classes (80 COCO classes)
names:
0:person
1:bicycle
2:car
# ...
77:teddy bear
78:hair drier
79:toothbrush
```
The `train` and `val` fields specify the paths to the directories containing the training and validation images, respectively.
`names` is a dictionary of class names. The order of the names should match the order of the object class indices in the YOLO dataset files.
## Usage
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load a pretrained model (recommended for training)
-[COCO](coco.md): A comprehensive dataset for [object detection](https://www.ultralytics.com/glossary/object-detection), segmentation, and captioning, featuring over 200K labeled images across a wide range of categories.
-[COCO8-seg](coco8-seg.md): A compact, 8-image subset of COCO designed for quick testing of segmentation model training, ideal for CI checks and workflow validation in the `ultralytics` repository.
-[COCO128-seg](coco.md): A smaller dataset for [instance segmentation](https://www.ultralytics.com/glossary/instance-segmentation) tasks, containing a subset of 128 COCO images with segmentation annotations.
-[Carparts-seg](carparts-seg.md): A specialized dataset focused on the segmentation of car parts, ideal for automotive applications. It includes a variety of vehicles with detailed annotations of individual car components.
-[Crack-seg](crack-seg.md): A dataset tailored for the segmentation of cracks in various surfaces. Essential for infrastructure maintenance and quality control, it provides detailed imagery for training models to identify structural weaknesses.
-[Package-seg](package-seg.md): A dataset dedicated to the segmentation of different types of packaging materials and shapes. It's particularly useful for logistics and warehouse automation, aiding in the development of systems for package handling and sorting.
### Adding your own dataset
If you have your own dataset and would like to use it for training segmentation models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
## Port or Convert Label Formats
### COCO Dataset Format to YOLO Format
You can easily convert labels from the popular COCO dataset format to the YOLO format using the following code snippet:
!!! example
=== "Python"
```python
from ultralytics.data.converter import convert_coco
This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format.
Remember to double-check if the dataset you want to use is compatible with your model and follows the necessary format conventions. Properly formatted datasets are crucial for training successful object detection models.
## Auto-Annotation
Auto-annotation is an essential feature that allows you to generate a segmentation dataset using a pre-trained detection model. It enables you to quickly and accurately annotate a large number of images without the need for manual labeling, saving time and effort.
### Generate Segmentation Dataset Using a Detection Model
To auto-annotate your dataset using the Ultralytics framework, you can use the `auto_annotate` function as shown below:
!!! example
=== "Python"
```python
from ultralytics.data.annotator import auto_annotate
| `sam_model` | `str, optional` | Pre-trained SAM segmentation model. Defaults to `'sam_b.pt'`. | `'sam_b.pt'` |
| `device` | `str, optional` | Device to run the models on. Defaults to an empty string (CPU or GPU, if available). | `''` |
| `output_dir` | `str or None, optional` | Directory to save the annotated results. Defaults to a `'labels'` folder in the same directory as `'data'`. | `None` |
The `auto_annotate` function takes the path to your images, along with optional arguments for specifying the pre-trained detection and [SAM segmentation models](../../models/sam.md), the device to run the models on, and the output directory for saving the annotated results.
By leveraging the power of pre-trained models, auto-annotation can significantly reduce the time and effort required for creating high-quality segmentation datasets. This feature is particularly useful for researchers and developers working with large image collections, as it allows them to focus on model development and evaluation rather than manual annotation.
## FAQ
### What dataset formats does Ultralytics YOLO support for instance segmentation?
Ultralytics YOLO supports several dataset formats for instance segmentation, with the primary format being its own Ultralytics YOLO format. Each image in your dataset needs a corresponding text file with object information segmented into multiple rows (one row per object), listing the class index and normalized bounding coordinates. For more detailed instructions on the YOLO dataset format, visit the [Instance Segmentation Datasets Overview](#instance-segmentation-datasets-overview).
### How can I convert COCO dataset annotations to the YOLO format?
Converting COCO format annotations to YOLO format is straightforward using Ultralytics tools. You can use the `convert_coco` function from the `ultralytics.data.converter` module:
This script converts your COCO dataset annotations to the required YOLO format, making it suitable for training your YOLO models. For more details, refer to [Port or Convert Label Formats](#coco-dataset-format-to-yolo-format).
### How do I prepare a YAML file for training Ultralytics YOLO models?
To prepare a YAML file for training YOLO models with Ultralytics, you need to define the dataset paths and class names. Here's an example YAML configuration:
```yaml
path:../datasets/coco8-seg# dataset root dir
train:images/train# train images (relative to 'path')
val:images/val# val images (relative to 'path')
names:
0:person
1:bicycle
2:car
# ...
```
Ensure you update the paths and class names according to your dataset. For more information, check the [Dataset YAML Format](#dataset-yaml-format) section.
### What is the auto-annotation feature in Ultralytics YOLO?
Auto-annotation in Ultralytics YOLO allows you to generate segmentation annotations for your dataset using a pre-trained detection model. This significantly reduces the need for manual labeling. You can use the `auto_annotate` function as follows:
This function automates the annotation process, making it faster and more efficient. For more details, explore the [Auto-Annotation](#auto-annotation) section.
description:Explore the Roboflow Package Segmentation Dataset. Optimize logistics and enhance vision models with curated images for package identification and sorting.
keywords:Roboflow, Package Segmentation Dataset, computer vision, package identification, logistics, warehouse automation, segmentation models, training data
---
# Roboflow Universe Package Segmentation Dataset
The [Roboflow](https://roboflow.com/?ref=ultralytics)[Package Segmentation Dataset](https://universe.roboflow.com/factorypackage/factory_package?ref=ultralytics) is a curated collection of images specifically tailored for tasks related to package segmentation in the field of [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv). This dataset is designed to assist researchers, developers, and enthusiasts working on projects related to package identification, sorting, and handling.
Containing a diverse set of images showcasing various packages in different contexts and environments, the dataset serves as a valuable resource for training and evaluating segmentation models. Whether you are engaged in logistics, warehouse automation, or any application requiring precise package analysis, the Package Segmentation Dataset provides a targeted and comprehensive set of images to enhance the performance of your computer vision algorithms.
## Dataset Structure
The distribution of data in the Package Segmentation Dataset is structured as follows:
-**Training set**: Encompasses 1920 images accompanied by their corresponding annotations.
-**Testing set**: Consists of 89 images, each paired with its respective annotations.
-**Validation set**: Comprises 188 images, each with corresponding annotations.
## Applications
Package segmentation, facilitated by the Package Segmentation Dataset, is crucial for optimizing logistics, enhancing last-mile delivery, improving manufacturing quality control, and contributing to smart city solutions. From e-commerce to security applications, this dataset is a key resource, fostering innovation in computer vision for diverse and efficient package analysis applications.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the Package Segmentation dataset, the `package-seg.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/package-seg.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/package-seg.yaml).
!!! example "ultralytics/cfg/datasets/package-seg.yaml"
To train Ultralytics YOLO11n model on the Package Segmentation dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load a pretrained model (recommended for training)
The Package Segmentation dataset comprises a varied collection of images and videos captured from multiple perspectives. Below are instances of data from the dataset, accompanied by their respective annotations:
- This image displays an instance of image [object detection](https://www.ultralytics.com/glossary/object-detection), featuring annotated bounding boxes with masks outlining recognized objects. The dataset incorporates a diverse collection of images taken in different locations, environments, and densities. It serves as a comprehensive resource for developing models specific to this task.
- The example emphasizes the diversity and complexity present in the VisDrone dataset, underscoring the significance of high-quality sensor data for computer vision tasks involving drones.
## Citations and Acknowledgments
If you integrate the crack segmentation dataset into your research or development initiatives, please cite the following paper:
We express our gratitude to the Roboflow team for their efforts in creating and maintaining the Package Segmentation dataset, a valuable asset for logistics and research projects. For additional details about the Package Segmentation dataset and its creators, please visit the [Package Segmentation Dataset Page](https://universe.roboflow.com/factorypackage/factory_package?ref=ultralytics).
## FAQ
### What is the Roboflow Package Segmentation Dataset and how can it help in computer vision projects?
The [Roboflow Package Segmentation Dataset](https://universe.roboflow.com/factorypackage/factory_package?ref=ultralytics) is a curated collection of images tailored for tasks involving package segmentation. It includes diverse images of packages in various contexts, making it invaluable for training and evaluating segmentation models. This dataset is particularly useful for applications in logistics, warehouse automation, and any project requiring precise package analysis. It helps optimize logistics and enhance vision models for accurate package identification and sorting.
### How do I train an Ultralytics YOLO11 model on the Package Segmentation Dataset?
You can train an Ultralytics YOLO11n model using both Python and CLI methods. Use the snippets below:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt") # load a pretrained model
Refer to the model [Training](../../modes/train.md) page for more details.
### What are the components of the Package Segmentation Dataset, and how is it structured?
The dataset is structured into three main components:
-**Training set**: Contains 1920 images with annotations.
-**Testing set**: Comprises 89 images with corresponding annotations.
-**Validation set**: Includes 188 images with annotations.
This structure ensures a balanced dataset for thorough model training, validation, and testing, enhancing the performance of segmentation algorithms.
### Why should I use Ultralytics YOLO11 with the Package Segmentation Dataset?
Ultralytics YOLO11 provides state-of-the-art [accuracy](https://www.ultralytics.com/glossary/accuracy) and speed for real-time object detection and segmentation tasks. Using it with the Package Segmentation Dataset allows you to leverage YOLO11's capabilities for precise package segmentation. This combination is especially beneficial for industries like logistics and warehouse automation, where accurate package identification is critical. For more information, check out our [page on YOLO11 segmentation](https://docs.ultralytics.com/models/yolo11/).
### How can I access and use the package-seg.yaml file for the Package Segmentation Dataset?
The `package-seg.yaml` file is hosted on Ultralytics' GitHub repository and contains essential information about the dataset's paths, classes, and configuration. You can download it from [here](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/package-seg.yaml). This file is crucial for configuring your models to utilize the dataset efficiently.
For more insights and practical examples, explore our [Usage](https://docs.ultralytics.com/usage/python/) section.
description:Learn how to use Multi-Object Tracking with YOLO. Explore dataset formats and see upcoming features for training trackers. Start with Python or CLI examples.
keywords:YOLO, Multi-Object Tracking, Tracking Datasets, Python Tracking Example, CLI Tracking Example, Object Detection, Ultralytics, AI, Machine Learning
---
# Multi-object Tracking Datasets Overview
## Dataset Format (Coming Soon)
Multi-Object Detector doesn't need standalone training and directly supports pre-trained detection, segmentation or Pose models. Support for training trackers alone is coming soon
yolo track model=yolo11n.pt source="https://youtu.be/LNwODJXcvt4" conf=0.3 iou=0.5 show
```
These commands load the YOLO11 model and use it for tracking objects in the given video source with specific confidence (`conf`) and [Intersection over Union](https://www.ultralytics.com/glossary/intersection-over-union-iou)(`iou`) thresholds. For more details, refer to the [track mode documentation](../../modes/track.md).
### What are the upcoming features for training trackers in Ultralytics?
Ultralytics is continuously enhancing its AI models. An upcoming feature will enable the training of standalone trackers. Until then, Multi-Object Detector leverages pre-trained detection, segmentation, or Pose models for tracking without requiring standalone training. Stay updated by following our [blog](https://www.ultralytics.com/blog) or checking the [upcoming features](../../reference/trackers/track.md).
### Why should I use Ultralytics YOLO for multi-object tracking?
Ultralytics YOLO is a state-of-the-art [object detection](https://www.ultralytics.com/glossary/object-detection) model known for its real-time performance and high [accuracy](https://www.ultralytics.com/glossary/accuracy). Using YOLO for multi-object tracking provides several advantages:
-**Real-time tracking:** Achieve efficient and high-speed tracking ideal for dynamic environments.
-**Flexibility with pre-trained models:** No need to train from scratch; simply use pre-trained detection, segmentation, or Pose models.
-**Ease of use:** Simple API integration with both Python and CLI makes setting up tracking pipelines straightforward.
-**Extensive documentation and community support:** Ultralytics provides comprehensive documentation and an active community forum to troubleshoot issues and enhance your tracking models.
For more details on setting up and using YOLO for tracking, visit our [track usage guide](../../modes/track.md).
### Can I use custom datasets for multi-object tracking with Ultralytics YOLO?
Yes, you can use custom datasets for multi-object tracking with Ultralytics YOLO. While support for standalone tracker training is an upcoming feature, you can already use pre-trained models on your custom datasets. Prepare your datasets in the appropriate format compatible with YOLO and follow the documentation to integrate them.
### How do I interpret the results from the Ultralytics YOLO tracking model?
After running a tracking job with Ultralytics YOLO, the results include various data points such as tracked object IDs, their bounding boxes, and the confidence scores. Here's a brief overview of how to interpret these results:
-**Tracked IDs:** Each object is assigned a unique ID, which helps in tracking it across frames.
-**Bounding boxes:** These indicate the location of tracked objects within the frame.
-**Confidence scores:** These reflect the model's confidence in detecting the tracked object.
For detailed guidance on interpreting and visualizing these results, refer to the [results handling guide](../../reference/engine/results.md).
description:Learn to create line graphs, bar plots, and pie charts using Python with guided instructions and code snippets. Maximize your data visualization skills!.
keywords:Ultralytics, YOLO11, data visualization, line graphs, bar plots, pie charts, Python, analytics, tutorial, guide
---
# Analytics using Ultralytics YOLO11
## Introduction
This guide provides a comprehensive overview of three fundamental types of [data visualizations](https://www.ultralytics.com/glossary/data-visualization): line graphs, bar plots, and pie charts. Each section includes step-by-step instructions and code snippets on how to create these visualizations using Python.
- Line graphs are ideal for tracking changes over short and long periods and for comparing changes for multiple groups over the same period.
- Bar plots, on the other hand, are suitable for comparing quantities across different categories and showing relationships between a category and its numerical value.
- Lastly, pie charts are effective for illustrating proportions among categories and showing parts of a whole.
!!! analytics "Analytics Examples"
=== "Line Graph"
```python
import cv2
from ultralytics import solutions
cap = cv2.VideoCapture("Path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
out = cv2.VideoWriter(
"ultralytics_analytics.avi",
cv2.VideoWriter_fourcc(*"MJPG"),
fps,
(1920, 1080), # This is fixed
)
analytics = solutions.Analytics(
analytics_type="line",
show=True,
)
frame_count = 0
while cap.isOpened():
success, im0 = cap.read()
if success:
frame_count += 1
im0 = analytics.process_data(im0, frame_count) # update analytics graph every frame
out.write(im0) # write the video file
else:
break
cap.release()
out.release()
cv2.destroyAllWindows()
```
=== "Pie Chart"
```python
import cv2
from ultralytics import solutions
cap = cv2.VideoCapture("Path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
out = cv2.VideoWriter(
"ultralytics_analytics.avi",
cv2.VideoWriter_fourcc(*"MJPG"),
fps,
(1920, 1080), # This is fixed
)
analytics = solutions.Analytics(
analytics_type="pie",
show=True,
)
frame_count = 0
while cap.isOpened():
success, im0 = cap.read()
if success:
frame_count += 1
im0 = analytics.process_data(im0, frame_count) # update analytics graph every frame
out.write(im0) # write the video file
else:
break
cap.release()
out.release()
cv2.destroyAllWindows()
```
=== "Bar Plot"
```python
import cv2
from ultralytics import solutions
cap = cv2.VideoCapture("Path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
out = cv2.VideoWriter(
"ultralytics_analytics.avi",
cv2.VideoWriter_fourcc(*"MJPG"),
fps,
(1920, 1080), # This is fixed
)
analytics = solutions.Analytics(
analytics_type="bar",
show=True,
)
frame_count = 0
while cap.isOpened():
success, im0 = cap.read()
if success:
frame_count += 1
im0 = analytics.process_data(im0, frame_count) # update analytics graph every frame
out.write(im0) # write the video file
else:
break
cap.release()
out.release()
cv2.destroyAllWindows()
```
=== "Area chart"
```python
import cv2
from ultralytics import solutions
cap = cv2.VideoCapture("Path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
out = cv2.VideoWriter(
"ultralytics_analytics.avi",
cv2.VideoWriter_fourcc(*"MJPG"),
fps,
(1920, 1080), # This is fixed
)
analytics = solutions.Analytics(
analytics_type="area",
show=True,
)
frame_count = 0
while cap.isOpened():
success, im0 = cap.read()
if success:
frame_count += 1
im0 = analytics.process_data(im0, frame_count) # update analytics graph every frame
| `analytics_type` | `str` | `line` | Type of graph i.e "line", "bar", "area", "pie" |
| `model` | `str` | `None` | Path to Ultralytics YOLO Model File |
| `line_width` | `int` | `2` | Line thickness for bounding boxes. |
| `show` | `bool` | `False` | Flag to control whether to display the video stream. |
### Arguments `model.track`
{% include "macros/track-args.md" %}
## Conclusion
Understanding when and how to use different types of visualizations is crucial for effective data analysis. Line graphs, bar plots, and pie charts are fundamental tools that can help you convey your data's story more clearly and effectively.
## FAQ
### How do I create a line graph using Ultralytics YOLO11 Analytics?
To create a line graph using Ultralytics YOLO11 Analytics, follow these steps:
1. Load a YOLO11 model and open your video file.
2. Initialize the `Analytics` class with the type set to "line."
3. Iterate through video frames, updating the line graph with relevant data, such as object counts per frame.
4. Save the output video displaying the line graph.
im0=analytics.process_data(im0,frame_count)# update analytics graph every frame
out.write(im0)# write the video file
else:
break
cap.release()
out.release()
cv2.destroyAllWindows()
```
For further details on configuring the `Analytics` class, visit the [Analytics using Ultralytics YOLO11 📊](#analytics-using-ultralytics-yolo11) section.
### What are the benefits of using Ultralytics YOLO11 for creating bar plots?
Using Ultralytics YOLO11 for creating bar plots offers several benefits:
1.**Real-time Data Visualization**: Seamlessly integrate [object detection](https://www.ultralytics.com/glossary/object-detection) results into bar plots for dynamic updates.
2.**Ease of Use**: Simple API and functions make it straightforward to implement and visualize data.
3.**Customization**: Customize titles, labels, colors, and more to fit your specific requirements.
4.**Efficiency**: Efficiently handle large amounts of data and update plots in real-time during video processing.
im0=analytics.process_data(im0,frame_count)# update analytics graph every frame
out.write(im0)# write the video file
else:
break
cap.release()
out.release()
cv2.destroyAllWindows()
```
For more information, refer to the [Pie Chart](#visual-samples) section in the guide.
### Can Ultralytics YOLO11 be used to track objects and dynamically update visualizations?
Yes, Ultralytics YOLO11 can be used to track objects and dynamically update visualizations. It supports tracking multiple objects in real-time and can update various visualizations like line graphs, bar plots, and pie charts based on the tracked objects' data.
im0=analytics.process_data(im0,frame_count)# update analytics graph every frame
out.write(im0)# write the video file
else:
break
cap.release()
out.release()
cv2.destroyAllWindows()
```
To learn about the complete functionality, see the [Tracking](../modes/track.md) section.
### What makes Ultralytics YOLO11 different from other object detection solutions like [OpenCV](https://www.ultralytics.com/glossary/opencv) and [TensorFlow](https://www.ultralytics.com/glossary/tensorflow)?
Ultralytics YOLO11 stands out from other object detection solutions like OpenCV and TensorFlow for multiple reasons:
1.**State-of-the-art [Accuracy](https://www.ultralytics.com/glossary/accuracy)**: YOLO11 provides superior accuracy in object detection, segmentation, and classification tasks.
2.**Ease of Use**: User-friendly API allows for quick implementation and integration without extensive coding.
3.**Real-time Performance**: Optimized for high-speed inference, suitable for real-time applications.
4.**Diverse Applications**: Supports various tasks including multi-object tracking, custom model training, and exporting to different formats like ONNX, TensorRT, and CoreML.
5.**Comprehensive Documentation**: Extensive [documentation](https://docs.ultralytics.com/) and [blog resources](https://www.ultralytics.com/blog) to guide users through every step.
For more detailed comparisons and use cases, explore our [Ultralytics Blog](https://www.ultralytics.com/blog/ai-use-cases-transforming-your-future).
description:Learn how to run YOLO11 on AzureML. Quickstart instructions for terminal and notebooks to harness Azure's cloud computing for efficient model training.
[Azure](https://azure.microsoft.com/) is Microsoft's [cloud computing](https://www.ultralytics.com/glossary/cloud-computing) platform, designed to help organizations move their workloads to the cloud from on-premises data centers. With the full spectrum of cloud services including those for computing, databases, analytics, [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml), and networking, users can pick and choose from these services to develop and scale new applications, or run existing applications, in the public cloud.
## What is Azure Machine Learning (AzureML)?
Azure Machine Learning, commonly referred to as AzureML, is a fully managed cloud service that enables data scientists and developers to efficiently embed predictive analytics into their applications, helping organizations use massive data sets and bring all the benefits of the cloud to machine learning. AzureML offers a variety of services and capabilities aimed at making machine learning accessible, easy to use, and scalable. It provides capabilities like automated machine learning, drag-and-drop model training, as well as a robust Python SDK so that developers can make the most out of their machine learning models.
## How Does AzureML Benefit YOLO Users?
For users of YOLO (You Only Look Once), AzureML provides a robust, scalable, and efficient platform to both train and deploy machine learning models. Whether you are looking to run quick prototypes or scale up to handle more extensive data, AzureML's flexible and user-friendly environment offers various tools and services to fit your needs. You can leverage AzureML to:
- Easily manage large datasets and computational resources for training.
- Utilize built-in tools for data preprocessing, feature selection, and model training.
- Collaborate more efficiently with capabilities for MLOps (Machine Learning Operations), including but not limited to monitoring, auditing, and versioning of models and data.
In the subsequent sections, you will find a quickstart guide detailing how to run YOLO11 object detection models using AzureML, either from a compute terminal or a notebook.
## Prerequisites
Before you can get started, make sure you have access to an AzureML workspace. If you don't have one, you can create a new [AzureML workspace](https://learn.microsoft.com/azure/machine-learning/concept-workspace?view=azureml-api-2) by following Azure's official documentation. This workspace acts as a centralized place to manage all AzureML resources.
## Create a compute instance
From your AzureML workspace, select Compute > Compute instances > New, select the instance with the resources you need.
Or with the [Ultralytics Python interface](../quickstart.md#use-ultralytics-with-python), for example to train the model:
```python
fromultralyticsimportYOLO
# Load a model
model=YOLO("yolo11n.pt")# load an official YOLO11n model
# Use the model
model.train(data="coco8.yaml",epochs=3)# train the model
metrics=model.val()# evaluate model performance on the validation set
results=model("https://ultralytics.com/images/bus.jpg")# predict on an image
path=model.export(format="onnx")# export the model to ONNX format
```
You can use either the Ultralytics CLI or Python interface for running YOLO11 tasks, as described in the terminal section above.
By following these steps, you should be able to get YOLO11 running quickly on AzureML for quick trials. For more advanced uses, you may refer to the full AzureML documentation linked at the beginning of this guide.
## Explore More with AzureML
This guide serves as an introduction to get you up and running with YOLO11 on AzureML. However, it only scratches the surface of what AzureML can offer. To delve deeper and unlock the full potential of AzureML for your machine learning projects, consider exploring the following resources:
-[Create a Data Asset](https://learn.microsoft.com/azure/machine-learning/how-to-create-data-assets): Learn how to set up and manage your data assets effectively within the AzureML environment.
-[Initiate an AzureML Job](https://learn.microsoft.com/azure/machine-learning/how-to-train-model): Get a comprehensive understanding of how to kickstart your machine learning training jobs on AzureML.
-[Register a Model](https://learn.microsoft.com/azure/machine-learning/how-to-manage-models): Familiarize yourself with model management practices including registration, versioning, and deployment.
-[Train YOLO11 with AzureML Python SDK](https://medium.com/@ouphi/how-to-train-the-yolov8-model-with-azure-machine-learning-python-sdk-8268696be8ba): Explore a step-by-step guide on using the AzureML Python SDK to train your YOLO11 models.
-[Train YOLO11 with AzureML CLI](https://medium.com/@ouphi/how-to-train-the-yolov8-model-with-azureml-and-the-az-cli-73d3c870ba8e): Discover how to utilize the command-line interface for streamlined training and management of YOLO11 models on AzureML.
## FAQ
### How do I run YOLO11 on AzureML for model training?
Running YOLO11 on AzureML for model training involves several steps:
1.**Create a Compute Instance**: From your AzureML workspace, navigate to Compute > Compute instances > New, and select the required instance.
2.**Setup Environment**: Start your compute instance, open a terminal, and create a conda environment:
```bash
conda create --name yolo11env -y
conda activate yolo11env
conda install pip -y
pip install ultralytics onnx>=1.12.0
```
3.**Run YOLO11 Tasks**: Use the Ultralytics CLI to train your model:
For more details, you can refer to the [instructions to use the Ultralytics CLI](../quickstart.md#use-ultralytics-with-cli).
### What are the benefits of using AzureML for YOLO11 training?
AzureML provides a robust and efficient ecosystem for training YOLO11 models:
-**Scalability**: Easily scale your compute resources as your data and model complexity grows.
-**MLOps Integration**: Utilize features like versioning, monitoring, and auditing to streamline ML operations.
-**Collaboration**: Share and manage resources within teams, enhancing collaborative workflows.
These advantages make AzureML an ideal platform for projects ranging from quick prototypes to large-scale deployments. For more tips, check out [AzureML Jobs](https://learn.microsoft.com/azure/machine-learning/how-to-train-model).
### How do I troubleshoot common issues when running YOLO11 on AzureML?
Troubleshooting common issues with YOLO11 on AzureML can involve the following steps:
-**Dependency Issues**: Ensure all required packages are installed. Refer to the `requirements.txt` file for dependencies.
-**Environment Setup**: Verify that your conda environment is correctly activated before running commands.
-**Resource Allocation**: Make sure your compute instances have sufficient resources to handle the training workload.
For additional guidance, review our [YOLO Common Issues](https://docs.ultralytics.com/guides/yolo-common-issues/) documentation.
### Can I use both the Ultralytics CLI and Python interface on AzureML?
Yes, AzureML allows you to use both the Ultralytics CLI and the Python interface seamlessly:
-**CLI**: Ideal for quick tasks and running standard scripts directly from the terminal.
-**Python Interface**: Useful for more complex tasks requiring custom coding and integration within notebooks.
```python
from ultralytics import YOLO
model = YOLO("yolo11n.pt")
model.train(data="coco8.yaml", epochs=3)
```
Refer to the quickstart guides for more detailed instructions [here](../quickstart.md#use-ultralytics-with-cli) and [here](../quickstart.md#use-ultralytics-with-python).
### What is the advantage of using Ultralytics YOLO11 over other [object detection](https://www.ultralytics.com/glossary/object-detection) models?
Ultralytics YOLO11 offers several unique advantages over competing object detection models:
-**Speed**: Faster inference and training times compared to models like Faster R-CNN and SSD.
-**[Accuracy](https://www.ultralytics.com/glossary/accuracy)**: High accuracy in detection tasks with features like anchor-free design and enhanced augmentation strategies.
-**Ease of Use**: Intuitive API and CLI for quick setup, making it accessible both to beginners and experts.
To explore more about YOLO11's features, visit the [Ultralytics YOLO](https://www.ultralytics.com/yolo) page for detailed insights.
This guide provides a comprehensive introduction to setting up a Conda environment for your Ultralytics projects. Conda is an open-source package and environment management system that offers an excellent alternative to pip for installing packages and dependencies. Its isolated environments make it particularly well-suited for data science and [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) endeavors. For more details, visit the Ultralytics Conda package on [Anaconda](https://anaconda.org/conda-forge/ultralytics) and check out the Ultralytics feedstock repository for package updates on [GitHub](https://github.com/conda-forge/ultralytics-feedstock/).
- You should have Anaconda or Miniconda installed on your system. If not, download and install it from [Anaconda](https://www.anaconda.com/) or [Miniconda](https://docs.conda.io/projects/miniconda/en/latest/).
---
## Setting up a Conda Environment
First, let's create a new Conda environment. Open your terminal and run the following command:
You can install the Ultralytics package from the conda-forge channel. Execute the following command:
```bash
conda install-c conda-forge ultralytics
```
### Note on CUDA Environment
If you're working in a CUDA-enabled environment, it's a good practice to install `ultralytics`, `pytorch`, and `pytorch-cuda` together to resolve any conflicts:
With Ultralytics installed, you can now start using its robust features for [object detection](https://www.ultralytics.com/glossary/object-detection), [instance segmentation](https://www.ultralytics.com/glossary/instance-segmentation), and more. For example, to predict an image, you can run:
results[0].show()# display results for the first image
```
---
## Ultralytics Conda Docker Image
If you prefer using Docker, Ultralytics offers Docker images with a Conda environment included. You can pull these images from [DockerHub](https://hub.docker.com/r/ultralytics/ultralytics).
Pull the latest Ultralytics image:
```bash
# Set image name as a variable
t=ultralytics/ultralytics:latest-conda
# Pull the latest Ultralytics image from Docker Hub
sudo docker pull $t
```
Run the image:
```bash
# Run the Ultralytics image in a container with GPU support
sudo docker run -it--ipc=host --gpus all $t# all GPUs
sudo docker run -it--ipc=host --gpus'"device=2,3"'$t# specify GPUs
```
## Speeding Up Installation with Libmamba
If you're looking to [speed up the package installation](https://www.anaconda.com/blog/a-faster-conda-for-a-growing-community) process in Conda, you can opt to use `libmamba`, a fast, cross-platform, and dependency-aware package manager that serves as an alternative solver to Conda's default.
### How to Enable Libmamba
To enable `libmamba` as the solver for Conda, you can perform the following steps:
1. First, install the `conda-libmamba-solver` package. This can be skipped if your Conda version is 4.11 or above, as `libmamba` is included by default.
```bash
conda install conda-libmamba-solver
```
2. Next, configure Conda to use `libmamba` as the solver:
```bash
conda config --set solver libmamba
```
And that's it! Your Conda installation will now use `libmamba` as the solver, which should result in a faster package installation process.
---
Congratulations! You have successfully set up a Conda environment, installed the Ultralytics package, and are now ready to explore its rich functionalities. Feel free to dive deeper into the [Ultralytics documentation](../index.md) for more advanced tutorials and examples.
## FAQ
### What is the process for setting up a Conda environment for Ultralytics projects?
Setting up a Conda environment for Ultralytics projects is straightforward and ensures smooth package management. First, create a new Conda environment using the following command:
Finally, install Ultralytics from the conda-forge channel:
```bash
conda install-c conda-forge ultralytics
```
### Why should I use Conda over pip for managing dependencies in Ultralytics projects?
Conda is a robust package and environment management system that offers several advantages over pip. It manages dependencies efficiently and ensures that all necessary libraries are compatible. Conda's isolated environments prevent conflicts between packages, which is crucial in data science and machine learning projects. Additionally, Conda supports binary package distribution, speeding up the installation process.
### Can I use Ultralytics YOLO in a CUDA-enabled environment for faster performance?
Yes, you can enhance performance by utilizing a CUDA-enabled environment. Ensure that you install `ultralytics`, `pytorch`, and `pytorch-cuda` together to avoid conflicts:
This setup enables GPU acceleration, crucial for intensive tasks like [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) model training and inference. For more information, visit the [Ultralytics installation guide](../quickstart.md).
### What are the benefits of using Ultralytics Docker images with a Conda environment?
Using Ultralytics Docker images ensures a consistent and reproducible environment, eliminating "it works on my machine" issues. These images include a pre-configured Conda environment, simplifying the setup process. You can pull and run the latest Ultralytics Docker image with the following commands:
sudo docker run -it--ipc=host --gpus all ultralytics/ultralytics:latest-conda
```
This approach is ideal for deploying applications in production or running complex workflows without manual configuration. Learn more about [Ultralytics Conda Docker Image](../quickstart.md).
### How can I speed up Conda package installation in my Ultralytics environment?
You can speed up the package installation process by using `libmamba`, a fast dependency solver for Conda. First, install the `conda-libmamba-solver` package:
```bash
conda install conda-libmamba-solver
```
Then configure Conda to use `libmamba` as the solver:
```bash
conda config --set solver libmamba
```
This setup provides faster and more efficient package management. For more tips on optimizing your environment, read about [libmamba installation](../quickstart.md).
description:Learn how to boost your Raspberry Pi's ML performance using Coral Edge TPU with Ultralytics YOLO11. Follow our detailed setup and installation guide.
# Coral Edge TPU on a Raspberry Pi with Ultralytics YOLO11 🚀
<palign="center">
<imgwidth="800"src="https://github.com/ultralytics/docs/releases/download/0/edge-tpu-usb-accelerator-and-pi.avif"alt="Raspberry Pi single board computer with USB Edge TPU accelerator">
</p>
## What is a Coral Edge TPU?
The Coral Edge TPU is a compact device that adds an Edge TPU coprocessor to your system. It enables low-power, high-performance ML inference for [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) Lite models. Read more at the [Coral Edge TPU home page](https://coral.ai/products/accelerator).
<strong>Watch:</strong> How to Run Inference on Raspberry Pi using Google Coral Edge TPU
</p>
## Boost Raspberry Pi Model Performance with Coral Edge TPU
Many people want to run their models on an embedded or mobile device such as a Raspberry Pi, since they are very power efficient and can be used in many different applications. However, the inference performance on these devices is usually poor even when using formats like [ONNX](../integrations/onnx.md) or [OpenVINO](../integrations/openvino.md). The Coral Edge TPU is a great solution to this problem, since it can be used with a Raspberry Pi and accelerate inference performance greatly.
## Edge TPU on Raspberry Pi with TensorFlow Lite (New)⭐
The [existing guide](https://coral.ai/docs/accelerator/get-started/) by Coral on how to use the Edge TPU with a Raspberry Pi is outdated, and the current Coral Edge TPU runtime builds do not work with the current TensorFlow Lite runtime versions anymore. In addition to that, Google seems to have completely abandoned the Coral project, and there have not been any updates between 2021 and 2024. This guide will show you how to get the Edge TPU working with the latest versions of the TensorFlow Lite runtime and an updated Coral Edge TPU runtime on a Raspberry Pi single board computer (SBC).
## Prerequisites
-[Raspberry Pi 4B](https://www.raspberrypi.com/products/raspberry-pi-4-model-b/)(2GB or more recommended) or [Raspberry Pi 5](https://www.raspberrypi.com/products/raspberry-pi-5/)(Recommended)
-[Raspberry Pi OS](https://www.raspberrypi.com/software/) Bullseye/Bookworm (64-bit) with desktop (Recommended)
-[Coral USB Accelerator](https://coral.ai/products/accelerator/)
- A non-ARM based platform for exporting an Ultralytics [PyTorch](https://www.ultralytics.com/glossary/pytorch) model
## Installation Walkthrough
This guide assumes that you already have a working Raspberry Pi OS install and have installed `ultralytics` and all dependencies. To get `ultralytics` installed, visit the [quickstart guide](../quickstart.md) to get setup before continuing here.
### Installing the Edge TPU runtime
First, we need to install the Edge TPU runtime. There are many different versions available, so you need to choose the right version for your operating system.
| Raspberry Pi OS | High frequency mode | Version to download |
[Download the latest version from here](https://github.com/feranick/libedgetpu/releases).
After downloading the file, you can install it with the following command:
```bash
sudo dpkg -i path/to/package.deb
```
After installing the runtime, you need to plug in your Coral Edge TPU into a USB 3.0 port on your Raspberry Pi. This is because, according to the official guide, a new `udev` rule needs to take effect after installation.
???+ warning "Important"
If you already have the Coral Edge TPU runtime installed, uninstall it using the following command.
```bash
# If you installed the standard version
sudo apt remove libedgetpu1-std
# If you installed the high frequency version
sudo apt remove libedgetpu1-max
```
## Export your model to a Edge TPU compatible model
To use the Edge TPU, you need to convert your model into a compatible format. It is recommended that you run export on Google Colab, x86_64 Linux machine, using the official [Ultralytics Docker container](docker-quickstart.md), or using [Ultralytics HUB](../hub/quickstart.md), since the Edge TPU compiler is not available on ARM. See the [Export Mode](../modes/export.md) for the available arguments.
!!! example "Exporting the model"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("path/to/model.pt") # Load an official model or custom model
# Export the model
model.export(format="edgetpu")
```
=== "CLI"
```bash
yolo export model=path/to/model.pt format=edgetpu # Export an official model or custom model
```
The exported model will be saved in the `<model_name>_saved_model/` folder with the name `<model_name>_full_integer_quant_edgetpu.tflite`. It is important that your model ends with the suffix `_edgetpu.tflite`, otherwise ultralytics doesn't know that you're using a Edge TPU model.
## Running the model
Before you can actually run the model, you will need to install the correct libraries.
If `tensorflow` is installed, uninstall tensorflow with the following command:
```bash
pip uninstall tensorflow tensorflow-aarch64
```
Then install/update `tflite-runtime`:
```bash
pip install-U tflite-runtime
```
Now you can run inference using the following code:
!!! example "Running the model"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("path/to/<model_name>_full_integer_quant_edgetpu.tflite") # Load an official model or custom model
# Run Prediction
model.predict("path/to/source.png")
```
=== "CLI"
```bash
yolo predict model=path/to/<model_name>_full_integer_quant_edgetpu.tflite source=path/to/source.png # Load an official model or custom model
```
Find comprehensive information on the [Predict](../modes/predict.md) page for full prediction mode details.
!!! note "Inference with multiple Edge TPUs"
If you have multiple Edge TPUs you can use the following code to select a specific TPU.
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("path/to/<model_name>_full_integer_quant_edgetpu.tflite") # Load an official model or custom model
# Run Prediction
model.predict("path/to/source.png") # Inference defaults to the first TPU
model.predict("path/to/source.png", device="tpu:0") # Select the first TPU
model.predict("path/to/source.png", device="tpu:1") # Select the second TPU
```
## FAQ
### What is a Coral Edge TPU and how does it enhance Raspberry Pi's performance with Ultralytics YOLO11?
The Coral Edge TPU is a compact device designed to add an Edge TPU coprocessor to your system. This coprocessor enables low-power, high-performance [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) inference, particularly optimized for TensorFlow Lite models. When using a Raspberry Pi, the Edge TPU accelerates ML model inference, significantly boosting performance, especially for Ultralytics YOLO11 models. You can read more about the Coral Edge TPU on their [home page](https://coral.ai/products/accelerator).
### How do I install the Coral Edge TPU runtime on a Raspberry Pi?
To install the Coral Edge TPU runtime on your Raspberry Pi, download the appropriate `.deb` package for your Raspberry Pi OS version from [this link](https://github.com/feranick/libedgetpu/releases). Once downloaded, use the following command to install it:
```bash
sudo dpkg -i path/to/package.deb
```
Make sure to uninstall any previous Coral Edge TPU runtime versions by following the steps outlined in the [Installation Walkthrough](#installation-walkthrough) section.
### Can I export my Ultralytics YOLO11 model to be compatible with Coral Edge TPU?
Yes, you can export your Ultralytics YOLO11 model to be compatible with the Coral Edge TPU. It is recommended to perform the export on Google Colab, an x86_64 Linux machine, or using the [Ultralytics Docker container](docker-quickstart.md). You can also use Ultralytics HUB for exporting. Here is how you can export your model using Python and CLI:
!!! note "Exporting the model"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("path/to/model.pt") # Load an official model or custom model
# Export the model
model.export(format="edgetpu")
```
=== "CLI"
```bash
yolo export model=path/to/model.pt format=edgetpu # Export an official model or custom model
```
For more information, refer to the [Export Mode](../modes/export.md) documentation.
### What should I do if TensorFlow is already installed on my Raspberry Pi but I want to use tflite-runtime instead?
If you have TensorFlow installed on your Raspberry Pi and need to switch to `tflite-runtime`, you'll need to uninstall TensorFlow first using:
```bash
pip uninstall tensorflow tensorflow-aarch64
```
Then, install or update `tflite-runtime` with the following command:
```bash
pip install-U tflite-runtime
```
For a specific wheel, such as TensorFlow 2.15.0 `tflite-runtime`, you can download it from [this link](https://github.com/feranick/TFlite-builds/releases) and install it using `pip`. Detailed instructions are available in the section on running the model [Running the Model](#running-the-model).
### How do I run inference with an exported YOLO11 model on a Raspberry Pi using the Coral Edge TPU?
After exporting your YOLO11 model to an Edge TPU-compatible format, you can run inference using the following code snippets:
!!! note "Running the model"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("path/to/edgetpu_model.tflite") # Load an official model or custom model
# Run Prediction
model.predict("path/to/source.png")
```
=== "CLI"
```bash
yolo predict model=path/to/edgetpu_model.tflite source=path/to/source.png # Load an official model or custom model
```
Comprehensive details on full prediction mode features can be found on the [Predict Page](../modes/predict.md).
description:Data collection and annotation are vital steps in any computer vision project. Explore the tools, techniques, and best practices for collecting and annotating data.
keywords:What is Data Annotation, Data Annotation Tools, Annotating Data, Avoiding Bias in Data Collection, Ethical Data Collection, Annotation Strategies
---
# Data Collection and Annotation Strategies for Computer Vision
## Introduction
The key to success in any [computer vision project](./steps-of-a-cv-project.md) starts with effective data collection and annotation strategies. The quality of the data directly impacts model performance, so it's important to understand the best practices related to data collection and data annotation.
Every consideration regarding the data should closely align with [your project's goals](./defining-project-goals.md). Changes in your annotation strategies could shift the project's focus or effectiveness and vice versa. With this in mind, let's take a closer look at the best ways to approach data collection and annotation.
## Setting Up Classes and Collecting Data
Collecting images and video for a computer vision project involves defining the number of classes, sourcing data, and considering ethical implications. Before you start gathering your data, you need to be clear about:
### Choosing the Right Classes for Your Project
One of the first questions when starting a computer vision project is how many classes to include. You need to determine the class membership, which is involves the different categories or labels that you want your model to recognize and differentiate. The number of classes should be determined by the specific goals of your project.
For example, if you want to monitor traffic, your classes might include "car," "truck," "bus," "motorcycle," and "bicycle." On the other hand, for tracking items in a store, your classes could be "fruits," "vegetables," "beverages," and "snacks." Defining classes based on your project goals helps keep your dataset relevant and focused.
When you define your classes, another important distinction to make is whether to choose coarse or fine class counts. 'Count' refers to the number of distinct classes you are interested in. This decision influences the granularity of your data and the complexity of your model. Here are the considerations for each approach:
-**Coarse Class-Count**: These are broader, more inclusive categories, such as "vehicle" and "non-vehicle." They simplify annotation and require fewer computational resources but provide less detailed information, potentially limiting the model's effectiveness in complex scenarios.
-**Fine Class-Count**: More categories with finer distinctions, such as "sedan," "SUV," "pickup truck," and "motorcycle." They capture more detailed information, improving model accuracy and performance. However, they are more time-consuming and labor-intensive to annotate and require more computational resources.
Something to note is that starting with more specific classes can be very helpful, especially in complex projects where details are important. More specific classes lets you collect more detailed data, and gain deeper insights and clearer distinctions between categories. Not only does it improve the accuracy of the model, but it also makes it easier to adjust the model later if needed, saving both time and resources.
### Sources of Data
You can use public datasets or gather your own custom data. Public datasets like those on [Kaggle](https://www.kaggle.com/datasets) and [Google Dataset Search Engine](https://datasetsearch.research.google.com/) offer well-annotated, standardized data, making them great starting points for training and validating models.
Custom data collection, on the other hand, allows you to customize your dataset to your specific needs. You might capture images and videos with cameras or drones, scrape the web for images, or use existing internal data from your organization. Custom data gives you more control over its quality and relevance. Combining both public and custom data sources helps create a diverse and comprehensive dataset.
### Avoiding [Bias in](https://www.ultralytics.com/glossary/bias-in-ai) Data Collection
Bias occurs when certain groups or scenarios are underrepresented or overrepresented in your dataset. It leads to a model that performs well on some data but poorly on others. It's crucial to avoid bias so that your computer vision model can perform well in a variety of scenarios.
Here is how you can avoid bias while collecting data:
-**Diverse Sources**: Collect data from many sources to capture different perspectives and scenarios.
-**Balanced Representation**: Include balanced representation from all relevant groups. For example, consider different ages, genders, and ethnicities.
-**Continuous Monitoring**: Regularly review and update your dataset to identify and address any emerging biases.
-**Bias Mitigation Techniques**: Use methods like oversampling underrepresented classes, [data augmentation](https://www.ultralytics.com/glossary/data-augmentation), and fairness-aware algorithms.
Following these practices helps create a more robust and fair model that can generalize well in real-world applications.
## What is Data Annotation?
Data annotation is the process of labeling data to make it usable for training [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) models. In computer vision, this means labeling images or videos with the information that a model needs to learn from. Without properly annotated data, models cannot accurately learn the relationships between inputs and outputs.
### Types of Data Annotation
Depending on the specific requirements of a [computer vision task](../tasks/index.md), there are different types of data annotation. Here are some examples:
-**Bounding Boxes**: Rectangular boxes drawn around objects in an image, used primarily for object detection tasks. These boxes are defined by their top-left and bottom-right coordinates.
-**Polygons**: Detailed outlines for objects, allowing for more precise annotation than bounding boxes. Polygons are used in tasks like [instance segmentation](https://www.ultralytics.com/glossary/instance-segmentation), where the shape of the object is important.
-**Masks**: Binary masks where each pixel is either part of an object or the background. Masks are used in semantic segmentation tasks to provide pixel-level detail.
-**Keypoints**: Specific points marked within an image to identify locations of interest. Keypoints are used in tasks like pose estimation and facial landmark detection.
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/types-of-data-annotation.avif"alt="Types of Data Annotation">
</p>
### Common Annotation Formats
After selecting a type of annotation, it's important to choose the appropriate format for storing and sharing annotations.
Commonly used formats include [COCO](../datasets/detect/coco.md), which supports various annotation types like [object detection](https://www.ultralytics.com/glossary/object-detection), keypoint detection, stuff segmentation, [panoptic segmentation](https://www.ultralytics.com/glossary/panoptic-segmentation), and image captioning, stored in JSON. [Pascal VOC](../datasets/detect/voc.md) uses XML files and is popular for object detection tasks. YOLO, on the other hand, creates a .txt file for each image, containing annotations like object class, coordinates, height, and width, making it suitable for object detection.
### Techniques of Annotation
Now, assuming you've chosen a type of annotation and format, it's time to establish clear and objective labeling rules. These rules are like a roadmap for consistency and [accuracy](https://www.ultralytics.com/glossary/accuracy) throughout the annotation process. Key aspects of these rules include:
-**Clarity and Detail**: Make sure your instructions are clear. Use examples and illustrations to understand what's expected.
-**Consistency**: Keep your annotations uniform. Set standard criteria for annotating different types of data, so all annotations follow the same rules.
-**Reducing Bias**: Stay neutral. Train yourself to be objective and minimize personal biases to ensure fair annotations.
-**Efficiency**: Work smarter, not harder. Use tools and workflows that automate repetitive tasks, making the annotation process faster and more efficient.
Regularly reviewing and updating your labeling rules will help keep your annotations accurate, consistent, and aligned with your project goals.
### Popular Annotation Tools
Let's say you are ready to annotate now. There are several open-source tools available to help streamline the data annotation process. Here are some useful open annotation tools:
-**[Label Studio](https://github.com/HumanSignal/label-studio)**: A flexible tool that supports a wide range of annotation tasks and includes features for managing projects and quality control.
-**[CVAT](https://github.com/cvat-ai/cvat)**: A powerful tool that supports various annotation formats and customizable workflows, making it suitable for complex projects.
-**[Labelme](https://github.com/wkentaro/labelme)**: A simple and easy-to-use tool that allows for quick annotation of images with polygons, making it ideal for straightforward tasks.
These open-source tools are budget-friendly and provide a range of features to meet different annotation needs.
### Some More Things to Consider Before Annotating Data
Before you dive into annotating your data, there are a few more things to keep in mind. You should be aware of accuracy, [precision](https://www.ultralytics.com/glossary/precision), outliers, and quality control to avoid labeling your data in a counterproductive manner.
#### Understanding Accuracy and Precision
It's important to understand the difference between accuracy and precision and how it relates to annotation. Accuracy refers to how close the annotated data is to the true values. It helps us measure how closely the labels reflect real-world scenarios. Precision indicates the consistency of annotations. It checks if you are giving the same label to the same object or feature throughout the dataset. High accuracy and precision lead to better-trained models by reducing noise and improving the model's ability to generalize from the [training data](https://www.ultralytics.com/glossary/training-data).
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/example-of-precision.avif"alt="Example of Precision">
</p>
#### Identifying Outliers
Outliers are data points that deviate quite a bit from other observations in the dataset. With respect to annotations, an outlier could be an incorrectly labeled image or an annotation that doesn't fit with the rest of the dataset. Outliers are concerning because they can distort the model's learning process, leading to inaccurate predictions and poor generalization.
You can use various methods to detect and correct outliers:
-**Statistical Techniques**: To detect outliers in numerical features like pixel values, [bounding box](https://www.ultralytics.com/glossary/bounding-box) coordinates, or object sizes, you can use methods such as box plots, histograms, or z-scores.
-**Visual Techniques**: To spot anomalies in categorical features like object classes, colors, or shapes, use visual methods like plotting images, labels, or heat maps.
-**Algorithmic Methods**: Use tools like clustering (e.g., K-means clustering, DBSCAN) and [anomaly detection](https://www.ultralytics.com/glossary/anomaly-detection) algorithms to identify outliers based on data distribution patterns.
#### Quality Control of Annotated Data
Just like other technical projects, quality control is a must for annotated data. It is a good practice to regularly check annotations to make sure they are accurate and consistent. This can be done in a few different ways:
- Reviewing samples of annotated data
- Using automated tools to spot common errors
- Having another person double-check the annotations
If you are working with multiple people, consistency between different annotators is important. Good inter-annotator agreement means that the guidelines are clear and everyone is following them the same way. It keeps everyone on the same page and the annotations consistent.
While reviewing, if you find errors, correct them and update the guidelines to avoid future mistakes. Provide feedback to annotators and offer regular training to help reduce errors. Having a strong process for handling errors keeps your dataset accurate and reliable.
## Share Your Thoughts with the Community
Bouncing your ideas and queries off other [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) enthusiasts can help accelerate your projects. Here are some great ways to learn, troubleshoot, and network:
### Where to Find Help and Support
-**GitHub Issues:** Visit the YOLO11 GitHub repository and use the [Issues tab](https://github.com/ultralytics/ultralytics/issues) to raise questions, report bugs, and suggest features. The community and maintainers are there to help with any issues you face.
-**Ultralytics Discord Server:** Join the [Ultralytics Discord server](https://discord.com/invite/ultralytics) to connect with other users and developers, get support, share knowledge, and brainstorm ideas.
### Official Documentation
-**Ultralytics YOLO11 Documentation:** Refer to the [official YOLO11 documentation](./index.md) for thorough guides and valuable insights on numerous computer vision tasks and projects.
## Conclusion
By following the best practices for collecting and annotating data, avoiding bias, and using the right tools and techniques, you can significantly improve your model's performance. Engaging with the community and using available resources will keep you informed and help you troubleshoot issues effectively. Remember, quality data is the foundation of a successful project, and the right strategies will help you build robust and reliable models.
## FAQ
### What is the best way to avoid bias in data collection for computer vision projects?
Avoiding bias in data collection ensures that your computer vision model performs well across various scenarios. To minimize bias, consider collecting data from diverse sources to capture different perspectives and scenarios. Ensure balanced representation among all relevant groups, such as different ages, genders, and ethnicities. Regularly review and update your dataset to identify and address any emerging biases. Techniques such as oversampling underrepresented classes, data augmentation, and fairness-aware algorithms can also help mitigate bias. By employing these strategies, you maintain a robust and fair dataset that enhances your model's generalization capability.
### How can I ensure high consistency and accuracy in data annotation?
Ensuring high consistency and accuracy in data annotation involves establishing clear and objective labeling guidelines. Your instructions should be detailed, with examples and illustrations to clarify expectations. Consistency is achieved by setting standard criteria for annotating various data types, ensuring all annotations follow the same rules. To reduce personal biases, train annotators to stay neutral and objective. Regular reviews and updates of labeling rules help maintain accuracy and alignment with project goals. Using automated tools to check for consistency and getting feedback from other annotators also contribute to maintaining high-quality annotations.
### How many images do I need for training Ultralytics YOLO models?
For effective [transfer learning](https://www.ultralytics.com/glossary/transfer-learning) and object detection with Ultralytics YOLO models, start with a minimum of a few hundred annotated objects per class. If training for just one class, begin with at least 100 annotated images and train for approximately 100 [epochs](https://www.ultralytics.com/glossary/epoch). More complex tasks might require thousands of images per class to achieve high reliability and performance. Quality annotations are crucial, so ensure your data collection and annotation processes are rigorous and aligned with your project's specific goals. Explore detailed training strategies in the [YOLO11 training guide](../modes/train.md).
### What are some popular tools for data annotation?
Several popular open-source tools can streamline the data annotation process:
-**[Label Studio](https://github.com/HumanSignal/label-studio)**: A flexible tool supporting various annotation tasks, project management, and quality control features.
-**[CVAT](https://www.cvat.ai/)**: Offers multiple annotation formats and customizable workflows, making it suitable for complex projects.
-**[Labelme](https://github.com/wkentaro/labelme)**: Ideal for quick and straightforward image annotation with polygons.
These tools can help enhance the efficiency and accuracy of your annotation workflows. For extensive feature lists and guides, refer to our [data annotation tools documentation](../datasets/index.md).
### What types of data annotation are commonly used in computer vision?
Different types of data annotation cater to various computer vision tasks:
-**Bounding Boxes**: Used primarily for object detection, these are rectangular boxes around objects in an image.
-**Polygons**: Provide more precise object outlines suitable for instance segmentation tasks.
-**Masks**: Offer pixel-level detail, used in [semantic segmentation](https://www.ultralytics.com/glossary/semantic-segmentation) to differentiate objects from the background.
-**Keypoints**: Identify specific points of interest within an image, useful for tasks like pose estimation and facial landmark detection.
Selecting the appropriate annotation type depends on your project's requirements. Learn more about how to implement these annotations and their formats in our [data annotation guide](#what-is-data-annotation).
description:Learn how to deploy Ultralytics YOLO11 on NVIDIA Jetson devices using TensorRT and DeepStream SDK. Explore performance benchmarks and maximize AI capabilities.
keywords:Ultralytics, YOLO11, NVIDIA Jetson, JetPack, AI deployment, embedded systems, deep learning, TensorRT, DeepStream SDK, computer vision
---
# Ultralytics YOLO11 on NVIDIA Jetson using DeepStream SDK and TensorRT
<strong>Watch:</strong> How to Run Multiple Streams with DeepStream SDK on Jetson Nano using Ultralytics YOLO11
</p>
This comprehensive guide provides a detailed walkthrough for deploying Ultralytics YOLO11 on [NVIDIA Jetson](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) devices using DeepStream SDK and TensorRT. Here we use TensorRT to maximize the inference performance on the Jetson platform.
<imgwidth="1024"src="https://github.com/ultralytics/docs/releases/download/0/deepstream-nvidia-jetson.avif"alt="DeepStream on NVIDIA Jetson">
!!! note
This guide has been tested with both [Seeed Studio reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) which is based on NVIDIA Jetson Orin NX 16GB running JetPack release of [JP5.1.3](https://developer.nvidia.com/embedded/jetpack-sdk-513) and [Seeed Studio reComputer J1020 v2](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) which is based on NVIDIA Jetson Nano 4GB running JetPack release of [JP4.6.4](https://developer.nvidia.com/jetpack-sdk-464). It is expected to work across all the NVIDIA Jetson hardware lineup including latest and legacy.
## What is NVIDIA DeepStream?
[NVIDIA's DeepStream SDK](https://developer.nvidia.com/deepstream-sdk) is a complete streaming analytics toolkit based on GStreamer for AI-based multi-sensor processing, video, audio, and image understanding. It's ideal for vision AI developers, software partners, startups, and OEMs building IVA (Intelligent Video Analytics) apps and services. You can now create stream-processing pipelines that incorporate [neural networks](https://www.ultralytics.com/glossary/neural-network-nn) and other complex processing tasks like tracking, video encoding/decoding, and video rendering. These pipelines enable real-time analytics on video, image, and sensor data. DeepStream's multi-platform support gives you a faster, easier way to develop vision AI applications and services on-premise, at the edge, and in the cloud.
## Prerequisites
Before you start to follow this guide:
- Visit our documentation, [Quick Start Guide: NVIDIA Jetson with Ultralytics YOLO11](nvidia-jetson.md) to set up your NVIDIA Jetson device with Ultralytics YOLO11
- Install [DeepStream SDK](https://developer.nvidia.com/deepstream-getting-started) according to the JetPack version
- For JetPack 4.6.4, install [DeepStream 6.0.1](https://docs.nvidia.com/metropolis/deepstream/6.0.1/dev-guide/text/DS_Quickstart.html)
- For JetPack 5.1.3, install [DeepStream 6.3](https://docs.nvidia.com/metropolis/deepstream/6.3/dev-guide/text/DS_Quickstart.html)
!!! tip
In this guide we have used the Debian package method of installing DeepStream SDK to the Jetson device. You can also visit the [DeepStream SDK on Jetson (Archived)](https://developer.nvidia.com/embedded/deepstream-on-jetson-downloads-archived) to access legacy versions of DeepStream.
## DeepStream Configuration for YOLO11
Here we are using [marcoslucianops/DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo) GitHub repository which includes NVIDIA DeepStream SDK support for YOLO models. We appreciate the efforts of marcoslucianops for his contributions!
3. Download Ultralytics YOLO11 detection model (.pt) of your choice from [YOLO11 releases](https://github.com/ultralytics/assets/releases). Here we use [yolov8s.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt).
It will take a long time to generate the TensorRT engine file before starting the inference. So please be patient.
<divalign=center><imgwidth=1000src="https://github.com/ultralytics/docs/releases/download/0/yolov8-with-deepstream.avif"alt="YOLO11 with deepstream"></div>
!!! tip
If you want to convert the model to FP16 [precision](https://www.ultralytics.com/glossary/precision), simply set `model-engine-file=model_b1_gpu0_fp16.engine` and `network-mode=2` inside `config_infer_primary_yoloV8.txt`
## INT8 Calibration
If you want to use INT8 precision for inference, you need to follow the steps below
1. Set `OPENCV` environment variable
```bash
export OPENCV=1
```
2. Compile the library
```bash
make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
```
3. For COCO dataset, download the [val2017](http://images.cocodataset.org/zips/val2017.zip), extract, and move to `DeepStream-Yolo` folder
4. Make a new directory for calibration images
```bash
mkdir calibration
```
5. Run the following to select 1000 random images from COCO dataset to run calibration
```bash
for jpg in $(ls -1 val2017/*.jpg | sort -R | head -1000); do \
cp ${jpg} calibration/; \
done
```
!!! note
NVIDIA recommends at least 500 images to get a good [accuracy](https://www.ultralytics.com/glossary/accuracy). On this example, 1000 images are chosen to get better accuracy (more images = more accuracy). You can set it from **head -1000**. For example, for 2000 images, **head -2000**. This process can take a long time.
6. Create the `calibration.txt` file with all selected images
```bash
realpath calibration/*jpg > calibration.txt
```
7. Set environment variables
```bash
export INT8_CALIB_IMG_PATH=calibration.txt
export INT8_CALIB_BATCH_SIZE=1
```
!!! note
Higher INT8_CALIB_BATCH_SIZE values will result in more accuracy and faster calibration speed. Set it according to you GPU memory.
8. Update the `config_infer_primary_yoloV8.txt` file
From
```bash
...
model-engine-file=model_b1_gpu0_fp32.engine
#int8-calib-file=calib.table
...
network-mode=0
...
```
To
```bash
...
model-engine-file=model_b1_gpu0_int8.engine
int8-calib-file=calib.table
...
network-mode=1
...
```
### Run Inference
```bash
deepstream-app -c deepstream_app_config.txt
```
## MultiStream Setup
To set up multiple streams under a single deepstream application, you can do the following changes to the `deepstream_app_config.txt` file
1. Change the rows and columns to build a grid display according to the number of streams you want to have. For example, for 4 streams, we can add 2 rows and 2 columns.
```bash
[tiled-display]
rows=2
columns=2
```
2. Set `num-sources=4` and add `uri` of all the 4 streams
The following table summarizes how YOLOv8s models perform at different TensorRT precision levels with an input size of 640x640 on NVIDIA Jetson Orin NX 16GB.
| Model Name | Precision | Inference Time (ms/im) | FPS |
This guide was initially created by our friends at Seeed Studio, Lakshantha and Elaine.
## FAQ
### How do I set up Ultralytics YOLO11 on an NVIDIA Jetson device?
To set up Ultralytics YOLO11 on an [NVIDIA Jetson](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) device, you first need to install the [DeepStream SDK](https://developer.nvidia.com/deepstream-getting-started) compatible with your JetPack version. Follow the step-by-step guide in our [Quick Start Guide](nvidia-jetson.md) to configure your NVIDIA Jetson for YOLO11 deployment.
### What is the benefit of using TensorRT with YOLO11 on NVIDIA Jetson?
Using TensorRT with YOLO11 optimizes the model for inference, significantly reducing latency and improving throughput on NVIDIA Jetson devices. TensorRT provides high-performance, low-latency [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) inference through layer fusion, precision calibration, and kernel auto-tuning. This leads to faster and more efficient execution, particularly useful for real-time applications like video analytics and autonomous machines.
### Can I run Ultralytics YOLO11 with DeepStream SDK across different NVIDIA Jetson hardware?
Yes, the guide for deploying Ultralytics YOLO11 with the DeepStream SDK and TensorRT is compatible across the entire NVIDIA Jetson lineup. This includes devices like the Jetson Orin NX 16GB with [JetPack 5.1.3](https://developer.nvidia.com/embedded/jetpack-sdk-513) and the Jetson Nano 4GB with [JetPack 4.6.4](https://developer.nvidia.com/jetpack-sdk-464). Refer to the section [DeepStream Configuration for YOLO11](#deepstream-configuration-for-yolo11) for detailed steps.
### How can I convert a YOLO11 model to ONNX for DeepStream?
To convert a YOLO11 model to ONNX format for deployment with DeepStream, use the `utils/export_yoloV8.py` script from the [DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo) repository.
For more details on model conversion, check out our [model export section](../modes/export.md).
### What are the performance benchmarks for YOLO on NVIDIA Jetson Orin NX?
The performance of YOLO11 models on NVIDIA Jetson Orin NX 16GB varies based on TensorRT precision levels. For example, YOLOv8s models achieve:
-**FP32 Precision**: 15.63 ms/im, 64 FPS
-**FP16 Precision**: 7.94 ms/im, 126 FPS
-**INT8 Precision**: 5.53 ms/im, 181 FPS
These benchmarks underscore the efficiency and capability of using TensorRT-optimized YOLO11 models on NVIDIA Jetson hardware. For further details, see our [Benchmark Results](#benchmark-results) section.