Commit af155c51 authored by chenzk's avatar chenzk
Browse files

v1.0

parents
Pipeline #2732 failed with stages
in 0 seconds
---
comments: true
description: Data collection and annotation are vital steps in any computer vision project. Explore the tools, techniques, and best practices for collecting and annotating data.
keywords: What is Data Annotation, Data Annotation Tools, Annotating Data, Avoiding Bias in Data Collection, Ethical Data Collection, Annotation Strategies
---
# Data Collection and Annotation Strategies for Computer Vision
## Introduction
The key to success in any [computer vision project](./steps-of-a-cv-project.md) starts with effective data collection and annotation strategies. The quality of the data directly impacts model performance, so it's important to understand the best practices related to data collection and data annotation.
Every consideration regarding the data should closely align with [your project's goals](./defining-project-goals.md). Changes in your annotation strategies could shift the project's focus or effectiveness and vice versa. With this in mind, let's take a closer look at the best ways to approach data collection and annotation.
## Setting Up Classes and Collecting Data
Collecting images and video for a computer vision project involves defining the number of classes, sourcing data, and considering ethical implications. Before you start gathering your data, you need to be clear about:
### Choosing the Right Classes for Your Project
One of the first questions when starting a computer vision project is how many classes to include. You need to determine the class membership, which is involves the different categories or labels that you want your model to recognize and differentiate. The number of classes should be determined by the specific goals of your project.
For example, if you want to monitor traffic, your classes might include "car," "truck," "bus," "motorcycle," and "bicycle." On the other hand, for tracking items in a store, your classes could be "fruits," "vegetables," "beverages," and "snacks." Defining classes based on your project goals helps keep your dataset relevant and focused.
When you define your classes, another important distinction to make is whether to choose coarse or fine class counts. 'Count' refers to the number of distinct classes you are interested in. This decision influences the granularity of your data and the complexity of your model. Here are the considerations for each approach:
- **Coarse Class-Count**: These are broader, more inclusive categories, such as "vehicle" and "non-vehicle." They simplify annotation and require fewer computational resources but provide less detailed information, potentially limiting the model's effectiveness in complex scenarios.
- **Fine Class-Count**: More categories with finer distinctions, such as "sedan," "SUV," "pickup truck," and "motorcycle." They capture more detailed information, improving model accuracy and performance. However, they are more time-consuming and labor-intensive to annotate and require more computational resources.
Something to note is that starting with more specific classes can be very helpful, especially in complex projects where details are important. More specific classes lets you collect more detailed data, and gain deeper insights and clearer distinctions between categories. Not only does it improve the accuracy of the model, but it also makes it easier to adjust the model later if needed, saving both time and resources.
### Sources of Data
You can use public datasets or gather your own custom data. Public datasets like those on [Kaggle](https://www.kaggle.com/datasets) and [Google Dataset Search Engine](https://datasetsearch.research.google.com/) offer well-annotated, standardized data, making them great starting points for training and validating models.
Custom data collection, on the other hand, allows you to customize your dataset to your specific needs. You might capture images and videos with cameras or drones, scrape the web for images, or use existing internal data from your organization. Custom data gives you more control over its quality and relevance. Combining both public and custom data sources helps create a diverse and comprehensive dataset.
### Avoiding [Bias in](https://www.ultralytics.com/glossary/bias-in-ai) Data Collection
Bias occurs when certain groups or scenarios are underrepresented or overrepresented in your dataset. It leads to a model that performs well on some data but poorly on others. It's crucial to avoid bias so that your computer vision model can perform well in a variety of scenarios.
Here is how you can avoid bias while collecting data:
- **Diverse Sources**: Collect data from many sources to capture different perspectives and scenarios.
- **Balanced Representation**: Include balanced representation from all relevant groups. For example, consider different ages, genders, and ethnicities.
- **Continuous Monitoring**: Regularly review and update your dataset to identify and address any emerging biases.
- **Bias Mitigation Techniques**: Use methods like oversampling underrepresented classes, [data augmentation](https://www.ultralytics.com/glossary/data-augmentation), and fairness-aware algorithms.
Following these practices helps create a more robust and fair model that can generalize well in real-world applications.
## What is Data Annotation?
Data annotation is the process of labeling data to make it usable for training [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) models. In computer vision, this means labeling images or videos with the information that a model needs to learn from. Without properly annotated data, models cannot accurately learn the relationships between inputs and outputs.
### Types of Data Annotation
Depending on the specific requirements of a [computer vision task](../tasks/index.md), there are different types of data annotation. Here are some examples:
- **Bounding Boxes**: Rectangular boxes drawn around objects in an image, used primarily for object detection tasks. These boxes are defined by their top-left and bottom-right coordinates.
- **Polygons**: Detailed outlines for objects, allowing for more precise annotation than bounding boxes. Polygons are used in tasks like [instance segmentation](https://www.ultralytics.com/glossary/instance-segmentation), where the shape of the object is important.
- **Masks**: Binary masks where each pixel is either part of an object or the background. Masks are used in semantic segmentation tasks to provide pixel-level detail.
- **Keypoints**: Specific points marked within an image to identify locations of interest. Keypoints are used in tasks like pose estimation and facial landmark detection.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/types-of-data-annotation.avif" alt="Types of Data Annotation">
</p>
### Common Annotation Formats
After selecting a type of annotation, it's important to choose the appropriate format for storing and sharing annotations.
Commonly used formats include [COCO](../datasets/detect/coco.md), which supports various annotation types like [object detection](https://www.ultralytics.com/glossary/object-detection), keypoint detection, stuff segmentation, [panoptic segmentation](https://www.ultralytics.com/glossary/panoptic-segmentation), and image captioning, stored in JSON. [Pascal VOC](../datasets/detect/voc.md) uses XML files and is popular for object detection tasks. YOLO, on the other hand, creates a .txt file for each image, containing annotations like object class, coordinates, height, and width, making it suitable for object detection.
### Techniques of Annotation
Now, assuming you've chosen a type of annotation and format, it's time to establish clear and objective labeling rules. These rules are like a roadmap for consistency and [accuracy](https://www.ultralytics.com/glossary/accuracy) throughout the annotation process. Key aspects of these rules include:
- **Clarity and Detail**: Make sure your instructions are clear. Use examples and illustrations to understand what's expected.
- **Consistency**: Keep your annotations uniform. Set standard criteria for annotating different types of data, so all annotations follow the same rules.
- **Reducing Bias**: Stay neutral. Train yourself to be objective and minimize personal biases to ensure fair annotations.
- **Efficiency**: Work smarter, not harder. Use tools and workflows that automate repetitive tasks, making the annotation process faster and more efficient.
Regularly reviewing and updating your labeling rules will help keep your annotations accurate, consistent, and aligned with your project goals.
### Popular Annotation Tools
Let's say you are ready to annotate now. There are several open-source tools available to help streamline the data annotation process. Here are some useful open annotation tools:
- **[Label Studio](https://github.com/HumanSignal/label-studio)**: A flexible tool that supports a wide range of annotation tasks and includes features for managing projects and quality control.
- **[CVAT](https://github.com/cvat-ai/cvat)**: A powerful tool that supports various annotation formats and customizable workflows, making it suitable for complex projects.
- **[Labelme](https://github.com/wkentaro/labelme)**: A simple and easy-to-use tool that allows for quick annotation of images with polygons, making it ideal for straightforward tasks.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/labelme-instance-segmentation-annotation.avif" alt="LabelMe Overview">
</p>
These open-source tools are budget-friendly and provide a range of features to meet different annotation needs.
### Some More Things to Consider Before Annotating Data
Before you dive into annotating your data, there are a few more things to keep in mind. You should be aware of accuracy, [precision](https://www.ultralytics.com/glossary/precision), outliers, and quality control to avoid labeling your data in a counterproductive manner.
#### Understanding Accuracy and Precision
It's important to understand the difference between accuracy and precision and how it relates to annotation. Accuracy refers to how close the annotated data is to the true values. It helps us measure how closely the labels reflect real-world scenarios. Precision indicates the consistency of annotations. It checks if you are giving the same label to the same object or feature throughout the dataset. High accuracy and precision lead to better-trained models by reducing noise and improving the model's ability to generalize from the [training data](https://www.ultralytics.com/glossary/training-data).
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/example-of-precision.avif" alt="Example of Precision">
</p>
#### Identifying Outliers
Outliers are data points that deviate quite a bit from other observations in the dataset. With respect to annotations, an outlier could be an incorrectly labeled image or an annotation that doesn't fit with the rest of the dataset. Outliers are concerning because they can distort the model's learning process, leading to inaccurate predictions and poor generalization.
You can use various methods to detect and correct outliers:
- **Statistical Techniques**: To detect outliers in numerical features like pixel values, [bounding box](https://www.ultralytics.com/glossary/bounding-box) coordinates, or object sizes, you can use methods such as box plots, histograms, or z-scores.
- **Visual Techniques**: To spot anomalies in categorical features like object classes, colors, or shapes, use visual methods like plotting images, labels, or heat maps.
- **Algorithmic Methods**: Use tools like clustering (e.g., K-means clustering, DBSCAN) and [anomaly detection](https://www.ultralytics.com/glossary/anomaly-detection) algorithms to identify outliers based on data distribution patterns.
#### Quality Control of Annotated Data
Just like other technical projects, quality control is a must for annotated data. It is a good practice to regularly check annotations to make sure they are accurate and consistent. This can be done in a few different ways:
- Reviewing samples of annotated data
- Using automated tools to spot common errors
- Having another person double-check the annotations
If you are working with multiple people, consistency between different annotators is important. Good inter-annotator agreement means that the guidelines are clear and everyone is following them the same way. It keeps everyone on the same page and the annotations consistent.
While reviewing, if you find errors, correct them and update the guidelines to avoid future mistakes. Provide feedback to annotators and offer regular training to help reduce errors. Having a strong process for handling errors keeps your dataset accurate and reliable.
## Share Your Thoughts with the Community
Bouncing your ideas and queries off other [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) enthusiasts can help accelerate your projects. Here are some great ways to learn, troubleshoot, and network:
### Where to Find Help and Support
- **GitHub Issues:** Visit the YOLO11 GitHub repository and use the [Issues tab](https://github.com/ultralytics/ultralytics/issues) to raise questions, report bugs, and suggest features. The community and maintainers are there to help with any issues you face.
- **Ultralytics Discord Server:** Join the [Ultralytics Discord server](https://discord.com/invite/ultralytics) to connect with other users and developers, get support, share knowledge, and brainstorm ideas.
### Official Documentation
- **Ultralytics YOLO11 Documentation:** Refer to the [official YOLO11 documentation](./index.md) for thorough guides and valuable insights on numerous computer vision tasks and projects.
## Conclusion
By following the best practices for collecting and annotating data, avoiding bias, and using the right tools and techniques, you can significantly improve your model's performance. Engaging with the community and using available resources will keep you informed and help you troubleshoot issues effectively. Remember, quality data is the foundation of a successful project, and the right strategies will help you build robust and reliable models.
## FAQ
### What is the best way to avoid bias in data collection for computer vision projects?
Avoiding bias in data collection ensures that your computer vision model performs well across various scenarios. To minimize bias, consider collecting data from diverse sources to capture different perspectives and scenarios. Ensure balanced representation among all relevant groups, such as different ages, genders, and ethnicities. Regularly review and update your dataset to identify and address any emerging biases. Techniques such as oversampling underrepresented classes, data augmentation, and fairness-aware algorithms can also help mitigate bias. By employing these strategies, you maintain a robust and fair dataset that enhances your model's generalization capability.
### How can I ensure high consistency and accuracy in data annotation?
Ensuring high consistency and accuracy in data annotation involves establishing clear and objective labeling guidelines. Your instructions should be detailed, with examples and illustrations to clarify expectations. Consistency is achieved by setting standard criteria for annotating various data types, ensuring all annotations follow the same rules. To reduce personal biases, train annotators to stay neutral and objective. Regular reviews and updates of labeling rules help maintain accuracy and alignment with project goals. Using automated tools to check for consistency and getting feedback from other annotators also contribute to maintaining high-quality annotations.
### How many images do I need for training Ultralytics YOLO models?
For effective [transfer learning](https://www.ultralytics.com/glossary/transfer-learning) and object detection with Ultralytics YOLO models, start with a minimum of a few hundred annotated objects per class. If training for just one class, begin with at least 100 annotated images and train for approximately 100 [epochs](https://www.ultralytics.com/glossary/epoch). More complex tasks might require thousands of images per class to achieve high reliability and performance. Quality annotations are crucial, so ensure your data collection and annotation processes are rigorous and aligned with your project's specific goals. Explore detailed training strategies in the [YOLO11 training guide](../modes/train.md).
### What are some popular tools for data annotation?
Several popular open-source tools can streamline the data annotation process:
- **[Label Studio](https://github.com/HumanSignal/label-studio)**: A flexible tool supporting various annotation tasks, project management, and quality control features.
- **[CVAT](https://www.cvat.ai/)**: Offers multiple annotation formats and customizable workflows, making it suitable for complex projects.
- **[Labelme](https://github.com/wkentaro/labelme)**: Ideal for quick and straightforward image annotation with polygons.
These tools can help enhance the efficiency and accuracy of your annotation workflows. For extensive feature lists and guides, refer to our [data annotation tools documentation](../datasets/index.md).
### What types of data annotation are commonly used in computer vision?
Different types of data annotation cater to various computer vision tasks:
- **Bounding Boxes**: Used primarily for object detection, these are rectangular boxes around objects in an image.
- **Polygons**: Provide more precise object outlines suitable for instance segmentation tasks.
- **Masks**: Offer pixel-level detail, used in [semantic segmentation](https://www.ultralytics.com/glossary/semantic-segmentation) to differentiate objects from the background.
- **Keypoints**: Identify specific points of interest within an image, useful for tasks like pose estimation and facial landmark detection.
Selecting the appropriate annotation type depends on your project's requirements. Learn more about how to implement these annotations and their formats in our [data annotation guide](#what-is-data-annotation).
---
comments: true
description: Learn how to deploy Ultralytics YOLO11 on NVIDIA Jetson devices using TensorRT and DeepStream SDK. Explore performance benchmarks and maximize AI capabilities.
keywords: Ultralytics, YOLO11, NVIDIA Jetson, JetPack, AI deployment, embedded systems, deep learning, TensorRT, DeepStream SDK, computer vision
---
# Ultralytics YOLO11 on NVIDIA Jetson using DeepStream SDK and TensorRT
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/wWmXKIteRLA"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> How to Run Multiple Streams with DeepStream SDK on Jetson Nano using Ultralytics YOLO11
</p>
This comprehensive guide provides a detailed walkthrough for deploying Ultralytics YOLO11 on [NVIDIA Jetson](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) devices using DeepStream SDK and TensorRT. Here we use TensorRT to maximize the inference performance on the Jetson platform.
<img width="1024" src="https://github.com/ultralytics/docs/releases/download/0/deepstream-nvidia-jetson.avif" alt="DeepStream on NVIDIA Jetson">
!!! note
This guide has been tested with both [Seeed Studio reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) which is based on NVIDIA Jetson Orin NX 16GB running JetPack release of [JP5.1.3](https://developer.nvidia.com/embedded/jetpack-sdk-513) and [Seeed Studio reComputer J1020 v2](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) which is based on NVIDIA Jetson Nano 4GB running JetPack release of [JP4.6.4](https://developer.nvidia.com/jetpack-sdk-464). It is expected to work across all the NVIDIA Jetson hardware lineup including latest and legacy.
## What is NVIDIA DeepStream?
[NVIDIA's DeepStream SDK](https://developer.nvidia.com/deepstream-sdk) is a complete streaming analytics toolkit based on GStreamer for AI-based multi-sensor processing, video, audio, and image understanding. It's ideal for vision AI developers, software partners, startups, and OEMs building IVA (Intelligent Video Analytics) apps and services. You can now create stream-processing pipelines that incorporate [neural networks](https://www.ultralytics.com/glossary/neural-network-nn) and other complex processing tasks like tracking, video encoding/decoding, and video rendering. These pipelines enable real-time analytics on video, image, and sensor data. DeepStream's multi-platform support gives you a faster, easier way to develop vision AI applications and services on-premise, at the edge, and in the cloud.
## Prerequisites
Before you start to follow this guide:
- Visit our documentation, [Quick Start Guide: NVIDIA Jetson with Ultralytics YOLO11](nvidia-jetson.md) to set up your NVIDIA Jetson device with Ultralytics YOLO11
- Install [DeepStream SDK](https://developer.nvidia.com/deepstream-getting-started) according to the JetPack version
- For JetPack 4.6.4, install [DeepStream 6.0.1](https://docs.nvidia.com/metropolis/deepstream/6.0.1/dev-guide/text/DS_Quickstart.html)
- For JetPack 5.1.3, install [DeepStream 6.3](https://docs.nvidia.com/metropolis/deepstream/6.3/dev-guide/text/DS_Quickstart.html)
!!! tip
In this guide we have used the Debian package method of installing DeepStream SDK to the Jetson device. You can also visit the [DeepStream SDK on Jetson (Archived)](https://developer.nvidia.com/embedded/deepstream-on-jetson-downloads-archived) to access legacy versions of DeepStream.
## DeepStream Configuration for YOLO11
Here we are using [marcoslucianops/DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo) GitHub repository which includes NVIDIA DeepStream SDK support for YOLO models. We appreciate the efforts of marcoslucianops for his contributions!
1. Install dependencies
```bash
pip install cmake
pip install onnxsim
```
2. Clone the following repository
```bash
git clone https://github.com/marcoslucianops/DeepStream-Yolo
cd DeepStream-Yolo
```
3. Download Ultralytics YOLO11 detection model (.pt) of your choice from [YOLO11 releases](https://github.com/ultralytics/assets/releases). Here we use [yolov8s.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt).
```bash
wget https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt
```
!!! note
You can also use a [custom trained YOLO11 model](https://docs.ultralytics.com/modes/train/).
4. Convert model to ONNX
```bash
python3 utils/export_yoloV8.py -w yolov8s.pt
```
!!! note "Pass the below arguments to the above command"
For DeepStream 6.0.1, use opset 12 or lower. The default opset is 16.
```bash
--opset 12
```
To change the inference size (default: 640)
```bash
-s SIZE
--size SIZE
-s HEIGHT WIDTH
--size HEIGHT WIDTH
```
Example for 1280:
```bash
-s 1280
or
-s 1280 1280
```
To simplify the ONNX model (DeepStream >= 6.0)
```bash
--simplify
```
To use dynamic batch-size (DeepStream >= 6.1)
```bash
--dynamic
```
To use static batch-size (example for batch-size = 4)
```bash
--batch 4
```
5. Set the CUDA version according to the JetPack version installed
For JetPack 4.6.4:
```bash
export CUDA_VER=10.2
```
For JetPack 5.1.3:
```bash
export CUDA_VER=11.4
```
6. Compile the library
```bash
make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
```
7. Edit the `config_infer_primary_yoloV8.txt` file according to your model (for YOLOv8s with 80 classes)
```bash
[property]
...
onnx-file=yolov8s.onnx
...
num-detected-classes=80
...
```
8. Edit the `deepstream_app_config` file
```bash
...
[primary-gie]
...
config-file=config_infer_primary_yoloV8.txt
```
9. You can also change the video source in `deepstream_app_config` file. Here a default video file is loaded
```bash
...
[source0]
...
uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
```
### Run Inference
```bash
deepstream-app -c deepstream_app_config.txt
```
!!! note
It will take a long time to generate the TensorRT engine file before starting the inference. So please be patient.
<div align=center><img width=1000 src="https://github.com/ultralytics/docs/releases/download/0/yolov8-with-deepstream.avif" alt="YOLO11 with deepstream"></div>
!!! tip
If you want to convert the model to FP16 [precision](https://www.ultralytics.com/glossary/precision), simply set `model-engine-file=model_b1_gpu0_fp16.engine` and `network-mode=2` inside `config_infer_primary_yoloV8.txt`
## INT8 Calibration
If you want to use INT8 precision for inference, you need to follow the steps below
1. Set `OPENCV` environment variable
```bash
export OPENCV=1
```
2. Compile the library
```bash
make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
```
3. For COCO dataset, download the [val2017](http://images.cocodataset.org/zips/val2017.zip), extract, and move to `DeepStream-Yolo` folder
4. Make a new directory for calibration images
```bash
mkdir calibration
```
5. Run the following to select 1000 random images from COCO dataset to run calibration
```bash
for jpg in $(ls -1 val2017/*.jpg | sort -R | head -1000); do \
cp ${jpg} calibration/; \
done
```
!!! note
NVIDIA recommends at least 500 images to get a good [accuracy](https://www.ultralytics.com/glossary/accuracy). On this example, 1000 images are chosen to get better accuracy (more images = more accuracy). You can set it from **head -1000**. For example, for 2000 images, **head -2000**. This process can take a long time.
6. Create the `calibration.txt` file with all selected images
```bash
realpath calibration/*jpg > calibration.txt
```
7. Set environment variables
```bash
export INT8_CALIB_IMG_PATH=calibration.txt
export INT8_CALIB_BATCH_SIZE=1
```
!!! note
Higher INT8_CALIB_BATCH_SIZE values will result in more accuracy and faster calibration speed. Set it according to you GPU memory.
8. Update the `config_infer_primary_yoloV8.txt` file
From
```bash
...
model-engine-file=model_b1_gpu0_fp32.engine
#int8-calib-file=calib.table
...
network-mode=0
...
```
To
```bash
...
model-engine-file=model_b1_gpu0_int8.engine
int8-calib-file=calib.table
...
network-mode=1
...
```
### Run Inference
```bash
deepstream-app -c deepstream_app_config.txt
```
## MultiStream Setup
To set up multiple streams under a single deepstream application, you can do the following changes to the `deepstream_app_config.txt` file
1. Change the rows and columns to build a grid display according to the number of streams you want to have. For example, for 4 streams, we can add 2 rows and 2 columns.
```bash
[tiled-display]
rows=2
columns=2
```
2. Set `num-sources=4` and add `uri` of all the 4 streams
```bash
[source0]
enable=1
type=3
uri=<path_to_video>
uri=<path_to_video>
uri=<path_to_video>
uri=<path_to_video>
num-sources=4
```
### Run Inference
```bash
deepstream-app -c deepstream_app_config.txt
```
<div align=center><img width=1000 src="https://github.com/ultralytics/docs/releases/download/0/multistream-setup.avif" alt="Multistream setup"></div>
## Benchmark Results
The following table summarizes how YOLOv8s models perform at different TensorRT precision levels with an input size of 640x640 on NVIDIA Jetson Orin NX 16GB.
| Model Name | Precision | Inference Time (ms/im) | FPS |
| ---------- | --------- | ---------------------- | --- |
| YOLOv8s | FP32 | 15.63 | 64 |
| | FP16 | 7.94 | 126 |
| | INT8 | 5.53 | 181 |
### Acknowledgements
This guide was initially created by our friends at Seeed Studio, Lakshantha and Elaine.
## FAQ
### How do I set up Ultralytics YOLO11 on an NVIDIA Jetson device?
To set up Ultralytics YOLO11 on an [NVIDIA Jetson](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) device, you first need to install the [DeepStream SDK](https://developer.nvidia.com/deepstream-getting-started) compatible with your JetPack version. Follow the step-by-step guide in our [Quick Start Guide](nvidia-jetson.md) to configure your NVIDIA Jetson for YOLO11 deployment.
### What is the benefit of using TensorRT with YOLO11 on NVIDIA Jetson?
Using TensorRT with YOLO11 optimizes the model for inference, significantly reducing latency and improving throughput on NVIDIA Jetson devices. TensorRT provides high-performance, low-latency [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) inference through layer fusion, precision calibration, and kernel auto-tuning. This leads to faster and more efficient execution, particularly useful for real-time applications like video analytics and autonomous machines.
### Can I run Ultralytics YOLO11 with DeepStream SDK across different NVIDIA Jetson hardware?
Yes, the guide for deploying Ultralytics YOLO11 with the DeepStream SDK and TensorRT is compatible across the entire NVIDIA Jetson lineup. This includes devices like the Jetson Orin NX 16GB with [JetPack 5.1.3](https://developer.nvidia.com/embedded/jetpack-sdk-513) and the Jetson Nano 4GB with [JetPack 4.6.4](https://developer.nvidia.com/jetpack-sdk-464). Refer to the section [DeepStream Configuration for YOLO11](#deepstream-configuration-for-yolo11) for detailed steps.
### How can I convert a YOLO11 model to ONNX for DeepStream?
To convert a YOLO11 model to ONNX format for deployment with DeepStream, use the `utils/export_yoloV8.py` script from the [DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo) repository.
Here's an example command:
```bash
python3 utils/export_yoloV8.py -w yolov8s.pt --opset 12 --simplify
```
For more details on model conversion, check out our [model export section](../modes/export.md).
### What are the performance benchmarks for YOLO on NVIDIA Jetson Orin NX?
The performance of YOLO11 models on NVIDIA Jetson Orin NX 16GB varies based on TensorRT precision levels. For example, YOLOv8s models achieve:
- **FP32 Precision**: 15.63 ms/im, 64 FPS
- **FP16 Precision**: 7.94 ms/im, 126 FPS
- **INT8 Precision**: 5.53 ms/im, 181 FPS
These benchmarks underscore the efficiency and capability of using TensorRT-optimized YOLO11 models on NVIDIA Jetson hardware. For further details, see our [Benchmark Results](#benchmark-results) section.
---
comments: true
description: Learn how to define clear goals and objectives for your computer vision project with our practical guide. Includes tips on problem statements, measurable objectives, and key decisions.
keywords: computer vision, project planning, problem statement, measurable objectives, dataset preparation, model selection, YOLO11, Ultralytics
---
# A Practical Guide for Defining Your [Computer Vision](https://www.ultralytics.com/glossary/computer-vision-cv) Project
## Introduction
The first step in any computer vision project is defining what you want to achieve. It's crucial to have a clear roadmap from the start, which includes everything from data collection to deploying your model.
If you need a quick refresher on the basics of a computer vision project, take a moment to read our guide on [the key steps in a computer vision project](./steps-of-a-cv-project.md). It'll give you a solid overview of the whole process. Once you're caught up, come back here to dive into how exactly you can define and refine the goals for your project.
Now, let's get to the heart of defining a clear problem statement for your project and exploring the key decisions you'll need to make along the way.
## Defining A Clear Problem Statement
Setting clear goals and objectives for your project is the first big step toward finding the most effective solutions. Let's understand how you can clearly define your project's problem statement:
- **Identify the Core Issue:** Pinpoint the specific challenge your computer vision project aims to solve.
- **Determine the Scope:** Define the boundaries of your problem.
- **Consider End Users and Stakeholders:** Identify who will be affected by the solution.
- **Analyze Project Requirements and Constraints:** Assess available resources (time, budget, personnel) and identify any technical or regulatory constraints.
### Example of a Business Problem Statement
Let's walk through an example.
Consider a computer vision project where you want to [estimate the speed of vehicles](./speed-estimation.md) on a highway. The core issue is that current speed monitoring methods are inefficient and error-prone due to outdated radar systems and manual processes. The project aims to develop a real-time computer vision system that can replace legacy [speed estimation](https://www.ultralytics.com/blog/ultralytics-yolov8-for-speed-estimation-in-computer-vision-projects) systems.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/speed-estimation-using-yolov8.avif" alt="Speed Estimation Using YOLO11">
</p>
Primary users include traffic management authorities and law enforcement, while secondary stakeholders are highway planners and the public benefiting from safer roads. Key requirements involve evaluating budget, time, and personnel, as well as addressing technical needs like high-resolution cameras and real-time data processing. Additionally, regulatory constraints on privacy and [data security](https://www.ultralytics.com/glossary/data-security) must be considered.
### Setting Measurable Objectives
Setting measurable objectives is key to the success of a computer vision project. These goals should be clear, achievable, and time-bound.
For example, if you are developing a system to estimate vehicle speeds on a highway. You could consider the following measurable objectives:
- To achieve at least 95% [accuracy](https://www.ultralytics.com/glossary/accuracy) in speed detection within six months, using a dataset of 10,000 vehicle images.
- The system should be able to process real-time video feeds at 30 frames per second with minimal delay.
By setting specific and quantifiable goals, you can effectively track progress, identify areas for improvement, and ensure the project stays on course.
## The Connection Between The Problem Statement and The Computer Vision Tasks
Your problem statement helps you conceptualize which computer vision task can solve your issue.
For example, if your problem is monitoring vehicle speeds on a highway, the relevant computer vision task is object tracking. [Object tracking](../modes/track.md) is suitable because it allows the system to continuously follow each vehicle in the video feed, which is crucial for accurately calculating their speeds.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/example-of-object-tracking.avif" alt="Example of Object Tracking">
</p>
Other tasks, like [object detection](../tasks/detect.md), are not suitable as they don't provide continuous location or movement information. Once you've identified the appropriate computer vision task, it guides several critical aspects of your project, like model selection, dataset preparation, and model training approaches.
## Which Comes First: Model Selection, Dataset Preparation, or Model Training Approach?
The order of model selection, dataset preparation, and training approach depends on the specifics of your project. Here are a few tips to help you decide:
- **Clear Understanding of the Problem**: If your problem and objectives are well-defined, start with model selection. Then, prepare your dataset and decide on the training approach based on the model's requirements.
- **Example**: Start by selecting a model for a traffic monitoring system that estimates vehicle speeds. Choose an object tracking model, gather and annotate highway videos, and then train the model with techniques for real-time video processing.
- **Unique or Limited Data**: If your project is constrained by unique or limited data, begin with dataset preparation. For instance, if you have a rare dataset of medical images, annotate and prepare the data first. Then, select a model that performs well on such data, followed by choosing a suitable training approach.
- **Example**: Prepare the data first for a facial recognition system with a small dataset. Annotate it, then select a model that works well with limited data, such as a pre-trained model for [transfer learning](https://www.ultralytics.com/glossary/transfer-learning). Finally, decide on a training approach, including [data augmentation](https://www.ultralytics.com/glossary/data-augmentation), to expand the dataset.
- **Need for Experimentation**: In projects where experimentation is crucial, start with the training approach. This is common in research projects where you might initially test different training techniques. Refine your model selection after identifying a promising method and prepare the dataset based on your findings.
- **Example**: In a project exploring new methods for detecting manufacturing defects, start with experimenting on a small data subset. Once you find a promising technique, select a model tailored to those findings and prepare a comprehensive dataset.
## Common Discussion Points in the Community
Next, let's look at a few common discussion points in the community regarding computer vision tasks and project planning.
### What Are the Different Computer Vision Tasks?
The most popular computer vision tasks include [image classification](https://www.ultralytics.com/glossary/image-classification), [object detection](https://www.ultralytics.com/glossary/object-detection), and [image segmentation](https://www.ultralytics.com/glossary/image-segmentation).
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/image-classification-vs-object-detection-vs-image-segmentation.avif" alt="Overview of Computer Vision Tasks">
</p>
For a detailed explanation of various tasks, please take a look at the Ultralytics Docs page on [YOLO11 Tasks](../tasks/index.md).
### Can a Pre-trained Model Remember Classes It Knew Before Custom Training?
No, pre-trained models don't "remember" classes in the traditional sense. They learn patterns from massive datasets, and during custom training (fine-tuning), these patterns are adjusted for your specific task. The model's capacity is limited, and focusing on new information can overwrite some previous learnings.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/overview-of-transfer-learning.avif" alt="Overview of Transfer Learning">
</p>
If you want to use the classes the model was pre-trained on, a practical approach is to use two models: one retains the original performance, and the other is fine-tuned for your specific task. This way, you can combine the outputs of both models. There are other options like freezing layers, using the pre-trained model as a feature extractor, and task-specific branching, but these are more complex solutions and require more expertise.
### How Do Deployment Options Affect My Computer Vision Project?
[Model deployment options](./model-deployment-options.md) critically impact the performance of your computer vision project. For instance, the deployment environment must handle the computational load of your model. Here are some practical examples:
- **Edge Devices**: Deploying on edge devices like smartphones or IoT devices requires lightweight models due to their limited computational resources. Example technologies include [TensorFlow Lite](../integrations/tflite.md) and [ONNX Runtime](../integrations/onnx.md), which are optimized for such environments.
- **Cloud Servers**: Cloud deployments can handle more complex models with larger computational demands. Cloud platforms like [AWS](../integrations/amazon-sagemaker.md), Google Cloud, and Azure offer robust hardware options that can scale based on the project's needs.
- **On-Premise Servers**: For scenarios requiring high [data privacy](https://www.ultralytics.com/glossary/data-privacy) and security, deploying on-premise might be necessary. This involves significant upfront hardware investment but allows full control over the data and infrastructure.
- **Hybrid Solutions**: Some projects might benefit from a hybrid approach, where some processing is done on the edge, while more complex analyses are offloaded to the cloud. This can balance performance needs with cost and latency considerations.
Each deployment option offers different benefits and challenges, and the choice depends on specific project requirements like performance, cost, and security.
## Connecting with the Community
Connecting with other computer vision enthusiasts can be incredibly helpful for your projects by providing support, solutions, and new ideas. Here are some great ways to learn, troubleshoot, and network:
### Community Support Channels
- **GitHub Issues:** Head over to the YOLO11 GitHub repository. You can use the [Issues tab](https://github.com/ultralytics/ultralytics/issues) to raise questions, report bugs, and suggest features. The community and maintainers can assist with specific problems you encounter.
- **Ultralytics Discord Server:** Become part of the [Ultralytics Discord server](https://discord.com/invite/ultralytics). Connect with fellow users and developers, seek support, exchange knowledge, and discuss ideas.
### Comprehensive Guides and Documentation
- **Ultralytics YOLO11 Documentation:** Explore the [official YOLO11 documentation](./index.md) for in-depth guides and valuable tips on various computer vision tasks and projects.
## Conclusion
Defining a clear problem and setting measurable goals is key to a successful computer vision project. We've highlighted the importance of being clear and focused from the start. Having specific goals helps avoid oversight. Also, staying connected with others in the community through platforms like GitHub or Discord is important for learning and staying current. In short, good planning and engaging with the community is a huge part of successful computer vision projects.
## FAQ
### How do I define a clear problem statement for my Ultralytics computer vision project?
To define a clear problem statement for your Ultralytics computer vision project, follow these steps:
1. **Identify the Core Issue:** Pinpoint the specific challenge your project aims to solve.
2. **Determine the Scope:** Clearly outline the boundaries of your problem.
3. **Consider End Users and Stakeholders:** Identify who will be affected by your solution.
4. **Analyze Project Requirements and Constraints:** Assess available resources and any technical or regulatory limitations.
Providing a well-defined problem statement ensures that the project remains focused and aligned with your objectives. For a detailed guide, refer to our [practical guide](#defining-a-clear-problem-statement).
### Why should I use Ultralytics YOLO11 for speed estimation in my computer vision project?
Ultralytics YOLO11 is ideal for speed estimation because of its real-time object tracking capabilities, high accuracy, and robust performance in detecting and monitoring vehicle speeds. It overcomes inefficiencies and inaccuracies of traditional radar systems by leveraging cutting-edge computer vision technology. Check out our blog on [speed estimation using YOLO11](https://www.ultralytics.com/blog/ultralytics-yolov8-for-speed-estimation-in-computer-vision-projects) for more insights and practical examples.
### How do I set effective measurable objectives for my computer vision project with Ultralytics YOLO11?
Set effective and measurable objectives using the SMART criteria:
- **Specific:** Define clear and detailed goals.
- **Measurable:** Ensure objectives are quantifiable.
- **Achievable:** Set realistic targets within your capabilities.
- **Relevant:** Align objectives with your overall project goals.
- **Time-bound:** Set deadlines for each objective.
For example, "Achieve 95% accuracy in speed detection within six months using a 10,000 vehicle image dataset." This approach helps track progress and identifies areas for improvement. Read more about [setting measurable objectives](#setting-measurable-objectives).
### How do deployment options affect the performance of my Ultralytics YOLO models?
Deployment options critically impact the performance of your Ultralytics YOLO models. Here are key options:
- **Edge Devices:** Use lightweight models like [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) Lite or ONNX Runtime for deployment on devices with limited resources.
- **Cloud Servers:** Utilize robust cloud platforms like AWS, Google Cloud, or Azure for handling complex models.
- **On-Premise Servers:** High data privacy and security needs may require on-premise deployments.
- **Hybrid Solutions:** Combine edge and cloud approaches for balanced performance and cost-efficiency.
For more information, refer to our [detailed guide on model deployment options](./model-deployment-options.md).
### What are the most common challenges in defining the problem for a computer vision project with Ultralytics?
Common challenges include:
- Vague or overly broad problem statements.
- Unrealistic objectives.
- Lack of stakeholder alignment.
- Insufficient understanding of technical constraints.
- Underestimating data requirements.
Address these challenges through thorough initial research, clear communication with stakeholders, and iterative refinement of the problem statement and objectives. Learn more about these challenges in our [Computer Vision Project guide](steps-of-a-cv-project.md).
---
comments: true
description: Learn how to calculate distances between objects using Ultralytics YOLO11 for accurate spatial positioning and scene understanding.
keywords: Ultralytics, YOLO11, distance calculation, computer vision, object tracking, spatial positioning
---
# Distance Calculation using Ultralytics YOLO11
## What is Distance Calculation?
Measuring the gap between two objects is known as distance calculation within a specified space. In the case of [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics), the [bounding box](https://www.ultralytics.com/glossary/bounding-box) centroid is employed to calculate the distance for bounding boxes highlighted by the user.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/LE8am1QoVn4"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> Distance Calculation using Ultralytics YOLO11
</p>
## Visuals
| Distance Calculation using Ultralytics YOLO11 |
| :---------------------------------------------------------------------------------------------------------------------------: |
| ![Ultralytics YOLO11 Distance Calculation](https://github.com/ultralytics/docs/releases/download/0/distance-calculation.avif) |
## Advantages of Distance Calculation?
- **Localization [Precision](https://www.ultralytics.com/glossary/precision):** Enhances accurate spatial positioning in [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) tasks.
- **Size Estimation:** Allows estimation of object size for better contextual understanding.
???+ tip "Distance Calculation"
- Click on any two bounding boxes with Left Mouse click for distance calculation
!!! example "Distance Calculation using YOLO11 Example"
=== "Video Stream"
```python
import cv2
from ultralytics import solutions
cap = cv2.VideoCapture("Path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
# Video writer
video_writer = cv2.VideoWriter("distance_calculation.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))
# Init distance-calculation obj
distance = solutions.DistanceCalculation(model="yolo11n.pt", show=True)
# Process video
while cap.isOpened():
success, im0 = cap.read()
if not success:
print("Video frame is empty or video processing has been successfully completed.")
break
im0 = distance.calculate(im0)
video_writer.write(im0)
cap.release()
video_writer.release()
cv2.destroyAllWindows()
```
???+ note
- Mouse Right Click will delete all drawn points
- Mouse Left Click can be used to draw points
???+ warning "Distance is Estimate"
Distance will be an estimate and may not be fully accurate, as it is calculated using 2-dimensional data, which lacks information about the object's depth.
### Arguments `DistanceCalculation()`
| `Name` | `Type` | `Default` | Description |
| ------------ | ------ | --------- | ---------------------------------------------------- |
| `model` | `str` | `None` | Path to Ultralytics YOLO Model File |
| `line_width` | `int` | `2` | Line thickness for bounding boxes. |
| `show` | `bool` | `False` | Flag to control whether to display the video stream. |
### Arguments `model.track`
{% include "macros/track-args.md" %}
## FAQ
### How do I calculate distances between objects using Ultralytics YOLO11?
To calculate distances between objects using [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics), you need to identify the bounding box centroids of the detected objects. This process involves initializing the `DistanceCalculation` class from Ultralytics' `solutions` module and using the model's tracking outputs to calculate the distances. You can refer to the implementation in the [distance calculation example](#distance-calculation-using-ultralytics-yolo11).
### What are the advantages of using distance calculation with Ultralytics YOLO11?
Using distance calculation with Ultralytics YOLO11 offers several advantages:
- **Localization Precision:** Provides accurate spatial positioning for objects.
- **Size Estimation:** Helps estimate physical sizes, contributing to better contextual understanding.
- **Scene Understanding:** Enhances 3D scene comprehension, aiding improved decision-making in applications like autonomous driving and surveillance.
### Can I perform distance calculation in real-time video streams with Ultralytics YOLO11?
Yes, you can perform distance calculation in real-time video streams with Ultralytics YOLO11. The process involves capturing video frames using [OpenCV](https://www.ultralytics.com/glossary/opencv), running YOLO11 [object detection](https://www.ultralytics.com/glossary/object-detection), and using the `DistanceCalculation` class to calculate distances between objects in successive frames. For a detailed implementation, see the [video stream example](#distance-calculation-using-ultralytics-yolo11).
### How do I delete points drawn during distance calculation using Ultralytics YOLO11?
To delete points drawn during distance calculation with Ultralytics YOLO11, you can use a right mouse click. This action will clear all the points you have drawn. For more details, refer to the note section under the [distance calculation example](#distance-calculation-using-ultralytics-yolo11).
### What are the key arguments for initializing the DistanceCalculation class in Ultralytics YOLO11?
The key arguments for initializing the `DistanceCalculation` class in Ultralytics YOLO11 include:
- `model`: Model file path.
- `show`: Flag to indicate if the video stream should be displayed.
- `line_width`: Thickness of bounding box and the lines drawn on the image.
For an exhaustive list and default values, see the [arguments of DistanceCalculation](#arguments-distancecalculation).
---
comments: true
description: Learn to effortlessly set up Ultralytics in Docker, from installation to running with CPU/GPU support. Follow our comprehensive guide for seamless container experience.
keywords: Ultralytics, Docker, Quickstart Guide, CPU support, GPU support, NVIDIA Docker, container setup, Docker environment, Docker Hub, Ultralytics projects
---
# Docker Quickstart Guide for Ultralytics
<p align="center">
<img width="800" src="https://github.com/ultralytics/docs/releases/download/0/ultralytics-docker-package-visual.avif" alt="Ultralytics Docker Package Visual">
</p>
This guide serves as a comprehensive introduction to setting up a Docker environment for your Ultralytics projects. [Docker](https://www.docker.com/) is a platform for developing, shipping, and running applications in containers. It is particularly beneficial for ensuring that the software will always run the same, regardless of where it's deployed. For more details, visit the Ultralytics Docker repository on [Docker Hub](https://hub.docker.com/r/ultralytics/ultralytics).
[![Docker Image Version](https://img.shields.io/docker/v/ultralytics/ultralytics?sort=semver&logo=docker)](https://hub.docker.com/r/ultralytics/ultralytics)
[![Docker Pulls](https://img.shields.io/docker/pulls/ultralytics/ultralytics)](https://hub.docker.com/r/ultralytics/ultralytics)
## What You Will Learn
- Setting up Docker with NVIDIA support
- Installing Ultralytics Docker images
- Running Ultralytics in a Docker container with CPU or GPU support
- Using a Display Server with Docker to Show Ultralytics Detection Results
- Mounting local directories into the container
---
## Prerequisites
- Make sure Docker is installed on your system. If not, you can download and install it from [Docker's website](https://www.docker.com/products/docker-desktop/).
- Ensure that your system has an NVIDIA GPU and NVIDIA drivers are installed.
---
## Setting up Docker with NVIDIA Support
First, verify that the NVIDIA drivers are properly installed by running:
```bash
nvidia-smi
```
### Installing NVIDIA Docker Runtime
Now, let's install the NVIDIA Docker runtime to enable GPU support in Docker containers:
```bash
# Add NVIDIA package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
distribution=$(lsb_release -cs)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
# Install NVIDIA Docker runtime
sudo apt-get update
sudo apt-get install -y nvidia-docker2
# Restart Docker service to apply changes
sudo systemctl restart docker
```
### Verify NVIDIA Runtime with Docker
Run `docker info | grep -i runtime` to ensure that `nvidia` appears in the list of runtimes:
```bash
docker info | grep -i runtime
```
---
## Installing Ultralytics Docker Images
Ultralytics offers several Docker images optimized for various platforms and use-cases:
- **Dockerfile:** GPU image, ideal for training.
- **Dockerfile-arm64:** For ARM64 architecture, suitable for devices like [Raspberry Pi](raspberry-pi.md).
- **Dockerfile-cpu:** CPU-only version for inference and non-GPU environments.
- **Dockerfile-jetson:** Optimized for NVIDIA Jetson devices.
- **Dockerfile-python:** Minimal Python environment for lightweight applications.
- **Dockerfile-conda:** Includes [Miniconda3](https://docs.conda.io/projects/miniconda/en/latest/) and Ultralytics package installed via Conda.
To pull the latest image:
```bash
# Set image name as a variable
t=ultralytics/ultralytics:latest
# Pull the latest Ultralytics image from Docker Hub
sudo docker pull $t
```
---
## Running Ultralytics in Docker Container
Here's how to execute the Ultralytics Docker container:
### Using only the CPU
```bash
# Run with all GPUs
sudo docker run -it --ipc=host $t
```
### Using GPUs
```bash
# Run with all GPUs
sudo docker run -it --ipc=host --gpus all $t
# Run specifying which GPUs to use
sudo docker run -it --ipc=host --gpus '"device=2,3"' $t
```
The `-it` flag assigns a pseudo-TTY and keeps stdin open, allowing you to interact with the container. The `--ipc=host` flag enables sharing of host's IPC namespace, essential for sharing memory between processes. The `--gpus` flag allows the container to access the host's GPUs.
## Running Ultralytics in Docker Container
Here's how to execute the Ultralytics Docker container:
### Using only the CPU
```bash
# Run with all GPUs
sudo docker run -it --ipc=host $t
```
### Using GPUs
```bash
# Run with all GPUs
sudo docker run -it --ipc=host --gpus all $t
# Run specifying which GPUs to use
sudo docker run -it --ipc=host --gpus '"device=2,3"' $t
```
The `-it` flag assigns a pseudo-TTY and keeps stdin open, allowing you to interact with the container. The `--ipc=host` flag enables sharing of host's IPC namespace, essential for sharing memory between processes. The `--gpus` flag allows the container to access the host's GPUs.
### Note on File Accessibility
To work with files on your local machine within the container, you can use Docker volumes:
```bash
# Mount a local directory into the container
sudo docker run -it --ipc=host --gpus all -v /path/on/host:/path/in/container $t
```
Replace `/path/on/host` with the directory path on your local machine and `/path/in/container` with the desired path inside the Docker container.
## Run graphical user interface (GUI) applications in a Docker Container
!!! danger "Highly Experimental - User Assumes All Risk"
The following instructions are experimental. Sharing a X11 socket with a Docker container poses potential security risks. Therefore, it's recommended to test this solution only in a controlled environment. For more information, refer to these resources on how to use `xhost`<sup>[(1)](http://users.stat.umn.edu/~geyer/secure.html)[(2)](https://linux.die.net/man/1/xhost)</sup>.
Docker is primarily used to containerize background applications and CLI programs, but it can also run graphical programs. In the Linux world, two main graphic servers handle graphical display: [X11](https://www.x.org/wiki/) (also known as the X Window System) and [Wayland](https://wayland.freedesktop.org/). Before starting, it's essential to determine which graphics server you are currently using. Run this command to find out:
```bash
env | grep -E -i 'x11|xorg|wayland'
```
Setup and configuration of an X11 or Wayland display server is outside the scope of this guide. If the above command returns nothing, then you'll need to start by getting either working for your system before continuing.
### Running a Docker Container with a GUI
!!! example
??? info "Use GPUs"
If you're using [GPUs](#using-gpus), you can add the `--gpus all` flag to the command.
=== "X11"
If you're using X11, you can run the following command to allow the Docker container to access the X11 socket:
```bash
xhost +local:docker && docker run -e DISPLAY=$DISPLAY \
-v /tmp/.X11-unix:/tmp/.X11-unix \
-v ~/.Xauthority:/root/.Xauthority \
-it --ipc=host $t
```
This command sets the `DISPLAY` environment variable to the host's display, mounts the X11 socket, and maps the `.Xauthority` file to the container. The `xhost +local:docker` command allows the Docker container to access the X11 server.
=== "Wayland"
For Wayland, use the following command:
```bash
xhost +local:docker && docker run -e DISPLAY=$DISPLAY \
-v $XDG_RUNTIME_DIR/$WAYLAND_DISPLAY:/tmp/$WAYLAND_DISPLAY \
--net=host -it --ipc=host $t
```
This command sets the `DISPLAY` environment variable to the host's display, mounts the Wayland socket, and allows the Docker container to access the Wayland server.
### Using Docker with a GUI
Now you can display graphical applications inside your Docker container. For example, you can run the following [CLI command](../usage/cli.md) to visualize the [predictions](../modes/predict.md) from a [YOLO11 model](../models/yolo11.md):
```bash
yolo predict model=yolo11n.pt show=True
```
??? info "Testing"
A simple way to validate that the Docker group has access to the X11 server is to run a container with a GUI program like [`xclock`](https://www.x.org/archive/X11R6.8.1/doc/xclock.1.html) or [`xeyes`](https://www.x.org/releases/X11R7.5/doc/man/man1/xeyes.1.html). Alternatively, you can also install these programs in the Ultralytics Docker container to test the access to the X11 server of your GNU-Linux display server. If you run into any problems, consider setting the environment variable `-e QT_DEBUG_PLUGINS=1`. Setting this environment variable enables the output of debugging information, aiding in the troubleshooting process.
### When finished with Docker GUI
!!! warning "Revoke access"
In both cases, don't forget to revoke access from the Docker group when you're done.
```bash
xhost -local:docker
```
??? question "Want to view image results directly in the Terminal?"
Refer to the following guide on [viewing the image results using a terminal](./view-results-in-terminal.md)
---
Congratulations! You're now set up to use Ultralytics with Docker and ready to take advantage of its powerful capabilities. For alternate installation methods, feel free to explore the [Ultralytics quickstart documentation](../quickstart.md).
## FAQ
### How do I set up Ultralytics with Docker?
To set up Ultralytics with Docker, first ensure that Docker is installed on your system. If you have an NVIDIA GPU, install the NVIDIA Docker runtime to enable GPU support. Then, pull the latest Ultralytics Docker image from Docker Hub using the following command:
```bash
sudo docker pull ultralytics/ultralytics:latest
```
For detailed steps, refer to our [Docker Quickstart Guide](../quickstart.md).
### What are the benefits of using Ultralytics Docker images for [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) projects?
Using Ultralytics Docker images ensures a consistent environment across different machines, replicating the same software and dependencies. This is particularly useful for collaborating across teams, running models on various hardware, and maintaining reproducibility. For GPU-based training, Ultralytics provides optimized Docker images such as `Dockerfile` for general GPU usage and `Dockerfile-jetson` for NVIDIA Jetson devices. Explore [Ultralytics Docker Hub](https://hub.docker.com/r/ultralytics/ultralytics) for more details.
### How can I run Ultralytics YOLO in a Docker container with GPU support?
First, ensure that the NVIDIA Docker runtime is installed and configured. Then, use the following command to run Ultralytics YOLO with GPU support:
```bash
sudo docker run -it --ipc=host --gpus all ultralytics/ultralytics:latest
```
This command sets up a Docker container with GPU access. For additional details, see the [Docker Quickstart Guide](../quickstart.md).
### How do I visualize YOLO prediction results in a Docker container with a display server?
To visualize YOLO prediction results with a GUI in a Docker container, you need to allow Docker to access your display server. For systems running X11, the command is:
```bash
xhost +local:docker && docker run -e DISPLAY=$DISPLAY \
-v /tmp/.X11-unix:/tmp/.X11-unix \
-v ~/.Xauthority:/root/.Xauthority \
-it --ipc=host ultralytics/ultralytics:latest
```
For systems running Wayland, use:
```bash
xhost +local:docker && docker run -e DISPLAY=$DISPLAY \
-v $XDG_RUNTIME_DIR/$WAYLAND_DISPLAY:/tmp/$WAYLAND_DISPLAY \
--net=host -it --ipc=host ultralytics/ultralytics:latest
```
More information can be found in the [Run graphical user interface (GUI) applications in a Docker Container](#run-graphical-user-interface-gui-applications-in-a-docker-container) section.
### Can I mount local directories into the Ultralytics Docker container?
Yes, you can mount local directories into the Ultralytics Docker container using the `-v` flag:
```bash
sudo docker run -it --ipc=host --gpus all -v /path/on/host:/path/in/container ultralytics/ultralytics:latest
```
Replace `/path/on/host` with the directory on your local machine and `/path/in/container` with the desired path inside the container. This setup allows you to work with your local files within the container. For more information, refer to the relevant section on [mounting local directories](../usage/python.md).
---
comments: true
description: Transform complex data into insightful heatmaps using Ultralytics YOLO11. Discover patterns, trends, and anomalies with vibrant visualizations.
keywords: Ultralytics, YOLO11, heatmaps, data visualization, data analysis, complex data, patterns, trends, anomalies
---
# Advanced [Data Visualization](https://www.ultralytics.com/glossary/data-visualization): Heatmaps using Ultralytics YOLO11 🚀
## Introduction to Heatmaps
A heatmap generated with [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) transforms complex data into a vibrant, color-coded matrix. This visual tool employs a spectrum of colors to represent varying data values, where warmer hues indicate higher intensities and cooler tones signify lower values. Heatmaps excel in visualizing intricate data patterns, correlations, and anomalies, offering an accessible and engaging approach to data interpretation across diverse domains.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/4ezde5-nZZw"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> Heatmaps using Ultralytics YOLO11
</p>
## Why Choose Heatmaps for Data Analysis?
- **Intuitive Data Distribution Visualization:** Heatmaps simplify the comprehension of data concentration and distribution, converting complex datasets into easy-to-understand visual formats.
- **Efficient Pattern Detection:** By visualizing data in heatmap format, it becomes easier to spot trends, clusters, and outliers, facilitating quicker analysis and insights.
- **Enhanced Spatial Analysis and Decision-Making:** Heatmaps are instrumental in illustrating spatial relationships, aiding in decision-making processes in sectors such as business intelligence, environmental studies, and urban planning.
## Real World Applications
| Transportation | Retail |
| :--------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------: |
| ![Ultralytics YOLO11 Transportation Heatmap](https://github.com/ultralytics/docs/releases/download/0/ultralytics-yolov8-transportation-heatmap.avif) | ![Ultralytics YOLO11 Retail Heatmap](https://github.com/ultralytics/docs/releases/download/0/ultralytics-yolov8-retail-heatmap.avif) |
| Ultralytics YOLO11 Transportation Heatmap | Ultralytics YOLO11 Retail Heatmap |
!!! example "Heatmaps using Ultralytics YOLO11 Example"
=== "CLI"
```bash
# Run a heatmap example
yolo solutions heatmap show=True
# Pass a source video
yolo solutions heatmap source="path/to/video/file.mp4"
# Pass a custom colormap
yolo solutions heatmap colormap=cv2.COLORMAP_INFERNO
# Heatmaps + object counting
yolo solutions heatmap region=[(20, 400), (1080, 400), (1080, 360), (20, 360)]
```
=== "Python"
```python
import cv2
from ultralytics import solutions
cap = cv2.VideoCapture("Path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
# Video writer
video_writer = cv2.VideoWriter("heatmap_output.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))
# In case you want to apply object counting + heatmaps, you can pass region points.
# region_points = [(20, 400), (1080, 400)] # Define line points
# region_points = [(20, 400), (1080, 400), (1080, 360), (20, 360)] # Define region points
# region_points = [(20, 400), (1080, 400), (1080, 360), (20, 360), (20, 400)] # Define polygon points
# Init heatmap
heatmap = solutions.Heatmap(
show=True, # Display the output
model="yolo11n.pt", # Path to the YOLO11 model file
colormap=cv2.COLORMAP_PARULA, # Colormap of heatmap
# region=region_points, # If you want to do object counting with heatmaps, you can pass region_points
# classes=[0, 2], # If you want to generate heatmap for specific classes i.e person and car.
# show_in=True, # Display in counts
# show_out=True, # Display out counts
# line_width=2, # Adjust the line width for bounding boxes and text display
)
# Process video
while cap.isOpened():
success, im0 = cap.read()
if not success:
print("Video frame is empty or video processing has been successfully completed.")
break
im0 = heatmap.generate_heatmap(im0)
video_writer.write(im0)
cap.release()
video_writer.release()
cv2.destroyAllWindows()
```
### Arguments `Heatmap()`
| Name | Type | Default | Description |
| ------------ | ------ | ------------------ | ----------------------------------------------------------------- |
| `model` | `str` | `None` | Path to Ultralytics YOLO Model File |
| `colormap` | `int` | `cv2.COLORMAP_JET` | Colormap to use for the heatmap. |
| `show` | `bool` | `False` | Whether to display the image with the heatmap overlay. |
| `show_in` | `bool` | `True` | Whether to display the count of objects entering the region. |
| `show_out` | `bool` | `True` | Whether to display the count of objects exiting the region. |
| `region` | `list` | `None` | Points defining the counting region (either a line or a polygon). |
| `line_width` | `int` | `2` | Thickness of the lines used in drawing. |
### Arguments `model.track`
{% include "macros/track-args.md" %}
### Heatmap COLORMAPs
| Colormap Name | Description |
| ------------------------------- | -------------------------------------- |
| `cv::COLORMAP_AUTUMN` | Autumn color map |
| `cv::COLORMAP_BONE` | Bone color map |
| `cv::COLORMAP_JET` | Jet color map |
| `cv::COLORMAP_WINTER` | Winter color map |
| `cv::COLORMAP_RAINBOW` | Rainbow color map |
| `cv::COLORMAP_OCEAN` | Ocean color map |
| `cv::COLORMAP_SUMMER` | Summer color map |
| `cv::COLORMAP_SPRING` | Spring color map |
| `cv::COLORMAP_COOL` | Cool color map |
| `cv::COLORMAP_HSV` | HSV (Hue, Saturation, Value) color map |
| `cv::COLORMAP_PINK` | Pink color map |
| `cv::COLORMAP_HOT` | Hot color map |
| `cv::COLORMAP_PARULA` | Parula color map |
| `cv::COLORMAP_MAGMA` | Magma color map |
| `cv::COLORMAP_INFERNO` | Inferno color map |
| `cv::COLORMAP_PLASMA` | Plasma color map |
| `cv::COLORMAP_VIRIDIS` | Viridis color map |
| `cv::COLORMAP_CIVIDIS` | Cividis color map |
| `cv::COLORMAP_TWILIGHT` | Twilight color map |
| `cv::COLORMAP_TWILIGHT_SHIFTED` | Shifted Twilight color map |
| `cv::COLORMAP_TURBO` | Turbo color map |
| `cv::COLORMAP_DEEPGREEN` | Deep Green color map |
These colormaps are commonly used for visualizing data with different color representations.
## FAQ
### How does Ultralytics YOLO11 generate heatmaps and what are their benefits?
Ultralytics YOLO11 generates heatmaps by transforming complex data into a color-coded matrix where different hues represent data intensities. Heatmaps make it easier to visualize patterns, correlations, and anomalies in the data. Warmer hues indicate higher values, while cooler tones represent lower values. The primary benefits include intuitive visualization of data distribution, efficient pattern detection, and enhanced spatial analysis for decision-making. For more details and configuration options, refer to the [Heatmap Configuration](#arguments-heatmap) section.
### Can I use Ultralytics YOLO11 to perform object tracking and generate a heatmap simultaneously?
Yes, Ultralytics YOLO11 supports object tracking and heatmap generation concurrently. This can be achieved through its `Heatmap` solution integrated with object tracking models. To do so, you need to initialize the heatmap object and use YOLO11's tracking capabilities. Here's a simple example:
```python
import cv2
from ultralytics import solutions
cap = cv2.VideoCapture("path/to/video/file.mp4")
heatmap = solutions.Heatmap(colormap=cv2.COLORMAP_PARULA, show=True, model="yolo11n.pt")
while cap.isOpened():
success, im0 = cap.read()
if not success:
break
im0 = heatmap.generate_heatmap(im0)
cv2.imshow("Heatmap", im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
```
For further guidance, check the [Tracking Mode](../modes/track.md) page.
### What makes Ultralytics YOLO11 heatmaps different from other data visualization tools like those from [OpenCV](https://www.ultralytics.com/glossary/opencv) or Matplotlib?
Ultralytics YOLO11 heatmaps are specifically designed for integration with its [object detection](https://www.ultralytics.com/glossary/object-detection) and tracking models, providing an end-to-end solution for real-time data analysis. Unlike generic visualization tools like OpenCV or Matplotlib, YOLO11 heatmaps are optimized for performance and automated processing, supporting features like persistent tracking, decay factor adjustment, and real-time video overlay. For more information on YOLO11's unique features, visit the [Ultralytics YOLO11 Introduction](https://www.ultralytics.com/blog/introducing-ultralytics-yolov8).
### How can I visualize only specific object classes in heatmaps using Ultralytics YOLO11?
You can visualize specific object classes by specifying the desired classes in the `track()` method of the YOLO model. For instance, if you only want to visualize cars and persons (assuming their class indices are 0 and 2), you can set the `classes` parameter accordingly.
```python
import cv2
from ultralytics import solutions
cap = cv2.VideoCapture("path/to/video/file.mp4")
heatmap = solutions.Heatmap(show=True, model="yolo11n.pt", classes=[0, 2])
while cap.isOpened():
success, im0 = cap.read()
if not success:
break
im0 = heatmap.generate_heatmap(im0)
cv2.imshow("Heatmap", im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
```
### Why should businesses choose Ultralytics YOLO11 for heatmap generation in data analysis?
Ultralytics YOLO11 offers seamless integration of advanced object detection and real-time heatmap generation, making it an ideal choice for businesses looking to visualize data more effectively. The key advantages include intuitive data distribution visualization, efficient pattern detection, and enhanced spatial analysis for better decision-making. Additionally, YOLO11's cutting-edge features such as persistent tracking, customizable colormaps, and support for various export formats make it superior to other tools like [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) and OpenCV for comprehensive data analysis. Learn more about business applications at [Ultralytics Plans](https://www.ultralytics.com/plans).
---
comments: true
description: Master hyperparameter tuning for Ultralytics YOLO to optimize model performance with our comprehensive guide. Elevate your machine learning models today!.
keywords: Ultralytics YOLO, hyperparameter tuning, machine learning, model optimization, genetic algorithms, learning rate, batch size, epochs
---
# Ultralytics YOLO [Hyperparameter Tuning](https://www.ultralytics.com/glossary/hyperparameter-tuning) Guide
## Introduction
Hyperparameter tuning is not just a one-time set-up but an iterative process aimed at optimizing the [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) model's performance metrics, such as accuracy, precision, and recall. In the context of Ultralytics YOLO, these hyperparameters could range from learning rate to architectural details, such as the number of layers or types of activation functions used.
### What are Hyperparameters?
Hyperparameters are high-level, structural settings for the algorithm. They are set prior to the training phase and remain constant during it. Here are some commonly tuned hyperparameters in Ultralytics YOLO:
- **Learning Rate** `lr0`: Determines the step size at each iteration while moving towards a minimum in the [loss function](https://www.ultralytics.com/glossary/loss-function).
- **[Batch Size](https://www.ultralytics.com/glossary/batch-size)** `batch`: Number of images processed simultaneously in a forward pass.
- **Number of [Epochs](https://www.ultralytics.com/glossary/epoch)** `epochs`: An epoch is one complete forward and backward pass of all the training examples.
- **Architecture Specifics**: Such as channel counts, number of layers, types of activation functions, etc.
<p align="center">
<img width="640" src="https://github.com/ultralytics/docs/releases/download/0/hyperparameter-tuning-visual.avif" alt="Hyperparameter Tuning Visual">
</p>
For a full list of augmentation hyperparameters used in YOLO11 please refer to the [configurations page](../usage/cfg.md#augmentation-settings).
### Genetic Evolution and Mutation
Ultralytics YOLO uses genetic algorithms to optimize hyperparameters. Genetic algorithms are inspired by the mechanism of natural selection and genetics.
- **Mutation**: In the context of Ultralytics YOLO, mutation helps in locally searching the hyperparameter space by applying small, random changes to existing hyperparameters, producing new candidates for evaluation.
- **Crossover**: Although crossover is a popular genetic algorithm technique, it is not currently used in Ultralytics YOLO for hyperparameter tuning. The focus is mainly on mutation for generating new hyperparameter sets.
## Preparing for Hyperparameter Tuning
Before you begin the tuning process, it's important to:
1. **Identify the Metrics**: Determine the metrics you will use to evaluate the model's performance. This could be AP50, F1-score, or others.
2. **Set the Tuning Budget**: Define how much computational resources you're willing to allocate. Hyperparameter tuning can be computationally intensive.
## Steps Involved
### Initialize Hyperparameters
Start with a reasonable set of initial hyperparameters. This could either be the default hyperparameters set by Ultralytics YOLO or something based on your domain knowledge or previous experiments.
### Mutate Hyperparameters
Use the `_mutate` method to produce a new set of hyperparameters based on the existing set.
### Train Model
Training is performed using the mutated set of hyperparameters. The training performance is then assessed.
### Evaluate Model
Use metrics like AP50, F1-score, or custom metrics to evaluate the model's performance.
### Log Results
It's crucial to log both the performance metrics and the corresponding hyperparameters for future reference.
### Repeat
The process is repeated until either the set number of iterations is reached or the performance metric is satisfactory.
## Usage Example
Here's how to use the `model.tune()` method to utilize the `Tuner` class for hyperparameter tuning of YOLO11n on COCO8 for 30 epochs with an AdamW optimizer and skipping plotting, checkpointing and validation other than on final epoch for faster Tuning.
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Initialize the YOLO model
model = YOLO("yolo11n.pt")
# Tune hyperparameters on COCO8 for 30 epochs
model.tune(data="coco8.yaml", epochs=30, iterations=300, optimizer="AdamW", plots=False, save=False, val=False)
```
## Results
After you've successfully completed the hyperparameter tuning process, you will obtain several files and directories that encapsulate the results of the tuning. The following describes each:
### File Structure
Here's what the directory structure of the results will look like. Training directories like `train1/` contain individual tuning iterations, i.e. one model trained with one set of hyperparameters. The `tune/` directory contains tuning results from all the individual model trainings:
```plaintext
runs/
└── detect/
├── train1/
├── train2/
├── ...
└── tune/
├── best_hyperparameters.yaml
├── best_fitness.png
├── tune_results.csv
├── tune_scatter_plots.png
└── weights/
├── last.pt
└── best.pt
```
### File Descriptions
#### best_hyperparameters.yaml
This YAML file contains the best-performing hyperparameters found during the tuning process. You can use this file to initialize future trainings with these optimized settings.
- **Format**: YAML
- **Usage**: Hyperparameter results
- **Example**:
```yaml
# 558/900 iterations complete ✅ (45536.81s)
# Results saved to /usr/src/ultralytics/runs/detect/tune
# Best fitness=0.64297 observed at iteration 498
# Best fitness metrics are {'metrics/precision(B)': 0.87247, 'metrics/recall(B)': 0.71387, 'metrics/mAP50(B)': 0.79106, 'metrics/mAP50-95(B)': 0.62651, 'val/box_loss': 2.79884, 'val/cls_loss': 2.72386, 'val/dfl_loss': 0.68503, 'fitness': 0.64297}
# Best fitness model is /usr/src/ultralytics/runs/detect/train498
# Best fitness hyperparameters are printed below.
lr0: 0.00269
lrf: 0.00288
momentum: 0.73375
weight_decay: 0.00015
warmup_epochs: 1.22935
warmup_momentum: 0.1525
box: 18.27875
cls: 1.32899
dfl: 0.56016
hsv_h: 0.01148
hsv_s: 0.53554
hsv_v: 0.13636
degrees: 0.0
translate: 0.12431
scale: 0.07643
shear: 0.0
perspective: 0.0
flipud: 0.0
fliplr: 0.08631
mosaic: 0.42551
mixup: 0.0
copy_paste: 0.0
```
#### best_fitness.png
This is a plot displaying fitness (typically a performance metric like AP50) against the number of iterations. It helps you visualize how well the genetic algorithm performed over time.
- **Format**: PNG
- **Usage**: Performance visualization
<p align="center">
<img width="640" src="https://github.com/ultralytics/docs/releases/download/0/best-fitness.avif" alt="Hyperparameter Tuning Fitness vs Iteration">
</p>
#### tune_results.csv
A CSV file containing detailed results of each iteration during the tuning. Each row in the file represents one iteration, and it includes metrics like fitness score, [precision](https://www.ultralytics.com/glossary/precision), [recall](https://www.ultralytics.com/glossary/recall), as well as the hyperparameters used.
- **Format**: CSV
- **Usage**: Per-iteration results tracking.
- **Example**:
```csv
fitness,lr0,lrf,momentum,weight_decay,warmup_epochs,warmup_momentum,box,cls,dfl,hsv_h,hsv_s,hsv_v,degrees,translate,scale,shear,perspective,flipud,fliplr,mosaic,mixup,copy_paste
0.05021,0.01,0.01,0.937,0.0005,3.0,0.8,7.5,0.5,1.5,0.015,0.7,0.4,0.0,0.1,0.5,0.0,0.0,0.0,0.5,1.0,0.0,0.0
0.07217,0.01003,0.00967,0.93897,0.00049,2.79757,0.81075,7.5,0.50746,1.44826,0.01503,0.72948,0.40658,0.0,0.0987,0.4922,0.0,0.0,0.0,0.49729,1.0,0.0,0.0
0.06584,0.01003,0.00855,0.91009,0.00073,3.42176,0.95,8.64301,0.54594,1.72261,0.01503,0.59179,0.40658,0.0,0.0987,0.46955,0.0,0.0,0.0,0.49729,0.80187,0.0,0.0
```
#### tune_scatter_plots.png
This file contains scatter plots generated from `tune_results.csv`, helping you visualize relationships between different hyperparameters and performance metrics. Note that hyperparameters initialized to 0 will not be tuned, such as `degrees` and `shear` below.
- **Format**: PNG
- **Usage**: Exploratory data analysis
<p align="center">
<img width="1000" src="https://github.com/ultralytics/docs/releases/download/0/tune-scatter-plots.avif" alt="Hyperparameter Tuning Scatter Plots">
</p>
#### weights/
This directory contains the saved [PyTorch](https://www.ultralytics.com/glossary/pytorch) models for the last and the best iterations during the hyperparameter tuning process.
- **`last.pt`**: The last.pt are the weights from the last epoch of training.
- **`best.pt`**: The best.pt weights for the iteration that achieved the best fitness score.
Using these results, you can make more informed decisions for your future model trainings and analyses. Feel free to consult these artifacts to understand how well your model performed and how you might improve it further.
## Conclusion
The hyperparameter tuning process in Ultralytics YOLO is simplified yet powerful, thanks to its genetic algorithm-based approach focused on mutation. Following the steps outlined in this guide will assist you in systematically tuning your model to achieve better performance.
### Further Reading
1. [Hyperparameter Optimization in Wikipedia](https://en.wikipedia.org/wiki/Hyperparameter_optimization)
2. [YOLOv5 Hyperparameter Evolution Guide](../yolov5/tutorials/hyperparameter_evolution.md)
3. [Efficient Hyperparameter Tuning with Ray Tune and YOLO11](../integrations/ray-tune.md)
For deeper insights, you can explore the `Tuner` class source code and accompanying documentation. Should you have any questions, feature requests, or need further assistance, feel free to reach out to us on [GitHub](https://github.com/ultralytics/ultralytics/issues/new/choose) or [Discord](https://discord.com/invite/ultralytics).
## FAQ
### How do I optimize the [learning rate](https://www.ultralytics.com/glossary/learning-rate) for Ultralytics YOLO during hyperparameter tuning?
To optimize the learning rate for Ultralytics YOLO, start by setting an initial learning rate using the `lr0` parameter. Common values range from `0.001` to `0.01`. During the hyperparameter tuning process, this value will be mutated to find the optimal setting. You can utilize the `model.tune()` method to automate this process. For example:
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Initialize the YOLO model
model = YOLO("yolo11n.pt")
# Tune hyperparameters on COCO8 for 30 epochs
model.tune(data="coco8.yaml", epochs=30, iterations=300, optimizer="AdamW", plots=False, save=False, val=False)
```
For more details, check the [Ultralytics YOLO configuration page](../usage/cfg.md#augmentation-settings).
### What are the benefits of using genetic algorithms for hyperparameter tuning in YOLO11?
Genetic algorithms in Ultralytics YOLO11 provide a robust method for exploring the hyperparameter space, leading to highly optimized model performance. Key benefits include:
- **Efficient Search**: Genetic algorithms like mutation can quickly explore a large set of hyperparameters.
- **Avoiding Local Minima**: By introducing randomness, they help in avoiding local minima, ensuring better global optimization.
- **Performance Metrics**: They adapt based on performance metrics such as AP50 and F1-score.
To see how genetic algorithms can optimize hyperparameters, check out the [hyperparameter evolution guide](../yolov5/tutorials/hyperparameter_evolution.md).
### How long does the hyperparameter tuning process take for Ultralytics YOLO?
The time required for hyperparameter tuning with Ultralytics YOLO largely depends on several factors such as the size of the dataset, the complexity of the model architecture, the number of iterations, and the computational resources available. For instance, tuning YOLO11n on a dataset like COCO8 for 30 epochs might take several hours to days, depending on the hardware.
To effectively manage tuning time, define a clear tuning budget beforehand ([internal section link](#preparing-for-hyperparameter-tuning)). This helps in balancing resource allocation and optimization goals.
### What metrics should I use to evaluate model performance during hyperparameter tuning in YOLO?
When evaluating model performance during hyperparameter tuning in YOLO, you can use several key metrics:
- **AP50**: The average precision at IoU threshold of 0.50.
- **F1-Score**: The harmonic mean of precision and recall.
- **Precision and Recall**: Individual metrics indicating the model's [accuracy](https://www.ultralytics.com/glossary/accuracy) in identifying true positives versus false positives and false negatives.
These metrics help you understand different aspects of your model's performance. Refer to the [Ultralytics YOLO performance metrics](../guides/yolo-performance-metrics.md) guide for a comprehensive overview.
### Can I use Ultralytics HUB for hyperparameter tuning of YOLO models?
Yes, you can use Ultralytics HUB for hyperparameter tuning of YOLO models. The HUB offers a no-code platform to easily upload datasets, train models, and perform hyperparameter tuning efficiently. It provides real-time tracking and visualization of tuning progress and results.
Explore more about using Ultralytics HUB for hyperparameter tuning in the [Ultralytics HUB Cloud Training](../hub/cloud-training.md) documentation.
---
comments: true
description: Master YOLO with Ultralytics tutorials covering training, deployment and optimization. Find solutions, improve metrics, and deploy with ease!.
keywords: Ultralytics, YOLO, tutorials, guides, object detection, deep learning, PyTorch, training, deployment, optimization, computer vision
---
# Comprehensive Tutorials to Ultralytics YOLO
Welcome to the Ultralytics' YOLO 🚀 Guides! Our comprehensive tutorials cover various aspects of the YOLO [object detection](https://www.ultralytics.com/glossary/object-detection) model, ranging from training and prediction to deployment. Built on [PyTorch](https://www.ultralytics.com/glossary/pytorch), YOLO stands out for its exceptional speed and [accuracy](https://www.ultralytics.com/glossary/accuracy) in real-time object detection tasks.
Whether you're a beginner or an expert in [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl), our tutorials offer valuable insights into the implementation and optimization of YOLO for your [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) projects. Let's dive in!
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/96NkhsV-W1U"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> Ultralytics YOLO11 Guides Overview
</p>
## Guides
Here's a compilation of in-depth guides to help you master different aspects of Ultralytics YOLO.
- [YOLO Common Issues](yolo-common-issues.md) ⭐ RECOMMENDED: Practical solutions and troubleshooting tips to the most frequently encountered issues when working with Ultralytics YOLO models.
- [YOLO Performance Metrics](yolo-performance-metrics.md) ⭐ ESSENTIAL: Understand the key metrics like mAP, IoU, and [F1 score](https://www.ultralytics.com/glossary/f1-score) used to evaluate the performance of your YOLO models. Includes practical examples and tips on how to improve detection accuracy and speed.
- [Model Deployment Options](model-deployment-options.md): Overview of YOLO [model deployment](https://www.ultralytics.com/glossary/model-deployment) formats like ONNX, OpenVINO, and TensorRT, with pros and cons for each to inform your deployment strategy.
- [K-Fold Cross Validation](kfold-cross-validation.md) 🚀 NEW: Learn how to improve model generalization using K-Fold cross-validation technique.
- [Hyperparameter Tuning](hyperparameter-tuning.md) 🚀 NEW: Discover how to optimize your YOLO models by fine-tuning hyperparameters using the Tuner class and genetic evolution algorithms.
- [SAHI Tiled Inference](sahi-tiled-inference.md) 🚀 NEW: Comprehensive guide on leveraging SAHI's sliced inference capabilities with YOLO11 for object detection in high-resolution images.
- [AzureML Quickstart](azureml-quickstart.md) 🚀 NEW: Get up and running with Ultralytics YOLO models on Microsoft's Azure [Machine Learning](https://www.ultralytics.com/glossary/machine-learning-ml) platform. Learn how to train, deploy, and scale your object detection projects in the cloud.
- [Conda Quickstart](conda-quickstart.md) 🚀 NEW: Step-by-step guide to setting up a [Conda](https://anaconda.org/conda-forge/ultralytics) environment for Ultralytics. Learn how to install and start using the Ultralytics package efficiently with Conda.
- [Docker Quickstart](docker-quickstart.md) 🚀 NEW: Complete guide to setting up and using Ultralytics YOLO models with [Docker](https://hub.docker.com/r/ultralytics/ultralytics). Learn how to install Docker, manage GPU support, and run YOLO models in isolated containers for consistent development and deployment.
- [Raspberry Pi](raspberry-pi.md) 🚀 NEW: Quickstart tutorial to run YOLO models to the latest Raspberry Pi hardware.
- [NVIDIA Jetson](nvidia-jetson.md) 🚀 NEW: Quickstart guide for deploying YOLO models on NVIDIA Jetson devices.
- [DeepStream on NVIDIA Jetson](deepstream-nvidia-jetson.md) 🚀 NEW: Quickstart guide for deploying YOLO models on NVIDIA Jetson devices using DeepStream and TensorRT.
- [Triton Inference Server Integration](triton-inference-server.md) 🚀 NEW: Dive into the integration of Ultralytics YOLO11 with NVIDIA's Triton Inference Server for scalable and efficient deep learning inference deployments.
- [YOLO Thread-Safe Inference](yolo-thread-safe-inference.md) 🚀 NEW: Guidelines for performing inference with YOLO models in a thread-safe manner. Learn the importance of thread safety and best practices to prevent race conditions and ensure consistent predictions.
- [Isolating Segmentation Objects](isolating-segmentation-objects.md) 🚀 NEW: Step-by-step recipe and explanation on how to extract and/or isolate objects from images using Ultralytics Segmentation.
- [Edge TPU on Raspberry Pi](coral-edge-tpu-on-raspberry-pi.md): [Google Edge TPU](https://coral.ai/products/accelerator) accelerates YOLO inference on [Raspberry Pi](https://www.raspberrypi.com/).
- [View Inference Images in a Terminal](view-results-in-terminal.md): Use VSCode's integrated terminal to view inference results when using Remote Tunnel or SSH sessions.
- [OpenVINO Latency vs Throughput Modes](optimizing-openvino-latency-vs-throughput-modes.md) - Learn latency and throughput optimization techniques for peak YOLO inference performance.
- [Steps of a Computer Vision Project ](steps-of-a-cv-project.md) 🚀 NEW: Learn about the key steps involved in a computer vision project, including defining goals, selecting models, preparing data, and evaluating results.
- [Defining A Computer Vision Project's Goals](defining-project-goals.md) 🚀 NEW: Walk through how to effectively define clear and measurable goals for your computer vision project. Learn the importance of a well-defined problem statement and how it creates a roadmap for your project.
- [Data Collection and Annotation](data-collection-and-annotation.md) 🚀 NEW: Explore the tools, techniques, and best practices for collecting and annotating data to create high-quality inputs for your computer vision models.
- [Preprocessing Annotated Data](preprocessing_annotated_data.md) 🚀 NEW: Learn about preprocessing and augmenting image data in computer vision projects using YOLO11, including normalization, dataset augmentation, splitting, and exploratory data analysis (EDA).
- [Tips for Model Training](model-training-tips.md) 🚀 NEW: Explore tips on optimizing [batch sizes](https://www.ultralytics.com/glossary/batch-size), using [mixed precision](https://www.ultralytics.com/glossary/mixed-precision), applying pre-trained weights, and more to make training your computer vision model a breeze.
- [Insights on Model Evaluation and Fine-Tuning](model-evaluation-insights.md) 🚀 NEW: Gain insights into the strategies and best practices for evaluating and fine-tuning your computer vision models. Learn about the iterative process of refining models to achieve optimal results.
- [A Guide on Model Testing](model-testing.md) 🚀 NEW: A thorough guide on testing your computer vision models in realistic settings. Learn how to verify accuracy, reliability, and performance in line with project goals.
- [Best Practices for Model Deployment](model-deployment-practices.md) 🚀 NEW: Walk through tips and best practices for efficiently deploying models in computer vision projects, with a focus on optimization, troubleshooting, and security.
- [Maintaining Your Computer Vision Model](model-monitoring-and-maintenance.md) 🚀 NEW: Understand the key practices for monitoring, maintaining, and documenting computer vision models to guarantee accuracy, spot anomalies, and mitigate data drift.
- [ROS Quickstart](ros-quickstart.md) 🚀 NEW: Learn how to integrate YOLO with the Robot Operating System (ROS) for real-time object detection in robotics applications, including Point Cloud and Depth images.
## Contribute to Our Guides
We welcome contributions from the community! If you've mastered a particular aspect of Ultralytics YOLO that's not yet covered in our guides, we encourage you to share your expertise. Writing a guide is a great way to give back to the community and help us make our documentation more comprehensive and user-friendly.
To get started, please read our [Contributing Guide](../help/contributing.md) for guidelines on how to open up a Pull Request (PR) 🛠️. We look forward to your contributions!
Let's work together to make the Ultralytics YOLO ecosystem more robust and versatile 🙏!
## FAQ
### How do I train a custom object detection model using Ultralytics YOLO?
Training a custom object detection model with Ultralytics YOLO is straightforward. Start by preparing your dataset in the correct format and installing the Ultralytics package. Use the following code to initiate training:
!!! example
=== "Python"
```python
from ultralytics import YOLO
model = YOLO("yolo11n.pt") # Load a pre-trained YOLO model
model.train(data="path/to/dataset.yaml", epochs=50) # Train on custom dataset
```
=== "CLI"
```bash
yolo task=detect mode=train model=yolo11n.pt data=path/to/dataset.yaml epochs=50
```
For detailed dataset formatting and additional options, refer to our [Tips for Model Training](model-training-tips.md) guide.
### What performance metrics should I use to evaluate my YOLO model?
Evaluating your YOLO model performance is crucial to understanding its efficacy. Key metrics include [Mean Average Precision](https://www.ultralytics.com/glossary/mean-average-precision-map) (mAP), [Intersection over Union](https://www.ultralytics.com/glossary/intersection-over-union-iou) (IoU), and F1 score. These metrics help assess the accuracy and [precision](https://www.ultralytics.com/glossary/precision) of object detection tasks. You can learn more about these metrics and how to improve your model in our [YOLO Performance Metrics](yolo-performance-metrics.md) guide.
### Why should I use Ultralytics HUB for my computer vision projects?
Ultralytics HUB is a no-code platform that simplifies managing, training, and deploying YOLO models. It supports seamless integration, real-time tracking, and cloud training, making it ideal for both beginners and professionals. Discover more about its features and how it can streamline your workflow with our [Ultralytics HUB](https://docs.ultralytics.com/hub/) quickstart guide.
### What are the common issues faced during YOLO model training, and how can I resolve them?
Common issues during YOLO model training include data formatting errors, model architecture mismatches, and insufficient [training data](https://www.ultralytics.com/glossary/training-data). To address these, ensure your dataset is correctly formatted, check for compatible model versions, and augment your training data. For a comprehensive list of solutions, refer to our [YOLO Common Issues](yolo-common-issues.md) guide.
### How can I deploy my YOLO model for real-time object detection on edge devices?
Deploying YOLO models on edge devices like NVIDIA Jetson and Raspberry Pi requires converting the model to a compatible format such as TensorRT or TFLite. Follow our step-by-step guides for [NVIDIA Jetson](nvidia-jetson.md) and [Raspberry Pi](raspberry-pi.md) deployments to get started with real-time object detection on edge hardware. These guides will walk you through installation, configuration, and performance optimization.
---
comments: true
description: Master instance segmentation and tracking with Ultralytics YOLO11. Learn techniques for precise object identification and tracking.
keywords: instance segmentation, tracking, YOLO11, Ultralytics, object detection, machine learning, computer vision, python
---
# Instance Segmentation and Tracking using Ultralytics YOLO11 🚀
## What is [Instance Segmentation](https://www.ultralytics.com/glossary/instance-segmentation)?
[Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) instance segmentation involves identifying and outlining individual objects in an image, providing a detailed understanding of spatial distribution. Unlike [semantic segmentation](https://www.ultralytics.com/glossary/semantic-segmentation), it uniquely labels and precisely delineates each object, crucial for tasks like [object detection](https://www.ultralytics.com/glossary/object-detection) and medical imaging.
There are two types of instance segmentation tracking available in the Ultralytics package:
- **Instance Segmentation with Class Objects:** Each class object is assigned a unique color for clear visual separation.
- **Instance Segmentation with Object Tracks:** Every track is represented by a distinct color, facilitating easy identification and tracking.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/75G_S1Ngji8"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> Instance Segmentation with Object Tracking using Ultralytics YOLO11
</p>
## Samples
| Instance Segmentation | Instance Segmentation + Object Tracking |
| :----------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ![Ultralytics Instance Segmentation](https://github.com/ultralytics/docs/releases/download/0/ultralytics-instance-segmentation.avif) | ![Ultralytics Instance Segmentation with Object Tracking](https://github.com/ultralytics/docs/releases/download/0/ultralytics-instance-segmentation-object-tracking.avif) |
| Ultralytics Instance Segmentation 😍 | Ultralytics Instance Segmentation with Object Tracking 🔥 |
!!! example "Instance Segmentation and Tracking"
=== "Instance Segmentation"
```python
import cv2
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
model = YOLO("yolo11n-seg.pt") # segmentation model
names = model.model.names
cap = cv2.VideoCapture("path/to/video/file.mp4")
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
out = cv2.VideoWriter("instance-segmentation.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))
while True:
ret, im0 = cap.read()
if not ret:
print("Video frame is empty or video processing has been successfully completed.")
break
results = model.predict(im0)
annotator = Annotator(im0, line_width=2)
if results[0].masks is not None:
clss = results[0].boxes.cls.cpu().tolist()
masks = results[0].masks.xy
for mask, cls in zip(masks, clss):
color = colors(int(cls), True)
txt_color = annotator.get_txt_color(color)
annotator.seg_bbox(mask=mask, mask_color=color, label=names[int(cls)], txt_color=txt_color)
out.write(im0)
cv2.imshow("instance-segmentation", im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
out.release()
cap.release()
cv2.destroyAllWindows()
```
=== "Instance Segmentation with Object Tracking"
```python
from collections import defaultdict
import cv2
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
track_history = defaultdict(lambda: [])
model = YOLO("yolo11n-seg.pt") # segmentation model
cap = cv2.VideoCapture("path/to/video/file.mp4")
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
out = cv2.VideoWriter("instance-segmentation-object-tracking.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))
while True:
ret, im0 = cap.read()
if not ret:
print("Video frame is empty or video processing has been successfully completed.")
break
annotator = Annotator(im0, line_width=2)
results = model.track(im0, persist=True)
if results[0].boxes.id is not None and results[0].masks is not None:
masks = results[0].masks.xy
track_ids = results[0].boxes.id.int().cpu().tolist()
for mask, track_id in zip(masks, track_ids):
color = colors(int(track_id), True)
txt_color = annotator.get_txt_color(color)
annotator.seg_bbox(mask=mask, mask_color=color, label=str(track_id), txt_color=txt_color)
out.write(im0)
cv2.imshow("instance-segmentation-object-tracking", im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
out.release()
cap.release()
cv2.destroyAllWindows()
```
### `seg_bbox` Arguments
| Name | Type | Default | Description |
| ------------ | ------- | --------------- | -------------------------------------------- |
| `mask` | `array` | `None` | Segmentation mask coordinates |
| `mask_color` | `RGB` | `(255, 0, 255)` | Mask color for every segmented box |
| `label` | `str` | `None` | Label for segmented object |
| `txt_color` | `RGB` | `None` | Label color for segmented and tracked object |
## Note
For any inquiries, feel free to post your questions in the [Ultralytics Issue Section](https://github.com/ultralytics/ultralytics/issues/new/choose) or the discussion section mentioned below.
## FAQ
### How do I perform instance segmentation using Ultralytics YOLO11?
To perform instance segmentation using Ultralytics YOLO11, initialize the YOLO model with a segmentation version of YOLO11 and process video frames through it. Here's a simplified code example:
!!! example
=== "Python"
```python
import cv2
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
model = YOLO("yolo11n-seg.pt") # segmentation model
cap = cv2.VideoCapture("path/to/video/file.mp4")
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
out = cv2.VideoWriter("instance-segmentation.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))
while True:
ret, im0 = cap.read()
if not ret:
break
results = model.predict(im0)
annotator = Annotator(im0, line_width=2)
if results[0].masks is not None:
clss = results[0].boxes.cls.cpu().tolist()
masks = results[0].masks.xy
for mask, cls in zip(masks, clss):
annotator.seg_bbox(mask=mask, mask_color=colors(int(cls), True), det_label=model.model.names[int(cls)])
out.write(im0)
cv2.imshow("instance-segmentation", im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
out.release()
cap.release()
cv2.destroyAllWindows()
```
Learn more about instance segmentation in the [Ultralytics YOLO11 guide](#what-is-instance-segmentation).
### What is the difference between instance segmentation and object tracking in Ultralytics YOLO11?
Instance segmentation identifies and outlines individual objects within an image, giving each object a unique label and mask. Object tracking extends this by assigning consistent labels to objects across video frames, facilitating continuous tracking of the same objects over time. Learn more about the distinctions in the [Ultralytics YOLO11 documentation](#samples).
### Why should I use Ultralytics YOLO11 for instance segmentation and tracking over other models like Mask R-CNN or Faster R-CNN?
Ultralytics YOLO11 offers real-time performance, superior [accuracy](https://www.ultralytics.com/glossary/accuracy), and ease of use compared to other models like Mask R-CNN or Faster R-CNN. YOLO11 provides a seamless integration with Ultralytics HUB, allowing users to manage models, datasets, and training pipelines efficiently. Discover more about the benefits of YOLO11 in the [Ultralytics blog](https://www.ultralytics.com/blog/introducing-ultralytics-yolov8).
### How can I implement object tracking using Ultralytics YOLO11?
To implement object tracking, use the `model.track` method and ensure that each object's ID is consistently assigned across frames. Below is a simple example:
!!! example
=== "Python"
```python
from collections import defaultdict
import cv2
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
track_history = defaultdict(lambda: [])
model = YOLO("yolo11n-seg.pt") # segmentation model
cap = cv2.VideoCapture("path/to/video/file.mp4")
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
out = cv2.VideoWriter("instance-segmentation-object-tracking.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))
while True:
ret, im0 = cap.read()
if not ret:
break
annotator = Annotator(im0, line_width=2)
results = model.track(im0, persist=True)
if results[0].boxes.id is not None and results[0].masks is not None:
masks = results[0].masks.xy
track_ids = results[0].boxes.id.int().cpu().tolist()
for mask, track_id in zip(masks, track_ids):
annotator.seg_bbox(mask=mask, mask_color=colors(track_id, True), track_label=str(track_id))
out.write(im0)
cv2.imshow("instance-segmentation-object-tracking", im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
out.release()
cap.release()
cv2.destroyAllWindows()
```
Find more in the [Instance Segmentation and Tracking section](#samples).
### Are there any datasets provided by Ultralytics suitable for training YOLO11 models for instance segmentation and tracking?
Yes, Ultralytics offers several datasets suitable for training YOLO11 models, including segmentation and tracking datasets. Dataset examples, structures, and instructions for use can be found in the [Ultralytics Datasets documentation](https://docs.ultralytics.com/datasets/).
---
comments: true
description: Learn to extract isolated objects from inference results using Ultralytics Predict Mode. Step-by-step guide for segmentation object isolation.
keywords: Ultralytics, segmentation, object isolation, Predict Mode, YOLO11, machine learning, object detection, binary mask, image processing
---
# Isolating Segmentation Objects
After performing the [Segment Task](../tasks/segment.md), it's sometimes desirable to extract the isolated objects from the inference results. This guide provides a generic recipe on how to accomplish this using the Ultralytics [Predict Mode](../modes/predict.md).
<p align="center">
<img src="https://github.com/ultralytics/docs/releases/download/0/isolated-object-segmentation.avif" alt="Example Isolated Object Segmentation">
</p>
## Recipe Walk Through
1. See the [Ultralytics Quickstart Installation section](../quickstart.md) for a quick walkthrough on installing the required libraries.
***
2. Load a model and run `predict()` method on a source.
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-seg.pt")
# Run inference
results = model.predict()
```
!!! question "No Prediction Arguments?"
Without specifying a source, the example images from the library will be used:
```
'ultralytics/assets/bus.jpg'
'ultralytics/assets/zidane.jpg'
```
This is helpful for rapid testing with the `predict()` method.
For additional information about Segmentation Models, visit the [Segment Task](../tasks/segment.md#models) page. To learn more about `predict()` method, see [Predict Mode](../modes/predict.md) section of the Documentation.
***
3. Now iterate over the results and the contours. For workflows that want to save an image to file, the source image `base-name` and the detection `class-label` are retrieved for later use (optional).
```{ .py .annotate }
from pathlib import Path
import numpy as np
# (2) Iterate detection results (helpful for multiple images)
for r in res:
img = np.copy(r.orig_img)
img_name = Path(r.path).stem # source image base-name
# Iterate each object contour (multiple detections)
for ci, c in enumerate(r):
# (1) Get detection class name
label = c.names[c.boxes.cls.tolist().pop()]
```
1. To learn more about working with detection results, see [Boxes Section for Predict Mode](../modes/predict.md#boxes).
2. To learn more about `predict()` results see [Working with Results for Predict Mode](../modes/predict.md#working-with-results)
??? info "For-Loop"
A single image will only iterate the first loop once. A single image with only a single detection will iterate each loop _only_ once.
***
4. Start with generating a binary mask from the source image and then draw a filled contour onto the mask. This will allow the object to be isolated from the other parts of the image. An example from `bus.jpg` for one of the detected `person` class objects is shown on the right.
![Binary Mask Image](https://github.com/ultralytics/ultralytics/assets/62214284/59bce684-fdda-4b17-8104-0b4b51149aca){ width="240", align="right" }
```{ .py .annotate }
import cv2
# Create binary mask
b_mask = np.zeros(img.shape[:2], np.uint8)
# (1) Extract contour result
contour = c.masks.xy.pop()
# (2) Changing the type
contour = contour.astype(np.int32)
# (3) Reshaping
contour = contour.reshape(-1, 1, 2)
# Draw contour onto mask
_ = cv2.drawContours(b_mask, [contour], -1, (255, 255, 255), cv2.FILLED)
```
1. For more info on `c.masks.xy` see [Masks Section from Predict Mode](../modes/predict.md#masks).
2. Here the values are cast into `np.int32` for compatibility with `drawContours()` function from [OpenCV](https://www.ultralytics.com/glossary/opencv).
3. The OpenCV `drawContours()` function expects contours to have a shape of `[N, 1, 2]` expand section below for more details.
<details>
<summary> Expand to understand what is happening when defining the <code>contour</code> variable.</summary>
<p>
- `c.masks.xy` :: Provides the coordinates of the mask contour points in the format `(x, y)`. For more details, refer to the [Masks Section from Predict Mode](../modes/predict.md#masks).
- `.pop()` :: As `masks.xy` is a list containing a single element, this element is extracted using the `pop()` method.
- `.astype(np.int32)` :: Using `masks.xy` will return with a data type of `float32`, but this won't be compatible with the OpenCV `drawContours()` function, so this will change the data type to `int32` for compatibility.
- `.reshape(-1, 1, 2)` :: Reformats the data into the required shape of `[N, 1, 2]` where `N` is the number of contour points, with each point represented by a single entry `1`, and the entry is composed of `2` values. The `-1` denotes that the number of values along this dimension is flexible.
</details>
<p></p>
<details>
<summary> Expand for an explanation of the <code>drawContours()</code> configuration.</summary>
<p>
- Encapsulating the `contour` variable within square brackets, `[contour]`, was found to effectively generate the desired contour mask during testing.
- The value `-1` specified for the `drawContours()` parameter instructs the function to draw all contours present in the image.
- The `tuple` `(255, 255, 255)` represents the color white, which is the desired color for drawing the contour in this binary mask.
- The addition of `cv2.FILLED` will color all pixels enclosed by the contour boundary the same, in this case, all enclosed pixels will be white.
- See [OpenCV Documentation on `drawContours()`](https://docs.opencv.org/4.8.0/d6/d6e/group__imgproc__draw.html#ga746c0625f1781f1ffc9056259103edbc) for more information.
</details>
<p></p>
***
5. Next there are 2 options for how to move forward with the image from this point and a subsequent option for each.
### Object Isolation Options
!!! example
=== "Black Background Pixels"
```python
# Create 3-channel mask
mask3ch = cv2.cvtColor(b_mask, cv2.COLOR_GRAY2BGR)
# Isolate object with binary mask
isolated = cv2.bitwise_and(mask3ch, img)
```
??? question "How does this work?"
- First, the binary mask is first converted from a single-channel image to a three-channel image. This conversion is necessary for the subsequent step where the mask and the original image are combined. Both images must have the same number of channels to be compatible with the blending operation.
- The original image and the three-channel binary mask are merged using the OpenCV function `bitwise_and()`. This operation retains <u>only</u> pixel values that are greater than zero `(> 0)` from both images. Since the mask pixels are greater than zero `(> 0)` <u>only</u> within the contour region, the pixels remaining from the original image are those that overlap with the contour.
### Isolate with Black Pixels: Sub-options
??? info "Full-size Image"
There are no additional steps required if keeping full size image.
<figure markdown>
![Example Full size Isolated Object Image Black Background](https://github.com/ultralytics/docs/releases/download/0/full-size-isolated-object-black-background.avif){ width=240 }
<figcaption>Example full-size output</figcaption>
</figure>
??? info "Cropped object Image"
Additional steps required to crop image to only include object region.
![Example Crop Isolated Object Image Black Background](https://github.com/ultralytics/docs/releases/download/0/example-crop-isolated-object-image-black-background.avif){ align="right" }
```{ .py .annotate }
# (1) Bounding box coordinates
x1, y1, x2, y2 = c.boxes.xyxy.cpu().numpy().squeeze().astype(np.int32)
# Crop image to object region
iso_crop = isolated[y1:y2, x1:x2]
```
1. For more information on [bounding box](https://www.ultralytics.com/glossary/bounding-box) results, see [Boxes Section from Predict Mode](../modes/predict.md/#boxes)
??? question "What does this code do?"
- The `c.boxes.xyxy.cpu().numpy()` call retrieves the bounding boxes as a NumPy array in the `xyxy` format, where `xmin`, `ymin`, `xmax`, and `ymax` represent the coordinates of the bounding box rectangle. See [Boxes Section from Predict Mode](../modes/predict.md/#boxes) for more details.
- The `squeeze()` operation removes any unnecessary dimensions from the NumPy array, ensuring it has the expected shape.
- Converting the coordinate values using `.astype(np.int32)` changes the box coordinates data type from `float32` to `int32`, making them compatible for image cropping using index slices.
- Finally, the bounding box region is cropped from the image using index slicing. The bounds are defined by the `[ymin:ymax, xmin:xmax]` coordinates of the detection bounding box.
=== "Transparent Background Pixels"
```python
# Isolate object with transparent background (when saved as PNG)
isolated = np.dstack([img, b_mask])
```
??? question "How does this work?"
- Using the NumPy `dstack()` function (array stacking along depth-axis) in conjunction with the binary mask generated, will create an image with four channels. This allows for all pixels outside of the object contour to be transparent when saving as a `PNG` file.
### Isolate with Transparent Pixels: Sub-options
??? info "Full-size Image"
There are no additional steps required if keeping full size image.
<figure markdown>
![Example Full size Isolated Object Image No Background](https://github.com/ultralytics/docs/releases/download/0/example-full-size-isolated-object-image-no-background.avif){ width=240 }
<figcaption>Example full-size output + transparent background</figcaption>
</figure>
??? info "Cropped object Image"
Additional steps required to crop image to only include object region.
![Example Crop Isolated Object Image No Background](https://github.com/ultralytics/docs/releases/download/0/example-crop-isolated-object-image-no-background.avif){ align="right" }
```{ .py .annotate }
# (1) Bounding box coordinates
x1, y1, x2, y2 = c.boxes.xyxy.cpu().numpy().squeeze().astype(np.int32)
# Crop image to object region
iso_crop = isolated[y1:y2, x1:x2]
```
1. For more information on bounding box results, see [Boxes Section from Predict Mode](../modes/predict.md/#boxes)
??? question "What does this code do?"
- When using `c.boxes.xyxy.cpu().numpy()`, the bounding boxes are returned as a NumPy array, using the `xyxy` box coordinates format, which correspond to the points `xmin, ymin, xmax, ymax` for the bounding box (rectangle), see [Boxes Section from Predict Mode](../modes/predict.md/#boxes) for more information.
- Adding `squeeze()` ensures that any extraneous dimensions are removed from the NumPy array.
- Converting the coordinate values using `.astype(np.int32)` changes the box coordinates data type from `float32` to `int32` which will be compatible when cropping the image using index slices.
- Finally the image region for the bounding box is cropped using index slicing, where the bounds are set using the `[ymin:ymax, xmin:xmax]` coordinates of the detection bounding box.
??? question "What if I want the cropped object **including** the background?"
This is a built in feature for the Ultralytics library. See the `save_crop` argument for [Predict Mode Inference Arguments](../modes/predict.md/#inference-arguments) for details.
***
6. <u>What to do next is entirely left to you as the developer.</u> A basic example of one possible next step (saving the image to file for future use) is shown.
- **NOTE:** this step is optional and can be skipped if not required for your specific use case.
??? example "Example Final Step"
```python
# Save isolated object to file
_ = cv2.imwrite(f"{img_name}_{label}-{ci}.png", iso_crop)
```
- In this example, the `img_name` is the base-name of the source image file, `label` is the detected class-name, and `ci` is the index of the [object detection](https://www.ultralytics.com/glossary/object-detection) (in case of multiple instances with the same class name).
## Full Example code
Here, all steps from the previous section are combined into a single block of code. For repeated use, it would be optimal to define a function to do some or all commands contained in the `for`-loops, but that is an exercise left to the reader.
```{ .py .annotate }
from pathlib import Path
import cv2
import numpy as np
from ultralytics import YOLO
m = YOLO("yolo11n-seg.pt") # (4)!
res = m.predict() # (3)!
# Iterate detection results (5)
for r in res:
img = np.copy(r.orig_img)
img_name = Path(r.path).stem
# Iterate each object contour (6)
for ci, c in enumerate(r):
label = c.names[c.boxes.cls.tolist().pop()]
b_mask = np.zeros(img.shape[:2], np.uint8)
# Create contour mask (1)
contour = c.masks.xy.pop().astype(np.int32).reshape(-1, 1, 2)
_ = cv2.drawContours(b_mask, [contour], -1, (255, 255, 255), cv2.FILLED)
# Choose one:
# OPTION-1: Isolate object with black background
mask3ch = cv2.cvtColor(b_mask, cv2.COLOR_GRAY2BGR)
isolated = cv2.bitwise_and(mask3ch, img)
# OPTION-2: Isolate object with transparent background (when saved as PNG)
isolated = np.dstack([img, b_mask])
# OPTIONAL: detection crop (from either OPT1 or OPT2)
x1, y1, x2, y2 = c.boxes.xyxy.cpu().numpy().squeeze().astype(np.int32)
iso_crop = isolated[y1:y2, x1:x2]
# TODO your actions go here (2)
```
1. The line populating `contour` is combined into a single line here, where it was split to multiple above.
2. {==What goes here is up to you!==}
3. See [Predict Mode](../modes/predict.md) for additional information.
4. See [Segment Task](../tasks/segment.md#models) for more information.
5. Learn more about [Working with Results](../modes/predict.md#working-with-results)
6. Learn more about [Segmentation Mask Results](../modes/predict.md#masks)
## FAQ
### How do I isolate objects using Ultralytics YOLO11 for segmentation tasks?
To isolate objects using Ultralytics YOLO11, follow these steps:
1. **Load the model and run inference:**
```python
from ultralytics import YOLO
model = YOLO("yolo11n-seg.pt")
results = model.predict(source="path/to/your/image.jpg")
```
2. **Generate a binary mask and draw contours:**
```python
import cv2
import numpy as np
img = np.copy(results[0].orig_img)
b_mask = np.zeros(img.shape[:2], np.uint8)
contour = results[0].masks.xy[0].astype(np.int32).reshape(-1, 1, 2)
cv2.drawContours(b_mask, [contour], -1, (255, 255, 255), cv2.FILLED)
```
3. **Isolate the object using the binary mask:**
```python
mask3ch = cv2.cvtColor(b_mask, cv2.COLOR_GRAY2BGR)
isolated = cv2.bitwise_and(mask3ch, img)
```
Refer to the guide on [Predict Mode](../modes/predict.md) and the [Segment Task](../tasks/segment.md) for more information.
### What options are available for saving the isolated objects after segmentation?
Ultralytics YOLO11 offers two main options for saving isolated objects:
1. **With a Black Background:**
```python
mask3ch = cv2.cvtColor(b_mask, cv2.COLOR_GRAY2BGR)
isolated = cv2.bitwise_and(mask3ch, img)
```
2. **With a Transparent Background:**
```python
isolated = np.dstack([img, b_mask])
```
For further details, visit the [Predict Mode](../modes/predict.md) section.
### How can I crop isolated objects to their bounding boxes using Ultralytics YOLO11?
To crop isolated objects to their bounding boxes:
1. **Retrieve bounding box coordinates:**
```python
x1, y1, x2, y2 = results[0].boxes.xyxy[0].cpu().numpy().astype(np.int32)
```
2. **Crop the isolated image:**
```python
iso_crop = isolated[y1:y2, x1:x2]
```
Learn more about bounding box results in the [Predict Mode](../modes/predict.md#boxes) documentation.
### Why should I use Ultralytics YOLO11 for object isolation in segmentation tasks?
Ultralytics YOLO11 provides:
- **High-speed** real-time object detection and segmentation.
- **Accurate bounding box and mask generation** for precise object isolation.
- **Comprehensive documentation** and easy-to-use API for efficient development.
Explore the benefits of using YOLO in the [Segment Task documentation](../tasks/segment.md).
### Can I save isolated objects including the background using Ultralytics YOLO11?
Yes, this is a built-in feature in Ultralytics YOLO11. Use the `save_crop` argument in the `predict()` method. For example:
```python
results = model.predict(source="path/to/your/image.jpg", save_crop=True)
```
Read more about the `save_crop` argument in the [Predict Mode Inference Arguments](../modes/predict.md#inference-arguments) section.
---
comments: true
description: Learn to implement K-Fold Cross Validation for object detection datasets using Ultralytics YOLO. Improve your model's reliability and robustness.
keywords: Ultralytics, YOLO, K-Fold Cross Validation, object detection, sklearn, pandas, PyYaml, machine learning, dataset split
---
# K-Fold Cross Validation with Ultralytics
## Introduction
This comprehensive guide illustrates the implementation of K-Fold Cross Validation for [object detection](https://www.ultralytics.com/glossary/object-detection) datasets within the Ultralytics ecosystem. We'll leverage the YOLO detection format and key Python libraries such as sklearn, pandas, and PyYaml to guide you through the necessary setup, the process of generating feature vectors, and the execution of a K-Fold dataset split.
<p align="center">
<img width="800" src="https://github.com/ultralytics/docs/releases/download/0/k-fold-cross-validation-overview.avif" alt="K-Fold Cross Validation Overview">
</p>
Whether your project involves the Fruit Detection dataset or a custom data source, this tutorial aims to help you comprehend and apply K-Fold Cross Validation to bolster the reliability and robustness of your [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) models. While we're applying `k=5` folds for this tutorial, keep in mind that the optimal number of folds can vary depending on your dataset and the specifics of your project.
Without further ado, let's dive in!
## Setup
- Your annotations should be in the [YOLO detection format](../datasets/detect/index.md).
- This guide assumes that annotation files are locally available.
- For our demonstration, we use the [Fruit Detection](https://www.kaggle.com/datasets/lakshaytyagi01/fruit-detection/code) dataset.
- This dataset contains a total of 8479 images.
- It includes 6 class labels, each with its total instance counts listed below.
| Class Label | Instance Count |
| :---------- | :------------: |
| Apple | 7049 |
| Grapes | 7202 |
| Pineapple | 1613 |
| Orange | 15549 |
| Banana | 3536 |
| Watermelon | 1976 |
- Necessary Python packages include:
- `ultralytics`
- `sklearn`
- `pandas`
- `pyyaml`
- This tutorial operates with `k=5` folds. However, you should determine the best number of folds for your specific dataset.
1. Initiate a new Python virtual environment (`venv`) for your project and activate it. Use `pip` (or your preferred package manager) to install:
- The Ultralytics library: `pip install -U ultralytics`. Alternatively, you can clone the official [repo](https://github.com/ultralytics/ultralytics).
- Scikit-learn, pandas, and PyYAML: `pip install -U scikit-learn pandas pyyaml`.
2. Verify that your annotations are in the [YOLO detection format](../datasets/detect/index.md).
- For this tutorial, all annotation files are found in the `Fruit-Detection/labels` directory.
## Generating Feature Vectors for Object Detection Dataset
1. Start by creating a new `example.py` Python file for the steps below.
2. Proceed to retrieve all label files for your dataset.
```python
from pathlib import Path
dataset_path = Path("./Fruit-detection") # replace with 'path/to/dataset' for your custom data
labels = sorted(dataset_path.rglob("*labels/*.txt")) # all data in 'labels'
```
3. Now, read the contents of the dataset YAML file and extract the indices of the class labels.
```python
yaml_file = "path/to/data.yaml" # your data YAML with data directories and names dictionary
with open(yaml_file, "r", encoding="utf8") as y:
classes = yaml.safe_load(y)["names"]
cls_idx = sorted(classes.keys())
```
4. Initialize an empty `pandas` DataFrame.
```python
import pandas as pd
indx = [label.stem for label in labels] # uses base filename as ID (no extension)
labels_df = pd.DataFrame([], columns=cls_idx, index=indx)
```
5. Count the instances of each class-label present in the annotation files.
```python
from collections import Counter
for label in labels:
lbl_counter = Counter()
with open(label, "r") as lf:
lines = lf.readlines()
for line in lines:
# classes for YOLO label uses integer at first position of each line
lbl_counter[int(line.split(" ")[0])] += 1
labels_df.loc[label.stem] = lbl_counter
labels_df = labels_df.fillna(0.0) # replace `nan` values with `0.0`
```
6. The following is a sample view of the populated DataFrame:
```pandas
0 1 2 3 4 5
'0000a16e4b057580_jpg.rf.00ab48988370f64f5ca8ea4...' 0.0 0.0 0.0 0.0 0.0 7.0
'0000a16e4b057580_jpg.rf.7e6dce029fb67f01eb19aa7...' 0.0 0.0 0.0 0.0 0.0 7.0
'0000a16e4b057580_jpg.rf.bc4d31cdcbe229dd022957a...' 0.0 0.0 0.0 0.0 0.0 7.0
'00020ebf74c4881c_jpg.rf.508192a0a97aa6c4a3b6882...' 0.0 0.0 0.0 1.0 0.0 0.0
'00020ebf74c4881c_jpg.rf.5af192a2254c8ecc4188a25...' 0.0 0.0 0.0 1.0 0.0 0.0
... ... ... ... ... ... ...
'ff4cd45896de38be_jpg.rf.c4b5e967ca10c7ced3b9e97...' 0.0 0.0 0.0 0.0 0.0 2.0
'ff4cd45896de38be_jpg.rf.ea4c1d37d2884b3e3cbce08...' 0.0 0.0 0.0 0.0 0.0 2.0
'ff5fd9c3c624b7dc_jpg.rf.bb519feaa36fc4bf630a033...' 1.0 0.0 0.0 0.0 0.0 0.0
'ff5fd9c3c624b7dc_jpg.rf.f0751c9c3aa4519ea3c9d6a...' 1.0 0.0 0.0 0.0 0.0 0.0
'fffe28b31f2a70d4_jpg.rf.7ea16bd637ba0711c53b540...' 0.0 6.0 0.0 0.0 0.0 0.0
```
The rows index the label files, each corresponding to an image in your dataset, and the columns correspond to your class-label indices. Each row represents a pseudo feature-vector, with the count of each class-label present in your dataset. This data structure enables the application of K-Fold Cross Validation to an object detection dataset.
## K-Fold Dataset Split
1. Now we will use the `KFold` class from `sklearn.model_selection` to generate `k` splits of the dataset.
- Important:
- Setting `shuffle=True` ensures a randomized distribution of classes in your splits.
- By setting `random_state=M` where `M` is a chosen integer, you can obtain repeatable results.
```python
from sklearn.model_selection import KFold
ksplit = 5
kf = KFold(n_splits=ksplit, shuffle=True, random_state=20) # setting random_state for repeatable results
kfolds = list(kf.split(labels_df))
```
2. The dataset has now been split into `k` folds, each having a list of `train` and `val` indices. We will construct a DataFrame to display these results more clearly.
```python
folds = [f"split_{n}" for n in range(1, ksplit + 1)]
folds_df = pd.DataFrame(index=indx, columns=folds)
for idx, (train, val) in enumerate(kfolds, start=1):
folds_df[f"split_{idx}"].loc[labels_df.iloc[train].index] = "train"
folds_df[f"split_{idx}"].loc[labels_df.iloc[val].index] = "val"
```
3. Now we will calculate the distribution of class labels for each fold as a ratio of the classes present in `val` to those present in `train`.
```python
fold_lbl_distrb = pd.DataFrame(index=folds, columns=cls_idx)
for n, (train_indices, val_indices) in enumerate(kfolds, start=1):
train_totals = labels_df.iloc[train_indices].sum()
val_totals = labels_df.iloc[val_indices].sum()
# To avoid division by zero, we add a small value (1E-7) to the denominator
ratio = val_totals / (train_totals + 1e-7)
fold_lbl_distrb.loc[f"split_{n}"] = ratio
```
The ideal scenario is for all class ratios to be reasonably similar for each split and across classes. This, however, will be subject to the specifics of your dataset.
4. Next, we create the directories and dataset YAML files for each split.
```python
import datetime
supported_extensions = [".jpg", ".jpeg", ".png"]
# Initialize an empty list to store image file paths
images = []
# Loop through supported extensions and gather image files
for ext in supported_extensions:
images.extend(sorted((dataset_path / "images").rglob(f"*{ext}")))
# Create the necessary directories and dataset YAML files (unchanged)
save_path = Path(dataset_path / f"{datetime.date.today().isoformat()}_{ksplit}-Fold_Cross-val")
save_path.mkdir(parents=True, exist_ok=True)
ds_yamls = []
for split in folds_df.columns:
# Create directories
split_dir = save_path / split
split_dir.mkdir(parents=True, exist_ok=True)
(split_dir / "train" / "images").mkdir(parents=True, exist_ok=True)
(split_dir / "train" / "labels").mkdir(parents=True, exist_ok=True)
(split_dir / "val" / "images").mkdir(parents=True, exist_ok=True)
(split_dir / "val" / "labels").mkdir(parents=True, exist_ok=True)
# Create dataset YAML files
dataset_yaml = split_dir / f"{split}_dataset.yaml"
ds_yamls.append(dataset_yaml)
with open(dataset_yaml, "w") as ds_y:
yaml.safe_dump(
{
"path": split_dir.as_posix(),
"train": "train",
"val": "val",
"names": classes,
},
ds_y,
)
```
5. Lastly, copy images and labels into the respective directory ('train' or 'val') for each split.
- **NOTE:** The time required for this portion of the code will vary based on the size of your dataset and your system hardware.
```python
import shutil
for image, label in zip(images, labels):
for split, k_split in folds_df.loc[image.stem].items():
# Destination directory
img_to_path = save_path / split / k_split / "images"
lbl_to_path = save_path / split / k_split / "labels"
# Copy image and label files to new directory (SamefileError if file already exists)
shutil.copy(image, img_to_path / image.name)
shutil.copy(label, lbl_to_path / label.name)
```
## Save Records (Optional)
Optionally, you can save the records of the K-Fold split and label distribution DataFrames as CSV files for future reference.
```python
folds_df.to_csv(save_path / "kfold_datasplit.csv")
fold_lbl_distrb.to_csv(save_path / "kfold_label_distribution.csv")
```
## Train YOLO using K-Fold Data Splits
1. First, load the YOLO model.
```python
from ultralytics import YOLO
weights_path = "path/to/weights.pt"
model = YOLO(weights_path, task="detect")
```
2. Next, iterate over the dataset YAML files to run training. The results will be saved to a directory specified by the `project` and `name` arguments. By default, this directory is 'exp/runs#' where # is an integer index.
```python
results = {}
# Define your additional arguments here
batch = 16
project = "kfold_demo"
epochs = 100
for k in range(ksplit):
dataset_yaml = ds_yamls[k]
model = YOLO(weights_path, task="detect")
model.train(data=dataset_yaml, epochs=epochs, batch=batch, project=project) # include any train arguments
results[k] = model.metrics # save output metrics for further analysis
```
## Conclusion
In this guide, we have explored the process of using K-Fold cross-validation for training the YOLO object detection model. We learned how to split our dataset into K partitions, ensuring a balanced class distribution across the different folds.
We also explored the procedure for creating report DataFrames to visualize the data splits and label distributions across these splits, providing us a clear insight into the structure of our training and validation sets.
Optionally, we saved our records for future reference, which could be particularly useful in large-scale projects or when troubleshooting model performance.
Finally, we implemented the actual model training using each split in a loop, saving our training results for further analysis and comparison.
This technique of K-Fold cross-validation is a robust way of making the most out of your available data, and it helps to ensure that your model performance is reliable and consistent across different data subsets. This results in a more generalizable and reliable model that is less likely to overfit to specific data patterns.
Remember that although we used YOLO in this guide, these steps are mostly transferable to other machine learning models. Understanding these steps allows you to apply cross-validation effectively in your own machine learning projects. Happy coding!
## FAQ
### What is K-Fold Cross Validation and why is it useful in object detection?
K-Fold Cross Validation is a technique where the dataset is divided into 'k' subsets (folds) to evaluate model performance more reliably. Each fold serves as both training and [validation data](https://www.ultralytics.com/glossary/validation-data). In the context of object detection, using K-Fold Cross Validation helps to ensure your Ultralytics YOLO model's performance is robust and generalizable across different data splits, enhancing its reliability. For detailed instructions on setting up K-Fold Cross Validation with Ultralytics YOLO, refer to [K-Fold Cross Validation with Ultralytics](#introduction).
### How do I implement K-Fold Cross Validation using Ultralytics YOLO?
To implement K-Fold Cross Validation with Ultralytics YOLO, you need to follow these steps:
1. Verify annotations are in the [YOLO detection format](../datasets/detect/index.md).
2. Use Python libraries like `sklearn`, `pandas`, and `pyyaml`.
3. Create feature vectors from your dataset.
4. Split your dataset using `KFold` from `sklearn.model_selection`.
5. Train the YOLO model on each split.
For a comprehensive guide, see the [K-Fold Dataset Split](#k-fold-dataset-split) section in our documentation.
### Why should I use Ultralytics YOLO for object detection?
Ultralytics YOLO offers state-of-the-art, real-time object detection with high [accuracy](https://www.ultralytics.com/glossary/accuracy) and efficiency. It's versatile, supporting multiple [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) tasks such as detection, segmentation, and classification. Additionally, it integrates seamlessly with tools like Ultralytics HUB for no-code model training and deployment. For more details, explore the benefits and features on our [Ultralytics YOLO page](https://www.ultralytics.com/yolo).
### How can I ensure my annotations are in the correct format for Ultralytics YOLO?
Your annotations should follow the YOLO detection format. Each annotation file must list the object class, alongside its [bounding box](https://www.ultralytics.com/glossary/bounding-box) coordinates in the image. The YOLO format ensures streamlined and standardized data processing for training object detection models. For more information on proper annotation formatting, visit the [YOLO detection format guide](../datasets/detect/index.md).
### Can I use K-Fold Cross Validation with custom datasets other than Fruit Detection?
Yes, you can use K-Fold Cross Validation with any custom dataset as long as the annotations are in the YOLO detection format. Replace the dataset paths and class labels with those specific to your custom dataset. This flexibility ensures that any object detection project can benefit from robust model evaluation using K-Fold Cross Validation. For a practical example, review our [Generating Feature Vectors](#generating-feature-vectors-for-object-detection-dataset) section.
---
comments: true
description: Learn about YOLO11's diverse deployment options to maximize your model's performance. Explore PyTorch, TensorRT, OpenVINO, TF Lite, and more!.
keywords: YOLO11, deployment options, export formats, PyTorch, TensorRT, OpenVINO, TF Lite, machine learning, model deployment
---
# Understanding YOLO11's Deployment Options
## Introduction
You've come a long way on your journey with YOLO11. You've diligently collected data, meticulously annotated it, and put in the hours to train and rigorously evaluate your custom YOLO11 model. Now, it's time to put your model to work for your specific application, use case, or project. But there's a critical decision that stands before you: how to export and deploy your model effectively.
This guide walks you through YOLO11's deployment options and the essential factors to consider to choose the right option for your project.
## How to Select the Right Deployment Option for Your YOLO11 Model
When it's time to deploy your YOLO11 model, selecting a suitable export format is very important. As outlined in the [Ultralytics YOLO11 Modes documentation](../modes/export.md#usage-examples), the model.export() function allows for converting your trained model into a variety of formats tailored to diverse environments and performance requirements.
The ideal format depends on your model's intended operational context, balancing speed, hardware constraints, and ease of integration. In the following section, we'll take a closer look at each export option, understanding when to choose each one.
### YOLO11's Deployment Options
Let's walk through the different YOLO11 deployment options. For a detailed walkthrough of the export process, visit the [Ultralytics documentation page on exporting](../modes/export.md).
#### PyTorch
PyTorch is an open-source machine learning library widely used for applications in [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) and [artificial intelligence](https://www.ultralytics.com/glossary/artificial-intelligence-ai). It provides a high level of flexibility and speed, which has made it a favorite among researchers and developers.
- **Performance Benchmarks**: PyTorch is known for its ease of use and flexibility, which may result in a slight trade-off in raw performance when compared to other frameworks that are more specialized and optimized.
- **Compatibility and Integration**: Offers excellent compatibility with various data science and machine learning libraries in Python.
- **Community Support and Ecosystem**: One of the most vibrant communities, with extensive resources for learning and troubleshooting.
- **Case Studies**: Commonly used in research prototypes, many academic papers reference models deployed in PyTorch.
- **Maintenance and Updates**: Regular updates with active development and support for new features.
- **Security Considerations**: Regular patches for security issues, but security is largely dependent on the overall environment it's deployed in.
- **Hardware Acceleration**: Supports CUDA for GPU acceleration, essential for speeding up model training and inference.
#### TorchScript
TorchScript extends PyTorch's capabilities by allowing the exportation of models to be run in a C++ runtime environment. This makes it suitable for production environments where Python is unavailable.
- **Performance Benchmarks**: Can offer improved performance over native PyTorch, especially in production environments.
- **Compatibility and Integration**: Designed for seamless transition from PyTorch to C++ production environments, though some advanced features might not translate perfectly.
- **Community Support and Ecosystem**: Benefits from PyTorch's large community but has a narrower scope of specialized developers.
- **Case Studies**: Widely used in industry settings where Python's performance overhead is a bottleneck.
- **Maintenance and Updates**: Maintained alongside PyTorch with consistent updates.
- **Security Considerations**: Offers improved security by enabling the running of models in environments without full Python installations.
- **Hardware Acceleration**: Inherits PyTorch's CUDA support, ensuring efficient GPU utilization.
#### ONNX
The Open [Neural Network](https://www.ultralytics.com/glossary/neural-network-nn) Exchange (ONNX) is a format that allows for model interoperability across different frameworks, which can be critical when deploying to various platforms.
- **Performance Benchmarks**: ONNX models may experience a variable performance depending on the specific runtime they are deployed on.
- **Compatibility and Integration**: High interoperability across multiple platforms and hardware due to its framework-agnostic nature.
- **Community Support and Ecosystem**: Supported by many organizations, leading to a broad ecosystem and a variety of tools for optimization.
- **Case Studies**: Frequently used to move models between different machine learning frameworks, demonstrating its flexibility.
- **Maintenance and Updates**: As an open standard, ONNX is regularly updated to support new operations and models.
- **Security Considerations**: As with any cross-platform tool, it's essential to ensure secure practices in the conversion and deployment pipeline.
- **Hardware Acceleration**: With ONNX Runtime, models can leverage various hardware optimizations.
#### OpenVINO
OpenVINO is an Intel toolkit designed to facilitate the deployment of deep learning models across Intel hardware, enhancing performance and speed.
- **Performance Benchmarks**: Specifically optimized for Intel CPUs, GPUs, and VPUs, offering significant performance boosts on compatible hardware.
- **Compatibility and Integration**: Works best within the Intel ecosystem but also supports a range of other platforms.
- **Community Support and Ecosystem**: Backed by Intel, with a solid user base especially in the [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) domain.
- **Case Studies**: Often utilized in IoT and [edge computing](https://www.ultralytics.com/glossary/edge-computing) scenarios where Intel hardware is prevalent.
- **Maintenance and Updates**: Intel regularly updates OpenVINO to support the latest deep learning models and Intel hardware.
- **Security Considerations**: Provides robust security features suitable for deployment in sensitive applications.
- **Hardware Acceleration**: Tailored for acceleration on Intel hardware, leveraging dedicated instruction sets and hardware features.
For more details on deployment using OpenVINO, refer to the Ultralytics Integration documentation: [Intel OpenVINO Export](../integrations/openvino.md).
#### TensorRT
TensorRT is a high-performance deep learning inference optimizer and runtime from NVIDIA, ideal for applications needing speed and efficiency.
- **Performance Benchmarks**: Delivers top-tier performance on NVIDIA GPUs with support for high-speed inference.
- **Compatibility and Integration**: Best suited for NVIDIA hardware, with limited support outside this environment.
- **Community Support and Ecosystem**: Strong support network through NVIDIA's developer forums and documentation.
- **Case Studies**: Widely adopted in industries requiring real-time inference on video and image data.
- **Maintenance and Updates**: NVIDIA maintains TensorRT with frequent updates to enhance performance and support new GPU architectures.
- **Security Considerations**: Like many NVIDIA products, it has a strong emphasis on security, but specifics depend on the deployment environment.
- **Hardware Acceleration**: Exclusively designed for NVIDIA GPUs, providing deep optimization and acceleration.
#### CoreML
CoreML is Apple's machine learning framework, optimized for on-device performance in the Apple ecosystem, including iOS, macOS, watchOS, and tvOS.
- **Performance Benchmarks**: Optimized for on-device performance on Apple hardware with minimal battery usage.
- **Compatibility and Integration**: Exclusively for Apple's ecosystem, providing a streamlined workflow for iOS and macOS applications.
- **Community Support and Ecosystem**: Strong support from Apple and a dedicated developer community, with extensive documentation and tools.
- **Case Studies**: Commonly used in applications that require on-device machine learning capabilities on Apple products.
- **Maintenance and Updates**: Regularly updated by Apple to support the latest machine learning advancements and Apple hardware.
- **Security Considerations**: Benefits from Apple's focus on user privacy and [data security](https://www.ultralytics.com/glossary/data-security).
- **Hardware Acceleration**: Takes full advantage of Apple's neural engine and GPU for accelerated machine learning tasks.
#### TF SavedModel
TF SavedModel is TensorFlow's format for saving and serving machine learning models, particularly suited for scalable server environments.
- **Performance Benchmarks**: Offers scalable performance in server environments, especially when used with TensorFlow Serving.
- **Compatibility and Integration**: Wide compatibility across TensorFlow's ecosystem, including cloud and enterprise server deployments.
- **Community Support and Ecosystem**: Large community support due to TensorFlow's popularity, with a vast array of tools for deployment and optimization.
- **Case Studies**: Extensively used in production environments for serving deep learning models at scale.
- **Maintenance and Updates**: Supported by Google and the TensorFlow community, ensuring regular updates and new features.
- **Security Considerations**: Deployment using TensorFlow Serving includes robust security features for enterprise-grade applications.
- **Hardware Acceleration**: Supports various hardware accelerations through TensorFlow's backends.
#### TF GraphDef
TF GraphDef is a TensorFlow format that represents the model as a graph, which is beneficial for environments where a static computation graph is required.
- **Performance Benchmarks**: Provides stable performance for static computation graphs, with a focus on consistency and reliability.
- **Compatibility and Integration**: Easily integrates within TensorFlow's infrastructure but less flexible compared to SavedModel.
- **Community Support and Ecosystem**: Good support from TensorFlow's ecosystem, with many resources available for optimizing static graphs.
- **Case Studies**: Useful in scenarios where a static graph is necessary, such as in certain embedded systems.
- **Maintenance and Updates**: Regular updates alongside TensorFlow's core updates.
- **Security Considerations**: Ensures safe deployment with TensorFlow's established security practices.
- **Hardware Acceleration**: Can utilize TensorFlow's hardware acceleration options, though not as flexible as SavedModel.
#### TF Lite
TF Lite is TensorFlow's solution for mobile and embedded device machine learning, providing a lightweight library for on-device inference.
- **Performance Benchmarks**: Designed for speed and efficiency on mobile and embedded devices.
- **Compatibility and Integration**: Can be used on a wide range of devices due to its lightweight nature.
- **Community Support and Ecosystem**: Backed by Google, it has a robust community and a growing number of resources for developers.
- **Case Studies**: Popular in mobile applications that require on-device inference with minimal footprint.
- **Maintenance and Updates**: Regularly updated to include the latest features and optimizations for mobile devices.
- **Security Considerations**: Provides a secure environment for running models on end-user devices.
- **Hardware Acceleration**: Supports a variety of hardware acceleration options, including GPU and DSP.
#### TF Edge TPU
TF Edge TPU is designed for high-speed, efficient computing on Google's Edge TPU hardware, perfect for IoT devices requiring real-time processing.
- **Performance Benchmarks**: Specifically optimized for high-speed, efficient computing on Google's Edge TPU hardware.
- **Compatibility and Integration**: Works exclusively with TensorFlow Lite models on Edge TPU devices.
- **Community Support and Ecosystem**: Growing support with resources provided by Google and third-party developers.
- **Case Studies**: Used in IoT devices and applications that require real-time processing with low latency.
- **Maintenance and Updates**: Continually improved upon to leverage the capabilities of new Edge TPU hardware releases.
- **Security Considerations**: Integrates with Google's robust security for IoT and edge devices.
- **Hardware Acceleration**: Custom-designed to take full advantage of Google Coral devices.
#### TF.js
TensorFlow.js (TF.js) is a library that brings machine learning capabilities directly to the browser, offering a new realm of possibilities for web developers and users alike. It allows for the integration of machine learning models in web applications without the need for back-end infrastructure.
- **Performance Benchmarks**: Enables machine learning directly in the browser with reasonable performance, depending on the client device.
- **Compatibility and Integration**: High compatibility with web technologies, allowing for easy integration into web applications.
- **Community Support and Ecosystem**: Support from a community of web and Node.js developers, with a variety of tools for deploying ML models in browsers.
- **Case Studies**: Ideal for interactive web applications that benefit from client-side machine learning without the need for server-side processing.
- **Maintenance and Updates**: Maintained by the TensorFlow team with contributions from the open-source community.
- **Security Considerations**: Runs within the browser's secure context, utilizing the security model of the web platform.
- **Hardware Acceleration**: Performance can be enhanced with web-based APIs that access hardware acceleration like WebGL.
#### PaddlePaddle
PaddlePaddle is an open-source deep learning framework developed by Baidu. It is designed to be both efficient for researchers and easy to use for developers. It's particularly popular in China and offers specialized support for Chinese language processing.
- **Performance Benchmarks**: Offers competitive performance with a focus on ease of use and scalability.
- **Compatibility and Integration**: Well-integrated within Baidu's ecosystem and supports a wide range of applications.
- **Community Support and Ecosystem**: While the community is smaller globally, it's rapidly growing, especially in China.
- **Case Studies**: Commonly used in Chinese markets and by developers looking for alternatives to other major frameworks.
- **Maintenance and Updates**: Regularly updated with a focus on serving Chinese language AI applications and services.
- **Security Considerations**: Emphasizes [data privacy](https://www.ultralytics.com/glossary/data-privacy) and security, catering to Chinese data governance standards.
- **Hardware Acceleration**: Supports various hardware accelerations, including Baidu's own Kunlun chips.
#### NCNN
NCNN is a high-performance neural network inference framework optimized for the mobile platform. It stands out for its lightweight nature and efficiency, making it particularly well-suited for mobile and embedded devices where resources are limited.
- **Performance Benchmarks**: Highly optimized for mobile platforms, offering efficient inference on ARM-based devices.
- **Compatibility and Integration**: Suitable for applications on mobile phones and embedded systems with ARM architecture.
- **Community Support and Ecosystem**: Supported by a niche but active community focused on mobile and embedded ML applications.
- **Case Studies**: Favoured for mobile applications where efficiency and speed are critical on Android and other ARM-based systems.
- **Maintenance and Updates**: Continuously improved to maintain high performance on a range of ARM devices.
- **Security Considerations**: Focuses on running locally on the device, leveraging the inherent security of on-device processing.
- **Hardware Acceleration**: Tailored for ARM CPUs and GPUs, with specific optimizations for these architectures.
#### MNN
MNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models and has industry-leading performance for inference and training on-device. In addition, MNN is also used on embedded devices, such as IoT.
## Comparative Analysis of YOLO11 Deployment Options
The following table provides a snapshot of the various deployment options available for YOLO11 models, helping you to assess which may best fit your project needs based on several critical criteria. For an in-depth look at each deployment option's format, please see the [Ultralytics documentation page on export formats](../modes/export.md#export-formats).
| Deployment Option | Performance Benchmarks | Compatibility and Integration | Community Support and Ecosystem | Case Studies | Maintenance and Updates | Security Considerations | Hardware Acceleration |
| ----------------- | ----------------------------------------------- | ---------------------------------------------- | --------------------------------------------- | ------------------------------------------ | ---------------------------------------------- | ------------------------------------------------- | ---------------------------------- |
| PyTorch | Good flexibility; may trade off raw performance | Excellent with Python libraries | Extensive resources and community | Research and prototypes | Regular, active development | Dependent on deployment environment | CUDA support for GPU acceleration |
| TorchScript | Better for production than PyTorch | Smooth transition from PyTorch to C++ | Specialized but narrower than PyTorch | Industry where Python is a bottleneck | Consistent updates with PyTorch | Improved security without full Python | Inherits CUDA support from PyTorch |
| ONNX | Variable depending on runtime | High across different frameworks | Broad ecosystem, supported by many orgs | Flexibility across ML frameworks | Regular updates for new operations | Ensure secure conversion and deployment practices | Various hardware optimizations |
| OpenVINO | Optimized for Intel hardware | Best within Intel ecosystem | Solid in computer vision domain | IoT and edge with Intel hardware | Regular updates for Intel hardware | Robust features for sensitive applications | Tailored for Intel hardware |
| TensorRT | Top-tier on NVIDIA GPUs | Best for NVIDIA hardware | Strong network through NVIDIA | Real-time video and image inference | Frequent updates for new GPUs | Emphasis on security | Designed for NVIDIA GPUs |
| CoreML | Optimized for on-device Apple hardware | Exclusive to Apple ecosystem | Strong Apple and developer support | On-device ML on Apple products | Regular Apple updates | Focus on privacy and security | Apple neural engine and GPU |
| TF SavedModel | Scalable in server environments | Wide compatibility in TensorFlow ecosystem | Large support due to TensorFlow popularity | Serving models at scale | Regular updates by Google and community | Robust features for enterprise | Various hardware accelerations |
| TF GraphDef | Stable for static computation graphs | Integrates well with TensorFlow infrastructure | Resources for optimizing static graphs | Scenarios requiring static graphs | Updates alongside TensorFlow core | Established TensorFlow security practices | TensorFlow acceleration options |
| TF Lite | Speed and efficiency on mobile/embedded | Wide range of device support | Robust community, Google backed | Mobile applications with minimal footprint | Latest features for mobile | Secure environment on end-user devices | GPU and DSP among others |
| TF Edge TPU | Optimized for Google's Edge TPU hardware | Exclusive to Edge TPU devices | Growing with Google and third-party resources | IoT devices requiring real-time processing | Improvements for new Edge TPU hardware | Google's robust IoT security | Custom-designed for Google Coral |
| TF.js | Reasonable in-browser performance | High with web technologies | Web and Node.js developers support | Interactive web applications | TensorFlow team and community contributions | Web platform security model | Enhanced with WebGL and other APIs |
| PaddlePaddle | Competitive, easy to use and scalable | Baidu ecosystem, wide application support | Rapidly growing, especially in China | Chinese market and language processing | Focus on Chinese AI applications | Emphasizes data privacy and security | Including Baidu's Kunlun chips |
| MNN | High-performance for mobile devices. | Mobile and embedded ARM systems and X86-64 CPU | Mobile/embedded ML community | Moblile systems efficiency | High performance maintenance on Mobile Devices | On-device security advantages | ARM CPUs and GPUs optimizations |
| NCNN | Optimized for mobile ARM-based devices | Mobile and embedded ARM systems | Niche but active mobile/embedded ML community | Android and ARM systems efficiency | High performance maintenance on ARM | On-device security advantages | ARM CPUs and GPUs optimizations |
This comparative analysis gives you a high-level overview. For deployment, it's essential to consider the specific requirements and constraints of your project, and consult the detailed documentation and resources available for each option.
## Community and Support
When you're getting started with YOLO11, having a helpful community and support can make a significant impact. Here's how to connect with others who share your interests and get the assistance you need.
### Engage with the Broader Community
- **GitHub Discussions:** The YOLO11 repository on GitHub has a "Discussions" section where you can ask questions, report issues, and suggest improvements.
- **Ultralytics Discord Server:** Ultralytics has a [Discord server](https://discord.com/invite/ultralytics) where you can interact with other users and developers.
### Official Documentation and Resources
- **Ultralytics YOLO11 Docs:** The [official documentation](../index.md) provides a comprehensive overview of YOLO11, along with guides on installation, usage, and troubleshooting.
These resources will help you tackle challenges and stay updated on the latest trends and best practices in the YOLO11 community.
## Conclusion
In this guide, we've explored the different deployment options for YOLO11. We've also discussed the important factors to consider when making your choice. These options allow you to customize your model for various environments and performance requirements, making it suitable for real-world applications.
Don't forget that the YOLO11 and Ultralytics community is a valuable source of help. Connect with other developers and experts to learn unique tips and solutions you might not find in regular documentation. Keep seeking knowledge, exploring new ideas, and sharing your experiences.
Happy deploying!
## FAQ
### What are the deployment options available for YOLO11 on different hardware platforms?
Ultralytics YOLO11 supports various deployment formats, each designed for specific environments and hardware platforms. Key formats include:
- **PyTorch** for research and prototyping, with excellent Python integration.
- **TorchScript** for production environments where Python is unavailable.
- **ONNX** for cross-platform compatibility and hardware acceleration.
- **OpenVINO** for optimized performance on Intel hardware.
- **TensorRT** for high-speed inference on NVIDIA GPUs.
Each format has unique advantages. For a detailed walkthrough, see our [export process documentation](../modes/export.md#usage-examples).
### How do I improve the inference speed of my YOLO11 model on an Intel CPU?
To enhance inference speed on Intel CPUs, you can deploy your YOLO11 model using Intel's OpenVINO toolkit. OpenVINO offers significant performance boosts by optimizing models to leverage Intel hardware efficiently.
1. Convert your YOLO11 model to the OpenVINO format using the `model.export()` function.
2. Follow the detailed setup guide in the [Intel OpenVINO Export documentation](../integrations/openvino.md).
For more insights, check out our [blog post](https://www.ultralytics.com/blog/achieve-faster-inference-speeds-ultralytics-yolov8-openvino).
### Can I deploy YOLO11 models on mobile devices?
Yes, YOLO11 models can be deployed on mobile devices using [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) Lite (TF Lite) for both Android and iOS platforms. TF Lite is designed for mobile and embedded devices, providing efficient on-device inference.
!!! example
=== "Python"
```python
# Export command for TFLite format
model.export(format="tflite")
```
=== "CLI"
```bash
# CLI command for TFLite export
yolo export --format tflite
```
For more details on deploying models to mobile, refer to our [TF Lite integration guide](../integrations/tflite.md).
### What factors should I consider when choosing a deployment format for my YOLO11 model?
When choosing a deployment format for YOLO11, consider the following factors:
- **Performance**: Some formats like TensorRT provide exceptional speeds on NVIDIA GPUs, while OpenVINO is optimized for Intel hardware.
- **Compatibility**: ONNX offers broad compatibility across different platforms.
- **Ease of Integration**: Formats like CoreML or TF Lite are tailored for specific ecosystems like iOS and Android, respectively.
- **Community Support**: Formats like [PyTorch](https://www.ultralytics.com/glossary/pytorch) and TensorFlow have extensive community resources and support.
For a comparative analysis, refer to our [export formats documentation](../modes/export.md#export-formats).
### How can I deploy YOLO11 models in a web application?
To deploy YOLO11 models in a web application, you can use TensorFlow.js (TF.js), which allows for running [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) models directly in the browser. This approach eliminates the need for backend infrastructure and provides real-time performance.
1. Export the YOLO11 model to the TF.js format.
2. Integrate the exported model into your web application.
For step-by-step instructions, refer to our guide on [TensorFlow.js integration](../integrations/tfjs.md).
---
comments: true
description: Learn essential tips, insights, and best practices for deploying computer vision models with a focus on efficiency, optimization, troubleshooting, and maintaining security.
keywords: Model Deployment, Machine Learning Model Deployment, ML Model Deployment, AI Model Deployment, How to Deploy a Machine Learning Model, How to Deploy ML Models
---
# Best Practices for [Model Deployment](https://www.ultralytics.com/glossary/model-deployment)
## Introduction
Model deployment is the [step in a computer vision project](./steps-of-a-cv-project.md) that brings a model from the development phase into a real-world application. There are various [model deployment options](./model-deployment-options.md): cloud deployment offers scalability and ease of access, edge deployment reduces latency by bringing the model closer to the data source, and local deployment ensures privacy and control. Choosing the right strategy depends on your application's needs, balancing speed, security, and scalability.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/Tt_35YnQ9uk"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> How to Optimize and Deploy AI Models: Best Practices, Troubleshooting, and Security Considerations
</p>
It's also important to follow best practices when deploying a model because deployment can significantly impact the effectiveness and reliability of the model's performance. In this guide, we'll focus on how to make sure that your model deployment is smooth, efficient, and secure.
## Model Deployment Options
Often times, once a model is [trained](./model-training-tips.md), [evaluated](./model-evaluation-insights.md), and [tested](./model-testing.md), it needs to be converted into specific formats to be deployed effectively in various environments, such as cloud, edge, or local devices.
With respect to YOLO11, you can [export your model](../modes/export.md) to different formats. For example, when you need to transfer your model between different frameworks, ONNX is an excellent tool and [exporting to YOLO11 to ONNX](../integrations/onnx.md) is easy. You can check out more options about integrating your model into different environments smoothly and effectively [here](../integrations/index.md).
### Choosing a Deployment Environment
Choosing where to deploy your [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) model depends on multiple factors. Different environments have unique benefits and challenges, so it's essential to pick the one that best fits your needs.
#### Cloud Deployment
Cloud deployment is great for applications that need to scale up quickly and handle large amounts of data. Platforms like AWS, [Google Cloud](../yolov5/environments/google_cloud_quickstart_tutorial.md), and Azure make it easy to manage your models from training to deployment. They offer services like [AWS SageMaker](../integrations/amazon-sagemaker.md), Google AI Platform, and [Azure Machine Learning](./azureml-quickstart.md) to help you throughout the process.
However, using the cloud can be expensive, especially with high data usage, and you might face latency issues if your users are far from the data centers. To manage costs and performance, it's important to optimize resource use and ensure compliance with [data privacy](https://www.ultralytics.com/glossary/data-privacy) rules.
#### Edge Deployment
Edge deployment works well for applications needing real-time responses and low latency, particularly in places with limited or no internet access. Deploying models on edge devices like smartphones or IoT gadgets ensures fast processing and keeps data local, which enhances privacy. Deploying on edge also saves bandwidth due to reduced data sent to the cloud.
However, edge devices often have limited processing power, so you'll need to optimize your models. Tools like [TensorFlow Lite](../integrations/tflite.md) and [NVIDIA Jetson](./nvidia-jetson.md) can help. Despite the benefits, maintaining and updating many devices can be challenging.
#### Local Deployment
Local Deployment is best when data privacy is critical or when there's unreliable or no internet access. Running models on local servers or desktops gives you full control and keeps your data secure. It can also reduce latency if the server is near the user.
However, scaling locally can be tough, and maintenance can be time-consuming. Using tools like [Docker](./docker-quickstart.md) for containerization and Kubernetes for management can help make local deployments more efficient. Regular updates and maintenance are necessary to keep everything running smoothly.
## Model Optimization Techniques
Optimizing your computer vision model helps it runs efficiently, especially when deploying in environments with limited resources like edge devices. Here are some key techniques for optimizing your model.
### Model Pruning
Pruning reduces the size of the model by removing weights that contribute little to the final output. It makes the model smaller and faster without significantly affecting accuracy. Pruning involves identifying and eliminating unnecessary parameters, resulting in a lighter model that requires less computational power. It is particularly useful for deploying models on devices with limited resources.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/model-pruning-overview.avif" alt="Model Pruning Overview">
</p>
### Model Quantization
Quantization converts the model's weights and activations from high [precision](https://www.ultralytics.com/glossary/precision) (like 32-bit floats) to lower precision (like 8-bit integers). By reducing the model size, it speeds up inference. Quantization-aware training (QAT) is a method where the model is trained with quantization in mind, preserving accuracy better than post-training quantization. By handling quantization during the training phase, the model learns to adjust to lower precision, maintaining performance while reducing computational demands.
<p align="center">
<img width="100%" src="https://miro.medium.com/v2/resize:fit:1032/format:webp/1*Jlq_cyLvRdmp_K5jCd3LkA.png" alt="Model Quantization Overview">
</p>
### Knowledge Distillation
Knowledge distillation involves training a smaller, simpler model (the student) to mimic the outputs of a larger, more complex model (the teacher). The student model learns to approximate the teacher's predictions, resulting in a compact model that retains much of the teacher's [accuracy](https://www.ultralytics.com/glossary/accuracy). This technique is beneficial for creating efficient models suitable for deployment on edge devices with constrained resources.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/knowledge-distillation-overview.avif" alt="Knowledge Distillation Overview">
</p>
## Troubleshooting Deployment Issues
You may face challenges while deploying your computer vision models, but understanding common problems and solutions can make the process smoother. Here are some general troubleshooting tips and best practices to help you navigate deployment issues.
### Your Model is Less Accurate After Deployment
Experiencing a drop in your model's accuracy after deployment can be frustrating. This issue can stem from various factors. Here are some steps to help you identify and resolve the problem:
- **Check Data Consistency:** Check that the data your model is processing post-deployment is consistent with the data it was trained on. Differences in data distribution, quality, or format can significantly impact performance.
- **Validate Preprocessing Steps:** Verify that all preprocessing steps applied during training are also applied consistently during deployment. This includes resizing images, normalizing pixel values, and other data transformations.
- **Evaluate the Model's Environment:** Ensure that the hardware and software configurations used during deployment match those used during training. Differences in libraries, versions, and hardware capabilities can introduce discrepancies.
- **Monitor Model Inference:** Log inputs and outputs at various stages of the inference pipeline to detect any anomalies. It can help identify issues like data corruption or improper handling of model outputs.
- **Review Model Export and Conversion:** Re-export the model and make sure that the conversion process maintains the integrity of the model weights and architecture.
- **Test with a Controlled Dataset:** Deploy the model in a test environment with a dataset you control and compare the results with the training phase. You can identify if the issue is with the deployment environment or the data.
When deploying YOLO11, several factors can affect model accuracy. Converting models to formats like [TensorRT](../integrations/tensorrt.md) involves optimizations such as weight quantization and layer fusion, which can cause minor precision losses. Using FP16 (half-precision) instead of FP32 (full-precision) can speed up inference but may introduce numerical precision errors. Also, hardware constraints, like those on the [Jetson Nano](./nvidia-jetson.md), with lower CUDA core counts and reduced memory bandwidth, can impact performance.
### Inferences Are Taking Longer Than You Expected
When deploying [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) models, it's important that they run efficiently. If inferences are taking longer than expected, it can affect the user experience and the effectiveness of your application. Here are some steps to help you identify and resolve the problem:
- **Implement Warm-Up Runs**: Initial runs often include setup overhead, which can skew latency measurements. Perform a few warm-up inferences before measuring latency. Excluding these initial runs provides a more accurate measurement of the model's performance.
- **Optimize the Inference Engine:** Double-check that the inference engine is fully optimized for your specific GPU architecture. Use the latest drivers and software versions tailored to your hardware to ensure maximum performance and compatibility.
- **Use Asynchronous Processing:** Asynchronous processing can help manage workloads more efficiently. Use asynchronous processing techniques to handle multiple inferences concurrently, which can help distribute the load and reduce wait times.
- **Profile the Inference Pipeline:** Identifying bottlenecks in the inference pipeline can help pinpoint the source of delays. Use profiling tools to analyze each step of the inference process, identifying and addressing any stages that cause significant delays, such as inefficient layers or data transfer issues.
- **Use Appropriate Precision:** Using higher precision than necessary can slow down inference times. Experiment with using lower precision, such as FP16 (half-precision), instead of FP32 (full-precision). While FP16 can reduce inference time, also keep in mind that it can impact model accuracy.
If you are facing this issue while deploying YOLO11, consider that YOLO11 offers [various model sizes](../models/yolo11.md), such as YOLO11n (nano) for devices with lower memory capacity and YOLO11x (extra-large) for more powerful GPUs. Choosing the right model variant for your hardware can help balance memory usage and processing time.
Also keep in mind that the size of the input images directly impacts memory usage and processing time. Lower resolutions reduce memory usage and speed up inference, while higher resolutions improve accuracy but require more memory and processing power.
## Security Considerations in Model Deployment
Another important aspect of deployment is security. The security of your deployed models is critical to protect sensitive data and intellectual property. Here are some best practices you can follow related to secure model deployment.
### Secure Data Transmission
Making sure data sent between clients and servers is secure is very important to prevent it from being intercepted or accessed by unauthorized parties. You can use encryption protocols like TLS (Transport Layer Security) to encrypt data while it's being transmitted. Even if someone intercepts the data, they won't be able to read it. You can also use end-to-end encryption that protects the data all the way from the source to the destination, so no one in between can access it.
### Access Controls
It's essential to control who can access your model and its data to prevent unauthorized use. Use strong authentication methods to verify the identity of users or systems trying to access the model, and consider adding extra security with multi-factor authentication (MFA). Set up role-based access control (RBAC) to assign permissions based on user roles so that people only have access to what they need. Keep detailed audit logs to track all access and changes to the model and its data, and regularly review these logs to spot any suspicious activity.
### Model Obfuscation
Protecting your model from being reverse-engineered or misuse can be done through model obfuscation. It involves encrypting model parameters, such as weights and biases in [neural networks](https://www.ultralytics.com/glossary/neural-network-nn), to make it difficult for unauthorized individuals to understand or alter the model. You can also obfuscate the model's architecture by renaming layers and parameters or adding dummy layers, making it harder for attackers to reverse-engineer it. You can also serve the model in a secure environment, like a secure enclave or using a trusted execution environment (TEE), can provide an extra layer of protection during inference.
## Share Ideas With Your Peers
Being part of a community of computer vision enthusiasts can help you solve problems and learn faster. Here are some ways to connect, get help, and share ideas.
### Community Resources
- **GitHub Issues:** Explore the [YOLO11 GitHub repository](https://github.com/ultralytics/ultralytics/issues) and use the Issues tab to ask questions, report bugs, and suggest new features. The community and maintainers are very active and ready to help.
- **Ultralytics Discord Server:** Join the [Ultralytics Discord server](https://discord.com/invite/ultralytics) to chat with other users and developers, get support, and share your experiences.
### Official Documentation
- **Ultralytics YOLO11 Documentation:** Visit the [official YOLO11 documentation](./index.md) for detailed guides and helpful tips on various computer vision projects.
Using these resources will help you solve challenges and stay up-to-date with the latest trends and practices in the computer vision community.
## Conclusion and Next Steps
We walked through some best practices to follow when deploying computer vision models. By securing data, controlling access, and obfuscating model details, you can protect sensitive information while keeping your models running smoothly. We also discussed how to address common issues like reduced accuracy and slow inferences using strategies such as warm-up runs, optimizing engines, asynchronous processing, profiling pipelines, and choosing the right precision.
After deploying your model, the next step would be monitoring, maintaining, and documenting your application. Regular monitoring helps catch and fix issues quickly, maintenance keeps your models up-to-date and functional, and good documentation tracks all changes and updates. These steps will help you achieve the [goals of your computer vision project](./defining-project-goals.md).
## FAQ
### What are the best practices for deploying a machine learning model using Ultralytics YOLO11?
Deploying a machine learning model, particularly with Ultralytics YOLO11, involves several best practices to ensure efficiency and reliability. First, choose the deployment environment that suits your needs—cloud, edge, or local. Optimize your model through techniques like [pruning, quantization, and knowledge distillation](#model-optimization-techniques) for efficient deployment in resource-constrained environments. Lastly, ensure data consistency and preprocessing steps align with the training phase to maintain performance. You can also refer to [model deployment options](./model-deployment-options.md) for more detailed guidelines.
### How can I troubleshoot common deployment issues with Ultralytics YOLO11 models?
Troubleshooting deployment issues can be broken down into a few key steps. If your model's accuracy drops after deployment, check for data consistency, validate preprocessing steps, and ensure the hardware/software environment matches what you used during training. For slow inference times, perform warm-up runs, optimize your inference engine, use asynchronous processing, and profile your inference pipeline. Refer to [troubleshooting deployment issues](#troubleshooting-deployment-issues) for a detailed guide on these best practices.
### How does Ultralytics YOLO11 optimization enhance model performance on edge devices?
Optimizing Ultralytics YOLO11 models for edge devices involves using techniques like pruning to reduce the model size, quantization to convert weights to lower precision, and knowledge distillation to train smaller models that mimic larger ones. These techniques ensure the model runs efficiently on devices with limited computational power. Tools like [TensorFlow Lite](../integrations/tflite.md) and [NVIDIA Jetson](./nvidia-jetson.md) are particularly useful for these optimizations. Learn more about these techniques in our section on [model optimization](#model-optimization-techniques).
### What are the security considerations for deploying machine learning models with Ultralytics YOLO11?
Security is paramount when deploying machine learning models. Ensure secure data transmission using encryption protocols like TLS. Implement robust access controls, including strong authentication and role-based access control (RBAC). Model obfuscation techniques, such as encrypting model parameters and serving models in a secure environment like a trusted execution environment (TEE), offer additional protection. For detailed practices, refer to [security considerations](#security-considerations-in-model-deployment).
### How do I choose the right deployment environment for my Ultralytics YOLO11 model?
Selecting the optimal deployment environment for your Ultralytics YOLO11 model depends on your application's specific needs. Cloud deployment offers scalability and ease of access, making it ideal for applications with high data volumes. Edge deployment is best for low-latency applications requiring real-time responses, using tools like [TensorFlow Lite](../integrations/tflite.md). Local deployment suits scenarios needing stringent data privacy and control. For a comprehensive overview of each environment, check out our section on [choosing a deployment environment](#choosing-a-deployment-environment).
---
comments: true
description: Explore the most effective ways to assess and refine YOLO11 models for better performance. Learn about evaluation metrics, fine-tuning processes, and how to customize your model for specific needs.
keywords: Model Evaluation, Machine Learning Model Evaluation, Fine Tuning Machine Learning, Fine Tune Model, Evaluating Models, Model Fine Tuning, How to Fine Tune a Model
---
# Insights on Model Evaluation and Fine-Tuning
## Introduction
Once you've [trained](./model-training-tips.md) your computer vision model, evaluating and refining it to perform optimally is essential. Just training your model isn't enough. You need to make sure that your model is accurate, efficient, and fulfills the [objective](./defining-project-goals.md) of your computer vision project. By evaluating and fine-tuning your model, you can identify weaknesses, improve its accuracy, and boost overall performance.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/-aYO-6VaDrw"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> Insights into Model Evaluation and Fine-Tuning | Tips for Improving Mean Average Precision
</p>
In this guide, we'll share insights on model evaluation and fine-tuning that'll make this [step of a computer vision project](./steps-of-a-cv-project.md) more approachable. We'll discuss how to understand evaluation metrics and implement fine-tuning techniques, giving you the knowledge to elevate your model's capabilities.
## Evaluating Model Performance Using Metrics
Evaluating how well a model performs helps us understand how effectively it works. Various metrics are used to measure performance. These [performance metrics](./yolo-performance-metrics.md) provide clear, numerical insights that can guide improvements toward making sure the model meets its intended goals. Let's take a closer look at a few key metrics.
### Confidence Score
The confidence score represents the model's certainty that a detected object belongs to a particular class. It ranges from 0 to 1, with higher scores indicating greater confidence. The confidence score helps filter predictions; only detections with confidence scores above a specified threshold are considered valid.
_Quick Tip:_ When running inferences, if you aren't seeing any predictions and you've checked everything else, try lowering the confidence score. Sometimes, the threshold is too high, causing the model to ignore valid predictions. Lowering the score allows the model to consider more possibilities. This might not meet your project goals, but it's a good way to see what the model can do and decide how to fine-tune it.
### Intersection over Union
[Intersection over Union](https://www.ultralytics.com/glossary/intersection-over-union-iou) (IoU) is a metric in [object detection](https://www.ultralytics.com/glossary/object-detection) that measures how well the predicted [bounding box](https://www.ultralytics.com/glossary/bounding-box) overlaps with the ground truth bounding box. IoU values range from 0 to 1, where one stands for a perfect match. IoU is essential because it measures how closely the predicted boundaries match the actual object boundaries.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/intersection-over-union-overview.avif" alt="Intersection over Union Overview">
</p>
### Mean Average [Precision](https://www.ultralytics.com/glossary/precision)
[Mean Average Precision](https://www.ultralytics.com/glossary/mean-average-precision-map) (mAP) is a way to measure how well an object detection model performs. It looks at the precision of detecting each object class, averages these scores, and gives an overall number that shows how accurately the model can identify and classify objects.
Let's focus on two specific mAP metrics:
- *mAP@.5:* Measures the average precision at a single IoU (Intersection over Union) threshold of 0.5. This metric checks if the model can correctly find objects with a looser [accuracy](https://www.ultralytics.com/glossary/accuracy) requirement. It focuses on whether the object is roughly in the right place, not needing perfect placement. It helps see if the model is generally good at spotting objects.
- *mAP@.5:.95:* Averages the mAP values calculated at multiple IoU thresholds, from 0.5 to 0.95 in 0.05 increments. This metric is more detailed and strict. It gives a fuller picture of how accurately the model can find objects at different levels of strictness and is especially useful for applications that need precise object detection.
Other mAP metrics include mAP@0.75, which uses a stricter IoU threshold of 0.75, and mAP@small, medium, and large, which evaluate precision across objects of different sizes.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/mean-average-precision-overview.avif" alt="Mean Average Precision Overview">
</p>
## Evaluating YOLO11 Model Performance
With respect to YOLO11, you can use the [validation mode](../modes/val.md) to evaluate the model. Also, be sure to take a look at our guide that goes in-depth into [YOLO11 performance metrics](./yolo-performance-metrics.md) and how they can be interpreted.
### Common Community Questions
When evaluating your YOLO11 model, you might run into a few hiccups. Based on common community questions, here are some tips to help you get the most out of your YOLO11 model:
#### Handling Variable Image Sizes
Evaluating your YOLO11 model with images of different sizes can help you understand its performance on diverse datasets. Using the `rect=true` validation parameter, YOLO11 adjusts the network's stride for each batch based on the image sizes, allowing the model to handle rectangular images without forcing them to a single size.
The `imgsz` validation parameter sets the maximum dimension for image resizing, which is 640 by default. You can adjust this based on your dataset's maximum dimensions and the GPU memory available. Even with `imgsz` set, `rect=true` lets the model manage varying image sizes effectively by dynamically adjusting the stride.
#### Accessing YOLO11 Metrics
If you want to get a deeper understanding of your YOLO11 model's performance, you can easily access specific evaluation metrics with a few lines of Python code. The code snippet below will let you load your model, run an evaluation, and print out various metrics that show how well your model is doing.
!!! example "Usage"
=== "Python"
```python
from ultralytics import YOLO
# Load the model
model = YOLO("yolo11n.pt")
# Run the evaluation
results = model.val(data="coco8.yaml")
# Print specific metrics
print("Class indices with average precision:", results.ap_class_index)
print("Average precision for all classes:", results.box.all_ap)
print("Average precision:", results.box.ap)
print("Average precision at IoU=0.50:", results.box.ap50)
print("Class indices for average precision:", results.box.ap_class_index)
print("Class-specific results:", results.box.class_result)
print("F1 score:", results.box.f1)
print("F1 score curve:", results.box.f1_curve)
print("Overall fitness score:", results.box.fitness)
print("Mean average precision:", results.box.map)
print("Mean average precision at IoU=0.50:", results.box.map50)
print("Mean average precision at IoU=0.75:", results.box.map75)
print("Mean average precision for different IoU thresholds:", results.box.maps)
print("Mean results for different metrics:", results.box.mean_results)
print("Mean precision:", results.box.mp)
print("Mean recall:", results.box.mr)
print("Precision:", results.box.p)
print("Precision curve:", results.box.p_curve)
print("Precision values:", results.box.prec_values)
print("Specific precision metrics:", results.box.px)
print("Recall:", results.box.r)
print("Recall curve:", results.box.r_curve)
```
The results object also includes speed metrics like preprocess time, inference time, loss, and postprocess time. By analyzing these metrics, you can fine-tune and optimize your YOLO11 model for better performance, making it more effective for your specific use case.
## How Does Fine-Tuning Work?
Fine-tuning involves taking a pre-trained model and adjusting its parameters to improve performance on a specific task or dataset. The process, also known as model retraining, allows the model to better understand and predict outcomes for the specific data it will encounter in real-world applications. You can retrain your model based on your model evaluation to achieve optimal results.
## Tips for Fine-Tuning Your Model
Fine-tuning a model means paying close attention to several vital parameters and techniques to achieve optimal performance. Here are some essential tips to guide you through the process.
### Starting With a Higher [Learning Rate](https://www.ultralytics.com/glossary/learning-rate)
Usually, during the initial training [epochs](https://www.ultralytics.com/glossary/epoch), the learning rate starts low and gradually increases to stabilize the training process. However, since your model has already learned some features from the previous dataset, starting with a higher learning rate right away can be more beneficial.
When evaluating your YOLO11 model, you can set the `warmup_epochs` validation parameter to `warmup_epochs=0` to prevent the learning rate from starting too high. By following this process, the training will continue from the provided weights, adjusting to the nuances of your new data.
### Image Tiling for Small Objects
Image tiling can improve detection accuracy for small objects. By dividing larger images into smaller segments, such as splitting 1280x1280 images into multiple 640x640 segments, you maintain the original resolution, and the model can learn from high-resolution fragments. When using YOLO11, make sure to adjust your labels for these new segments correctly.
## Engage with the Community
Sharing your ideas and questions with other [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) enthusiasts can inspire creative solutions to roadblocks in your projects. Here are some excellent ways to learn, troubleshoot, and connect.
### Finding Help and Support
- **GitHub Issues:** Explore the YOLO11 GitHub repository and use the [Issues tab](https://github.com/ultralytics/ultralytics/issues) to ask questions, report bugs, and suggest features. The community and maintainers are available to assist with any issues you encounter.
- **Ultralytics Discord Server:** Join the [Ultralytics Discord server](https://discord.com/invite/ultralytics) to connect with other users and developers, get support, share knowledge, and brainstorm ideas.
### Official Documentation
- **Ultralytics YOLO11 Documentation:** Check out the [official YOLO11 documentation](./index.md) for comprehensive guides and valuable insights on various computer vision tasks and projects.
## Final Thoughts
Evaluating and fine-tuning your computer vision model are important steps for successful [model deployment](https://www.ultralytics.com/glossary/model-deployment). These steps help make sure that your model is accurate, efficient, and suited to your overall application. The key to training the best model possible is continuous experimentation and learning. Don't hesitate to tweak parameters, try new techniques, and explore different datasets. Keep experimenting and pushing the boundaries of what's possible!
## FAQ
### What are the key metrics for evaluating YOLO11 model performance?
To evaluate YOLO11 model performance, important metrics include Confidence Score, Intersection over Union (IoU), and Mean Average Precision (mAP). Confidence Score measures the model's certainty for each detected object class. IoU evaluates how well the predicted bounding box overlaps with the ground truth. Mean Average Precision (mAP) aggregates precision scores across classes, with mAP@.5 and mAP@.5:.95 being two common types for varying IoU thresholds. Learn more about these metrics in our [YOLO11 performance metrics guide](./yolo-performance-metrics.md).
### How can I fine-tune a pre-trained YOLO11 model for my specific dataset?
Fine-tuning a pre-trained YOLO11 model involves adjusting its parameters to improve performance on a specific task or dataset. Start by evaluating your model using metrics, then set a higher initial learning rate by adjusting the `warmup_epochs` parameter to 0 for immediate stability. Use parameters like `rect=true` for handling varied image sizes effectively. For more detailed guidance, refer to our section on [fine-tuning YOLO11 models](#how-does-fine-tuning-work).
### How can I handle variable image sizes when evaluating my YOLO11 model?
To handle variable image sizes during evaluation, use the `rect=true` parameter in YOLO11, which adjusts the network's stride for each batch based on image sizes. The `imgsz` parameter sets the maximum dimension for image resizing, defaulting to 640. Adjust `imgsz` to suit your dataset and GPU memory. For more details, visit our [section on handling variable image sizes](#handling-variable-image-sizes).
### What practical steps can I take to improve mean average precision for my YOLO11 model?
Improving mean average precision (mAP) for a YOLO11 model involves several steps:
1. **Tuning Hyperparameters**: Experiment with different learning rates, [batch sizes](https://www.ultralytics.com/glossary/batch-size), and image augmentations.
2. **[Data Augmentation](https://www.ultralytics.com/glossary/data-augmentation)**: Use techniques like Mosaic and MixUp to create diverse training samples.
3. **Image Tiling**: Split larger images into smaller tiles to improve detection accuracy for small objects.
Refer to our detailed guide on [model fine-tuning](#tips-for-fine-tuning-your-model) for specific strategies.
### How do I access YOLO11 model evaluation metrics in Python?
You can access YOLO11 model evaluation metrics using Python with the following steps:
!!! example "Usage"
=== "Python"
```python
from ultralytics import YOLO
# Load the model
model = YOLO("yolo11n.pt")
# Run the evaluation
results = model.val(data="coco8.yaml")
# Print specific metrics
print("Class indices with average precision:", results.ap_class_index)
print("Average precision for all classes:", results.box.all_ap)
print("Mean average precision at IoU=0.50:", results.box.map50)
print("Mean recall:", results.box.mr)
```
Analyzing these metrics helps fine-tune and optimize your YOLO11 model. For a deeper dive, check out our guide on [YOLO11 metrics](../modes/val.md).
---
comments: true
description: Understand the key practices for monitoring, maintaining, and documenting computer vision models to guarantee accuracy, spot anomalies, and mitigate data drift.
keywords: Computer Vision Models, AI Model Monitoring, Data Drift Detection, Anomaly Detection in AI, Model Monitoring
---
# Maintaining Your Computer Vision Models After Deployment
## Introduction
If you are here, we can assume you've completed many [steps in your computer vision project](./steps-of-a-cv-project.md): from [gathering requirements](./defining-project-goals.md), [annotating data](./data-collection-and-annotation.md), and [training the model](./model-training-tips.md) to finally [deploying](./model-deployment-practices.md) it. Your application is now running in production, but your project doesn't end here. The most important part of a computer vision project is making sure your model continues to fulfill your [project's objectives](./defining-project-goals.md) over time, and that's where monitoring, maintaining, and documenting your computer vision model enters the picture.
In this guide, we'll take a closer look at how you can maintain your computer vision models after deployment. We'll explore how model monitoring can help you catch problems early on, how to keep your model accurate and up-to-date, and why documentation is important for troubleshooting.
## Model Monitoring is Key
Keeping a close eye on your deployed computer vision models is essential. Without proper monitoring, models can lose accuracy. A common issue is data distribution shift or data drift, where the data the model encounters changes from what it was trained on. When the model has to make predictions on data it doesn't recognize, it can lead to misinterpretations and poor performance. Outliers, or unusual data points, can also throw off the model's accuracy.
Regular model monitoring helps developers track the [model's performance](./model-evaluation-insights.md), spot anomalies, and quickly address problems like data drift. It also helps manage resources by indicating when updates are needed, avoiding expensive overhauls, and keeping the model relevant.
### Best Practices for Model Monitoring
Here are some best practices to keep in mind while monitoring your computer vision model in production:
- **Track Performance Regularly**: Continuously monitor the model's performance to detect changes over time.
- **Double Check the Data Quality**: Check for missing values or anomalies in the data.
- **Use Diverse Data Sources**: Monitor data from various sources to get a comprehensive view of the model's performance.
- **Combine Monitoring Techniques**: Use a mix of drift detection algorithms and rule-based approaches to identify a wide range of issues.
- **Monitor Inputs and Outputs**: Keep an eye on both the data the model processes and the results it produces to make sure everything is functioning correctly.
- **Set Up Alerts**: Implement alerts for unusual behavior, such as performance drops, to be able to make quick corrective actions.
### Tools for AI Model Monitoring
You can use automated monitoring tools to make it easier to monitor models after deployment. Many tools offer real-time insights and alerting capabilities. Here are some examples of open-source model monitoring tools that can work together:
- **[Prometheus](https://prometheus.io/)**: Prometheus is an open-source monitoring tool that collects and stores metrics for detailed performance tracking. It integrates easily with Kubernetes and Docker, collecting data at set intervals and storing it in a time-series database. Prometheus can also scrape HTTP endpoints to gather real-time metrics. Collected data can be queried using the PromQL language.
- **[Grafana](https://grafana.com/)**: Grafana is an open-source [data visualization](https://www.ultralytics.com/glossary/data-visualization) and monitoring tool that allows you to query, visualize, alert on, and understand your metrics no matter where they are stored. It works well with Prometheus and offers advanced data visualization features. You can create custom dashboards to show important metrics for your computer vision models, like inference latency, error rates, and resource usage. Grafana turns collected data into easy-to-read dashboards with line graphs, heat maps, and histograms. It also supports alerts, which can be sent through channels like Slack to quickly notify teams of any issues.
- **[Evidently AI](https://www.evidentlyai.com/)**: Evidently AI is an open-source tool designed for monitoring and debugging [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) models in production. It generates interactive reports from pandas DataFrames, helping analyze machine learning models. Evidently AI can detect data drift, model performance degradation, and other issues that may arise with your deployed models.
The three tools introduced above, Evidently AI, Prometheus, and Grafana, can work together seamlessly as a fully open-source ML monitoring solution that is ready for production. Evidently AI is used to collect and calculate metrics, Prometheus stores these metrics, and Grafana displays them and sets up alerts. While there are many other tools available, this setup is an exciting open-source option that provides robust capabilities for monitoring and maintaining your models.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/evidently-prometheus-grafana-monitoring-tools.avif" alt="Overview of Open Source Model Monitoring Tools">
</p>
### Anomaly Detection and Alert Systems
An anomaly is any data point or pattern that deviates quite a bit from what is expected. With respect to [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) models, anomalies can be images that are very different from the ones the model was trained on. These unexpected images can be signs of issues like changes in data distribution, outliers, or behaviors that might reduce model performance. Setting up alert systems to detect these anomalies is an important part of model monitoring.
By setting standard performance levels and limits for key metrics, you can catch problems early. When performance goes outside these limits, alerts are triggered, prompting quick fixes. Regularly updating and retraining models with new data keeps them relevant and accurate as the data changes.
#### Things to Keep in Mind When Configuring Thresholds and Alerts
When you are setting up your alert systems, keep these best practices in mind:
- **Standardized Alerts**: Use consistent tools and formats for all alerts, such as email or messaging apps like Slack. Standardization makes it easier for you to quickly understand and respond to alerts.
- **Include Expected Behavior**: Alert messages should clearly state what went wrong, what was expected, and the timeframe evaluated. It helps you gauge the urgency and context of the alert.
- **Configurable Alerts**: Make alerts easily configurable to adapt to changing conditions. Allow yourself to edit thresholds, snooze, disable, or acknowledge alerts.
### Data Drift Detection
Data drift detection is a concept that helps identify when the statistical properties of the input data change over time, which can degrade model performance. Before you decide to retrain or adjust your models, this technique helps spot that there is an issue. Data drift deals with changes in the overall data landscape over time, while anomaly detection focuses on identifying rare or unexpected data points that may require immediate attention.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/data-drift-detection-overview.avif" alt="Data Drift Detection Overview">
</p>
Here are several methods to detect data drift:
**Continuous Monitoring**: Regularly monitor the model's input data and outputs for signs of drift. Track key metrics and compare them against historical data to identify significant changes.
**Statistical Techniques**: Use methods like the Kolmogorov-Smirnov test or Population Stability Index (PSI) to detect changes in data distributions. These tests compare the distribution of new data with the [training data](https://www.ultralytics.com/glossary/training-data) to identify significant differences.
**Feature Drift**: Monitor individual features for drift. Sometimes, the overall data distribution may remain stable, but individual features may drift. Identifying which features are drifting helps in fine-tuning the retraining process.
## Model Maintenance
Model maintenance is crucial to keep computer vision models accurate and relevant over time. Model maintenance involves regularly updating and retraining models, addressing data drift, and ensuring the model stays relevant as data and environments change. You might be wondering how model maintenance differs from model monitoring. Monitoring is about watching the model's performance in real time to catch issues early. Maintenance, on the other hand, is about fixing these issues.
### Regular Updates and Re-training
Once a model is deployed, while monitoring, you may notice changes in data patterns or performance, indicating model drift. Regular updates and re-training become essential parts of model maintenance to ensure the model can handle new patterns and scenarios. There are a few techniques you can use based on how your data is changing.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/computer-vision-model-drift-overview.avif" alt="Computer Vision Model Drift Overview">
</p>
For example, if the data is changing gradually over time, incremental learning is a good approach. Incremental learning involves updating the model with new data without completely retraining it from scratch, saving computational resources and time. However, if the data has changed drastically, a periodic full re-training might be a better option to ensure the model does not overfit on the new data while losing track of older patterns.
Regardless of the method, validation and testing are a must after updates. It is important to validate the model on a separate [test dataset](./model-testing.md) to check for performance improvements or degradation.
### Deciding When to Retrain Your Model
The frequency of retraining your computer vision model depends on data changes and model performance. Retrain your model whenever you observe a significant performance drop or detect data drift. Regular evaluations can help determine the right retraining schedule by testing the model against new data. Monitoring performance metrics and data patterns lets you decide if your model needs more frequent updates to maintain [accuracy](https://www.ultralytics.com/glossary/accuracy).
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/when-to-retrain-overview.avif" alt="When to Retrain Overview">
</p>
## Documentation
Documenting a computer vision project makes it easier to understand, reproduce, and collaborate on. Good documentation covers model architecture, hyperparameters, datasets, evaluation metrics, and more. It provides transparency, helping team members and stakeholders understand what has been done and why. Documentation also aids in troubleshooting, maintenance, and future enhancements by providing a clear reference of past decisions and methods.
### Key Elements to Document
These are some of the key elements that should be included in project documentation:
- **[Project Overview](./steps-of-a-cv-project.md)**: Provide a high-level summary of the project, including the problem statement, solution approach, expected outcomes, and project scope. Explain the role of computer vision in addressing the problem and outline the stages and deliverables.
- **Model Architecture**: Detail the structure and design of the model, including its components, layers, and connections. Explain the chosen hyperparameters and the rationale behind these choices.
- **[Data Preparation](./data-collection-and-annotation.md)**: Describe the data sources, types, formats, sizes, and preprocessing steps. Discuss data quality, reliability, and any transformations applied before training the model.
- **[Training Process](./model-training-tips.md)**: Document the training procedure, including the datasets used, training parameters, and [loss functions](https://www.ultralytics.com/glossary/loss-function). Explain how the model was trained and any challenges encountered during training.
- **[Evaluation Metrics](./model-evaluation-insights.md)**: Specify the metrics used to evaluate the model's performance, such as accuracy, [precision](https://www.ultralytics.com/glossary/precision), [recall](https://www.ultralytics.com/glossary/recall), and F1-score. Include performance results and an analysis of these metrics.
- **[Deployment Steps](./model-deployment-options.md)**: Outline the steps taken to deploy the model, including the tools and platforms used, deployment configurations, and any specific challenges or considerations.
- **Monitoring and Maintenance Procedure**: Provide a detailed plan for monitoring the model's performance post-deployment. Include methods for detecting and addressing data and model drift, and describe the process for regular updates and retraining.
### Tools for Documentation
There are many options when it comes to documenting AI projects, with open-source tools being particularly popular. Two of these are Jupyter Notebooks and MkDocs. Jupyter Notebooks allow you to create interactive documents with embedded code, visualizations, and text, making them ideal for sharing experiments and analyses. MkDocs is a static site generator that is easy to set up and deploy and is perfect for creating and hosting project documentation online.
## Connect with the Community
Joining a community of computer vision enthusiasts can help you solve problems and learn more quickly. Here are some ways to connect, get support, and share ideas.
### Community Resources
- **GitHub Issues:** Check out the [YOLO11 GitHub repository](https://github.com/ultralytics/ultralytics/issues) and use the Issues tab to ask questions, report bugs, and suggest new features. The community and maintainers are highly active and supportive.
- **Ultralytics Discord Server:** Join the [Ultralytics Discord server](https://discord.com/invite/ultralytics) to chat with other users and developers, get support, and share your experiences.
### Official Documentation
- **Ultralytics YOLO11 Documentation:** Visit the [official YOLO11 documentation](./index.md) for detailed guides and helpful tips on various computer vision projects.
Using these resources will help you solve challenges and stay up-to-date with the latest trends and practices in the computer vision community.
## Key Takeaways
We covered key tips for monitoring, maintaining, and documenting your computer vision models. Regular updates and re-training help the model adapt to new data patterns. Detecting and fixing data drift helps your model stay accurate. Continuous monitoring catches issues early, and good documentation makes collaboration and future updates easier. Following these steps will help your computer vision project stay successful and effective over time.
## FAQ
### How do I monitor the performance of my deployed computer vision model?
Monitoring the performance of your deployed computer vision model is crucial to ensure its accuracy and reliability over time. You can use tools like [Prometheus](https://prometheus.io/), [Grafana](https://grafana.com/), and [Evidently AI](https://www.evidentlyai.com/) to track key metrics, detect anomalies, and identify data drift. Regularly monitor inputs and outputs, set up alerts for unusual behavior, and use diverse data sources to get a comprehensive view of your model's performance. For more details, check out our section on [Model Monitoring](#model-monitoring-is-key).
### What are the best practices for maintaining computer vision models after deployment?
Maintaining computer vision models involves regular updates, retraining, and monitoring to ensure continued accuracy and relevance. Best practices include:
- **Continuous Monitoring**: Track performance metrics and data quality regularly.
- **Data Drift Detection**: Use statistical techniques to identify changes in data distributions.
- **Regular Updates and Retraining**: Implement incremental learning or periodic full retraining based on data changes.
- **Documentation**: Maintain detailed documentation of model architecture, training processes, and evaluation metrics. For more insights, visit our [Model Maintenance](#model-maintenance) section.
### Why is data drift detection important for AI models?
Data drift detection is essential because it helps identify when the statistical properties of the input data change over time, which can degrade model performance. Techniques like continuous monitoring, statistical tests (e.g., Kolmogorov-Smirnov test), and feature drift analysis can help spot issues early. Addressing data drift ensures that your model remains accurate and relevant in changing environments. Learn more about data drift detection in our [Data Drift Detection](#data-drift-detection) section.
### What tools can I use for [anomaly detection](https://www.ultralytics.com/glossary/anomaly-detection) in computer vision models?
For anomaly detection in computer vision models, tools like [Prometheus](https://prometheus.io/), [Grafana](https://grafana.com/), and [Evidently AI](https://www.evidentlyai.com/) are highly effective. These tools can help you set up alert systems to detect unusual data points or patterns that deviate from expected behavior. Configurable alerts and standardized messages can help you respond quickly to potential issues. Explore more in our [Anomaly Detection and Alert Systems](#anomaly-detection-and-alert-systems) section.
### How can I document my computer vision project effectively?
Effective documentation of a computer vision project should include:
- **Project Overview**: High-level summary, problem statement, and solution approach.
- **Model Architecture**: Details of the model structure, components, and hyperparameters.
- **Data Preparation**: Information on data sources, preprocessing steps, and transformations.
- **Training Process**: Description of the training procedure, datasets used, and challenges encountered.
- **Evaluation Metrics**: Metrics used for performance evaluation and analysis.
- **Deployment Steps**: Steps taken for [model deployment](https://www.ultralytics.com/glossary/model-deployment) and any specific challenges.
- **Monitoring and Maintenance Procedure**: Plan for ongoing monitoring and maintenance. For more comprehensive guidelines, refer to our [Documentation](#documentation) section.
---
comments: true
description: Explore effective methods for testing computer vision models to make sure they are reliable, perform well, and are ready to be deployed.
keywords: Overfitting and Underfitting in Machine Learning, Model Testing, Data Leakage Machine Learning, Testing a Model, Testing Machine Learning Models, How to Test AI Models
---
# A Guide on Model Testing
## Introduction
After [training](./model-training-tips.md) and [evaluating](./model-evaluation-insights.md) your model, it's time to test it. Model testing involves assessing how well it performs in real-world scenarios. Testing considers factors like accuracy, reliability, fairness, and how easy it is to understand the model's decisions. The goal is to make sure the model performs as intended, delivers the expected results, and fits into the [overall objective of your application](./defining-project-goals.md) or project.
Model testing is quite similar to model evaluation, but they are two distinct [steps in a computer vision project](./steps-of-a-cv-project.md). Model evaluation involves metrics and plots to assess the model's accuracy. On the other hand, model testing checks if the model's learned behavior is the same as expectations. In this guide, we'll explore strategies for testing your [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) models.
## Model Testing Vs. Model Evaluation
First, let's understand the difference between model evaluation and testing with an example.
Suppose you have trained a computer vision model to recognize cats and dogs, and you want to deploy this model at a pet store to monitor the animals. During the model evaluation phase, you use a labeled dataset to calculate metrics like accuracy, [precision](https://www.ultralytics.com/glossary/precision), [recall](https://www.ultralytics.com/glossary/recall), and F1 score. For instance, the model might have an accuracy of 98% in distinguishing between cats and dogs in a given dataset.
After evaluation, you test the model using images from a pet store to see how well it identifies cats and dogs in more varied and realistic conditions. You check if it can correctly label cats and dogs when they are moving, in different lighting conditions, or partially obscured by objects like toys or furniture. Model testing checks that the model behaves as expected outside the controlled evaluation environment.
## Preparing for Model Testing
Computer vision models learn from datasets by detecting patterns, making predictions, and evaluating their performance. These [datasets](./preprocessing_annotated_data.md) are usually divided into training and testing sets to simulate real-world conditions. [Training data](https://www.ultralytics.com/glossary/training-data) teaches the model while testing data verifies its accuracy.
Here are two points to keep in mind before testing your model:
- **Realistic Representation:** The previously unseen testing data should be similar to the data that the model will have to handle when deployed. This helps get a realistic understanding of the model's capabilities.
- **Sufficient Size:** The size of the testing dataset needs to be large enough to provide reliable insights into how well the model performs.
## Testing Your Computer Vision Model
Here are the key steps to take to test your computer vision model and understand its performance.
- **Run Predictions:** Use the model to make predictions on the test dataset.
- **Compare Predictions:** Check how well the model's predictions match the actual labels (ground truth).
- **Calculate Performance Metrics:** [Compute metrics](./yolo-performance-metrics.md) like accuracy, precision, recall, and F1 score to understand the model's strengths and weaknesses. Testing focuses on how these metrics reflect real-world performance.
- **Visualize Results:** Create visual aids like confusion matrices and ROC curves. These help you spot specific areas where the model might not be performing well in practical applications.
Next, the testing results can be analyzed:
- **Misclassified Images:** Identify and review images that the model misclassified to understand where it is going wrong.
- **Error Analysis:** Perform a thorough error analysis to understand the types of errors (e.g., false positives vs. false negatives) and their potential causes.
- **Bias and Fairness:** Check for any biases in the model's predictions. Ensure that the model performs equally well across different subsets of the data, especially if it includes sensitive attributes like race, gender, or age.
## Testing Your YOLO11 Model
To test your YOLO11 model, you can use the validation mode. It's a straightforward way to understand the model's strengths and areas that need improvement. Also, you'll need to format your test dataset correctly for YOLO11. For more details on how to use the validation mode, check out the [Model Validation](../modes/val.md) docs page.
## Using YOLO11 to Predict on Multiple Test Images
If you want to test your trained YOLO11 model on multiple images stored in a folder, you can easily do so in one go. Instead of using the validation mode, which is typically used to evaluate model performance on a validation set and provide detailed metrics, you might just want to see predictions on all images in your test set. For this, you can use the [prediction mode](../modes/predict.md).
### Difference Between Validation and Prediction Modes
- **[Validation Mode](../modes/val.md):** Used to evaluate the model's performance by comparing predictions against known labels (ground truth). It provides detailed metrics such as accuracy, precision, recall, and F1 score.
- **[Prediction Mode](../modes/predict.md):** Used to run the model on new, unseen data to generate predictions. It does not provide detailed performance metrics but allows you to see how the model performs on real-world images.
## Running YOLO11 Predictions Without Custom Training
If you are interested in testing the basic YOLO11 model to understand whether it can be used for your application without custom training, you can use the prediction mode. While the model is pre-trained on datasets like COCO, running predictions on your own dataset can give you a quick sense of how well it might perform in your specific context.
## Overfitting and [Underfitting](https://www.ultralytics.com/glossary/underfitting) in [Machine Learning](https://www.ultralytics.com/glossary/machine-learning-ml)
When testing a machine learning model, especially in computer vision, it's important to watch out for overfitting and underfitting. These issues can significantly affect how well your model works with new data.
### Overfitting
Overfitting happens when your model learns the training data too well, including the noise and details that don't generalize to new data. In computer vision, this means your model might do great with training images but struggle with new ones.
#### Signs of Overfitting
- **High Training Accuracy, Low Validation Accuracy:** If your model performs very well on training data but poorly on validation or [test data](https://www.ultralytics.com/glossary/test-data), it's likely overfitting.
- **Visual Inspection:** Sometimes, you can see overfitting if your model is too sensitive to minor changes or irrelevant details in images.
### Underfitting
Underfitting occurs when your model can't capture the underlying patterns in the data. In computer vision, an underfitted model might not even recognize objects correctly in the training images.
#### Signs of Underfitting
- **Low Training Accuracy:** If your model can't achieve high accuracy on the training set, it might be underfitting.
- **Visual Misclassification:** Consistent failure to recognize obvious features or objects suggests underfitting.
### Balancing Overfitting and Underfitting
The key is to find a balance between overfitting and underfitting. Ideally, a model should perform well on both training and validation datasets. Regularly monitoring your model's performance through metrics and visual inspections, along with applying the right strategies, can help you achieve the best results.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/overfitting-underfitting-appropriate-fitting.avif" alt="Overfitting and Underfitting Overview">
</p>
## Data Leakage in Computer Vision and How to Avoid It
While testing your model, something important to keep in mind is data leakage. Data leakage happens when information from outside the training dataset accidentally gets used to train the model. The model may seem very accurate during training, but it won't perform well on new, unseen data when data leakage occurs.
### Why Data Leakage Happens
Data leakage can be tricky to spot and often comes from hidden biases in the training data. Here are some common ways it can happen in computer vision:
- **Camera Bias:** Different angles, lighting, shadows, and camera movements can introduce unwanted patterns.
- **Overlay Bias:** Logos, timestamps, or other overlays in images can mislead the model.
- **Font and Object Bias:** Specific fonts or objects that frequently appear in certain classes can skew the model's learning.
- **Spatial Bias:** Imbalances in foreground-background, [bounding box](https://www.ultralytics.com/glossary/bounding-box) distributions, and object locations can affect training.
- **Label and Domain Bias:** Incorrect labels or shifts in data types can lead to leakage.
### Detecting Data Leakage
To find data leakage, you can:
- **Check Performance:** If the model's results are surprisingly good, it might be leaking.
- **Look at Feature Importance:** If one feature is much more important than others, it could indicate leakage.
- **Visual Inspection:** Double-check that the model's decisions make sense intuitively.
- **Verify Data Separation:** Make sure data was divided correctly before any processing.
### Avoiding Data Leakage
To prevent data leakage, use a diverse dataset with images or videos from different cameras and environments. Carefully review your data and check that there are no hidden biases, such as all positive samples being taken at a specific time of day. Avoiding data leakage will help make your computer vision models more reliable and effective in real-world situations.
## What Comes After Model Testing
After testing your model, the next steps depend on the results. If your model performs well, you can deploy it into a real-world environment. If the results aren't satisfactory, you'll need to make improvements. This might involve analyzing errors, [gathering more data](./data-collection-and-annotation.md), improving data quality, [adjusting hyperparameters](./hyperparameter-tuning.md), and retraining the model.
## Join the AI Conversation
Becoming part of a community of computer vision enthusiasts can aid in solving problems and learning more efficiently. Here are some ways to connect, seek help, and share your thoughts.
### Community Resources
- **GitHub Issues:** Explore the [YOLO11 GitHub repository](https://github.com/ultralytics/ultralytics/issues) and use the Issues tab to ask questions, report bugs, and suggest new features. The community and maintainers are very active and ready to help.
- **Ultralytics Discord Server:** Join the [Ultralytics Discord server](https://discord.com/invite/ultralytics) to chat with other users and developers, get support, and share your experiences.
### Official Documentation
- **Ultralytics YOLO11 Documentation:** Check out the [official YOLO11 documentation](./index.md) for detailed guides and helpful tips on various computer vision projects.
These resources will help you navigate challenges and remain updated on the latest trends and practices within the computer vision community.
## In Summary
Building trustworthy computer vision models relies on rigorous model testing. By testing the model with previously unseen data, we can analyze it and spot weaknesses like [overfitting](https://www.ultralytics.com/glossary/overfitting) and data leakage. Addressing these issues before deployment helps the model perform well in real-world applications. It's important to remember that model testing is just as crucial as model evaluation in guaranteeing the model's long-term success and effectiveness.
## FAQ
### What are the key differences between model evaluation and model testing in computer vision?
Model evaluation and model testing are distinct steps in a computer vision project. Model evaluation involves using a labeled dataset to compute metrics such as [accuracy](https://www.ultralytics.com/glossary/accuracy), precision, recall, and [F1 score](https://www.ultralytics.com/glossary/f1-score), providing insights into the model's performance with a controlled dataset. Model testing, on the other hand, assesses the model's performance in real-world scenarios by applying it to new, unseen data, ensuring the model's learned behavior aligns with expectations outside the evaluation environment. For a detailed guide, refer to the [steps in a computer vision project](./steps-of-a-cv-project.md).
### How can I test my Ultralytics YOLO11 model on multiple images?
To test your Ultralytics YOLO11 model on multiple images, you can use the [prediction mode](../modes/predict.md). This mode allows you to run the model on new, unseen data to generate predictions without providing detailed metrics. This is ideal for real-world performance testing on larger image sets stored in a folder. For evaluating performance metrics, use the [validation mode](../modes/val.md) instead.
### What should I do if my computer vision model shows signs of overfitting or underfitting?
To address **overfitting**:
- [Regularization](https://www.ultralytics.com/glossary/regularization) techniques like dropout.
- Increase the size of the training dataset.
- Simplify the model architecture.
To address **underfitting**:
- Use a more complex model.
- Provide more relevant features.
- Increase training iterations or [epochs](https://www.ultralytics.com/glossary/epoch).
Review misclassified images, perform thorough error analysis, and regularly track performance metrics to maintain a balance. For more information on these concepts, explore our section on [Overfitting and Underfitting](#overfitting-and-underfitting-in-machine-learning).
### How can I detect and avoid data leakage in computer vision?
To detect data leakage:
- Verify that the testing performance is not unusually high.
- Check feature importance for unexpected insights.
- Intuitively review model decisions.
- Ensure correct data division before processing.
To avoid data leakage:
- Use diverse datasets with various environments.
- Carefully review data for hidden biases.
- Ensure no overlapping information between training and testing sets.
For detailed strategies on preventing data leakage, refer to our section on [Data Leakage in Computer Vision](#data-leakage-in-computer-vision-and-how-to-avoid-it).
### What steps should I take after testing my computer vision model?
Post-testing, if the model performance meets the project goals, proceed with deployment. If the results are unsatisfactory, consider:
- Error analysis.
- Gathering more diverse and high-quality data.
- [Hyperparameter tuning](https://www.ultralytics.com/glossary/hyperparameter-tuning).
- Retraining the model.
Gain insights from the [Model Testing Vs. Model Evaluation](#model-testing-vs-model-evaluation) section to refine and enhance model effectiveness in real-world applications.
### How do I run YOLO11 predictions without custom training?
You can run predictions using the pre-trained YOLO11 model on your dataset to see if it suits your application needs. Utilize the [prediction mode](../modes/predict.md) to get a quick sense of performance results without diving into custom training.
---
comments: true
description: Find best practices, optimization strategies, and troubleshooting advice for training computer vision models. Improve your model training efficiency and accuracy.
keywords: Model Training Machine Learning, AI Model Training, Number of Epochs, How to Train a Model in Machine Learning, Machine Learning Best Practices, What is Model Training
---
# Machine Learning Best Practices and Tips for Model Training
## Introduction
One of the most important steps when working on a [computer vision project](./steps-of-a-cv-project.md) is model training. Before reaching this step, you need to [define your goals](./defining-project-goals.md) and [collect and annotate your data](./data-collection-and-annotation.md). After [preprocessing the data](./preprocessing_annotated_data.md) to make sure it is clean and consistent, you can move on to training your model.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/GIrFEoR5PoU"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> Model Training Tips | How to Handle Large Datasets | Batch Size, GPU Utilization and <a href="https://www.ultralytics.com/glossary/mixed-precision">Mixed Precision</a>
</p>
So, what is [model training](../modes/train.md)? Model training is the process of teaching your model to recognize visual patterns and make predictions based on your data. It directly impacts the performance and accuracy of your application. In this guide, we'll cover best practices, optimization techniques, and troubleshooting tips to help you train your computer vision models effectively.
## How to Train a [Machine Learning](https://www.ultralytics.com/glossary/machine-learning-ml) Model
A computer vision model is trained by adjusting its internal parameters to minimize errors. Initially, the model is fed a large set of labeled images. It makes predictions about what is in these images, and the predictions are compared to the actual labels or contents to calculate errors. These errors show how far off the model's predictions are from the true values.
During training, the model iteratively makes predictions, calculates errors, and updates its parameters through a process called [backpropagation](https://www.ultralytics.com/glossary/backpropagation). In this process, the model adjusts its internal parameters (weights and biases) to reduce the errors. By repeating this cycle many times, the model gradually improves its accuracy. Over time, it learns to recognize complex patterns such as shapes, colors, and textures.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/backpropagation-diagram.avif" alt="What is Backpropagation?">
</p>
This learning process makes it possible for the [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) model to perform various [tasks](../tasks/index.md), including [object detection](../tasks/detect.md), [instance segmentation](../tasks/segment.md), and [image classification](../tasks/classify.md). The ultimate goal is to create a model that can generalize its learning to new, unseen images so that it can accurately understand visual data in real-world applications.
Now that we know what is happening behind the scenes when we train a model, let's look at points to consider when training a model.
## Training on Large Datasets
There are a few different aspects to think about when you are planning on using a large dataset to train a model. For example, you can adjust the batch size, control the GPU utilization, choose to use multiscale training, etc. Let's walk through each of these options in detail.
### Batch Size and GPU Utilization
When training models on large datasets, efficiently utilizing your GPU is key. Batch size is an important factor. It is the number of data samples that a machine learning model processes in a single training iteration.
Using the maximum batch size supported by your GPU, you can fully take advantage of its capabilities and reduce the time model training takes. However, you want to avoid running out of GPU memory. If you encounter memory errors, reduce the batch size incrementally until the model trains smoothly.
With respect to YOLO11, you can set the `batch_size` parameter in the [training configuration](../modes/train.md) to match your GPU capacity. Also, setting `batch=-1` in your training script will automatically determine the [batch size](https://www.ultralytics.com/glossary/batch-size) that can be efficiently processed based on your device's capabilities. By fine-tuning the batch size, you can make the most of your GPU resources and improve the overall training process.
### Subset Training
Subset training is a smart strategy that involves training your model on a smaller set of data that represents the larger dataset. It can save time and resources, especially during initial model development and testing. If you are running short on time or experimenting with different model configurations, subset training is a good option.
When it comes to YOLO11, you can easily implement subset training by using the `fraction` parameter. This parameter lets you specify what fraction of your dataset to use for training. For example, setting `fraction=0.1` will train your model on 10% of the data. You can use this technique for quick iterations and tuning your model before committing to training a model using a full dataset. Subset training helps you make rapid progress and identify potential issues early on.
### Multi-scale Training
Multiscale training is a technique that improves your model's ability to generalize by training it on images of varying sizes. Your model can learn to detect objects at different scales and distances and become more robust.
For example, when you train YOLO11, you can enable multiscale training by setting the `scale` parameter. This parameter adjusts the size of training images by a specified factor, simulating objects at different distances. For example, setting `scale=0.5` will reduce the image size by half, while `scale=2.0` will double it. Configuring this parameter allows your model to experience a variety of image scales and improve its detection capabilities across different object sizes and scenarios.
### Caching
Caching is an important technique to improve the efficiency of training machine learning models. By storing preprocessed images in memory, caching reduces the time the GPU spends waiting for data to be loaded from the disk. The model can continuously receive data without delays caused by disk I/O operations.
Caching can be controlled when training YOLO11 using the `cache` parameter:
- _`cache=True`_: Stores dataset images in RAM, providing the fastest access speed but at the cost of increased memory usage.
- _`cache='disk'`_: Stores the images on disk, slower than RAM but faster than loading fresh data each time.
- _`cache=False`_: Disables caching, relying entirely on disk I/O, which is the slowest option.
### Mixed Precision Training
Mixed precision training uses both 16-bit (FP16) and 32-bit (FP32) floating-point types. The strengths of both FP16 and FP32 are leveraged by using FP16 for faster computation and FP32 to maintain precision where needed. Most of the [neural network](https://www.ultralytics.com/glossary/neural-network-nn)'s operations are done in FP16 to benefit from faster computation and lower memory usage. However, a master copy of the model's weights is kept in FP32 to ensure accuracy during the weight update steps. You can handle larger models or larger batch sizes within the same hardware constraints.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/mixed-precision-training-overview.avif" alt="Mixed Precision Training Overview">
</p>
To implement mixed precision training, you'll need to modify your training scripts and ensure your hardware (like GPUs) supports it. Many modern [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) frameworks, such as [Tensorflow](https://www.ultralytics.com/glossary/tensorflow), offer built-in support for mixed precision.
Mixed precision training is straightforward when working with YOLO11. You can use the `amp` flag in your training configuration. Setting `amp=True` enables Automatic Mixed Precision (AMP) training. Mixed precision training is a simple yet effective way to optimize your model training process.
### Pre-trained Weights
Using pretrained weights is a smart way to speed up your model's training process. Pretrained weights come from models already trained on large datasets, giving your model a head start. [Transfer learning](https://www.ultralytics.com/glossary/transfer-learning) adapts pretrained models to new, related tasks. Fine-tuning a pre-trained model involves starting with these weights and then continuing training on your specific dataset. This method of training results in faster training times and often better performance because the model starts with a solid understanding of basic features.
The `pretrained` parameter makes transfer learning easy with YOLO11. Setting `pretrained=True` will use default pre-trained weights, or you can specify a path to a custom pre-trained model. Using pre-trained weights and transfer learning effectively boosts your model's capabilities and reduces training costs.
### Other Techniques to Consider When Handling a Large Dataset
There are a couple of other techniques to consider when handling a large dataset:
- **[Learning Rate](https://www.ultralytics.com/glossary/learning-rate) Schedulers**: Implementing learning rate schedulers dynamically adjusts the learning rate during training. A well-tuned learning rate can prevent the model from overshooting minima and improve stability. When training YOLO11, the `lrf` parameter helps manage learning rate scheduling by setting the final learning rate as a fraction of the initial rate.
- **Distributed Training**: For handling large datasets, distributed training can be a game-changer. You can reduce the training time by spreading the training workload across multiple GPUs or machines.
## The Number of Epochs To Train For
When training a model, an epoch refers to one complete pass through the entire training dataset. During an epoch, the model processes each example in the training set once and updates its parameters based on the learning algorithm. Multiple epochs are usually needed to allow the model to learn and refine its parameters over time.
A common question that comes up is how to determine the number of epochs to train the model for. A good starting point is 300 epochs. If the model overfits early, you can reduce the number of epochs. If [overfitting](https://www.ultralytics.com/glossary/overfitting) does not occur after 300 epochs, you can extend the training to 600, 1200, or more epochs.
However, the ideal number of epochs can vary based on your dataset's size and project goals. Larger datasets might require more epochs for the model to learn effectively, while smaller datasets might need fewer epochs to avoid overfitting. With respect to YOLO11, you can set the `epochs` parameter in your training script.
## Early Stopping
Early stopping is a valuable technique for optimizing model training. By monitoring validation performance, you can halt training once the model stops improving. You can save computational resources and prevent overfitting.
The process involves setting a patience parameter that determines how many [epochs](https://www.ultralytics.com/glossary/epoch) to wait for an improvement in validation metrics before stopping training. If the model's performance does not improve within these epochs, training is stopped to avoid wasting time and resources.
<p align="center">
<img width="100%" src="https://github.com/ultralytics/docs/releases/download/0/early-stopping-overview.avif" alt="Early Stopping Overview">
</p>
For YOLO11, you can enable early stopping by setting the patience parameter in your training configuration. For example, `patience=5` means training will stop if there's no improvement in validation metrics for 5 consecutive epochs. Using this method ensures the training process remains efficient and achieves optimal performance without excessive computation.
## Choosing Between Cloud and Local Training
There are two options for training your model: cloud training and local training.
Cloud training offers scalability and powerful hardware and is ideal for handling large datasets and complex models. Platforms like Google Cloud, AWS, and Azure provide on-demand access to high-performance GPUs and TPUs, speeding up training times and enabling experiments with larger models. However, cloud training can be expensive, especially for long periods, and data transfer can add to costs and latency.
Local training provides greater control and customization, letting you tailor your environment to specific needs and avoid ongoing cloud costs. It can be more economical for long-term projects, and since your data stays on-premises, it's more secure. However, local hardware may have resource limitations and require maintenance, which can lead to longer training times for large models.
## Selecting an Optimizer
An optimizer is an algorithm that adjusts the weights of your neural network to minimize the [loss function](https://www.ultralytics.com/glossary/loss-function), which measures how well the model is performing. In simpler terms, the optimizer helps the model learn by tweaking its parameters to reduce errors. Choosing the right optimizer directly affects how quickly and accurately the model learns.
You can also fine-tune optimizer parameters to improve model performance. Adjusting the learning rate sets the size of the steps when updating parameters. For stability, you might start with a moderate learning rate and gradually decrease it over time to improve long-term learning. Additionally, setting the momentum determines how much influence past updates have on current updates. A common value for momentum is around 0.9. It generally provides a good balance.
### Common Optimizers
Different optimizers have various strengths and weaknesses. Let's take a glimpse at a few common optimizers.
- **SGD (Stochastic Gradient Descent)**:
- Updates model parameters using the gradient of the loss function with respect to the parameters.
- Simple and efficient but can be slow to converge and might get stuck in local minima.
- **Adam (Adaptive Moment Estimation)**:
- Combines the benefits of both SGD with momentum and RMSProp.
- Adjusts the learning rate for each parameter based on estimates of the first and second moments of the gradients.
- Well-suited for noisy data and sparse gradients.
- Efficient and generally requires less tuning, making it a recommended optimizer for YOLO11.
- **RMSProp (Root Mean Square Propagation)**:
- Adjusts the learning rate for each parameter by dividing the gradient by a running average of the magnitudes of recent gradients.
- Helps in handling the vanishing gradient problem and is effective for [recurrent neural networks](https://www.ultralytics.com/glossary/recurrent-neural-network-rnn).
For YOLO11, the `optimizer` parameter lets you choose from various optimizers, including SGD, Adam, AdamW, NAdam, RAdam, and RMSProp, or you can set it to `auto` for automatic selection based on model configuration.
## Connecting with the Community
Being part of a community of computer vision enthusiasts can help you solve problems and learn faster. Here are some ways to connect, get help, and share ideas.
### Community Resources
- **GitHub Issues:** Visit the [YOLO11 GitHub repository](https://github.com/ultralytics/ultralytics/issues) and use the Issues tab to ask questions, report bugs, and suggest new features. The community and maintainers are very active and ready to help.
- **Ultralytics Discord Server:** Join the [Ultralytics Discord server](https://discord.com/invite/ultralytics) to chat with other users and developers, get support, and share your experiences.
### Official Documentation
- **Ultralytics YOLO11 Documentation:** Check out the [official YOLO11 documentation](./index.md) for detailed guides and helpful tips on various computer vision projects.
Using these resources will help you solve challenges and stay up-to-date with the latest trends and practices in the computer vision community.
## Key Takeaways
Training computer vision models involves following good practices, optimizing your strategies, and solving problems as they arise. Techniques like adjusting batch sizes, mixed [precision](https://www.ultralytics.com/glossary/precision) training, and starting with pre-trained weights can make your models work better and train faster. Methods like subset training and early stopping help you save time and resources. Staying connected with the community and keeping up with new trends will help you keep improving your model training skills.
## FAQ
### How can I improve GPU utilization when training a large dataset with Ultralytics YOLO?
To improve GPU utilization, set the `batch_size` parameter in your training configuration to the maximum size supported by your GPU. This ensures that you make full use of the GPU's capabilities, reducing training time. If you encounter memory errors, incrementally reduce the batch size until training runs smoothly. For YOLO11, setting `batch=-1` in your training script will automatically determine the optimal batch size for efficient processing. For further information, refer to the [training configuration](../modes/train.md).
### What is mixed precision training, and how do I enable it in YOLO11?
Mixed precision training utilizes both 16-bit (FP16) and 32-bit (FP32) floating-point types to balance computational speed and precision. This approach speeds up training and reduces memory usage without sacrificing model [accuracy](https://www.ultralytics.com/glossary/accuracy). To enable mixed precision training in YOLO11, set the `amp` parameter to `True` in your training configuration. This activates Automatic Mixed Precision (AMP) training. For more details on this optimization technique, see the [training configuration](../modes/train.md).
### How does multiscale training enhance YOLO11 model performance?
Multiscale training enhances model performance by training on images of varying sizes, allowing the model to better generalize across different scales and distances. In YOLO11, you can enable multiscale training by setting the `scale` parameter in the training configuration. For example, `scale=0.5` reduces the image size by half, while `scale=2.0` doubles it. This technique simulates objects at different distances, making the model more robust across various scenarios. For settings and more details, check out the [training configuration](../modes/train.md).
### How can I use pre-trained weights to speed up training in YOLO11?
Using pre-trained weights can significantly reduce training times and improve model performance by starting from a model that already understands basic features. In YOLO11, you can set the `pretrained` parameter to `True` or specify a path to custom pre-trained weights in your training configuration. This approach, known as transfer learning, leverages knowledge from large datasets to adapt to your specific task. Learn more about pre-trained weights and their advantages [here](../modes/train.md).
### What is the recommended number of epochs for training a model, and how do I set this in YOLO11?
The number of epochs refers to the complete passes through the training dataset during model training. A typical starting point is 300 epochs. If your model overfits early, you can reduce the number. Alternatively, if overfitting isn't observed, you might extend training to 600, 1200, or more epochs. To set this in YOLO11, use the `epochs` parameter in your training script. For additional advice on determining the ideal number of epochs, refer to this section on [number of epochs](#the-number-of-epochs-to-train-for).
---
comments: true
description: Learn to deploy Ultralytics YOLO11 on NVIDIA Jetson devices with our detailed guide. Explore performance benchmarks and maximize AI capabilities.
keywords: Ultralytics, YOLO11, NVIDIA Jetson, JetPack, AI deployment, performance benchmarks, embedded systems, deep learning, TensorRT, computer vision
---
# Quick Start Guide: NVIDIA Jetson with Ultralytics YOLO11
This comprehensive guide provides a detailed walkthrough for deploying Ultralytics YOLO11 on [NVIDIA Jetson](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) devices. Additionally, it showcases performance benchmarks to demonstrate the capabilities of YOLO11 on these small and powerful devices.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/mUybgOlSxxA"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> How to Setup NVIDIA Jetson with Ultralytics YOLO11
</p>
<img width="1024" src="https://github.com/ultralytics/docs/releases/download/0/nvidia-jetson-ecosystem.avif" alt="NVIDIA Jetson Ecosystem">
!!! note
This guide has been tested with both [Seeed Studio reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) which is based on NVIDIA Jetson Orin NX 16GB running the latest stable JetPack release of [JP6.0](https://developer.nvidia.com/embedded/jetpack-sdk-60), JetPack release of [JP5.1.3](https://developer.nvidia.com/embedded/jetpack-sdk-513) and [Seeed Studio reComputer J1020 v2](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) which is based on NVIDIA Jetson Nano 4GB running JetPack release of [JP4.6.1](https://developer.nvidia.com/embedded/jetpack-sdk-461). It is expected to work across all the NVIDIA Jetson hardware lineup including latest and legacy.
## What is NVIDIA Jetson?
NVIDIA Jetson is a series of embedded computing boards designed to bring accelerated AI (artificial intelligence) computing to edge devices. These compact and powerful devices are built around NVIDIA's GPU architecture and are capable of running complex AI algorithms and [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models directly on the device, without needing to rely on [cloud computing](https://www.ultralytics.com/glossary/cloud-computing) resources. Jetson boards are often used in robotics, autonomous vehicles, industrial automation, and other applications where AI inference needs to be performed locally with low latency and high efficiency. Additionally, these boards are based on the ARM64 architecture and runs on lower power compared to traditional GPU computing devices.
## NVIDIA Jetson Series Comparison
[Jetson Orin](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/) is the latest iteration of the NVIDIA Jetson family based on NVIDIA Ampere architecture which brings drastically improved AI performance when compared to the previous generations. Below table compared few of the Jetson devices in the ecosystem.
| | Jetson AGX Orin 64GB | Jetson Orin NX 16GB | Jetson Orin Nano 8GB | Jetson AGX Xavier | Jetson Xavier NX | Jetson Nano |
| ----------------- | ----------------------------------------------------------------- | ---------------------------------------------------------------- | ------------------------------------------------------------- | ----------------------------------------------------------- | ------------------------------------------------------------- | --------------------------------------------- |
| AI Performance | 275 TOPS | 100 TOPS | 40 TOPs | 32 TOPS | 21 TOPS | 472 GFLOPS |
| GPU | 2048-core NVIDIA Ampere architecture GPU with 64 Tensor Cores | 1024-core NVIDIA Ampere architecture GPU with 32 Tensor Cores | 1024-core NVIDIA Ampere architecture GPU with 32 Tensor Cores | 512-core NVIDIA Volta architecture GPU with 64 Tensor Cores | 384-core NVIDIA Volta™ architecture GPU with 48 Tensor Cores | 128-core NVIDIA Maxwell™ architecture GPU |
| GPU Max Frequency | 1.3 GHz | 918 MHz | 625 MHz | 1377 MHz | 1100 MHz | 921MHz |
| CPU | 12-core NVIDIA Arm® Cortex A78AE v8.2 64-bit CPU 3MB L2 + 6MB L3 | 8-core NVIDIA Arm® Cortex A78AE v8.2 64-bit CPU 2MB L2 + 4MB L3 | 6-core Arm® Cortex®-A78AE v8.2 64-bit CPU 1.5MB L2 + 4MB L3 | 8-core NVIDIA Carmel Arm®v8.2 64-bit CPU 8MB L2 + 4MB L3 | 6-core NVIDIA Carmel Arm®v8.2 64-bit CPU 6MB L2 + 4MB L3 | Quad-Core Arm® Cortex®-A57 MPCore processor |
| CPU Max Frequency | 2.2 GHz | 2.0 GHz | 1.5 GHz | 2.2 GHz | 1.9 GHz | 1.43GHz |
| Memory | 64GB 256-bit LPDDR5 204.8GB/s | 16GB 128-bit LPDDR5 102.4GB/s | 8GB 128-bit LPDDR5 68 GB/s | 32GB 256-bit LPDDR4x 136.5GB/s | 8GB 128-bit LPDDR4x 59.7GB/s | 4GB 64-bit LPDDR4 25.6GB/s" |
For a more detailed comparison table, please visit the **Technical Specifications** section of [official NVIDIA Jetson page](https://developer.nvidia.com/embedded/jetson-modules).
## What is NVIDIA JetPack?
[NVIDIA JetPack SDK](https://developer.nvidia.com/embedded/jetpack) powering the Jetson modules is the most comprehensive solution and provides full development environment for building end-to-end accelerated AI applications and shortens time to market. JetPack includes Jetson Linux with bootloader, Linux kernel, Ubuntu desktop environment, and a complete set of libraries for acceleration of GPU computing, multimedia, graphics, and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv). It also includes samples, documentation, and developer tools for both host computer and developer kit, and supports higher level SDKs such as DeepStream for streaming video analytics, Isaac for robotics, and Riva for conversational AI.
## Flash JetPack to NVIDIA Jetson
The first step after getting your hands on an NVIDIA Jetson device is to flash NVIDIA JetPack to the device. There are several different way of flashing NVIDIA Jetson devices.
1. If you own an official NVIDIA Development Kit such as the Jetson Orin Nano Developer Kit, you can [download an image and prepare an SD card with JetPack for booting the device](https://developer.nvidia.com/embedded/learn/get-started-jetson-orin-nano-devkit).
2. If you own any other NVIDIA Development Kit, you can [flash JetPack to the device using SDK Manager](https://docs.nvidia.com/sdk-manager/install-with-sdkm-jetson/index.html).
3. If you own a Seeed Studio reComputer J4012 device, you can [flash JetPack to the included SSD](https://wiki.seeedstudio.com/reComputer_J4012_Flash_Jetpack/) and if you own a Seeed Studio reComputer J1020 v2 device, you can [flash JetPack to the eMMC/ SSD](https://wiki.seeedstudio.com/reComputer_J2021_J202_Flash_Jetpack/).
4. If you own any other third party device powered by the NVIDIA Jetson module, it is recommended to follow [command-line flashing](https://docs.nvidia.com/jetson/archives/r35.5.0/DeveloperGuide/IN/QuickStart.html).
!!! note
For methods 3 and 4 above, after flashing the system and booting the device, please enter "sudo apt update && sudo apt install nvidia-jetpack -y" on the device terminal to install all the remaining JetPack components needed.
## JetPack Support Based on Jetson Device
The below table highlights NVIDIA JetPack versions supported by different NVIDIA Jetson devices.
| | JetPack 4 | JetPack 5 | JetPack 6 |
| ----------------- | --------- | --------- | --------- |
| Jetson Nano | ✅ | ❌ | ❌ |
| Jetson TX2 | ✅ | ❌ | ❌ |
| Jetson Xavier NX | ✅ | ✅ | ❌ |
| Jetson AGX Xavier | ✅ | ✅ | ❌ |
| Jetson AGX Orin | ❌ | ✅ | ✅ |
| Jetson Orin NX | ❌ | ✅ | ✅ |
| Jetson Orin Nano | ❌ | ✅ | ✅ |
## Quick Start with Docker
The fastest way to get started with Ultralytics YOLO11 on NVIDIA Jetson is to run with pre-built docker images for Jetson. Refer to the table above and choose the JetPack version according to the Jetson device you own.
=== "JetPack 4"
```bash
t=ultralytics/ultralytics:latest-jetson-jetpack4
sudo docker pull $t && sudo docker run -it --ipc=host --runtime=nvidia $t
```
=== "JetPack 5"
```bash
t=ultralytics/ultralytics:latest-jetson-jetpack5
sudo docker pull $t && sudo docker run -it --ipc=host --runtime=nvidia $t
```
=== "JetPack 6"
```bash
t=ultralytics/ultralytics:latest-jetson-jetpack6
sudo docker pull $t && sudo docker run -it --ipc=host --runtime=nvidia $t
```
After this is done, skip to [Use TensorRT on NVIDIA Jetson section](#use-tensorrt-on-nvidia-jetson).
## Start with Native Installation
For a native installation without Docker, please refer to the steps below.
### Run on JetPack 6.x
#### Install Ultralytics Package
Here we will install Ultralytics package on the Jetson with optional dependencies so that we can export the [PyTorch](https://www.ultralytics.com/glossary/pytorch) models to other different formats. We will mainly focus on [NVIDIA TensorRT exports](../integrations/tensorrt.md) because TensorRT will make sure we can get the maximum performance out of the Jetson devices.
1. Update packages list, install pip and upgrade to latest
```bash
sudo apt update
sudo apt install python3-pip -y
pip install -U pip
```
2. Install `ultralytics` pip package with optional dependencies
```bash
pip install ultralytics[export]
```
3. Reboot the device
```bash
sudo reboot
```
#### Install PyTorch and Torchvision
The above ultralytics installation will install Torch and Torchvision. However, these 2 packages installed via pip are not compatible to run on Jetson platform which is based on ARM64 architecture. Therefore, we need to manually install pre-built PyTorch pip wheel and compile/ install Torchvision from source.
Install `torch 2.3.0` and `torchvision 0.18` according to JP6.0
```bash
sudo apt-get install libopenmpi-dev libopenblas-base libomp-dev -y
pip install https://github.com/ultralytics/assets/releases/download/v0.0.0/torch-2.3.0-cp310-cp310-linux_aarch64.whl
pip install https://github.com/ultralytics/assets/releases/download/v0.0.0/torchvision-0.18.0a0+6043bc2-cp310-cp310-linux_aarch64.whl
```
Visit the [PyTorch for Jetson page](https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048) to access all different versions of PyTorch for different JetPack versions. For a more detailed list on the PyTorch, Torchvision compatibility, visit the [PyTorch and Torchvision compatibility page](https://github.com/pytorch/vision).
#### Install `onnxruntime-gpu`
The [onnxruntime-gpu](https://pypi.org/project/onnxruntime-gpu/) package hosted in PyPI does not have `aarch64` binaries for the Jetson. So we need to manually install this package. This package is needed for some of the exports.
All different `onnxruntime-gpu` packages corresponding to different JetPack and Python versions are listed [here](https://elinux.org/Jetson_Zoo#ONNX_Runtime). However, here we will download and install `onnxruntime-gpu 1.18.0` with `Python3.10` support.
```bash
wget https://nvidia.box.com/shared/static/48dtuob7meiw6ebgfsfqakc9vse62sg4.whl -O onnxruntime_gpu-1.18.0-cp310-cp310-linux_aarch64.whl
pip install onnxruntime_gpu-1.18.0-cp310-cp310-linux_aarch64.whl
```
!!! note
`onnxruntime-gpu` will automatically revert back the numpy version to latest. So we need to reinstall numpy to `1.23.5` to fix an issue by executing:
`pip install numpy==1.23.5`
### Run on JetPack 5.x
#### Install Ultralytics Package
Here we will install Ultralytics package on the Jetson with optional dependencies so that we can export the PyTorch models to other different formats. We will mainly focus on [NVIDIA TensorRT exports](../integrations/tensorrt.md) because TensorRT will make sure we can get the maximum performance out of the Jetson devices.
1. Update packages list, install pip and upgrade to latest
```bash
sudo apt update
sudo apt install python3-pip -y
pip install -U pip
```
2. Install `ultralytics` pip package with optional dependencies
```bash
pip install ultralytics[export]
```
3. Reboot the device
```bash
sudo reboot
```
#### Install PyTorch and Torchvision
The above ultralytics installation will install Torch and Torchvision. However, these 2 packages installed via pip are not compatible to run on Jetson platform which is based on ARM64 architecture. Therefore, we need to manually install pre-built PyTorch pip wheel and compile/ install Torchvision from source.
1. Uninstall currently installed PyTorch and Torchvision
```bash
pip uninstall torch torchvision
```
2. Install PyTorch 2.1.0 according to JP5.1.3
```bash
sudo apt-get install -y libopenblas-base libopenmpi-dev
wget https://developer.download.nvidia.com/compute/redist/jp/v512/pytorch/torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl -O torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
pip install torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
```
3. Install Torchvision v0.16.2 according to PyTorch v2.1.0
```bash
sudo apt install -y libjpeg-dev zlib1g-dev
git clone https://github.com/pytorch/vision torchvision
cd torchvision
git checkout v0.16.2
python3 setup.py install --user
```
Visit the [PyTorch for Jetson page](https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048) to access all different versions of PyTorch for different JetPack versions. For a more detailed list on the PyTorch, Torchvision compatibility, visit the [PyTorch and Torchvision compatibility page](https://github.com/pytorch/vision).
#### Install `onnxruntime-gpu`
The [onnxruntime-gpu](https://pypi.org/project/onnxruntime-gpu/) package hosted in PyPI does not have `aarch64` binaries for the Jetson. So we need to manually install this package. This package is needed for some of the exports.
All different `onnxruntime-gpu` packages corresponding to different JetPack and Python versions are listed [here](https://elinux.org/Jetson_Zoo#ONNX_Runtime). However, here we will download and install `onnxruntime-gpu 1.17.0` with `Python3.8` support.
```bash
wget https://nvidia.box.com/shared/static/zostg6agm00fb6t5uisw51qi6kpcuwzd.whl -O onnxruntime_gpu-1.17.0-cp38-cp38-linux_aarch64.whl
pip install onnxruntime_gpu-1.17.0-cp38-cp38-linux_aarch64.whl
```
!!! note
`onnxruntime-gpu` will automatically revert back the numpy version to latest. So we need to reinstall numpy to `1.23.5` to fix an issue by executing:
`pip install numpy==1.23.5`
## Use TensorRT on NVIDIA Jetson
Out of all the model export formats supported by Ultralytics, TensorRT delivers the best inference performance when working with NVIDIA Jetson devices and our recommendation is to use TensorRT with Jetson. We also have a detailed document on TensorRT [here](../integrations/tensorrt.md).
### Convert Model to TensorRT and Run Inference
The YOLO11n model in PyTorch format is converted to TensorRT to run inference with the exported model.
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a YOLO11n PyTorch model
model = YOLO("yolo11n.pt")
# Export the model to TensorRT
model.export(format="engine") # creates 'yolo11n.engine'
# Load the exported TensorRT model
trt_model = YOLO("yolo11n.engine")
# Run inference
results = trt_model("https://ultralytics.com/images/bus.jpg")
```
=== "CLI"
```bash
# Export a YOLO11n PyTorch model to TensorRT format
yolo export model=yolo11n.pt format=engine # creates 'yolo11n.engine'
# Run inference with the exported model
yolo predict model=yolo11n.engine source='https://ultralytics.com/images/bus.jpg'
```
### Use NVIDIA Deep Learning Accelerator (DLA)
[NVIDIA Deep Learning Accelerator (DLA)](https://developer.nvidia.com/deep-learning-accelerator) is a specialized hardware component built into NVIDIA Jetson devices that optimizes deep learning inference for energy efficiency and performance. By offloading tasks from the GPU (freeing it up for more intensive processes), DLA enables models to run with lower power consumption while maintaining high throughput, ideal for embedded systems and real-time AI applications.
The following Jetson devices are equipped with DLA hardware:
- Jetson Orin NX 16GB
- Jetson AGX Orin Series
- Jetson AGX Xavier Series
- Jetson Xavier NX Series
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a YOLO11n PyTorch model
model = YOLO("yolo11n.pt")
# Export the model to TensorRT with DLA enabled (only works with FP16 or INT8)
model.export(format="engine", device="dla:0", half=True) # dla:0 or dla:1 corresponds to the DLA cores
# Load the exported TensorRT model
trt_model = YOLO("yolo11n.engine")
# Run inference
results = trt_model("https://ultralytics.com/images/bus.jpg")
```
=== "CLI"
```bash
# Export a YOLO11n PyTorch model to TensorRT format with DLA enabled (only works with FP16 or INT8)
yolo export model=yolo11n.pt format=engine device="dla:0" half=True # dla:0 or dla:1 corresponds to the DLA cores
# Run inference with the exported model on the DLA
yolo predict model=yolo11n.engine source='https://ultralytics.com/images/bus.jpg'
```
!!! note
Visit the [Export page](../modes/export.md#arguments) to access additional arguments when exporting models to different model formats
## NVIDIA Jetson Orin YOLO11 Benchmarks
YOLO11 benchmarks were run by the Ultralytics team on 10 different model formats measuring speed and [accuracy](https://www.ultralytics.com/glossary/accuracy): PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN. Benchmarks were run on Seeed Studio reComputer J4012 powered by Jetson Orin NX 16GB device at FP32 [precision](https://www.ultralytics.com/glossary/precision) with default input image size of 640.
### Comparison Chart
Even though all model exports are working with NVIDIA Jetson, we have only included **PyTorch, TorchScript, TensorRT** for the comparison chart below because, they make use of the GPU on the Jetson and are guaranteed to produce the best results. All the other exports only utilize the CPU and the performance is not as good as the above three. You can find benchmarks for all exports in the section after this chart.
<div style="text-align: center;">
<img src="https://github.com/ultralytics/docs/releases/download/0/nvidia-jetson-benchmarks.avif" alt="NVIDIA Jetson Ecosystem">
</div>
### Detailed Comparison Table
The below table represents the benchmark results for five different models (YOLO11n, YOLO11s, YOLO11m, YOLO11l, YOLO11x) across ten different formats (PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN), giving us the status, size, mAP50-95(B) metric, and inference time for each combination.
!!! performance
=== "YOLO11n"
| Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
|-----------------|--------|-------------------|-------------|------------------------|
| PyTorch | ✅ | 5.4 | 0.6176 | 19.80 |
| TorchScript | ✅ | 10.5 | 0.6100 | 13.30 |
| ONNX | ✅ | 10.2 | 0.6082 | 67.92 |
| OpenVINO | ✅ | 10.4 | 0.6082 | 118.21 |
| TensorRT (FP32) | ✅ | 14.1 | 0.6100 | 7.94 |
| TensorRT (FP16) | ✅ | 8.3 | 0.6082 | 4.80 |
| TensorRT (INT8) | ✅ | 6.6 | 0.3256 | 4.17 |
| TF SavedModel | ✅ | 25.8 | 0.6082 | 185.88 |
| TF GraphDef | ✅ | 10.3 | 0.6082 | 256.66 |
| TF Lite | ✅ | 10.3 | 0.6082 | 284.64 |
| PaddlePaddle | ✅ | 20.4 | 0.6082 | 477.41 |
| NCNN | ✅ | 10.2 | 0.6106 | 32.18 |
=== "YOLO11s"
| Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
|-----------------|--------|-------------------|-------------|------------------------|
| PyTorch | ✅ | 18.4 | 0.7526 | 20.20 |
| TorchScript | ✅ | 36.5 | 0.7416 | 23.42 |
| ONNX | ✅ | 36.3 | 0.7416 | 162.01 |
| OpenVINO | ✅ | 36.4 | 0.7416 | 159.61 |
| TensorRT (FP32) | ✅ | 40.3 | 0.7416 | 13.93 |
| TensorRT (FP16) | ✅ | 21.7 | 0.7416 | 7.47 |
| TensorRT (INT8) | ✅ | 13.6 | 0.3179 | 5.66 |
| TF SavedModel | ✅ | 91.1 | 0.7416 | 316.46 |
| TF GraphDef | ✅ | 36.4 | 0.7416 | 506.71 |
| TF Lite | ✅ | 36.4 | 0.7416 | 842.97 |
| PaddlePaddle | ✅ | 72.5 | 0.7416 | 1172.57 |
| NCNN | ✅ | 36.2 | 0.7419 | 66.00 |
=== "YOLO11m"
| Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
|-----------------|--------|-------------------|-------------|------------------------|
| PyTorch | ✅ | 38.8 | 0.7595 | 36.70 |
| TorchScript | ✅ | 77.3 | 0.7643 | 50.95 |
| ONNX | ✅ | 76.9 | 0.7643 | 416.34 |
| OpenVINO | ✅ | 77.1 | 0.7643 | 370.99 |
| TensorRT (FP32) | ✅ | 81.5 | 0.7640 | 30.49 |
| TensorRT (FP16) | ✅ | 42.2 | 0.7658 | 14.93 |
| TensorRT (INT8) | ✅ | 24.3 | 0.4118 | 10.32 |
| TF SavedModel | ✅ | 192.7 | 0.7643 | 597.08 |
| TF GraphDef | ✅ | 77.0 | 0.7643 | 1016.12 |
| TF Lite | ✅ | 77.0 | 0.7643 | 2494.60 |
| PaddlePaddle | ✅ | 153.8 | 0.7643 | 3218.99 |
| NCNN | ✅ | 76.8 | 0.7691 | 192.77 |
=== "YOLO11l"
| Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
|-----------------|--------|-------------------|-------------|------------------------|
| PyTorch | ✅ | 49.0 | 0.7475 | 47.6 |
| TorchScript | ✅ | 97.6 | 0.7250 | 66.36 |
| ONNX | ✅ | 97.0 | 0.7250 | 532.58 |
| OpenVINO | ✅ | 97.3 | 0.7250 | 477.55 |
| TensorRT (FP32) | ✅ | 101.6 | 0.7250 | 38.71 |
| TensorRT (FP16) | ✅ | 52.6 | 0.7265 | 19.35 |
| TensorRT (INT8) | ✅ | 31.6 | 0.3856 | 13.50 |
| TF SavedModel | ✅ | 243.3 | 0.7250 | 895.24 |
| TF GraphDef | ✅ | 97.2 | 0.7250 | 1301.19 |
| TF Lite | ✅ | 97.2 | 0.7250 | 3202.93 |
| PaddlePaddle | ✅ | 193.9 | 0.7250 | 4206.98 |
| NCNN | ✅ | 96.9 | 0.7252 | 225.75 |
=== "YOLO11x"
| Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
|-----------------|--------|-------------------|-------------|------------------------|
| PyTorch | ✅ | 109.3 | 0.8288 | 85.60 |
| TorchScript | ✅ | 218.1 | 0.8308 | 121.67 |
| ONNX | ✅ | 217.5 | 0.8308 | 1073.14 |
| OpenVINO | ✅ | 217.8 | 0.8308 | 955.60 |
| TensorRT (FP32) | ✅ | 221.6 | 0.8307 | 75.84 |
| TensorRT (FP16) | ✅ | 113.1 | 0.8295 | 35.75 |
| TensorRT (INT8) | ✅ | 62.2 | 0.4783 | 22.23 |
| TF SavedModel | ✅ | 545.0 | 0.8308 | 1497.40 |
| TF GraphDef | ✅ | 217.8 | 0.8308 | 2552.42 |
| TF Lite | ✅ | 217.8 | 0.8308 | 7044.58 |
| PaddlePaddle | ✅ | 434.9 | 0.8308 | 8386.73 |
| NCNN | ✅ | 217.3 | 0.8304 | 486.36 |
[Explore more benchmarking efforts by Seeed Studio](https://www.seeedstudio.com/blog/2023/03/30/yolov8-performance-benchmarks-on-nvidia-jetson-devices) running on different versions of NVIDIA Jetson hardware.
## Reproduce Our Results
To reproduce the above Ultralytics benchmarks on all export [formats](../modes/export.md) run this code:
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a YOLO11n PyTorch model
model = YOLO("yolo11n.pt")
# Benchmark YOLO11n speed and accuracy on the COCO8 dataset for all all export formats
results = model.benchmarks(data="coco8.yaml", imgsz=640)
```
=== "CLI"
```bash
# Benchmark YOLO11n speed and accuracy on the COCO8 dataset for all all export formats
yolo benchmark model=yolo11n.pt data=coco8.yaml imgsz=640
```
Note that benchmarking results might vary based on the exact hardware and software configuration of a system, as well as the current workload of the system at the time the benchmarks are run. For the most reliable results use a dataset with a large number of images, i.e. `data='coco8.yaml' (4 val images), or `data='coco.yaml'` (5000 val images).
## Best Practices when using NVIDIA Jetson
When using NVIDIA Jetson, there are a couple of best practices to follow in order to enable maximum performance on the NVIDIA Jetson running YOLO11.
1. Enable MAX Power Mode
Enabling MAX Power Mode on the Jetson will make sure all CPU, GPU cores are turned on.
```bash
sudo nvpmodel -m 0
```
2. Enable Jetson Clocks
Enabling Jetson Clocks will make sure all CPU, GPU cores are clocked at their maximum frequency.
```bash
sudo jetson_clocks
```
3. Install Jetson Stats Application
We can use jetson stats application to monitor the temperatures of the system components and check other system details such as view CPU, GPU, RAM utilization, change power modes, set to max clocks, check JetPack information
```bash
sudo apt update
sudo pip install jetson-stats
sudo reboot
jtop
```
<img width="1024" src="https://github.com/ultralytics/docs/releases/download/0/jetson-stats-application.avif" alt="Jetson Stats">
## Next Steps
Congratulations on successfully setting up YOLO11 on your NVIDIA Jetson! For further learning and support, visit more guide at [Ultralytics YOLO11 Docs](../index.md)!
## FAQ
### How do I deploy Ultralytics YOLO11 on NVIDIA Jetson devices?
Deploying Ultralytics YOLO11 on NVIDIA Jetson devices is a straightforward process. First, flash your Jetson device with the NVIDIA JetPack SDK. Then, either use a pre-built Docker image for quick setup or manually install the required packages. Detailed steps for each approach can be found in sections [Quick Start with Docker](#quick-start-with-docker) and [Start with Native Installation](#start-with-native-installation).
### What performance benchmarks can I expect from YOLO11 models on NVIDIA Jetson devices?
YOLO11 models have been benchmarked on various NVIDIA Jetson devices showing significant performance improvements. For example, the TensorRT format delivers the best inference performance. The table in the [Detailed Comparison Table](#detailed-comparison-table) section provides a comprehensive view of performance metrics like mAP50-95 and inference time across different model formats.
### Why should I use TensorRT for deploying YOLO11 on NVIDIA Jetson?
TensorRT is highly recommended for deploying YOLO11 models on NVIDIA Jetson due to its optimal performance. It accelerates inference by leveraging the Jetson's GPU capabilities, ensuring maximum efficiency and speed. Learn more about how to convert to TensorRT and run inference in the [Use TensorRT on NVIDIA Jetson](#use-tensorrt-on-nvidia-jetson) section.
### How can I install PyTorch and Torchvision on NVIDIA Jetson?
To install PyTorch and Torchvision on NVIDIA Jetson, first uninstall any existing versions that may have been installed via pip. Then, manually install the compatible PyTorch and Torchvision versions for the Jetson's ARM64 architecture. Detailed instructions for this process are provided in the [Install PyTorch and Torchvision](#install-pytorch-and-torchvision) section.
### What are the best practices for maximizing performance on NVIDIA Jetson when using YOLO11?
To maximize performance on NVIDIA Jetson with YOLO11, follow these best practices:
1. Enable MAX Power Mode to utilize all CPU and GPU cores.
2. Enable Jetson Clocks to run all cores at their maximum frequency.
3. Install the Jetson Stats application for monitoring system metrics.
For commands and additional details, refer to the [Best Practices when using NVIDIA Jetson](#best-practices-when-using-nvidia-jetson) section.
---
comments: true
description: Learn how to use Ultralytics YOLO11 for real-time object blurring to enhance privacy and focus in your images and videos.
keywords: YOLO11, object blurring, real-time processing, privacy protection, image manipulation, video editing, Ultralytics
---
# Object Blurring using Ultralytics YOLO11 🚀
## What is Object Blurring?
Object blurring with [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) involves applying a blurring effect to specific detected objects in an image or video. This can be achieved using the YOLO11 model capabilities to identify and manipulate objects within a given scene.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/ydGdibB5Mds"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> Object Blurring using Ultralytics YOLO11
</p>
## Advantages of Object Blurring?
- **Privacy Protection**: Object blurring is an effective tool for safeguarding privacy by concealing sensitive or personally identifiable information in images or videos.
- **Selective Focus**: YOLO11 allows for selective blurring, enabling users to target specific objects, ensuring a balance between privacy and retaining relevant visual information.
- **Real-time Processing**: YOLO11's efficiency enables object blurring in real-time, making it suitable for applications requiring on-the-fly privacy enhancements in dynamic environments.
!!! example "Object Blurring using YOLO11 Example"
=== "Object Blurring"
```python
import cv2
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
model = YOLO("yolo11n.pt")
names = model.names
cap = cv2.VideoCapture("path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
# Blur ratio
blur_ratio = 50
# Video writer
video_writer = cv2.VideoWriter("object_blurring_output.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))
while cap.isOpened():
success, im0 = cap.read()
if not success:
print("Video frame is empty or video processing has been successfully completed.")
break
results = model.predict(im0, show=False)
boxes = results[0].boxes.xyxy.cpu().tolist()
clss = results[0].boxes.cls.cpu().tolist()
annotator = Annotator(im0, line_width=2, example=names)
if boxes is not None:
for box, cls in zip(boxes, clss):
annotator.box_label(box, color=colors(int(cls), True), label=names[int(cls)])
obj = im0[int(box[1]) : int(box[3]), int(box[0]) : int(box[2])]
blur_obj = cv2.blur(obj, (blur_ratio, blur_ratio))
im0[int(box[1]) : int(box[3]), int(box[0]) : int(box[2])] = blur_obj
cv2.imshow("ultralytics", im0)
video_writer.write(im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
video_writer.release()
cv2.destroyAllWindows()
```
### Arguments `model.predict`
{% include "macros/predict-args.md" %}
## FAQ
### What is object blurring with Ultralytics YOLO11?
Object blurring with [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) involves automatically detecting and applying a blurring effect to specific objects in images or videos. This technique enhances privacy by concealing sensitive information while retaining relevant visual data. YOLO11's real-time processing capabilities make it suitable for applications requiring immediate privacy protection and selective focus adjustments.
### How can I implement real-time object blurring using YOLO11?
To implement real-time object blurring with YOLO11, follow the provided Python example. This involves using YOLO11 for [object detection](https://www.ultralytics.com/glossary/object-detection) and OpenCV for applying the blur effect. Here's a simplified version:
```python
import cv2
from ultralytics import YOLO
model = YOLO("yolo11n.pt")
cap = cv2.VideoCapture("path/to/video/file.mp4")
while cap.isOpened():
success, im0 = cap.read()
if not success:
break
results = model.predict(im0, show=False)
for box in results[0].boxes.xyxy.cpu().tolist():
obj = im0[int(box[1]) : int(box[3]), int(box[0]) : int(box[2])]
im0[int(box[1]) : int(box[3]), int(box[0]) : int(box[2])] = cv2.blur(obj, (50, 50))
cv2.imshow("YOLO11 Blurring", im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
```
### What are the benefits of using Ultralytics YOLO11 for object blurring?
Ultralytics YOLO11 offers several advantages for object blurring:
- **Privacy Protection**: Effectively obscure sensitive or identifiable information.
- **Selective Focus**: Target specific objects for blurring, maintaining essential visual content.
- **Real-time Processing**: Execute object blurring efficiently in dynamic environments, suitable for instant privacy enhancements.
For more detailed applications, check the [advantages of object blurring section](#advantages-of-object-blurring).
### Can I use Ultralytics YOLO11 to blur faces in a video for privacy reasons?
Yes, Ultralytics YOLO11 can be configured to detect and blur faces in videos to protect privacy. By training or using a pre-trained model to specifically recognize faces, the detection results can be processed with [OpenCV](https://www.ultralytics.com/glossary/opencv) to apply a blur effect. Refer to our guide on [object detection with YOLO11](https://docs.ultralytics.com/models/yolov8/) and modify the code to target face detection.
### How does YOLO11 compare to other object detection models like Faster R-CNN for object blurring?
Ultralytics YOLO11 typically outperforms models like Faster R-CNN in terms of speed, making it more suitable for real-time applications. While both models offer accurate detection, YOLO11's architecture is optimized for rapid inference, which is critical for tasks like real-time object blurring. Learn more about the technical differences and performance metrics in our [YOLO11 documentation](https://docs.ultralytics.com/models/yolov8/).
---
comments: true
description: Learn to accurately identify and count objects in real-time using Ultralytics YOLO11 for applications like crowd analysis and surveillance.
keywords: object counting, YOLO11, Ultralytics, real-time object detection, AI, deep learning, object tracking, crowd analysis, surveillance, resource optimization
---
# Object Counting using Ultralytics YOLO11
## What is Object Counting?
Object counting with [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) involves accurate identification and counting of specific objects in videos and camera streams. YOLO11 excels in real-time applications, providing efficient and precise object counting for various scenarios like crowd analysis and surveillance, thanks to its state-of-the-art algorithms and [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) capabilities.
<table>
<tr>
<td align="center">
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/Ag2e-5_NpS0"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> Object Counting using Ultralytics YOLOv8
</td>
<td align="center">
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/Fj9TStNBVoY"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> Class-wise Object Counting using Ultralytics YOLO11
</td>
</tr>
</table>
## Advantages of Object Counting?
- **Resource Optimization:** Object counting facilitates efficient resource management by providing accurate counts, and optimizing resource allocation in applications like inventory management.
- **Enhanced Security:** Object counting enhances security and surveillance by accurately tracking and counting entities, aiding in proactive threat detection.
- **Informed Decision-Making:** Object counting offers valuable insights for decision-making, optimizing processes in retail, traffic management, and various other domains.
## Real World Applications
| Logistics | Aquaculture |
| :-----------------------------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------------------------------------: |
| ![Conveyor Belt Packets Counting Using Ultralytics YOLO11](https://github.com/ultralytics/docs/releases/download/0/conveyor-belt-packets-counting.avif) | ![Fish Counting in Sea using Ultralytics YOLO11](https://github.com/ultralytics/docs/releases/download/0/fish-counting-in-sea-using-ultralytics-yolov8.avif) |
| Conveyor Belt Packets Counting Using Ultralytics YOLO11 | Fish Counting in Sea using Ultralytics YOLO11 |
!!! example "Object Counting using YOLO11 Example"
=== "CLI"
```bash
# Run a counting example
yolo solutions count show=True
# Pass a source video
yolo solutions count source="path/to/video/file.mp4"
# Pass region coordinates
yolo solutions count region=[(20, 400), (1080, 400), (1080, 360), (20, 360)]
```
=== "Python"
```python
import cv2
from ultralytics import solutions
cap = cv2.VideoCapture("path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
# Define region points
# region_points = [(20, 400), (1080, 400)] # For line counting
region_points = [(20, 400), (1080, 400), (1080, 360), (20, 360)] # For rectangle region counting
# region_points = [(20, 400), (1080, 400), (1080, 360), (20, 360), (20, 400)] # For polygon region counting
# Video writer
video_writer = cv2.VideoWriter("object_counting_output.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))
# Init Object Counter
counter = solutions.ObjectCounter(
show=True, # Display the output
region=region_points, # Pass region points
model="yolo11n.pt", # model="yolo11n-obb.pt" for object counting using YOLO11 OBB model.
# classes=[0, 2], # If you want to count specific classes i.e person and car with COCO pretrained model.
# show_in=True, # Display in counts
# show_out=True, # Display out counts
# line_width=2, # Adjust the line width for bounding boxes and text display
)
# Process video
while cap.isOpened():
success, im0 = cap.read()
if not success:
print("Video frame is empty or video processing has been successfully completed.")
break
im0 = counter.count(im0)
video_writer.write(im0)
cap.release()
video_writer.release()
cv2.destroyAllWindows()
```
### Argument `ObjectCounter`
Here's a table with the `ObjectCounter` arguments:
| Name | Type | Default | Description |
| ------------ | ------ | -------------------------- | ---------------------------------------------------------------------- |
| `model` | `str` | `None` | Path to Ultralytics YOLO Model File |
| `region` | `list` | `[(20, 400), (1260, 400)]` | List of points defining the counting region. |
| `line_width` | `int` | `2` | Line thickness for bounding boxes. |
| `show` | `bool` | `False` | Flag to control whether to display the video stream. |
| `show_in` | `bool` | `True` | Flag to control whether to display the in counts on the video stream. |
| `show_out` | `bool` | `True` | Flag to control whether to display the out counts on the video stream. |
### Arguments `model.track`
{% include "macros/track-args.md" %}
## FAQ
### How do I count objects in a video using Ultralytics YOLO11?
To count objects in a video using Ultralytics YOLO11, you can follow these steps:
1. Import the necessary libraries (`cv2`, `ultralytics`).
2. Define the counting region (e.g., a polygon, line, etc.).
3. Set up the video capture and initialize the object counter.
4. Process each frame to track objects and count them within the defined region.
Here's a simple example for counting in a region:
```python
import cv2
from ultralytics import solutions
def count_objects_in_region(video_path, output_video_path, model_path):
"""Count objects in a specific region within a video."""
cap = cv2.VideoCapture(video_path)
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
video_writer = cv2.VideoWriter(output_video_path, cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))
region_points = [(20, 400), (1080, 400), (1080, 360), (20, 360)]
counter = solutions.ObjectCounter(show=True, region=region_points, model=model_path)
while cap.isOpened():
success, im0 = cap.read()
if not success:
print("Video frame is empty or video processing has been successfully completed.")
break
im0 = counter.count(im0)
video_writer.write(im0)
cap.release()
video_writer.release()
cv2.destroyAllWindows()
count_objects_in_region("path/to/video.mp4", "output_video.avi", "yolo11n.pt")
```
Explore more configurations and options in the [Object Counting](#object-counting-using-ultralytics-yolo11) section.
### What are the advantages of using Ultralytics YOLO11 for object counting?
Using Ultralytics YOLO11 for object counting offers several advantages:
1. **Resource Optimization:** It facilitates efficient resource management by providing accurate counts, helping optimize resource allocation in industries like inventory management.
2. **Enhanced Security:** It enhances security and surveillance by accurately tracking and counting entities, aiding in proactive threat detection.
3. **Informed Decision-Making:** It offers valuable insights for decision-making, optimizing processes in domains like retail, traffic management, and more.
For real-world applications and code examples, visit the [Advantages of Object Counting](#advantages-of-object-counting) section.
### How can I count specific classes of objects using Ultralytics YOLO11?
To count specific classes of objects using Ultralytics YOLO11, you need to specify the classes you are interested in during the tracking phase. Below is a Python example:
```python
import cv2
from ultralytics import solutions
def count_specific_classes(video_path, output_video_path, model_path, classes_to_count):
"""Count specific classes of objects in a video."""
cap = cv2.VideoCapture(video_path)
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
video_writer = cv2.VideoWriter(output_video_path, cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))
line_points = [(20, 400), (1080, 400)]
counter = solutions.ObjectCounter(show=True, region=line_points, model=model_path, classes=classes_to_count)
while cap.isOpened():
success, im0 = cap.read()
if not success:
print("Video frame is empty or video processing has been successfully completed.")
break
im0 = counter.count(im0)
video_writer.write(im0)
cap.release()
video_writer.release()
cv2.destroyAllWindows()
count_specific_classes("path/to/video.mp4", "output_specific_classes.avi", "yolo11n.pt", [0, 2])
```
In this example, `classes_to_count=[0, 2]`, which means it counts objects of class `0` and `2` (e.g., person and car).
### Why should I use YOLO11 over other [object detection](https://www.ultralytics.com/glossary/object-detection) models for real-time applications?
Ultralytics YOLO11 provides several advantages over other object detection models like Faster R-CNN, SSD, and previous YOLO versions:
1. **Speed and Efficiency:** YOLO11 offers real-time processing capabilities, making it ideal for applications requiring high-speed inference, such as surveillance and autonomous driving.
2. **[Accuracy](https://www.ultralytics.com/glossary/accuracy):** It provides state-of-the-art accuracy for object detection and tracking tasks, reducing the number of false positives and improving overall system reliability.
3. **Ease of Integration:** YOLO11 offers seamless integration with various platforms and devices, including mobile and edge devices, which is crucial for modern AI applications.
4. **Flexibility:** Supports various tasks like object detection, segmentation, and tracking with configurable models to meet specific use-case requirements.
Check out Ultralytics [YOLO11 Documentation](https://docs.ultralytics.com/models/yolo11/) for a deeper dive into its features and performance comparisons.
### Can I use YOLO11 for advanced applications like crowd analysis and traffic management?
Yes, Ultralytics YOLO11 is perfectly suited for advanced applications like crowd analysis and traffic management due to its real-time detection capabilities, scalability, and integration flexibility. Its advanced features allow for high-accuracy object tracking, counting, and classification in dynamic environments. Example use cases include:
- **Crowd Analysis:** Monitor and manage large gatherings, ensuring safety and optimizing crowd flow.
- **Traffic Management:** Track and count vehicles, analyze traffic patterns, and manage congestion in real-time.
For more information and implementation details, refer to the guide on [Real World Applications](#real-world-applications) of object counting with YOLO11.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment