Object cropping with [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) involves isolating and extracting specific detected objects from an image or video. The YOLO11 model capabilities are utilized to accurately identify and delineate objects, enabling precise cropping for further analysis or manipulation.
<strong>Watch:</strong> Object Cropping using Ultralytics YOLO
</p>
## Advantages of Object Cropping?
-**Focused Analysis**: YOLO11 facilitates targeted object cropping, allowing for in-depth examination or processing of individual items within a scene.
-**Reduced Data Volume**: By extracting only relevant objects, object cropping helps in minimizing data size, making it efficient for storage, transmission, or subsequent computational tasks.
-**Enhanced Precision**: YOLO11's [object detection](https://www.ultralytics.com/glossary/object-detection)[accuracy](https://www.ultralytics.com/glossary/accuracy) ensures that the cropped objects maintain their spatial relationships, preserving the integrity of the visual information for detailed analysis.
|  |
| Suitcases Cropping at airport conveyor belt using Ultralytics YOLO11 |
!!! example "Object Cropping using YOLO11 Example"
=== "Object Cropping"
```python
import os
import cv2
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors
model = YOLO("yolo11n.pt")
names = model.names
cap = cv2.VideoCapture("path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
### What is object cropping in Ultralytics YOLO11 and how does it work?
Object cropping using [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics) involves isolating and extracting specific objects from an image or video based on YOLO11's detection capabilities. This process allows for focused analysis, reduced data volume, and enhanced [precision](https://www.ultralytics.com/glossary/precision) by leveraging YOLO11 to identify objects with high accuracy and crop them accordingly. For an in-depth tutorial, refer to the [object cropping example](#object-cropping-using-ultralytics-yolo11).
### Why should I use Ultralytics YOLO11 for object cropping over other solutions?
Ultralytics YOLO11 stands out due to its precision, speed, and ease of use. It allows detailed and accurate object detection and cropping, essential for [focused analysis](#advantages-of-object-cropping) and applications needing high data integrity. Moreover, YOLO11 integrates seamlessly with tools like OpenVINO and TensorRT for deployments requiring real-time capabilities and optimization on diverse hardware. Explore the benefits in the [guide on model export](../modes/export.md).
### How can I reduce the data volume of my dataset using object cropping?
By using Ultralytics YOLO11 to crop only relevant objects from your images or videos, you can significantly reduce the data size, making it more efficient for storage and processing. This process involves training the model to detect specific objects and then using the results to crop and save these portions only. For more information on exploiting Ultralytics YOLO11's capabilities, visit our [quickstart guide](../quickstart.md).
### Can I use Ultralytics YOLO11 for real-time video analysis and object cropping?
Yes, Ultralytics YOLO11 can process real-time video feeds to detect and crop objects dynamically. The model's high-speed inference capabilities make it ideal for real-time applications such as surveillance, sports analysis, and automated inspection systems. Check out the [tracking and prediction modes](../modes/predict.md) to understand how to implement real-time processing.
### What are the hardware requirements for efficiently running YOLO11 for object cropping?
Ultralytics YOLO11 is optimized for both CPU and GPU environments, but to achieve optimal performance, especially for real-time or high-volume inference, a dedicated GPU (e.g., NVIDIA Tesla, RTX series) is recommended. For deployment on lightweight devices, consider using CoreML for iOS or TFLite for Android. More details on supported devices and formats can be found in our [model deployment options](../guides/model-deployment-options.md).
When deploying [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models, particularly those for [object detection](https://www.ultralytics.com/glossary/object-detection) such as Ultralytics YOLO models, achieving optimal performance is crucial. This guide delves into leveraging Intel's OpenVINO toolkit to optimize inference, focusing on latency and throughput. Whether you're working on consumer-grade applications or large-scale deployments, understanding and applying these optimization strategies will ensure your models run efficiently on various devices.
## Optimizing for Latency
Latency optimization is vital for applications requiring immediate response from a single model given a single input, typical in consumer scenarios. The goal is to minimize the delay between input and inference result. However, achieving low latency involves careful consideration, especially when running concurrent inferences or managing multiple models.
### Key Strategies for Latency Optimization:
-**Single Inference per Device:** The simplest way to achieve low latency is by limiting to one inference at a time per device. Additional concurrency often leads to increased latency.
-**Leveraging Sub-Devices:** Devices like multi-socket CPUs or multi-tile GPUs can execute multiple requests with minimal latency increase by utilizing their internal sub-devices.
-**OpenVINO Performance Hints:** Utilizing OpenVINO's `ov::hint::PerformanceMode::LATENCY` for the `ov::hint::performance_mode` property during model compilation simplifies performance tuning, offering a device-agnostic and future-proof approach.
### Managing First-Inference Latency:
-**Model Caching:** To mitigate model load and compile times impacting latency, use model caching where possible. For scenarios where caching isn't viable, CPUs generally offer the fastest model load times.
-**Model Mapping vs. Reading:** To reduce load times, OpenVINO replaced model reading with mapping. However, if the model is on a removable or network drive, consider using `ov::enable_mmap(false)` to switch back to reading.
-**AUTO Device Selection:** This mode begins inference on the CPU, shifting to an accelerator once ready, seamlessly reducing first-inference latency.
## Optimizing for Throughput
Throughput optimization is crucial for scenarios serving numerous inference requests simultaneously, maximizing resource utilization without significantly sacrificing individual request performance.
### Approaches to Throughput Optimization:
1.**OpenVINO Performance Hints:** A high-level, future-proof method to enhance throughput across devices using performance hints.
2.**Explicit Batching and Streams:** A more granular approach involving explicit batching and the use of streams for advanced performance tuning.
### Designing Throughput-Oriented Applications:
To maximize throughput, applications should:
- Process inputs in parallel, making full use of the device's capabilities.
- Decompose data flow into concurrent inference requests, scheduled for parallel execution.
- Utilize the Async API with callbacks to maintain efficiency and avoid device starvation.
### Multi-Device Execution:
OpenVINO's multi-device mode simplifies scaling throughput by automatically balancing inference requests across devices without requiring application-level device management.
## Conclusion
Optimizing Ultralytics YOLO models for latency and throughput with OpenVINO can significantly enhance your application's performance. By carefully applying the strategies outlined in this guide, developers can ensure their models run efficiently, meeting the demands of various deployment scenarios. Remember, the choice between optimizing for latency or throughput depends on your specific application needs and the characteristics of the deployment environment.
For more detailed technical information and the latest updates, refer to the [OpenVINO documentation](https://docs.openvino.ai/latest/index.html) and [Ultralytics YOLO repository](https://github.com/ultralytics/ultralytics). These resources provide in-depth guides, tutorials, and community support to help you get the most out of your deep learning models.
---
Ensuring your models achieve optimal performance is not just about tweaking configurations; it's about understanding your application's needs and making informed decisions. Whether you're optimizing for real-time responses or maximizing throughput for large-scale processing, the combination of Ultralytics YOLO models and OpenVINO offers a powerful toolkit for developers to deploy high-performance AI solutions.
## FAQ
### How do I optimize Ultralytics YOLO models for low latency using OpenVINO?
Optimizing Ultralytics YOLO models for low latency involves several key strategies:
1.**Single Inference per Device:** Limit inferences to one at a time per device to minimize delays.
2.**Leveraging Sub-Devices:** Utilize devices like multi-socket CPUs or multi-tile GPUs which can handle multiple requests with minimal latency increase.
3.**OpenVINO Performance Hints:** Use OpenVINO's `ov::hint::PerformanceMode::LATENCY` during model compilation for simplified, device-agnostic tuning.
For more practical tips on optimizing latency, check out the [Latency Optimization section](#optimizing-for-latency) of our guide.
### Why should I use OpenVINO for optimizing Ultralytics YOLO throughput?
OpenVINO enhances Ultralytics YOLO model throughput by maximizing device resource utilization without sacrificing performance. Key benefits include:
-**Performance Hints:** Simple, high-level performance tuning across devices.
-**Explicit Batching and Streams:** Fine-tuning for advanced performance.
Learn more about throughput optimization in the [Throughput Optimization section](#optimizing-for-throughput) of our detailed guide.
### What is the best practice for reducing first-inference latency in OpenVINO?
To reduce first-inference latency, consider these practices:
1.**Model Caching:** Use model caching to decrease load and compile times.
2.**Model Mapping vs. Reading:** Use mapping (`ov::enable_mmap(true)`) by default but switch to reading (`ov::enable_mmap(false)`) if the model is on a removable or network drive.
3.**AUTO Device Selection:** Utilize AUTO mode to start with CPU inference and transition to an accelerator seamlessly.
For detailed strategies on managing first-inference latency, refer to the [Managing First-Inference Latency section](#managing-first-inference-latency).
### How do I balance optimizing for latency and throughput with Ultralytics YOLO and OpenVINO?
Balancing latency and throughput optimization requires understanding your application needs:
-**Throughput Optimization:** Best for scenarios with many concurrent inferences, maximizing resource use (e.g., large-scale deployments).
Using OpenVINO's high-level performance hints and multi-device modes can help strike the right balance. Choose the appropriate [OpenVINO Performance hints](https://docs.ultralytics.com/integrations/openvino/#openvino-performance-hints) based on your specific requirements.
### Can I use Ultralytics YOLO models with other AI frameworks besides OpenVINO?
Yes, Ultralytics YOLO models are highly versatile and can be integrated with various AI frameworks. Options include:
-**TensorRT:** For NVIDIA GPU optimization, follow the [TensorRT integration guide](https://docs.ultralytics.com/integrations/tensorrt/).
-**CoreML:** For Apple devices, refer to our [CoreML export instructions](https://docs.ultralytics.com/integrations/coreml/).
-**[TensorFlow](https://www.ultralytics.com/glossary/tensorflow).js:** For web and Node.js apps, see the [TF.js conversion guide](https://docs.ultralytics.com/integrations/tfjs/).
Explore more integrations on the [Ultralytics Integrations page](https://docs.ultralytics.com/integrations/).
Parking management with [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) ensures efficient and safe parking by organizing spaces and monitoring availability. YOLO11 can improve parking lot management through real-time vehicle detection, and insights into parking occupancy.
|  |  |
| Parking management Aerial View using Ultralytics YOLO11 | Parking management Top View using Ultralytics YOLO11 |
## Parking Management System Code Workflow
### Selection of Points
!!! tip "Point Selection is now Easy"
Choosing parking points is a critical and complex task in parking management systems. Ultralytics streamlines this process by providing a tool that lets you define parking lot areas, which can be utilized later for additional processing.
- Capture a frame from the video or camera stream where you want to manage the parking lot.
- Use the provided code to launch a graphical interface, where you can select an image and start outlining parking regions by mouse click to create polygons.
!!! warning "Image Size"
Max Image Size of 1920 * 1080 supported
!!! example "Parking slots Annotator Ultralytics YOLO11"
=== "Parking Annotator"
```python
from ultralytics import solutions
solutions.ParkingPtsSelection()
```
- After defining the parking areas with polygons, click `save` to store a JSON file with the data in your working directory.
| `model` | `str` | `None` | Path to the YOLO11 model. |
| `json_file` | `str` | `None` | Path to the JSON file, that have all parking coordinates data. |
### Arguments `model.track`
{% include "macros/track-args.md" %}
## FAQ
### How does Ultralytics YOLO11 enhance parking management systems?
Ultralytics YOLO11 greatly enhances parking management systems by providing **real-time vehicle detection** and monitoring. This results in optimized usage of parking spaces, reduced congestion, and improved safety through continuous surveillance. The [Parking Management System](https://github.com/ultralytics/ultralytics) enables efficient traffic flow, minimizing idle times and emissions in parking lots, thereby contributing to environmental sustainability. For further details, refer to the [parking management code workflow](#python-code-for-parking-management).
### What are the benefits of using Ultralytics YOLO11 for smart parking?
Using Ultralytics YOLO11 for smart parking yields numerous benefits:
-**Efficiency**: Optimizes the use of parking spaces and decreases congestion.
-**Safety and Security**: Enhances surveillance and ensures the safety of vehicles and pedestrians.
-**Environmental Impact**: Helps in reducing emissions by minimizing vehicle idle times. More details on the advantages can be seen [here](#advantages-of-parking-management-system).
### How can I define parking spaces using Ultralytics YOLO11?
Defining parking spaces is straightforward with Ultralytics YOLO11:
1. Capture a frame from a video or camera stream.
2. Use the provided code to launch a GUI for selecting an image and drawing polygons to define parking spaces.
3. Save the labeled data in JSON format for further processing. For comprehensive instructions, check the [selection of points](#selection-of-points) section.
### Can I customize the YOLO11 model for specific parking management needs?
Yes, Ultralytics YOLO11 allows customization for specific parking management needs. You can adjust parameters such as the **occupied and available region colors**, margins for text display, and much more. Utilizing the `ParkingManagement` class's [optional arguments](#optional-arguments-parkingmanagement), you can tailor the model to suit your particular requirements, ensuring maximum efficiency and effectiveness.
### What are some real-world applications of Ultralytics YOLO11 in parking lot management?
Ultralytics YOLO11 is utilized in various real-world applications for parking lot management, including:
-**Parking Space Detection**: Accurately identifying available and occupied spaces.
-**Surveillance**: Enhancing security through real-time monitoring.
-**Traffic Flow Management**: Reducing idle times and congestion with efficient traffic handling. Images showcasing these applications can be found in [real-world applications](#real-world-applications).
description:Learn essential data preprocessing techniques for annotated computer vision data, including resizing, normalizing, augmenting, and splitting datasets for optimal model training.
keywords:data preprocessing, computer vision, image resizing, normalization, data augmentation, training dataset, validation dataset, test dataset, YOLO11
---
# Data Preprocessing Techniques for Annotated [Computer Vision](https://www.ultralytics.com/glossary/computer-vision-cv) Data
## Introduction
After you've defined your computer vision [project's goals](./defining-project-goals.md) and [collected and annotated data](./data-collection-and-annotation.md), the next step is to preprocess annotated data and prepare it for model training. Clean and consistent data are vital to creating a model that performs well.
Preprocessing is a step in the [computer vision project workflow](./steps-of-a-cv-project.md) that includes resizing images, normalizing pixel values, augmenting the dataset, and splitting the data into training, validation, and test sets. Let's explore the essential techniques and best practices for cleaning your data!
## Importance of Data Preprocessing
We are already collecting and annotating our data carefully with multiple considerations in mind. Then, what makes data preprocessing so important to a computer vision project? Well, data preprocessing is all about getting your data into a suitable format for training that reduces the computational load and helps improve model performance. Here are some common issues in raw data that preprocessing addresses:
-**Noise**: Irrelevant or random variations in data.
-**Inconsistency**: Variations in image sizes, formats, and quality.
-**Imbalance**: Unequal distribution of classes or categories in the dataset.
## Data Preprocessing Techniques
One of the first and foremost steps in data preprocessing is resizing. Some models are designed to handle variable input sizes, but many models require a consistent input size. Resizing images makes them uniform and reduces computational complexity.
### Resizing Images
You can resize your images using the following methods:
-**Bilinear Interpolation**: Smooths pixel values by taking a weighted average of the four nearest pixel values.
-**Nearest Neighbor**: Assigns the nearest pixel value without averaging, leading to a blocky image but faster computation.
To make resizing a simpler task, you can use the following tools:
-**[OpenCV](https://www.ultralytics.com/glossary/opencv)**: A popular computer vision library with extensive functions for image processing.
-**PIL (Pillow)**: A Python Imaging Library for opening, manipulating, and saving image files.
With respect to YOLO11, the 'imgsz' parameter during [model training](../modes/train.md) allows for flexible input sizes. When set to a specific size, such as 640, the model will resize input images so their largest dimension is 640 pixels while maintaining the original aspect ratio.
By evaluating your model's and dataset's specific needs, you can determine whether resizing is a necessary preprocessing step or if your model can efficiently handle images of varying sizes.
### Normalizing Pixel Values
Another preprocessing technique is normalization. Normalization scales the pixel values to a standard range, which helps in faster convergence during training and improves model performance. Here are some common normalization techniques:
-**Min-Max Scaling**: Scales pixel values to a range of 0 to 1.
-**Z-Score Normalization**: Scales pixel values based on their mean and standard deviation.
With respect to YOLO11, normalization is seamlessly handled as part of its preprocessing pipeline during model training. YOLO11 automatically performs several preprocessing steps, including conversion to RGB, scaling pixel values to the range [0, 1], and normalization using predefined mean and standard deviation values.
### Splitting the Dataset
Once you've cleaned the data, you are ready to split the dataset. Splitting the data into training, validation, and test sets is done to ensure that the model can be evaluated on unseen data to assess its generalization performance. A common split is 70% for training, 20% for validation, and 10% for testing. There are various tools and libraries that you can use to split your data like scikit-learn or TensorFlow.
Consider the following when splitting your dataset:
-**Maintaining Data Distribution**: Ensure that the data distribution of classes is maintained across training, validation, and test sets.
-**Avoiding Data Leakage**: Typically, data augmentation is done after the dataset is split. Data augmentation and any other preprocessing should only be applied to the training set to prevent information from the validation or test sets from influencing the model training. -**Balancing Classes**: For imbalanced datasets, consider techniques such as oversampling the minority class or under-sampling the majority class within the training set.
### What is Data Augmentation?
The most commonly discussed data preprocessing step is data augmentation. Data augmentation artificially increases the size of the dataset by creating modified versions of images. By augmenting your data, you can reduce overfitting and improve model generalization.
Here are some other benefits of data augmentation:
-**Creates a More Robust Dataset**: Data augmentation can make the model more robust to variations and distortions in the input data. This includes changes in lighting, orientation, and scale.
-**Cost-Effective**: Data augmentation is a cost-effective way to increase the amount of [training data](https://www.ultralytics.com/glossary/training-data) without collecting and labeling new data.
-**Better Use of Data**: Every available data point is used to its maximum potential by creating new variations
#### Data Augmentation Methods
Common augmentation techniques include flipping, rotation, scaling, and color adjustments. Several libraries, such as Albumentations, Imgaug, and TensorFlow's ImageDataGenerator, can generate these augmentations.
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/overview-of-data-augmentations.avif"alt="Overview of Data Augmentations">
</p>
With respect to YOLO11, you can [augment your custom dataset](../modes/train.md) by modifying the dataset configuration file, a .yaml file. In this file, you can add an augmentation section with parameters that specify how you want to augment your data.
The [Ultralytics YOLO11 repository](https://github.com/ultralytics/ultralytics/tree/main) supports a wide range of data augmentations. You can apply various transformations such as:
- Random Crops
- Flipping: Images can be flipped horizontally or vertically.
- Rotation: Images can be rotated by specific angles.
- Distortion
Also, you can adjust the intensity of these augmentation techniques through specific parameters to generate more data variety.
## A Case Study of Preprocessing
Consider a project aimed at developing a model to detect and classify different types of vehicles in traffic images using YOLO11. We've collected traffic images and annotated them with bounding boxes and labels.
Here's what each step of preprocessing would look like for this project:
- Resizing Images: Since YOLO11 handles flexible input sizes and performs resizing automatically, manual resizing is not required. The model will adjust the image size according to the specified 'imgsz' parameter during training.
- Normalizing Pixel Values: YOLO11 automatically normalizes pixel values to a range of 0 to 1 during preprocessing, so it's not required.
- Splitting the Dataset: Divide the dataset into training (70%), validation (20%), and test (10%) sets using tools like scikit-learn.
-[Data Augmentation](https://www.ultralytics.com/glossary/data-augmentation): Modify the dataset configuration file (.yaml) to include data augmentation techniques such as random crops, horizontal flips, and brightness adjustments.
These steps make sure the dataset is prepared without any potential issues and is ready for Exploratory Data Analysis (EDA).
## Exploratory Data Analysis Techniques
After preprocessing and augmenting your dataset, the next step is to gain insights through Exploratory Data Analysis. EDA uses statistical techniques and visualization tools to understand the patterns and distributions in your data. You can identify issues like class imbalances or outliers and make informed decisions about further data preprocessing or model training adjustments.
### Statistical EDA Techniques
Statistical techniques often begin with calculating basic metrics such as mean, median, standard deviation, and range. These metrics provide a quick overview of your image dataset's properties, such as pixel intensity distributions. Understanding these basic statistics helps you grasp the overall quality and characteristics of your data, allowing you to spot any irregularities early on.
### Visual EDA Techniques
Visualizations are key in EDA for image datasets. For example, class imbalance analysis is another vital aspect of EDA. It helps determine if certain classes are underrepresented in your dataset, Visualizing the distribution of different image classes or categories using bar charts can quickly reveal any imbalances. Similarly, outliers can be identified using visualization tools like box plots, which highlight anomalies in pixel intensity or feature distributions. Outlier detection prevents unusual data points from skewing your results.
Common tools for visualizations include:
- Histograms and Box Plots: Useful for understanding the distribution of pixel values and identifying outliers.
- Scatter Plots: Helpful for exploring relationships between image features or annotations.
- Heatmaps: Effective for visualizing the distribution of pixel intensities or the spatial distribution of annotated features within images.
### Using Ultralytics Explorer for EDA
!!! warning "Community Note ⚠️"
As of **`ultralytics>=8.3.10`**, Ultralytics explorer support has been deprecated. But don't worry! You can now access similar and even enhanced functionality through [Ultralytics HUB](https://hub.ultralytics.com/), our intuitive no-code platform designed to streamline your workflow. With Ultralytics HUB, you can continue exploring, visualizing, and managing your data effortlessly, all without writing a single line of code. Make sure to check it out and take advantage of its powerful features!🚀
For a more advanced approach to EDA, you can use the Ultralytics Explorer tool. It offers robust capabilities for exploring computer vision datasets. By supporting semantic search, SQL queries, and vector similarity search, the tool makes it easy to analyze and understand your data. With Ultralytics Explorer, you can create [embeddings](https://www.ultralytics.com/glossary/embeddings) for your dataset to find similar images, run SQL queries for detailed analysis, and perform semantic searches, all through a user-friendly graphical interface.
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/ultralytics-explorer-openai-integration.avif"alt="Overview of Ultralytics Explorer">
</p>
## Reach Out and Connect
Having discussions about your project with other computer vision enthusiasts can give you new ideas from different perspectives. Here are some great ways to learn, troubleshoot, and network:
### Channels to Connect with the Community
-**GitHub Issues:** Visit the YOLO11 GitHub repository and use the [Issues tab](https://github.com/ultralytics/ultralytics/issues) to raise questions, report bugs, and suggest features. The community and maintainers are there to help with any issues you face.
-**Ultralytics Discord Server:** Join the [Ultralytics Discord server](https://discord.com/invite/ultralytics) to connect with other users and developers, get support, share knowledge, and brainstorm ideas.
### Official Documentation
-**Ultralytics YOLO11 Documentation:** Refer to the [official YOLO11 documentation](./index.md) for thorough guides and valuable insights on numerous computer vision tasks and projects.
## Your Dataset Is Ready!
Properly resized, normalized, and augmented data improves model performance by reducing noise and improving generalization. By following the preprocessing techniques and best practices outlined in this guide, you can create a solid dataset. With your preprocessed dataset ready, you can confidently proceed to the next steps in your project.
## FAQ
### What is the importance of data preprocessing in computer vision projects?
Data preprocessing is essential in computer vision projects because it ensures that the data is clean, consistent, and in a format that is optimal for model training. By addressing issues such as noise, inconsistency, and imbalance in raw data, preprocessing steps like resizing, normalization, augmentation, and dataset splitting help reduce computational load and improve model performance. For more details, visit the [steps of a computer vision project](../guides/steps-of-a-cv-project.md).
### How can I use Ultralytics YOLO for data augmentation?
For data augmentation with Ultralytics YOLO11, you need to modify the dataset configuration file (.yaml). In this file, you can specify various augmentation techniques such as random crops, horizontal flips, and brightness adjustments. This can be effectively done using the training configurations [explained here](../modes/train.md). Data augmentation helps create a more robust dataset, reduce [overfitting](https://www.ultralytics.com/glossary/overfitting), and improve model generalization.
### What are the best data normalization techniques for computer vision data?
Normalization scales pixel values to a standard range for faster convergence and improved performance during training. Common techniques include:
-**Min-Max Scaling**: Scales pixel values to a range of 0 to 1.
-**Z-Score Normalization**: Scales pixel values based on their mean and standard deviation.
For YOLO11, normalization is handled automatically, including conversion to RGB and pixel value scaling. Learn more about it in the [model training section](../modes/train.md).
### How should I split my annotated dataset for training?
To split your dataset, a common practice is to divide it into 70% for training, 20% for validation, and 10% for testing. It is important to maintain the data distribution of classes across these splits and avoid data leakage by performing augmentation only on the training set. Use tools like scikit-learn or [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) for efficient dataset splitting. See the detailed guide on [dataset preparation](../guides/data-collection-and-annotation.md).
### Can I handle varying image sizes in YOLO11 without manual resizing?
Yes, Ultralytics YOLO11 can handle varying image sizes through the 'imgsz' parameter during model training. This parameter ensures that images are resized so their largest dimension matches the specified size (e.g., 640 pixels), while maintaining the aspect ratio. For more flexible input handling and automatic adjustments, check the [model training section](../modes/train.md).
description:Learn how to manage and optimize queues using Ultralytics YOLO11 to reduce wait times and increase efficiency in various real-world applications.
Queue management using [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) involves organizing and controlling lines of people or vehicles to reduce wait times and enhance efficiency. It's about optimizing queues to improve customer satisfaction and system performance in various settings like retail, banks, airports, and healthcare facilities.
<strong>Watch:</strong> How to Implement Queue Management with Ultralytics YOLO11 | Airport and Metro Station
</p>
## Advantages of Queue Management?
-**Reduced Waiting Times:** Queue management systems efficiently organize queues, minimizing wait times for customers. This leads to improved satisfaction levels as customers spend less time waiting and more time engaging with products or services.
-**Increased Efficiency:** Implementing queue management allows businesses to allocate resources more effectively. By analyzing queue data and optimizing staff deployment, businesses can streamline operations, reduce costs, and improve overall productivity.
|  |  |
| Queue management at airport ticket counter Using Ultralytics YOLO11 | Queue monitoring in crowd Ultralytics YOLO11 |
!!! example "Queue Management using YOLO11 Example"
Leveraging Ultralytics [HUB](https://docs.ultralytics.com/hub/) can streamline this process by providing a user-friendly platform for deploying and managing your queue management solution.
### What are the key advantages of using Ultralytics YOLO11 for queue management?
Using Ultralytics YOLO11 for queue management offers several benefits:
-**Plummeting Waiting Times:** Efficiently organizes queues, reducing customer wait times and boosting satisfaction.
-**Enhancing Efficiency:** Analyzes queue data to optimize staff deployment and operations, thereby reducing costs.
-**Real-time Alerts:** Provides real-time notifications for long queues, enabling quick intervention.
-**Scalability:** Easily scalable across different environments like retail, airports, and healthcare.
For more details, explore our [Queue Management](https://docs.ultralytics.com/reference/solutions/queue_management/) solutions.
### Why should I choose Ultralytics YOLO11 over competitors like [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) or Detectron2 for queue management?
Ultralytics YOLO11 has several advantages over TensorFlow and Detectron2 for queue management:
-**Real-time Performance:** YOLO11 is known for its real-time detection capabilities, offering faster processing speeds.
-**Ease of Use:** Ultralytics provides a user-friendly experience, from training to deployment, via [Ultralytics HUB](https://docs.ultralytics.com/hub/).
-**Pretrained Models:** Access to a range of pretrained models, minimizing the time needed for setup.
-**Community Support:** Extensive documentation and active community support make problem-solving easier.
Learn how to get started with [Ultralytics YOLO](https://docs.ultralytics.com/quickstart/).
### Can Ultralytics YOLO11 handle multiple types of queues, such as in airports and retail?
Yes, Ultralytics YOLO11 can manage various types of queues, including those in airports and retail environments. By configuring the QueueManager with specific regions and settings, YOLO11 can adapt to different queue layouts and densities.
For more information on diverse applications, check out our [Real World Applications](#real-world-applications) section.
### What are some real-world applications of Ultralytics YOLO11 in queue management?
Ultralytics YOLO11 is used in various real-world applications for queue management:
-**Retail:** Monitors checkout lines to reduce wait times and improve customer satisfaction.
-**Airports:** Manages queues at ticket counters and security checkpoints for a smoother passenger experience.
-**Healthcare:** Optimizes patient flow in clinics and hospitals.
-**Banks:** Enhances customer service by managing queues efficiently in banks.
Check our [blog on real-world queue management](https://www.ultralytics.com/blog/revolutionizing-queue-management-with-ultralytics-yolov8-and-openvino) to learn more.
description:Learn how to deploy Ultralytics YOLO11 on Raspberry Pi with our comprehensive guide. Get performance benchmarks, setup instructions, and best practices.
# Quick Start Guide: Raspberry Pi with Ultralytics YOLO11
This comprehensive guide provides a detailed walkthrough for deploying Ultralytics YOLO11 on [Raspberry Pi](https://www.raspberrypi.com/) devices. Additionally, it showcases performance benchmarks to demonstrate the capabilities of YOLO11 on these small and powerful devices.
<strong>Watch:</strong> Raspberry Pi 5 updates and improvements.
</p>
!!! note
This guide has been tested with Raspberry Pi 4 and Raspberry Pi 5 running the latest [Raspberry Pi OS Bookworm (Debian 12)](https://www.raspberrypi.com/software/operating-systems/). Using this guide for older Raspberry Pi devices such as the Raspberry Pi 3 is expected to work as long as the same Raspberry Pi OS Bookworm is installed.
## What is Raspberry Pi?
Raspberry Pi is a small, affordable, single-board computer. It has become popular for a wide range of projects and applications, from hobbyist home automation to industrial uses. Raspberry Pi boards are capable of running a variety of operating systems, and they offer GPIO (General Purpose Input/Output) pins that allow for easy integration with sensors, actuators, and other hardware components. They come in different models with varying specifications, but they all share the same basic design philosophy of being low-cost, compact, and versatile.
## Raspberry Pi Series Comparison
| | Raspberry Pi 3 | Raspberry Pi 4 | Raspberry Pi 5 |
| Max Power Draw | 2.5A@5V | 3A@5V | 5A@5V (PD enabled) |
## What is Raspberry Pi OS?
[Raspberry Pi OS](https://www.raspberrypi.com/software/)(formerly known as Raspbian) is a Unix-like operating system based on the Debian GNU/Linux distribution for the Raspberry Pi family of compact single-board computers distributed by the Raspberry Pi Foundation. Raspberry Pi OS is highly optimized for the Raspberry Pi with ARM CPUs and uses a modified LXDE desktop environment with the Openbox stacking window manager. Raspberry Pi OS is under active development, with an emphasis on improving the stability and performance of as many Debian packages as possible on Raspberry Pi.
## Flash Raspberry Pi OS to Raspberry Pi
The first thing to do after getting your hands on a Raspberry Pi is to flash a micro-SD card with Raspberry Pi OS, insert into the device and boot into the OS. Follow along with detailed [Getting Started Documentation by Raspberry Pi](https://www.raspberrypi.com/documentation/computers/getting-started.html) to prepare your device for first use.
## Set Up Ultralytics
There are two ways of setting up Ultralytics package on Raspberry Pi to build your next [Computer Vision](https://www.ultralytics.com/glossary/computer-vision-cv) project. You can use either of them.
-[Start with Docker](#start-with-docker)
-[Start without Docker](#start-without-docker)
### Start with Docker
The fastest way to get started with Ultralytics YOLO11 on Raspberry Pi is to run with pre-built docker image for Raspberry Pi.
Execute the below command to pull the Docker container and run on Raspberry Pi. This is based on [arm64v8/debian](https://hub.docker.com/r/arm64v8/debian) docker image which contains Debian 12 (Bookworm) in a Python3 environment.
```bash
t=ultralytics/ultralytics:latest-arm64 &&sudo docker pull $t&&sudo docker run -it--ipc=host $t
```
After this is done, skip to [Use NCNN on Raspberry Pi section](#use-ncnn-on-raspberry-pi).
### Start without Docker
#### Install Ultralytics Package
Here we will install Ultralytics package on the Raspberry Pi with optional dependencies so that we can export the [PyTorch](https://www.ultralytics.com/glossary/pytorch) models to other different formats.
1. Update packages list, install pip and upgrade to latest
```bash
sudo apt update
sudo apt install python3-pip -y
pip install -U pip
```
2. Install `ultralytics` pip package with optional dependencies
```bash
pip install ultralytics[export]
```
3. Reboot the device
```bash
sudo reboot
```
## Use NCNN on Raspberry Pi
Out of all the model export formats supported by Ultralytics, [NCNN](https://docs.ultralytics.com/integrations/ncnn/) delivers the best inference performance when working with Raspberry Pi devices because NCNN is highly optimized for mobile/ embedded platforms (such as ARM architecture). Therefor our recommendation is to use NCNN with Raspberry Pi.
## Convert Model to NCNN and Run Inference
The YOLO11n model in PyTorch format is converted to NCNN to run inference with the exported model.
For more details about supported export options, visit the [Ultralytics documentation page on deployment options](https://docs.ultralytics.com/guides/model-deployment-options/).
## Raspberry Pi 5 YOLO11 Benchmarks
YOLO11 benchmarks were run by the Ultralytics team on nine different model formats measuring speed and [accuracy](https://www.ultralytics.com/glossary/accuracy): PyTorch, TorchScript, ONNX, OpenVINO, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN. Benchmarks were run on a Raspberry Pi 5 at FP32 [precision](https://www.ultralytics.com/glossary/precision) with default input image size of 640.
### Comparison Chart
We have only included benchmarks for YOLO11n and YOLO11s models because other models sizes are too big to run on the Raspberry Pis and does not offer decent performance.
<divstyle="text-align: center;">
<imgwidth="800"src="https://github.com/ultralytics/docs/releases/download/0/rpi-yolo11-benchmarks.avif"alt="YOLO11 benchmarks on RPi 5">
</div>
### Detailed Comparison Table
The below table represents the benchmark results for two different models (YOLO11n, YOLO11s) across nine different formats (PyTorch, TorchScript, ONNX, OpenVINO, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN), running on a Raspberry Pi 5, giving us the status, size, mAP50-95(B) metric, and inference time for each combination.
!!! tip "Performance"
=== "YOLO11n"
| Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
Note that benchmarking results might vary based on the exact hardware and software configuration of a system, as well as the current workload of the system at the time the benchmarks are run. For the most reliable results use a dataset with a large number of images, i.e. `data='coco8.yaml' (4 val images), or `data='coco.yaml'` (5000 val images).
## Use Raspberry Pi Camera
When using Raspberry Pi for Computer Vision projects, it can be essentially to grab real-time video feeds to perform inference. The onboard MIPI CSI connector on the Raspberry Pi allows you to connect official Raspberry PI camera modules. In this guide, we have used a [Raspberry Pi Camera Module 3](https://www.raspberrypi.com/products/camera-module-3/) to grab the video feeds and perform inference using YOLO11 models.
!!! tip
Learn more about the [different camera modules offered by Raspberry Pi](https://www.raspberrypi.com/documentation/accessories/camera.html) and also [how to get started with the Raspberry Pi camera modules](https://www.raspberrypi.com/documentation/computers/camera_software.html#introducing-the-raspberry-pi-cameras).
!!! note
Raspberry Pi 5 uses smaller CSI connectors than the Raspberry Pi 4 (15-pin vs 22-pin), so you will need a [15-pin to 22pin adapter cable](https://www.raspberrypi.com/products/camera-cable/) to connect to a Raspberry Pi Camera.
### Test the Camera
Execute the following command after connecting the camera to the Raspberry Pi. You should see a live video feed from the camera for about 5 seconds.
```bash
rpicam-hello
```
!!! tip
Learn more about [`rpicam-hello` usage on official Raspberry Pi documentation](https://www.raspberrypi.com/documentation/computers/camera_software.html#rpicam-hello)
### Inference with Camera
There are 2 methods of using the Raspberry Pi Camera to inference YOLO11 models.
!!! usage
=== "Method 1"
We can use `picamera2`which comes pre-installed with Raspberry Pi OS to access the camera and inference YOLO11 models.
We need to initiate a TCP stream with `rpicam-vid` from the connected camera so that we can use this stream URL as an input when we are inferencing later. Execute the following command to start the TCP stream.
Learn more about [`rpicam-vid` usage on official Raspberry Pi documentation](https://www.raspberrypi.com/documentation/computers/camera_software.html#rpicam-vid)
Check our document on [Inference Sources](https://docs.ultralytics.com/modes/predict/#inference-sources) if you want to change the image/ video input type
## Best Practices when using Raspberry Pi
There are a couple of best practices to follow in order to enable maximum performance on Raspberry Pis running YOLO11.
1. Use an SSD
When using Raspberry Pi for 24x7 continued usage, it is recommended to use an SSD for the system because an SD card will not be able to withstand continuous writes and might get broken. With the onboard PCIe connector on the Raspberry Pi 5, now you can connect SSDs using an adapter such as the [NVMe Base for Raspberry Pi 5](https://shop.pimoroni.com/products/nvme-base).
2. Flash without GUI
When flashing Raspberry Pi OS, you can choose to not install the Desktop environment (Raspberry Pi OS Lite) and this can save a bit of RAM on the device, leaving more space for computer vision processing.
## Next Steps
Congratulations on successfully setting up YOLO on your Raspberry Pi! For further learning and support, visit [Ultralytics YOLO11 Docs](../index.md) and [Kashmir World Foundation](https://www.kashmirworldfoundation.org/).
## Acknowledgements and Citations
This guide was initially created by Daan Eeltink for Kashmir World Foundation, an organization dedicated to the use of YOLO for the conservation of endangered species. We acknowledge their pioneering work and educational focus in the realm of object detection technologies.
For more information about Kashmir World Foundation's activities, you can visit their [website](https://www.kashmirworldfoundation.org/).
## FAQ
### How do I set up Ultralytics YOLO11 on a Raspberry Pi without using Docker?
To set up Ultralytics YOLO11 on a Raspberry Pi without Docker, follow these steps:
1. Update the package list and install `pip`:
```bash
sudo apt update
sudo apt install python3-pip -y
pip install-U pip
```
2. Install the Ultralytics package with optional dependencies:
```bash
pip install ultralytics[export]
```
3. Reboot the device to apply changes:
```bash
sudo reboot
```
For detailed instructions, refer to the [Start without Docker](#start-without-docker) section.
### Why should I use Ultralytics YOLO11's NCNN format on Raspberry Pi for AI tasks?
Ultralytics YOLO11's NCNN format is highly optimized for mobile and embedded platforms, making it ideal for running AI tasks on Raspberry Pi devices. NCNN maximizes inference performance by leveraging ARM architecture, providing faster and more efficient processing compared to other formats. For more details on supported export options, visit the [Ultralytics documentation page on deployment options](../modes/export.md).
### How can I convert a YOLO11 model to NCNN format for use on Raspberry Pi?
You can convert a PyTorch YOLO11 model to NCNN format using either Python or CLI commands:
For more details, see the [Use NCNN on Raspberry Pi](#use-ncnn-on-raspberry-pi) section.
### What are the hardware differences between Raspberry Pi 4 and Raspberry Pi 5 relevant to running YOLO11?
Key differences include:
-**CPU**: Raspberry Pi 4 uses Broadcom BCM2711, Cortex-A72 64-bit SoC, while Raspberry Pi 5 uses Broadcom BCM2712, Cortex-A76 64-bit SoC.
-**Max CPU Frequency**: Raspberry Pi 4 has a max frequency of 1.8GHz, whereas Raspberry Pi 5 reaches 2.4GHz.
-**Memory**: Raspberry Pi 4 offers up to 8GB of LPDDR4-3200 SDRAM, while Raspberry Pi 5 features LPDDR4X-4267 SDRAM, available in 4GB and 8GB variants.
These enhancements contribute to better performance benchmarks for YOLO11 models on Raspberry Pi 5 compared to Raspberry Pi 4. Refer to the [Raspberry Pi Series Comparison](#raspberry-pi-series-comparison) table for more details.
### How can I set up a Raspberry Pi Camera Module to work with Ultralytics YOLO11?
There are two methods to set up a Raspberry Pi Camera for YOLO11 inference:
# Object Counting in Different Regions using Ultralytics YOLO 🚀
## What is Object Counting in Regions?
[Object counting](../guides/object-counting.md) in regions with [Ultralytics YOLOv8](https://github.com/ultralytics/ultralytics/) involves precisely determining the number of objects within specified areas using advanced [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv). This approach is valuable for optimizing processes, enhancing security, and improving efficiency in various applications.
<strong>Watch:</strong> Object Counting in Different Regions using Ultralytics YOLO11 | Ultralytics Solutions 🚀
</p>
## Advantages of Object Counting in Regions?
-**[Precision](https://www.ultralytics.com/glossary/precision) and Accuracy:** Object counting in regions with advanced computer vision ensures precise and accurate counts, minimizing errors often associated with manual counting.
-**Efficiency Improvement:** Automated object counting enhances operational efficiency, providing real-time results and streamlining processes across different applications.
-**Versatility and Application:** The versatility of object counting in regions makes it applicable across various domains, from manufacturing and surveillance to traffic monitoring, contributing to its widespread utility and effectiveness.
|  |  |
| People Counting in Different Region using Ultralytics YOLOv8 | Crowd Counting in Different Region using Ultralytics YOLOv8 |
!!! example "Region Counting Example"
=== "Python"
```python
import cv2
from ultralytics import solutions
cap = cv2.VideoCapture("Path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
# Define region points
# region_points = [(20, 400), (1080, 400), (1080, 360), (20, 360)] # Pass region as list
print("Video frame is empty or video processing has been successfully completed.")
break
im0 = region.count(im0)
video_writer.write(im0)
cap.release()
video_writer.release()
cv2.destroyAllWindows()
```
!!! tip "Ultralytics Example Code"
The Ultralytics region counting module is available in our [examples section](https://github.com/ultralytics/ultralytics/blob/main/examples/YOLOv8-Region-Counter/yolov8_region_counter.py). You can explore this example for code customization and modify it to suit your specific use case.
### Argument `RegionCounter`
Here's a table with the `RegionCounter` arguments:
| `model` | `str` | `None` | Path to Ultralytics YOLO Model File |
| `region` | `list` | `[(20, 400), (1260, 400)]` | List of points defining the counting region. |
| `line_width` | `int` | `2` | Line thickness for bounding boxes. |
| `show` | `bool` | `False` | Flag to control whether to display the video stream. |
## FAQ
### What is object counting in specified regions using Ultralytics YOLOv8?
Object counting in specified regions with [Ultralytics YOLOv8](https://github.com/ultralytics/ultralytics) involves detecting and tallying the number of objects within defined areas using advanced computer vision. This precise method enhances efficiency and [accuracy](https://www.ultralytics.com/glossary/accuracy) across various applications like manufacturing, surveillance, and traffic monitoring.
### How do I run the object counting script with Ultralytics YOLOv8?
Follow these steps to run object counting in Ultralytics YOLOv8:
1. Clone the Ultralytics repository and navigate to the directory:
For more options, visit the [Run Region Counting](https://github.com/ultralytics/ultralytics/blob/main/examples/YOLOv8-Region-Counter/readme.md) section.
### Why should I use Ultralytics YOLOv8 for object counting in regions?
Using Ultralytics YOLOv8 for object counting in regions offers several advantages:
-**Precision and Accuracy:** Minimizes errors often seen in manual counting.
-**Efficiency Improvement:** Provides real-time results and streamlines processes.
-**Versatility and Application:** Applies to various domains, enhancing its utility.
Explore deeper benefits in the [Advantages](#advantages-of-object-counting-in-regions) section.
### Can the defined regions be adjusted during video playback?
Yes, with Ultralytics YOLOv8, regions can be interactively moved during video playback. Simply click and drag with the left mouse button to reposition the region. This feature enhances flexibility for dynamic environments. Learn more in the tip section for [movable regions](https://github.com/ultralytics/ultralytics/blob/33cdaa5782efb2bc2b5ede945771ba647882830d/examples/YOLOv8-Region-Counter/yolov8_region_counter.py#L39).
### What are some real-world applications of object counting in regions?
Object counting with Ultralytics YOLOv8 can be applied to numerous real-world scenarios:
-**Retail:** Counting people for foot traffic analysis.
-**Market Streets:** Crowd density management.
Explore more examples in the [Real World Applications](#real-world-applications) section.
description:Learn to integrate Ultralytics YOLO with your robot running ROS Noetic, utilizing RGB images, depth images, and point clouds for efficient object detection, segmentation, and enhanced robotic perception.
<palign="center"><ahref="https://vimeo.com/639236696">ROS Introduction (captioned)</a> from <ahref="https://vimeo.com/osrfoundation">Open Robotics</a> on <ahref="https://vimeo.com/">Vimeo</a>.</p>
## What is ROS?
The [Robot Operating System (ROS)](https://www.ros.org/) is an open-source framework widely used in robotics research and industry. ROS provides a collection of [libraries and tools](https://www.ros.org/blog/ecosystem/) to help developers create robot applications. ROS is designed to work with various [robotic platforms](https://robots.ros.org/), making it a flexible and powerful tool for roboticists.
### Key Features of ROS
1.**Modular Architecture**: ROS has a modular architecture, allowing developers to build complex systems by combining smaller, reusable components called [nodes](https://wiki.ros.org/ROS/Tutorials/UnderstandingNodes). Each node typically performs a specific function, and nodes communicate with each other using messages over [topics](https://wiki.ros.org/ROS/Tutorials/UnderstandingTopics) or [services](https://wiki.ros.org/ROS/Tutorials/UnderstandingServicesParams).
2.**Communication Middleware**: ROS offers a robust communication infrastructure that supports inter-process communication and distributed computing. This is achieved through a publish-subscribe model for data streams (topics) and a request-reply model for service calls.
3.**Hardware Abstraction**: ROS provides a layer of abstraction over the hardware, enabling developers to write device-agnostic code. This allows the same code to be used with different hardware setups, facilitating easier integration and experimentation.
4.**Tools and Utilities**: ROS comes with a rich set of tools and utilities for visualization, debugging, and simulation. For instance, RViz is used for visualizing sensor data and robot state information, while Gazebo provides a powerful simulation environment for testing algorithms and robot designs.
5.**Extensive Ecosystem**: The ROS ecosystem is vast and continually growing, with numerous packages available for different robotic applications, including navigation, manipulation, perception, and more. The community actively contributes to the development and maintenance of these packages.
???+ note "Evolution of ROS Versions"
Since its development in 2007, ROS has evolved through [multiple versions](https://wiki.ros.org/Distributions), each introducing new features and improvements to meet the growing needs of the robotics community. The development of ROS can be categorized into two main series: ROS 1 and ROS 2. This guide focuses on the Long Term Support (LTS) version of ROS 1, known as ROS Noetic Ninjemys, the code should also work with earlier versions.
### ROS 1 vs. ROS 2
While ROS 1 provided a solid foundation for robotic development, ROS 2 addresses its shortcomings by offering:
- **Real-time Performance**: Improved support for real-time systems and deterministic behavior.
- **Security**: Enhanced security features for safe and reliable operation in various environments.
- **Scalability**: Better support for multi-robot systems and large-scale deployments.
- **Cross-platform Support**: Expanded compatibility with various operating systems beyond Linux, including Windows and macOS.
- **Flexible Communication**: Use of DDS for more flexible and efficient inter-process communication.
### ROS Messages and Topics
In ROS, communication between nodes is facilitated through [messages](https://wiki.ros.org/Messages) and [topics](https://wiki.ros.org/Topics). A message is a data structure that defines the information exchanged between nodes, while a topic is a named channel over which messages are sent and received. Nodes can publish messages to a topic or subscribe to messages from a topic, enabling them to communicate with each other. This publish-subscribe model allows for asynchronous communication and decoupling between nodes. Each sensor or actuator in a robotic system typically publishes data to a topic, which can then be consumed by other nodes for processing or control. For the purpose of this guide, we will focus on Image, Depth and PointCloud messages and camera topics.
## Setting Up Ultralytics YOLO with ROS
This guide has been tested using [this ROS environment](https://github.com/ambitious-octopus/rosbot_ros/tree/noetic), which is a fork of the [ROSbot ROS repository](https://github.com/husarion/rosbot_ros). This environment includes the Ultralytics YOLO package, a Docker container for easy setup, comprehensive ROS packages, and Gazebo worlds for rapid testing. It is designed to work with the [Husarion ROSbot 2 PRO](https://husarion.com/manuals/rosbot/). The code examples provided will work in any ROS Noetic/Melodic environment, including both simulation and real-world.
Apart from the ROS environment, you will need to install the following dependencies:
-**[ROS Numpy package](https://github.com/eric-wieser/ros_numpy)**: This is required for fast conversion between ROS Image messages and numpy arrays.
```bash
pip install ros_numpy
```
-**Ultralytics package**:
```bash
pip install ultralytics
```
## Use Ultralytics with ROS `sensor_msgs/Image`
The `sensor_msgs/Image`[message type](https://docs.ros.org/en/api/sensor_msgs/html/msg/Image.html) is commonly used in ROS for representing image data. It contains fields for encoding, height, width, and pixel data, making it suitable for transmitting images captured by cameras or other sensors. Image messages are widely used in robotic applications for tasks such as visual perception, [object detection](https://www.ultralytics.com/glossary/object-detection), and navigation.
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/detection-segmentation-ros-gazebo.avif"alt="Detection and Segmentation in ROS Gazebo">
</p>
### Image Step-by-Step Usage
The following code snippet demonstrates how to use the Ultralytics YOLO package with ROS. In this example, we subscribe to a camera topic, process the incoming image using YOLO, and publish the detected objects to new topics for [detection](../tasks/detect.md) and [segmentation](../tasks/segment.md).
First, import the necessary libraries and instantiate two models: one for [segmentation](../tasks/segment.md) and one for [detection](../tasks/detect.md). Initialize a ROS node (with the name `ultralytics`) to enable communication with the ROS master. To ensure a stable connection, we include a brief pause, giving the node sufficient time to establish the connection before proceeding.
```python
importtime
importrospy
fromultralyticsimportYOLO
detection_model=YOLO("yolov8m.pt")
segmentation_model=YOLO("yolov8m-seg.pt")
rospy.init_node("ultralytics")
time.sleep(1)
```
Initialize two ROS topics: one for [detection](../tasks/detect.md) and one for [segmentation](../tasks/segment.md). These topics will be used to publish the annotated images, making them accessible for further processing. The communication between nodes is facilitated using `sensor_msgs/Image` messages.
Finally, create a subscriber that listens to messages on the `/camera/color/image_raw` topic and calls a callback function for each new message. This callback function receives messages of type `sensor_msgs/Image`, converts them into a numpy array using `ros_numpy`, processes the images with the previously instantiated YOLO models, annotates the images, and then publishes them back to the respective topics: `/ultralytics/detection/image` for detection and `/ultralytics/segmentation/image` for segmentation.
```python
importros_numpy
defcallback(data):
"""Callback function to process image and publish annotated images."""
Debugging ROS (Robot Operating System) nodes can be challenging due to the system's distributed nature. Several tools can assist with this process:
1. `rostopic echo <TOPIC-NAME>` : This command allows you to view messages published on a specific topic, helping you inspect the data flow.
2. `rostopic list`: Use this command to list all available topics in the ROS system, giving you an overview of the active data streams.
3. `rqt_graph`: This visualization tool displays the communication graph between nodes, providing insights into how nodes are interconnected and how they interact.
4. For more complex visualizations, such as 3D representations, you can use [RViz](https://wiki.ros.org/rviz). RViz (ROS Visualization) is a powerful 3D visualization tool for ROS. It allows you to visualize the state of your robot and its environment in real-time. With RViz, you can view sensor data (e.g. `sensors_msgs/Image`), robot model states, and various other types of information, making it easier to debug and understand the behavior of your robotic system.
### Publish Detected Classes with `std_msgs/String`
Standard ROS messages also include `std_msgs/String` messages. In many applications, it is not necessary to republish the entire annotated image; instead, only the classes present in the robot's view are needed. The following example demonstrates how to use `std_msgs/String`[messages](https://docs.ros.org/en/noetic/api/std_msgs/html/msg/String.html) to republish the detected classes on the `/ultralytics/detection/classes` topic. These messages are more lightweight and provide essential information, making them valuable for various applications.
#### Example Use Case
Consider a warehouse robot equipped with a camera and object [detection model](../tasks/detect.md). Instead of sending large annotated images over the network, the robot can publish a list of detected classes as `std_msgs/String` messages. For instance, when the robot detects objects like "box", "pallet" and "forklift" it publishes these classes to the `/ultralytics/detection/classes` topic. This information can then be used by a central monitoring system to track the inventory in real-time, optimize the robot's path planning to avoid obstacles, or trigger specific actions such as picking up a detected box. This approach reduces the bandwidth required for communication and focuses on transmitting critical data.
### String Step-by-Step Usage
This example demonstrates how to use the Ultralytics YOLO package with ROS. In this example, we subscribe to a camera topic, process the incoming image using YOLO, and publish the detected objects to new topic `/ultralytics/detection/classes` using `std_msgs/String` messages. The `ros_numpy` package is used to convert the ROS Image message to a numpy array for processing with YOLO.
In addition to RGB images, ROS supports [depth images](https://en.wikipedia.org/wiki/Depth_map), which provide information about the distance of objects from the camera. Depth images are crucial for robotic applications such as obstacle avoidance, 3D mapping, and localization.
A depth image is an image where each pixel represents the distance from the camera to an object. Unlike RGB images that capture color, depth images capture spatial information, enabling robots to perceive the 3D structure of their environment.
!!! tip "Obtaining Depth Images"
Depth images can be obtained using various sensors:
1. [Stereo Cameras](https://en.wikipedia.org/wiki/Stereo_camera): Use two cameras to calculate depth based on image disparity.
2. [Time-of-Flight (ToF) Cameras](https://en.wikipedia.org/wiki/Time-of-flight_camera): Measure the time light takes to return from an object.
3. [Structured Light Sensors](https://en.wikipedia.org/wiki/Structured-light_3D_scanner): Project a pattern and measure its deformation on surfaces.
### Using YOLO with Depth Images
In ROS, depth images are represented by the `sensor_msgs/Image` message type, which includes fields for encoding, height, width, and pixel data. The encoding field for depth images often uses a format like "16UC1", indicating a 16-bit unsigned integer per pixel, where each value represents the distance to the object. Depth images are commonly used in conjunction with RGB images to provide a more comprehensive view of the environment.
Using YOLO, it is possible to extract and combine information from both RGB and depth images. For instance, YOLO can detect objects within an RGB image, and this detection can be used to pinpoint corresponding regions in the depth image. This allows for the extraction of precise depth information for detected objects, enhancing the robot's ability to understand its environment in three dimensions.
!!! warning "RGB-D Cameras"
When working with depth images, it is essential to ensure that the RGB and depth images are correctly aligned. RGB-D cameras, such as the [Intel RealSense](https://www.intelrealsense.com/) series, provide synchronized RGB and depth images, making it easier to combine information from both sources. If using separate RGB and depth cameras, it is crucial to calibrate them to ensure accurate alignment.
#### Depth Step-by-Step Usage
In this example, we use YOLO to segment an image and apply the extracted mask to segment the object in the depth image. This allows us to determine the distance of each pixel of the object of interest from the camera's focal center. By obtaining this distance information, we can calculate the distance between the camera and the specific object in the scene. Begin by importing the necessary libraries, creating a ROS node, and instantiating a segmentation model and a ROS topic.
Next, define a callback function that processes the incoming depth image message. The function waits for the depth image and RGB image messages, converts them into numpy arrays, and applies the segmentation model to the RGB image. It then extracts the segmentation mask for each detected object and calculates the average distance of the object from the camera using the depth image. Most sensors have a maximum distance, known as the clip distance, beyond which values are represented as inf (`np.inf`). Before processing, it is important to filter out these null values and assign them a value of `0`. Finally, it publishes the detected objects along with their average distances to the `/ultralytics/detection/distance` topic.
```python
importnumpyasnp
importros_numpy
fromsensor_msgs.msgimportImage
defcallback(data):
"""Callback function to process depth image and RGB image."""
## Use Ultralytics with ROS `sensor_msgs/PointCloud2`
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/detection-segmentation-ros-gazebo-1.avif"alt="Detection and Segmentation in ROS Gazebo">
</p>
The `sensor_msgs/PointCloud2`[message type](https://docs.ros.org/en/api/sensor_msgs/html/msg/PointCloud2.html) is a data structure used in ROS to represent 3D point cloud data. This message type is integral to robotic applications, enabling tasks such as 3D mapping, object recognition, and localization.
A point cloud is a collection of data points defined within a three-dimensional coordinate system. These data points represent the external surface of an object or a scene, captured via 3D scanning technologies. Each point in the cloud has `X`, `Y`, and `Z` coordinates, which correspond to its position in space, and may also include additional information such as color and intensity.
!!! warning "Reference frame"
When working with `sensor_msgs/PointCloud2`, it's essential to consider the reference frame of the sensor from which the point cloud data was acquired. The point cloud is initially captured in the sensor's reference frame. You can determine this reference frame by listening to the `/tf_static` topic. However, depending on your specific application requirements, you might need to convert the point cloud into another reference frame. This transformation can be achieved using the `tf2_ros` package, which provides tools for managing coordinate frames and transforming data between them.
!!! tip "Obtaining Point clouds"
Point Clouds can be obtained using various sensors:
1. **LIDAR (Light Detection and Ranging)**: Uses laser pulses to measure distances to objects and create high-[precision](https://www.ultralytics.com/glossary/precision) 3D maps.
2. **Depth Cameras**: Capture depth information for each pixel, allowing for 3D reconstruction of the scene.
3. **Stereo Cameras**: Utilize two or more cameras to obtain depth information through triangulation.
4. **Structured Light Scanners**: Project a known pattern onto a surface and measure the deformation to calculate depth.
### Using YOLO with Point Clouds
To integrate YOLO with `sensor_msgs/PointCloud2` type messages, we can employ a method similar to the one used for depth maps. By leveraging the color information embedded in the point cloud, we can extract a 2D image, perform segmentation on this image using YOLO, and then apply the resulting mask to the three-dimensional points to isolate the 3D object of interest.
For handling point clouds, we recommend using Open3D (`pip install open3d`), a user-friendly Python library. Open3D provides robust tools for managing point cloud data structures, visualizing them, and executing complex operations seamlessly. This library can significantly simplify the process and enhance our ability to manipulate and analyze point clouds in conjunction with YOLO-based segmentation.
#### Point Clouds Step-by-Step Usage
Import the necessary libraries and instantiate the YOLO model for segmentation.
```python
importtime
importrospy
fromultralyticsimportYOLO
rospy.init_node("ultralytics")
time.sleep(1)
segmentation_model=YOLO("yolov8m-seg.pt")
```
Create a function `pointcloud2_to_array`, which transforms a `sensor_msgs/PointCloud2` message into two numpy arrays. The `sensor_msgs/PointCloud2` messages contain `n` points based on the `width` and `height` of the acquired image. For instance, a `480 x 640` image will have `307,200` points. Each point includes three spatial coordinates (`xyz`) and the corresponding color in `RGB` format. These can be considered as two separate channels of information.
The function returns the `xyz` coordinates and `RGB` values in the format of the original camera resolution (`width x height`). Most sensors have a maximum distance, known as the clip distance, beyond which values are represented as inf (`np.inf`). Before processing, it is important to filter out these null values and assign them a value of `0`.
Next, subscribe to the `/camera/depth/points` topic to receive the point cloud message and convert the `sensor_msgs/PointCloud2` message into numpy arrays containing the XYZ coordinates and RGB values (using the `pointcloud2_to_array` function). Process the RGB image using the YOLO model to extract segmented objects. For each detected object, extract the segmentation mask and apply it to both the RGB image and the XYZ coordinates to isolate the object in 3D space.
Processing the mask is straightforward since it consists of binary values, with `1` indicating the presence of the object and `0` indicating the absence. To apply the mask, simply multiply the original channels by the mask. This operation effectively isolates the object of interest within the image. Finally, create an Open3D point cloud object and visualize the segmented object in 3D space with associated colors.
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/point-cloud-segmentation-ultralytics.avif"alt="Point Cloud Segmentation with Ultralytics ">
</p>
## FAQ
### What is the Robot Operating System (ROS)?
The [Robot Operating System (ROS)](https://www.ros.org/) is an open-source framework commonly used in robotics to help developers create robust robot applications. It provides a collection of [libraries and tools](https://www.ros.org/blog/ecosystem/) for building and interfacing with robotic systems, enabling easier development of complex applications. ROS supports communication between nodes using messages over [topics](https://wiki.ros.org/ROS/Tutorials/UnderstandingTopics) or [services](https://wiki.ros.org/ROS/Tutorials/UnderstandingServicesParams).
### How do I integrate Ultralytics YOLO with ROS for real-time object detection?
Integrating Ultralytics YOLO with ROS involves setting up a ROS environment and using YOLO for processing sensor data. Begin by installing the required dependencies like `ros_numpy` and Ultralytics YOLO:
```bash
pip install ros_numpy ultralytics
```
Next, create a ROS node and subscribe to an [image topic](../tasks/detect.md) to process the incoming data. Here is a minimal example:
### What are ROS topics and how are they used in Ultralytics YOLO?
ROS topics facilitate communication between nodes in a ROS network by using a publish-subscribe model. A topic is a named channel that nodes use to send and receive messages asynchronously. In the context of Ultralytics YOLO, you can make a node subscribe to an image topic, process the images using YOLO for tasks like detection or segmentation, and publish outcomes to new topics.
For example, subscribe to a camera topic and process the incoming image for detection:
### Why use depth images with Ultralytics YOLO in ROS?
Depth images in ROS, represented by `sensor_msgs/Image`, provide the distance of objects from the camera, crucial for tasks like obstacle avoidance, 3D mapping, and localization. By [using depth information](https://en.wikipedia.org/wiki/Depth_map) along with RGB images, robots can better understand their 3D environment.
With YOLO, you can extract segmentation masks from RGB images and apply these masks to depth images to obtain precise 3D object information, improving the robot's ability to navigate and interact with its surroundings.
### How can I visualize 3D point clouds with YOLO in ROS?
To visualize 3D point clouds in ROS with YOLO:
1. Convert `sensor_msgs/PointCloud2` messages to numpy arrays.
2. Use YOLO to segment RGB images.
3. Apply the segmentation mask to the point cloud.
description:Learn how to implement YOLO11 with SAHI for sliced inference. Optimize memory usage and enhance detection accuracy for large-scale applications.
# Ultralytics Docs: Using YOLO11 with SAHI for Sliced Inference
Welcome to the Ultralytics documentation on how to use YOLO11 with [SAHI](https://github.com/obss/sahi)(Slicing Aided Hyper Inference). This comprehensive guide aims to furnish you with all the essential knowledge you'll need to implement SAHI alongside YOLO11. We'll deep-dive into what SAHI is, why sliced inference is critical for large-scale applications, and how to integrate these functionalities with YOLO11 for enhanced [object detection](https://www.ultralytics.com/glossary/object-detection) performance.
SAHI (Slicing Aided Hyper Inference) is an innovative library designed to optimize object detection algorithms for large-scale and high-resolution imagery. Its core functionality lies in partitioning images into manageable slices, running object detection on each slice, and then stitching the results back together. SAHI is compatible with a range of object detection models, including the YOLO series, thereby offering flexibility while ensuring optimized use of computational resources.
<strong>Watch:</strong> Inference with SAHI (Slicing Aided Hyper Inference) using Ultralytics YOLO11
</p>
### Key Features of SAHI
-**Seamless Integration**: SAHI integrates effortlessly with YOLO models, meaning you can start slicing and detecting without a lot of code modification.
-**Resource Efficiency**: By breaking down large images into smaller parts, SAHI optimizes the memory usage, allowing you to run high-quality detection on hardware with limited resources.
-**High [Accuracy](https://www.ultralytics.com/glossary/accuracy)**: SAHI maintains the detection accuracy by employing smart algorithms to merge overlapping detection boxes during the stitching process.
## What is Sliced Inference?
Sliced Inference refers to the practice of subdividing a large or high-resolution image into smaller segments (slices), conducting object detection on these slices, and then recompiling the slices to reconstruct the object locations on the original image. This technique is invaluable in scenarios where computational resources are limited or when working with extremely high-resolution images that could otherwise lead to memory issues.
### Benefits of Sliced Inference
-**Reduced Computational Burden**: Smaller image slices are faster to process, and they consume less memory, enabling smoother operation on lower-end hardware.
-**Preserved Detection Quality**: Since each slice is treated independently, there is no reduction in the quality of object detection, provided the slices are large enough to capture the objects of interest.
-**Enhanced Scalability**: The technique allows for object detection to be more easily scaled across different sizes and resolutions of images, making it ideal for a wide range of applications from satellite imagery to medical diagnostics.
<tableborder="0">
<tr>
<th>YOLO11 without SAHI</th>
<th>YOLO11 with SAHI</th>
</tr>
<tr>
<td><imgsrc="https://github.com/ultralytics/docs/releases/download/0/yolov8-without-sahi.avif"alt="YOLO11 without SAHI"width="640"></td>
<td><imgsrc="https://github.com/ultralytics/docs/releases/download/0/yolov8-with-sahi.avif"alt="YOLO11 with SAHI"width="640"></td>
</tr>
</table>
## Installation and Preparation
### Installation
To get started, install the latest versions of SAHI and Ultralytics:
```bash
pip install-U ultralytics sahi
```
### Import Modules and Download Resources
Here's how to import the necessary modules and download a YOLO11 model and some test images:
# Convert to COCO annotation, COCO prediction, imantics, and fiftyone formats
result.to_coco_annotations()[:3]
result.to_coco_predictions(image_id=1)[:3]
result.to_imantics_annotations()[:3]
result.to_fiftyone_detections()[:3]
```
## Batch Prediction
For batch prediction on a directory of images:
```python
fromsahi.predictimportpredict
predict(
model_type="yolov8",
model_path="path/to/yolo11n.pt",
model_device="cpu",# or 'cuda:0'
model_confidence_threshold=0.4,
source="path/to/dir",
slice_height=256,
slice_width=256,
overlap_height_ratio=0.2,
overlap_width_ratio=0.2,
)
```
That's it! Now you're equipped to use YOLO11 with SAHI for both standard and sliced inference.
## Citations and Acknowledgments
If you use SAHI in your research or development work, please cite the original SAHI paper and acknowledge the authors:
!!! quote ""
=== "BibTeX"
```bibtex
@article{akyon2022sahi,
title={Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection},
author={Akyon, Fatih Cagatay and Altinuc, Sinan Onur and Temizel, Alptekin},
journal={2022 IEEE International Conference on Image Processing (ICIP)},
doi={10.1109/ICIP46576.2022.9897990},
pages={966-970},
year={2022}
}
```
We extend our thanks to the SAHI research group for creating and maintaining this invaluable resource for the [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) community. For more information about SAHI and its creators, visit the [SAHI GitHub repository](https://github.com/obss/sahi).
## FAQ
### How can I integrate YOLO11 with SAHI for sliced inference in object detection?
Integrating Ultralytics YOLO11 with SAHI (Slicing Aided Hyper Inference) for sliced inference optimizes your object detection tasks on high-resolution images by partitioning them into manageable slices. This approach improves memory usage and ensures high detection accuracy. To get started, you need to install the ultralytics and sahi libraries:
For more detailed instructions, refer to our [Sliced Inference guide](#sliced-inference-with-yolo11).
### Why should I use SAHI with YOLO11 for object detection on large images?
Using SAHI with Ultralytics YOLO11 for object detection on large images offers several benefits:
-**Reduced Computational Burden**: Smaller slices are faster to process and consume less memory, making it feasible to run high-quality detections on hardware with limited resources.
-**Maintained Detection Accuracy**: SAHI uses intelligent algorithms to merge overlapping boxes, preserving the detection quality.
-**Enhanced Scalability**: By scaling object detection tasks across different image sizes and resolutions, SAHI becomes ideal for various applications, such as satellite imagery analysis and medical diagnostics.
Learn more about the [benefits of sliced inference](#benefits-of-sliced-inference) in our documentation.
### Can I visualize prediction results when using YOLO11 with SAHI?
Yes, you can visualize prediction results when using YOLO11 with SAHI. Here's how you can export and visualize the results:
```python
fromIPython.displayimportImage
result.export_visuals(export_dir="demo_data/")
Image("demo_data/prediction_visual.png")
```
This command will save the visualized predictions to the specified directory and you can then load the image to view it in your notebook or application. For a detailed guide, check out the [Standard Inference section](#visualize-results).
### What features does SAHI offer for improving YOLO11 object detection?
SAHI (Slicing Aided Hyper Inference) offers several features that complement Ultralytics YOLO11 for object detection:
description:Enhance your security with real-time object detection using Ultralytics YOLO11. Reduce false positives and integrate seamlessly with existing systems.
The Security Alarm System Project utilizing Ultralytics YOLO11 integrates advanced [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) capabilities to enhance security measures. YOLO11, developed by Ultralytics, provides real-time [object detection](https://www.ultralytics.com/glossary/object-detection), allowing the system to identify and respond to potential security threats promptly. This project offers several advantages:
-**Real-time Detection:** YOLO11's efficiency enables the Security Alarm System to detect and respond to security incidents in real-time, minimizing response time.
-**[Accuracy](https://www.ultralytics.com/glossary/accuracy):** YOLO11 is known for its accuracy in object detection, reducing false positives and enhancing the reliability of the security alarm system.
-**Integration Capabilities:** The project can be seamlessly integrated with existing security infrastructure, providing an upgraded layer of intelligent surveillance.
<strong>Watch:</strong> Security Alarm System Project with Ultralytics YOLO11 <ahref="https://www.ultralytics.com/glossary/object-detection">Object Detection</a>
</p>
### Code
#### Set up the parameters of the message
???+ note
App Password Generation is necessary
- Navigate to [App Password Generator](https://myaccount.google.com/apppasswords), designate an app name such as "security project," and obtain a 16-digit password. Copy this password and paste it into the designated password field as instructed.
```python
password=""
from_email=""# must match the email used to generate the password
"""Run object detection on video frames from a camera stream, plotting and showing the results."""
cap=cv2.VideoCapture(self.capture_index)
assertcap.isOpened()
cap.set(cv2.CAP_PROP_FRAME_WIDTH,640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT,480)
frame_count=0
whileTrue:
self.start_time=time()
ret,im0=cap.read()
assertret
results=self.predict(im0)
im0,class_ids=self.plot_bboxes(results,im0)
iflen(class_ids)>0:# Only send email If not sent before
ifnotself.email_sent:
send_email(to_email,from_email,len(class_ids))
self.email_sent=True
else:
self.email_sent=False
self.display_fps(im0)
cv2.imshow("YOLO11 Detection",im0)
frame_count+=1
ifcv2.waitKey(5)&0xFF==27:
break
cap.release()
cv2.destroyAllWindows()
server.quit()
```
#### Call the Object Detection class and Run the Inference
```python
detector=ObjectDetection(capture_index=0)
detector()
```
That's it! When you execute the code, you'll receive a single notification on your email if any object is detected. The notification is sent immediately, not repeatedly. However, feel free to customize the code to suit your project requirements.
#### Email Received Sample
<imgwidth="256"src="https://github.com/ultralytics/docs/releases/download/0/email-received-sample.avif"alt="Email Received Sample">
## FAQ
### How does Ultralytics YOLO11 improve the accuracy of a security alarm system?
Ultralytics YOLO11 enhances security alarm systems by delivering high-accuracy, real-time object detection. Its advanced algorithms significantly reduce false positives, ensuring that the system only responds to genuine threats. This increased reliability can be seamlessly integrated with existing security infrastructure, upgrading the overall surveillance quality.
### Can I integrate Ultralytics YOLO11 with my existing security infrastructure?
Yes, Ultralytics YOLO11 can be seamlessly integrated with your existing security infrastructure. The system supports various modes and provides flexibility for customization, allowing you to enhance your existing setup with advanced object detection capabilities. For detailed instructions on integrating YOLO11 in your projects, visit the [integration section](https://docs.ultralytics.com/integrations/).
### What are the storage requirements for running Ultralytics YOLO11?
Running Ultralytics YOLO11 on a standard setup typically requires around 5GB of free disk space. This includes space for storing the YOLO11 model and any additional dependencies. For cloud-based solutions, Ultralytics HUB offers efficient project management and dataset handling, which can optimize storage needs. Learn more about the [Pro Plan](../hub/pro.md) for enhanced features including extended storage.
### What makes Ultralytics YOLO11 different from other object detection models like Faster R-CNN or SSD?
Ultralytics YOLO11 provides an edge over models like Faster R-CNN or SSD with its real-time detection capabilities and higher accuracy. Its unique architecture allows it to process images much faster without compromising on [precision](https://www.ultralytics.com/glossary/precision), making it ideal for time-sensitive applications like security alarm systems. For a comprehensive comparison of object detection models, you can explore our [guide](https://docs.ultralytics.com/models/).
### How can I reduce the frequency of false positives in my security system using Ultralytics YOLO11?
To reduce false positives, ensure your Ultralytics YOLO11 model is adequately trained with a diverse and well-annotated dataset. Fine-tuning hyperparameters and regularly updating the model with new data can significantly improve detection accuracy. Detailed [hyperparameter tuning](https://www.ultralytics.com/glossary/hyperparameter-tuning) techniques can be found in our [hyperparameter tuning guide](../guides/hyperparameter-tuning.md).
[Speed estimation](https://www.ultralytics.com/blog/ultralytics-yolov8-for-speed-estimation-in-computer-vision-projects) is the process of calculating the rate of movement of an object within a given context, often employed in [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) applications. Using [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) you can now calculate the speed of object using [object tracking](../modes/track.md) alongside distance and time data, crucial for tasks like traffic and surveillance. The accuracy of speed estimation directly influences the efficiency and reliability of various applications, making it a key component in the advancement of intelligent systems and real-time decision-making processes.
<strong>Watch:</strong> Speed Estimation using Ultralytics YOLO11
</p>
!!! tip "Check Out Our Blog"
For deeper insights into speed estimation, check out our blog post: [Ultralytics YOLO11 for Speed Estimation in Computer Vision Projects](https://www.ultralytics.com/blog/ultralytics-yolov8-for-speed-estimation-in-computer-vision-projects)
## Advantages of Speed Estimation?
-**Efficient Traffic Control:** Accurate speed estimation aids in managing traffic flow, enhancing safety, and reducing congestion on roadways.
-**Precise Autonomous Navigation:** In autonomous systems like self-driving cars, reliable speed estimation ensures safe and accurate vehicle navigation.
-**Enhanced Surveillance Security:** Speed estimation in surveillance analytics helps identify unusual behaviors or potential threats, improving the effectiveness of security measures.
|  |  |
| Speed Estimation on Road using Ultralytics YOLO11 | Speed Estimation on Bridge using Ultralytics YOLO11 |
!!! example "Speed Estimation using YOLO11 Example"
| `model` | `str` | `None` | Path to Ultralytics YOLO Model File |
| `region` | `list` | `[(20, 400), (1260, 400)]` | List of points defining the counting region. |
| `line_width` | `int` | `2` | Line thickness for bounding boxes. |
| `show` | `bool` | `False` | Flag to control whether to display the video stream. |
### Arguments `model.track`
{% include "macros/track-args.md" %}
## FAQ
### How do I estimate object speed using Ultralytics YOLO11?
Estimating object speed with Ultralytics YOLO11 involves combining [object detection](https://www.ultralytics.com/glossary/object-detection) and tracking techniques. First, you need to detect objects in each frame using the YOLO11 model. Then, track these objects across frames to calculate their movement over time. Finally, use the distance traveled by the object between frames and the frame rate to estimate its speed.
For more details, refer to our [official blog post](https://www.ultralytics.com/blog/ultralytics-yolov8-for-speed-estimation-in-computer-vision-projects).
### What are the benefits of using Ultralytics YOLO11 for speed estimation in traffic management?
Using Ultralytics YOLO11 for speed estimation offers significant advantages in traffic management:
-**Enhanced Safety**: Accurately estimate vehicle speeds to detect over-speeding and improve road safety.
-**Real-Time Monitoring**: Benefit from YOLO11's real-time object detection capability to monitor traffic flow and congestion effectively.
-**Scalability**: Deploy the model on various hardware setups, from edge devices to servers, ensuring flexible and scalable solutions for large-scale implementations.
For more applications, see [advantages of speed estimation](#advantages-of-speed-estimation).
### Can YOLO11 be integrated with other AI frameworks like [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) or [PyTorch](https://www.ultralytics.com/glossary/pytorch)?
Yes, YOLO11 can be integrated with other AI frameworks like TensorFlow and PyTorch. Ultralytics provides support for exporting YOLO11 models to various formats like ONNX, TensorRT, and CoreML, ensuring smooth interoperability with other ML frameworks.
To export a YOLO11 model to ONNX format:
```bash
yolo export--weights yolo11n.pt --include onnx
```
Learn more about exporting models in our [guide on export](../modes/export.md).
### How accurate is the speed estimation using Ultralytics YOLO11?
The [accuracy](https://www.ultralytics.com/glossary/accuracy) of speed estimation using Ultralytics YOLO11 depends on several factors, including the quality of the object tracking, the resolution and frame rate of the video, and environmental variables. While the speed estimator provides reliable estimates, it may not be 100% accurate due to variances in frame processing speed and object occlusion.
**Note**: Always consider margin of error and validate the estimates with ground truth data when possible.
For further accuracy improvement tips, check the [Arguments `SpeedEstimator` section](#arguments-speedestimator).
### Why choose Ultralytics YOLO11 over other object detection models like TensorFlow Object Detection API?
Ultralytics YOLO11 offers several advantages over other object detection models, such as the TensorFlow Object Detection API:
-**Real-Time Performance**: YOLO11 is optimized for real-time detection, providing high speed and accuracy.
-**Ease of Use**: Designed with a user-friendly interface, YOLO11 simplifies model training and deployment.
-**Versatility**: Supports multiple tasks, including object detection, segmentation, and pose estimation.
-**Community and Support**: YOLO11 is backed by an active community and extensive documentation, ensuring developers have the resources they need.
For more information on the benefits of YOLO11, explore our detailed [model page](../models/yolov8.md).
description:Discover essential steps for launching a successful computer vision project, from defining goals to model deployment and maintenance. Boost your AI capabilities now!.
keywords:Computer Vision, AI, Object Detection, Image Classification, Instance Segmentation, Data Annotation, Model Training, Model Evaluation, Model Deployment
---
# Understanding the Key Steps in a Computer Vision Project
## Introduction
Computer vision is a subfield of [artificial intelligence](https://www.ultralytics.com/glossary/artificial-intelligence-ai)(AI) that helps computers see and understand the world like humans do. It processes and analyzes images or videos to extract information, recognize patterns, and make decisions based on that data.
<strong>Watch:</strong> How to Do <ahref="https://www.ultralytics.com/glossary/computer-vision-cv">Computer Vision</a> Projects | A Step-by-Step Guide
</p>
Computer vision techniques like [object detection](../tasks/detect.md), [image classification](../tasks/classify.md), and [instance segmentation](../tasks/segment.md) can be applied across various industries, from [autonomous driving](https://www.ultralytics.com/solutions/ai-in-self-driving) to [medical imaging](https://www.ultralytics.com/solutions/ai-in-healthcare) to gain valuable insights.
Working on your own computer vision projects is a great way to understand and learn more about computer vision. However, a computer vision project can consist of many steps, and it might seem confusing at first. By the end of this guide, you'll be familiar with the steps involved in a computer vision project. We'll walk through everything from the beginning to the end of a project, explaining why each part is important. Let's get started and make your computer vision project a success!
## An Overview of a Computer Vision Project
Before discussing the details of each step involved in a computer vision project, let's look at the overall process. If you started a computer vision project today, you'd take the following steps:
- Your first priority would be to understand your project's requirements.
- Then, you'd collect and accurately label the images that will help train your model.
- Next, you'd clean your data and apply augmentation techniques to prepare it for model training.
- After model training, you'd thoroughly test and evaluate your model to make sure it performs consistently under different conditions.
- Finally, you'd deploy your model into the real world and update it based on new insights and feedback.
Now that we know what to expect, let's dive right into the steps and get your project moving forward.
## Step 1: Defining Your Project's Goals
The first step in any computer vision project is clearly defining the problem you're trying to solve. Knowing the end goal helps you start to build a solution. This is especially true when it comes to computer vision because your project's objective will directly affect which computer vision task you need to focus on.
Here are some examples of project objectives and the computer vision tasks that can be used to reach these objectives:
-**Objective:** To develop a system that can monitor and manage the flow of different vehicle types on highways, improving traffic management and safety.
- **Computer Vision Task:** Object detection is ideal for traffic monitoring because it efficiently locates and identifies multiple vehicles. It is less computationally demanding than image segmentation, which provides unnecessary detail for this task, ensuring faster, real-time analysis.
-**Objective:** To develop a tool that assists radiologists by providing precise, pixel-level outlines of tumors in medical imaging scans.
- **Computer Vision Task:** Image segmentation is suitable for medical imaging because it provides accurate and detailed boundaries of tumors that are crucial for assessing size, shape, and treatment planning.
-**Objective:** To create a digital system that categorizes various documents (e.g., invoices, receipts, legal paperwork) to improve organizational efficiency and document retrieval.
-**Computer Vision Task:**[Image classification](https://www.ultralytics.com/glossary/image-classification) is ideal here as it handles one document at a time, without needing to consider the document's position in the image. This approach simplifies and accelerates the sorting process.
### Step 1.5: Selecting the Right Model and Training Approach
After understanding the project objective and suitable computer vision tasks, an essential part of defining the project goal is [selecting the right model](../models/index.md) and training approach.
Depending on the objective, you might choose to select the model first or after seeing what data you are able to collect in Step 2. For example, suppose your project is highly dependent on the availability of specific types of data. In that case, it may be more practical to gather and analyze the data first before selecting a model. On the other hand, if you have a clear understanding of the model requirements, you can choose the model first and then collect data that fits those specifications.
Choosing between training from scratch or using [transfer learning](https://www.ultralytics.com/glossary/transfer-learning) affects how you prepare your data. Training from scratch requires a diverse dataset to build the model's understanding from the ground up. Transfer learning, on the other hand, allows you to use a pre-trained model and adapt it with a smaller, more specific dataset. Also, choosing a specific model to train will determine how you need to prepare your data, such as resizing images or adding annotations, according to the model's specific requirements.
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/training-from-scratch-vs-transfer-learning.avif"alt="Training From Scratch Vs. Using Transfer Learning">
</p>
Note: When choosing a model, consider its [deployment](./model-deployment-options.md) to ensure compatibility and performance. For example, lightweight models are ideal for [edge computing](https://www.ultralytics.com/glossary/edge-computing) due to their efficiency on resource-constrained devices. To learn more about the key points related to defining your project, read [our guide](./defining-project-goals.md) on defining your project's goals and selecting the right model.
Before getting into the hands-on work of a computer vision project, it's important to have a clear understanding of these details. Double-check that you've considered the following before moving on to Step 2:
- Clearly define the problem you're trying to solve.
- Determine the end goal of your project.
- Identify the specific computer vision task needed (e.g., object detection, image classification, image segmentation).
- Decide whether to train a model from scratch or use transfer learning.
- Select the appropriate model for your task and deployment needs.
## Step 2: Data Collection and Data Annotation
The quality of your computer vision models depend on the quality of your dataset. You can either collect images from the internet, take your own pictures, or use pre-existing datasets. Here are some great resources for downloading high-quality datasets: [Google Dataset Search Engine](https://datasetsearch.research.google.com/), [UC Irvine Machine Learning Repository](https://archive.ics.uci.edu/), and [Kaggle Datasets](https://www.kaggle.com/datasets).
Some libraries, like Ultralytics, provide [built-in support for various datasets](../datasets/index.md), making it easier to get started with high-quality data. These libraries often include utilities for using popular datasets seamlessly, which can save you a lot of time and effort in the initial stages of your project.
However, if you choose to collect images or take your own pictures, you'll need to annotate your data. Data annotation is the process of labeling your data to impart knowledge to your model. The type of data annotation you'll work with depends on your specific computer vision technique. Here are some examples:
-**Image Classification:** You'll label the entire image as a single class.
-**[Object Detection](https://www.ultralytics.com/glossary/object-detection):** You'll draw bounding boxes around each object in the image and label each box.
-**[Image Segmentation](https://www.ultralytics.com/glossary/image-segmentation):** You'll label each pixel in the image according to the object it belongs to, creating detailed object boundaries.
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/different-types-of-image-annotation.avif"alt="Different Types of Image Annotation">
</p>
[Data collection and annotation](./data-collection-and-annotation.md) can be a time-consuming manual effort. Annotation tools can help make this process easier. Here are some useful open annotation tools: [LabeI Studio](https://github.com/HumanSignal/label-studio), [CVAT](https://github.com/cvat-ai/cvat), and [Labelme](https://github.com/wkentaro/labelme).
## Step 3: [Data Augmentation](https://www.ultralytics.com/glossary/data-augmentation) and Splitting Your Dataset
After collecting and annotating your image data, it's important to first split your dataset into training, validation, and test sets before performing data augmentation. Splitting your dataset before augmentation is crucial to test and validate your model on original, unaltered data. It helps accurately assess how well the model generalizes to new, unseen data.
Here's how to split your data:
-**Training Set:** It is the largest portion of your data, typically 70-80% of the total, used to train your model.
-**Validation Set:** Usually around 10-15% of your data; this set is used to tune hyperparameters and validate the model during training, helping to prevent [overfitting](https://www.ultralytics.com/glossary/overfitting).
-**Test Set:** The remaining 10-15% of your data is set aside as the test set. It is used to evaluate the model's performance on unseen data after training is complete.
After splitting your data, you can perform data augmentation by applying transformations like rotating, scaling, and flipping images to artificially increase the size of your dataset. Data augmentation makes your model more robust to variations and improves its performance on unseen images.
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/examples-of-data-augmentations.avif"alt="Examples of Data Augmentations">
</p>
Libraries like [OpenCV](https://www.ultralytics.com/glossary/opencv), Albumentations, and [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) offer flexible augmentation functions that you can use. Additionally, some libraries, such as Ultralytics, have [built-in augmentation settings](../modes/train.md) directly within its model training function, simplifying the process.
To understand your data better, you can use tools like [Matplotlib](https://matplotlib.org/) or [Seaborn](https://seaborn.pydata.org/) to visualize the images and analyze their distribution and characteristics. Visualizing your data helps identify patterns, anomalies, and the effectiveness of your augmentation techniques. You can also use [Ultralytics Explorer](../datasets/explorer/index.md), a tool for exploring computer vision datasets with semantic search, SQL queries, and vector similarity search.
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/docs/releases/download/0/explorer-dashboard-screenshot-1.avif"alt="The Ultralytics Explorer Tool">
</p>
By properly [understanding, splitting, and augmenting your data](./preprocessing_annotated_data.md), you can develop a well-trained, validated, and tested model that performs well in real-world applications.
## Step 4: Model Training
Once your dataset is ready for training, you can focus on setting up the necessary environment, managing your datasets, and training your model.
First, you'll need to make sure your environment is configured correctly. Typically, this includes the following:
- Installing essential libraries and frameworks like TensorFlow, [PyTorch](https://www.ultralytics.com/glossary/pytorch), or [Ultralytics](../quickstart.md).
- If you are using a GPU, installing libraries like CUDA and cuDNN will help enable GPU acceleration and speed up the training process.
Then, you can load your training and validation datasets into your environment. Normalize and preprocess the data through resizing, format conversion, or augmentation. With your model selected, configure the layers and specify hyperparameters. Compile the model by setting the [loss function](https://www.ultralytics.com/glossary/loss-function), optimizer, and performance metrics.
Libraries like Ultralytics simplify the training process. You can [start training](../modes/train.md) by feeding data into the model with minimal code. These libraries handle weight adjustments, [backpropagation](https://www.ultralytics.com/glossary/backpropagation), and validation automatically. They also offer tools to monitor progress and adjust hyperparameters easily. After training, save the model and its weights with a few commands.
It's important to keep in mind that proper dataset management is vital for efficient training. Use version control for datasets to track changes and ensure reproducibility. Tools like [DVC (Data Version Control)](../integrations/dvc.md) can help manage large datasets.
## Step 5: Model Evaluation and Model [Finetuning](https://www.ultralytics.com/glossary/fine-tuning)
It's important to assess your model's performance using various metrics and refine it to improve [accuracy](https://www.ultralytics.com/glossary/accuracy). [Evaluating](../modes/val.md) helps identify areas where the model excels and where it may need improvement. Fine-tuning ensures the model is optimized for the best possible performance.
-**[Performance Metrics](./yolo-performance-metrics.md):** Use metrics like accuracy, [precision](https://www.ultralytics.com/glossary/precision), [recall](https://www.ultralytics.com/glossary/recall), and F1-score to evaluate your model's performance. These metrics provide insights into how well your model is making predictions.
-**[Hyperparameter Tuning](./hyperparameter-tuning.md):** Adjust hyperparameters to optimize model performance. Techniques like grid search or random search can help find the best hyperparameter values.
- Fine-Tuning: Make small adjustments to the model architecture or training process to enhance performance. This might involve tweaking [learning rates](https://www.ultralytics.com/glossary/learning-rate), [batch sizes](https://www.ultralytics.com/glossary/batch-size), or other model parameters.
## Step 6: Model Testing
In this step, you can make sure that your model performs well on completely unseen data, confirming its readiness for deployment. The difference between model testing and model evaluation is that it focuses on verifying the final model's performance rather than iteratively improving it.
It's important to thoroughly test and debug any common issues that may arise. Test your model on a separate test dataset that was not used during training or validation. This dataset should represent real-world scenarios to ensure the model's performance is consistent and reliable.
Also, address common problems such as overfitting, [underfitting](https://www.ultralytics.com/glossary/underfitting), and data leakage. Use techniques like cross-validation and [anomaly detection](https://www.ultralytics.com/glossary/anomaly-detection) to identify and fix these issues.
Once your model has been thoroughly tested, it's time to deploy it. Deployment involves making your model available for use in a production environment. Here are the steps to deploy a computer vision model:
- Setting Up the Environment: Configure the necessary infrastructure for your chosen deployment option, whether it's cloud-based (AWS, Google Cloud, Azure) or edge-based (local devices, IoT).
-**[Exporting the Model](../modes/export.md):** Export your model to the appropriate format (e.g., ONNX, TensorRT, CoreML for YOLO11) to ensure compatibility with your deployment platform.
-**Deploying the Model:** Deploy the model by setting up APIs or endpoints and integrating it with your application.
-**Ensuring Scalability**: Implement load balancers, auto-scaling groups, and monitoring tools to manage resources and handle increasing data and user requests.
## Step 8: Monitoring, Maintenance, and Documentation
Once your model is deployed, it's important to continuously monitor its performance, maintain it to handle any issues, and document the entire process for future reference and improvements.
Monitoring tools can help you track key performance indicators (KPIs) and detect anomalies or drops in accuracy. By monitoring the model, you can be aware of model drift, where the model's performance declines over time due to changes in the input data. Periodically retrain the model with updated data to maintain accuracy and relevance.
In addition to monitoring and maintenance, documentation is also key. Thoroughly document the entire process, including model architecture, training procedures, hyperparameters, data preprocessing steps, and any changes made during deployment and maintenance. Good documentation ensures reproducibility and makes future updates or troubleshooting easier. By effectively monitoring, maintaining, and documenting your model, you can ensure it remains accurate, reliable, and easy to manage over its lifecycle.
## Engaging with the Community
Connecting with a community of computer vision enthusiasts can help you tackle any issues you face while working on your computer vision project with confidence. Here are some ways to learn, troubleshoot, and network effectively.
### Community Resources
-**GitHub Issues:** Check out the [YOLO11 GitHub repository](https://github.com/ultralytics/ultralytics/issues) and use the Issues tab to ask questions, report bugs, and suggest new features. The active community and maintainers are there to help with specific issues.
-**Ultralytics Discord Server:** Join the [Ultralytics Discord server](https://discord.com/invite/ultralytics) to interact with other users and developers, get support, and share insights.
### Official Documentation
-**Ultralytics YOLO11 Documentation:** Explore the [official YOLO11 documentation](./index.md) for detailed guides with helpful tips on different computer vision tasks and projects.
Using these resources will help you overcome challenges and stay updated with the latest trends and best practices in the computer vision community.
## Kickstart Your Computer Vision Project Today!
Taking on a computer vision project can be exciting and rewarding. By following the steps in this guide, you can build a solid foundation for success. Each step is crucial for developing a solution that meets your objectives and works well in real-world scenarios. As you gain experience, you'll discover advanced techniques and tools to improve your projects. Stay curious, keep learning, and explore new methods and innovations!
## FAQ
### How do I choose the right computer vision task for my project?
Choosing the right computer vision task depends on your project's end goal. For instance, if you want to monitor traffic, **object detection** is suitable as it can locate and identify multiple vehicle types in real-time. For medical imaging, **image segmentation** is ideal for providing detailed boundaries of tumors, aiding in diagnosis and treatment planning. Learn more about specific tasks like [object detection](../tasks/detect.md), [image classification](../tasks/classify.md), and [instance segmentation](../tasks/segment.md).
### Why is data annotation crucial in computer vision projects?
Data annotation is vital for teaching your model to recognize patterns. The type of annotation varies with the task:
-**Image Classification**: Entire image labeled as a single class.
-**Object Detection**: Bounding boxes drawn around objects.
-**Image Segmentation**: Each pixel labeled according to the object it belongs to.
Tools like [Label Studio](https://github.com/HumanSignal/label-studio), [CVAT](https://github.com/cvat-ai/cvat), and [Labelme](https://github.com/wkentaro/labelme) can assist in this process. For more details, refer to our [data collection and annotation guide](./data-collection-and-annotation.md).
### What steps should I follow to augment and split my dataset effectively?
Splitting your dataset before augmentation helps validate model performance on original, unaltered data. Follow these steps:
-**Training Set**: 70-80% of your data.
-**Validation Set**: 10-15% for [hyperparameter tuning](https://www.ultralytics.com/glossary/hyperparameter-tuning).
-**Test Set**: Remaining 10-15% for final evaluation.
After splitting, apply data augmentation techniques like rotation, scaling, and flipping to increase dataset diversity. Libraries such as Albumentations and OpenCV can help. Ultralytics also offers [built-in augmentation settings](../modes/train.md) for convenience.
### How can I export my trained computer vision model for deployment?
Exporting your model ensures compatibility with different deployment platforms. Ultralytics provides multiple formats, including ONNX, TensorRT, and CoreML. To export your YOLO11 model, follow this guide:
- Use the `export` function with the desired format parameter.
- Ensure the exported model fits the specifications of your deployment environment (e.g., edge devices, cloud).
For more information, check out the [model export guide](../modes/export.md).
### What are the best practices for monitoring and maintaining a deployed computer vision model?
Continuous monitoring and maintenance are essential for a model's long-term success. Implement tools for tracking Key Performance Indicators (KPIs) and detecting anomalies. Regularly retrain the model with updated data to counteract model drift. Document the entire process, including model architecture, hyperparameters, and changes, to ensure reproducibility and ease of future updates. Learn more in our [monitoring and maintenance guide](#step-8-monitoring-maintenance-and-documentation).
description:Learn how to set up a real-time object detection application using Streamlit and Ultralytics YOLO11. Follow this step-by-step guide to implement webcam-based object detection.
# Live Inference with Streamlit Application using Ultralytics YOLO11
## Introduction
Streamlit makes it simple to build and deploy interactive web applications. Combining this with Ultralytics YOLO11 allows for real-time [object detection](https://www.ultralytics.com/glossary/object-detection) and analysis directly in your browser. YOLO11 high accuracy and speed ensure seamless performance for live video streams, making it ideal for applications in security, retail, and beyond.
<strong>Watch:</strong> How to Use Streamlit with Ultralytics for Real-Time <ahref="https://www.ultralytics.com/glossary/computer-vision-cv">Computer Vision</a> in Your Browser
|  |  |
| Fish Detection using Ultralytics YOLO11 | Animals Detection using Ultralytics YOLO11 |
## Advantages of Live Inference
-**Seamless Real-Time Object Detection**: Streamlit combined with YOLO11 enables real-time object detection directly from your webcam feed. This allows for immediate analysis and insights, making it ideal for applications requiring instant feedback.
-**User-Friendly Deployment**: Streamlit's interactive interface makes it easy to deploy and use the application without extensive technical knowledge. Users can start live inference with a simple click, enhancing accessibility and usability.
-**Efficient Resource Utilization**: YOLO11 optimized algorithm ensure high-speed processing with minimal computational resources. This efficiency allows for smooth and reliable webcam inference even on standard hardware, making advanced computer vision accessible to a wider audience.
## Streamlit Application Code
!!! tip "Ultralytics Installation"
Before you start building the application, ensure you have the Ultralytics Python Package installed. You can install it using the command **pip install ultralytics**
!!! example "Streamlit Application"
=== "CLI"
```bash
yolo streamlit-predict
```
=== "Python"
```python
from ultralytics import solutions
solutions.inference()
### Make sure to run the file using command `streamlit run <file-name.py>`
```
This will launch the Streamlit application in your default web browser. You will see the main title, subtitle, and the sidebar with configuration options. Select your desired YOLO11 model, set the confidence and NMS thresholds, and click the "Start" button to begin the real-time object detection.
You can optionally supply a specific model in Python:
!!! example "Streamlit Application with a custom model"
=== "Python"
```python
from ultralytics import solutions
# Pass a model as an argument
solutions.inference(model="path/to/model.pt")
### Make sure to run the file using command `streamlit run <file-name.py>`
```
## Conclusion
By following this guide, you have successfully created a real-time object detection application using Streamlit and Ultralytics YOLO11. This application allows you to experience the power of YOLO11 in detecting objects through your webcam, with a user-friendly interface and the ability to stop the video stream at any time.
For further enhancements, you can explore adding more features such as recording the video stream, saving the annotated frames, or integrating with other computer vision libraries.
## Share Your Thoughts with the Community
Engage with the community to learn more, troubleshoot issues, and share your projects:
### Where to Find Help and Support
-**GitHub Issues:** Visit the [Ultralytics GitHub repository](https://github.com/ultralytics/ultralytics/issues) to raise questions, report bugs, and suggest features.
-**Ultralytics Discord Server:** Join the [Ultralytics Discord server](https://discord.com/invite/ultralytics) to connect with other users and developers, get support, share knowledge, and brainstorm ideas.
### Official Documentation
-**Ultralytics YOLO11 Documentation:** Refer to the [official YOLO11 documentation](https://docs.ultralytics.com/) for comprehensive guides and insights on various computer vision tasks and projects.
## FAQ
### How can I set up a real-time object detection application using Streamlit and Ultralytics YOLO11?
Setting up a real-time object detection application with Streamlit and Ultralytics YOLO11 is straightforward. First, ensure you have the Ultralytics Python package installed using:
```bash
pip install ultralytics
```
Then, you can create a basic Streamlit application to run live inference:
!!! example "Streamlit Application"
=== "Python"
```python
from ultralytics import solutions
solutions.inference()
### Make sure to run the file using command `streamlit run <file-name.py>`
```
=== "CLI"
```bash
yolo streamlit-predict
```
For more details on the practical setup, refer to the [Streamlit Application Code section](#streamlit-application-code) of the documentation.
### What are the main advantages of using Ultralytics YOLO11 with Streamlit for real-time object detection?
Using Ultralytics YOLO11 with Streamlit for real-time object detection offers several advantages:
Discover more about these advantages [here](#advantages-of-live-inference).
### How do I deploy a Streamlit object detection application in my web browser?
After coding your Streamlit application integrating Ultralytics YOLO11, you can deploy it by running:
```bash
streamlit run <file-name.py>
```
This command will launch the application in your default web browser, enabling you to select YOLO11 models, set confidence, and NMS thresholds, and start real-time object detection with a simple click. For a detailed guide, refer to the [Streamlit Application Code](#streamlit-application-code) section.
### What are some use cases for real-time object detection using Streamlit and Ultralytics YOLO11?
Real-time object detection using Streamlit and Ultralytics YOLO11 can be applied in various sectors:
-**Security**: Real-time monitoring for unauthorized access.
-**Retail**: Customer counting, shelf management, and more.
-**Wildlife and Agriculture**: Monitoring animals and crop conditions.
For more in-depth use cases and examples, explore [Ultralytics Solutions](https://docs.ultralytics.com/solutions/).
### How does Ultralytics YOLO11 compare to other object detection models like YOLOv5 and RCNNs?
Ultralytics YOLO11 provides several enhancements over prior models like YOLOv5 and RCNNs:
-**Higher Speed and Accuracy**: Improved performance for real-time applications.
-**Ease of Use**: Simplified interfaces and deployment.
-**Resource Efficiency**: Optimized for better speed with minimal computational requirements.
For a comprehensive comparison, check [Ultralytics YOLO11 Documentation](https://docs.ultralytics.com/models/yolov8/) and related blog posts discussing model performance.
description:Learn how to integrate Ultralytics YOLO11 with NVIDIA Triton Inference Server for scalable, high-performance AI model deployment.
keywords:Triton Inference Server, YOLO11, Ultralytics, NVIDIA, deep learning, AI model deployment, ONNX, scalable inference
---
# Triton Inference Server with Ultralytics YOLO11
The [Triton Inference Server](https://developer.nvidia.com/triton-inference-server)(formerly known as TensorRT Inference Server) is an open-source software solution developed by NVIDIA. It provides a cloud inference solution optimized for NVIDIA GPUs. Triton simplifies the deployment of AI models at scale in production. Integrating Ultralytics YOLO11 with Triton Inference Server allows you to deploy scalable, high-performance [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) inference workloads. This guide provides steps to set up and test the integration.
<strong>Watch:</strong> Getting Started with NVIDIA Triton Inference Server.
</p>
## What is Triton Inference Server?
Triton Inference Server is designed to deploy a variety of AI models in production. It supports a wide range of deep learning and [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) frameworks, including TensorFlow, [PyTorch](https://www.ultralytics.com/glossary/pytorch), ONNX Runtime, and many others. Its primary use cases are:
- Serving multiple models from a single server instance.
- Dynamic model loading and unloading without server restart.
- Ensemble inference, allowing multiple models to be used together to achieve results.
- Model versioning for A/B testing and rolling updates.
## Prerequisites
Ensure you have the following prerequisites before proceeding:
- Docker installed on your machine.
- Install `tritonclient`:
```bash
pip install tritonclient[all]
```
## Exporting YOLO11 to ONNX Format
Before deploying the model on Triton, it must be exported to the ONNX format. ONNX (Open Neural Network Exchange) is a format that allows models to be transferred between different deep learning frameworks. Use the `export` function from the `YOLO` class:
By following the above steps, you can deploy and run Ultralytics YOLO11 models efficiently on Triton Inference Server, providing a scalable and high-performance solution for deep learning inference tasks. If you face any issues or have further queries, refer to the [official Triton documentation](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html) or reach out to the Ultralytics community for support.
## FAQ
### How do I set up Ultralytics YOLO11 with NVIDIA Triton Inference Server?
Setting up [Ultralytics YOLO11](https://docs.ultralytics.com/models/yolov8/) with [NVIDIA Triton Inference Server](https://developer.nvidia.com/triton-inference-server) involves a few key steps:
1.**Export YOLO11 to ONNX format**:
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n.pt") # load an official model
This setup can help you efficiently deploy YOLO11 models at scale on Triton Inference Server for high-performance AI model inference.
### What benefits does using Ultralytics YOLO11 with NVIDIA Triton Inference Server offer?
Integrating [Ultralytics YOLO11](../models/yolov8.md) with [NVIDIA Triton Inference Server](https://developer.nvidia.com/triton-inference-server) provides several advantages:
-**Scalable AI Inference**: Triton allows serving multiple models from a single server instance, supporting dynamic model loading and unloading, making it highly scalable for diverse AI workloads.
-**High Performance**: Optimized for NVIDIA GPUs, Triton Inference Server ensures high-speed inference operations, perfect for real-time applications such as [object detection](https://www.ultralytics.com/glossary/object-detection).
-**Ensemble and Model Versioning**: Triton's ensemble mode enables combining multiple models to improve results, and its model versioning supports A/B testing and rolling updates.
For detailed instructions on setting up and running YOLO11 with Triton, you can refer to the [setup guide](#setting-up-triton-model-repository).
### Why should I export my YOLO11 model to ONNX format before using Triton Inference Server?
Using ONNX (Open Neural Network Exchange) format for your [Ultralytics YOLO11](../models/yolov8.md) model before deploying it on [NVIDIA Triton Inference Server](https://developer.nvidia.com/triton-inference-server) offers several key benefits:
-**Interoperability**: ONNX format supports transfer between different deep learning frameworks (such as PyTorch, TensorFlow), ensuring broader compatibility.
-**Optimization**: Many deployment environments, including Triton, optimize for ONNX, enabling faster inference and better performance.
-**Ease of Deployment**: ONNX is widely supported across frameworks and platforms, simplifying the deployment process in various operating systems and hardware configurations.
You can follow the steps in the [exporting guide](../modes/export.md) to complete the process.
### Can I run inference using the Ultralytics YOLO11 model on Triton Inference Server?
Yes, you can run inference using the [Ultralytics YOLO11](../models/yolov8.md) model on [NVIDIA Triton Inference Server](https://developer.nvidia.com/triton-inference-server). Once your model is set up in the Triton Model Repository and the server is running, you can load and run inference on your model as follows:
For an in-depth guide on setting up and running Triton Server with YOLO11, refer to the [running triton inference server](#running-triton-inference-server) section.
### How does Ultralytics YOLO11 compare to [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) and PyTorch models for deployment?
[Ultralytics YOLO11](https://docs.ultralytics.com/models/yolov8/) offers several unique advantages compared to TensorFlow and PyTorch models for deployment:
-**Real-time Performance**: Optimized for real-time object detection tasks, YOLO11 provides state-of-the-art [accuracy](https://www.ultralytics.com/glossary/accuracy) and speed, making it ideal for applications requiring live video analytics.
-**Ease of Use**: YOLO11 integrates seamlessly with Triton Inference Server and supports diverse export formats (ONNX, TensorRT, CoreML), making it flexible for various deployment scenarios.
-**Advanced Features**: YOLO11 includes features like dynamic model loading, model versioning, and ensemble inference, which are crucial for scalable and reliable AI deployments.
For more details, compare the deployment options in the [model deployment guide](../modes/export.md).
<imgwidth="800"src="https://github.com/ultralytics/docs/releases/download/0/sixel-example-terminal.avif"alt="Sixel example of image in Terminal">
</p>
Image from the [libsixel](https://saitoha.github.io/libsixel/) website.
## Motivation
When connecting to a remote machine, normally visualizing image results is not possible or requires moving data to a local device with a GUI. The VSCode integrated terminal allows for directly rendering images. This is a short demonstration on how to use this in conjunction with `ultralytics` with [prediction results](../modes/predict.md).
!!! warning
Only compatible with Linux and MacOS. Check the [VSCode repository](https://github.com/microsoft/vscode), check [Issue status](https://github.com/microsoft/vscode/issues/198622), or [documentation](https://code.visualstudio.com/docs) for updates about Windows support to view images in terminal with `sixel`.
The VSCode compatible protocols for viewing images using the integrated terminal are [`sixel`](https://en.wikipedia.org/wiki/Sixel) and [`iTerm`](https://iterm2.com/documentation-images.html). This guide will demonstrate use of the `sixel` protocol.
## Process
1. First, you must enable settings `terminal.integrated.enableImages` and `terminal.integrated.gpuAcceleration` in VSCode.
```yaml
"terminal.integrated.gpuAcceleration": "auto" # "auto" is default, can also use "on"
2. Install the `python-sixel` library in your virtual environment. This is a [fork](https://github.com/lubosz/python-sixel?tab=readme-ov-file) of the `PySixel` library, which is no longer maintained.
```bash
pip install sixel
```
3. Load a model and execute inference, then plot the results and store in a variable. See more about inference arguments and working with results on the [predict mode](../modes/predict.md) page.
1. See [plot method parameters](../modes/predict.md#plot-method-parameters) to see possible arguments to use.
4. Now, use [OpenCV](https://www.ultralytics.com/glossary/opencv) to convert the `numpy.ndarray` to `bytes` data. Then use `io.BytesIO` to make a "file-like" object.
```{ .py .annotate }
import io
import cv2
# Results image as bytes
im_bytes = cv2.imencode(
".png", # (1)!
plot,
)[1].tobytes() # (2)!
# Image bytes as a file-like object
mem_file = io.BytesIO(im_bytes)
```
1. It's possible to use other image extensions as well.
2. Only the object at index `1` that is returned is needed.
5. Create a `SixelWriter` instance, and then use the `.draw()` method to draw the image in the terminal.
```python
from sixel import SixelWriter
# Create sixel writer object
w = SixelWriter()
# Draw the sixel image in the terminal
w.draw(mem_file)
```
## Example Inference Results
<palign="center">
<imgwidth="800"src="https://github.com/ultralytics/docs/releases/download/0/view-image-in-terminal.avif"alt="View Image in Terminal">
</p>
!!! danger
Using this example with videos or animated GIF frames has **not** been tested. Attempt at your own risk.
For further details, visit the [predict mode](../modes/predict.md) page.
### Why does the sixel protocol only work on Linux and macOS?
The sixel protocol is currently only supported on Linux and macOS because these platforms have native terminal capabilities compatible with sixel graphics. Windows support for terminal graphics using sixel is still under development. For updates on Windows compatibility, check the [VSCode Issue status](https://github.com/microsoft/vscode/issues/198622) and [documentation](https://code.visualstudio.com/docs).
### What if I encounter issues with displaying images in the VSCode terminal?
If you encounter issues displaying images in the VSCode terminal using sixel:
1. Ensure the necessary settings in VSCode are enabled:
```yaml
"terminal.integrated.enableImages": true
"terminal.integrated.gpuAcceleration": "auto"
```
2. Verify the sixel library installation:
```bash
pip install sixel
```
3. Check your image data conversion and plotting code for errors. For example:
If problems persist, consult the [VSCode repository](https://github.com/microsoft/vscode), and visit the [plot method parameters](../modes/predict.md#plot-method-parameters) section for additional guidance.
### Can YOLO display video inference results in the terminal using sixel?
Displaying video inference results or animated GIF frames using sixel in the terminal is currently untested and may not be supported. We recommend starting with static images and verifying compatibility. Attempt video results at your own risk, keeping in mind performance constraints. For more information on plotting inference results, visit the [predict mode](../modes/predict.md) page.
### How can I troubleshoot issues with the `python-sixel` library?
To troubleshoot issues with the `python-sixel` library:
1. Ensure the library is correctly installed in your virtual environment:
```bash
pip install sixel
```
2. Verify that you have the necessary Python and system dependencies.
3. Refer to the [python-sixel GitHub repository](https://github.com/lubosz/python-sixel) for additional documentation and community support.
4. Double-check your code for potential errors, specifically the usage of `SixelWriter` and image data conversion steps.
For further assistance on working with YOLO models and sixel integration, see the [export](../modes/export.md) and [predict mode](../modes/predict.md) documentation pages.
description:Discover VisionEye's object mapping and tracking powered by Ultralytics YOLO11. Simulate human eye precision, track objects, and calculate distances effortlessly.
# VisionEye View Object Mapping using Ultralytics YOLO11 🚀
## What is VisionEye Object Mapping?
[Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) VisionEye offers the capability for computers to identify and pinpoint objects, simulating the observational [precision](https://www.ultralytics.com/glossary/precision) of the human eye. This functionality enables computers to discern and focus on specific objects, much like the way the human eye observes details from a particular viewpoint.
## Samples
| VisionEye View | VisionEye View With Object Tracking | VisionEye View With Distance Calculation |
For any inquiries, feel free to post your questions in the [Ultralytics Issue Section](https://github.com/ultralytics/ultralytics/issues/new/choose) or the discussion section mentioned below.
## FAQ
### How do I start using VisionEye Object Mapping with Ultralytics YOLO11?
To start using VisionEye Object Mapping with Ultralytics YOLO11, first, you'll need to install the Ultralytics YOLO package via pip. Then, you can use the sample code provided in the documentation to set up [object detection](https://www.ultralytics.com/glossary/object-detection) with VisionEye. Here's a simple example to get you started:
```python
importcv2
fromultralyticsimportYOLO
model=YOLO("yolo11n.pt")
cap=cv2.VideoCapture("path/to/video/file.mp4")
whileTrue:
ret,frame=cap.read()
ifnotret:
break
results=model.predict(frame)
forresultinresults:
# Perform custom logic with result
pass
cv2.imshow("visioneye",frame)
ifcv2.waitKey(1)&0xFF==ord("q"):
break
cap.release()
cv2.destroyAllWindows()
```
### What are the key features of VisionEye's object tracking capability using Ultralytics YOLO11?
VisionEye's object tracking with Ultralytics YOLO11 allows users to follow the movement of objects within a video frame. Key features include:
1.**Real-Time Object Tracking**: Keeps up with objects as they move.
3.**Distance Calculation**: Calculates distances between objects and specified points.
4.**Annotation and Visualization**: Provides visual markers for tracked objects.
Here's a brief code snippet demonstrating tracking with VisionEye:
```python
importcv2
fromultralyticsimportYOLO
model=YOLO("yolo11n.pt")
cap=cv2.VideoCapture("path/to/video/file.mp4")
whileTrue:
ret,frame=cap.read()
ifnotret:
break
results=model.track(frame,persist=True)
forresultinresults:
# Annotate and visualize tracking
pass
cv2.imshow("visioneye-tracking",frame)
ifcv2.waitKey(1)&0xFF==ord("q"):
break
cap.release()
cv2.destroyAllWindows()
```
For a comprehensive guide, visit the [VisionEye Object Mapping with Object Tracking](#samples).
### How can I calculate distances with VisionEye's YOLO11 model?
Distance calculation with VisionEye and Ultralytics YOLO11 involves determining the distance of detected objects from a specified point in the frame. It enhances spatial analysis capabilities, useful in applications such as autonomous driving and surveillance.
For detailed instructions, refer to the [VisionEye with Distance Calculation](#samples).
### Why should I use Ultralytics YOLO11 for object mapping and tracking?
Ultralytics YOLO11 is renowned for its speed, [accuracy](https://www.ultralytics.com/glossary/accuracy), and ease of integration, making it a top choice for object mapping and tracking. Key advantages include:
1.**State-of-the-art Performance**: Delivers high accuracy in real-time object detection.
2.**Flexibility**: Supports various tasks such as detection, tracking, and distance calculation.
3.**Community and Support**: Extensive documentation and active GitHub community for troubleshooting and enhancements.
4.**Ease of Use**: Intuitive API simplifies complex tasks, allowing for rapid deployment and iteration.
For more information on applications and benefits, check out the [Ultralytics YOLO11 documentation](https://docs.ultralytics.com/models/yolov8/).
### How can I integrate VisionEye with other [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) tools like Comet or ClearML?
Ultralytics YOLO11 can integrate seamlessly with various machine learning tools like Comet and ClearML, enhancing experiment tracking, collaboration, and reproducibility. Follow the detailed guides on [how to use YOLOv5 with Comet](https://www.ultralytics.com/blog/how-to-use-yolov5-with-comet) and [integrate YOLO11 with ClearML](https://docs.ultralytics.com/integrations/clearml/) to get started.
For further exploration and integration examples, check our [Ultralytics Integrations Guide](https://docs.ultralytics.com/integrations/).
description:Optimize your fitness routine with real-time workouts monitoring using Ultralytics YOLO11. Track and improve your exercise form and performance.
Monitoring workouts through pose estimation with [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics/) enhances exercise assessment by accurately tracking key body landmarks and joints in real-time. This technology provides instant feedback on exercise form, tracks workout routines, and measures performance metrics, optimizing training sessions for users and trainers alike.
| `kpts` | `list` | `None` | List of three keypoints index, for counting specific workout, followed by keypoint Map |
| `line_width` | `int` | `2` | Thickness of the lines drawn. |
| `show` | `bool` | `False` | Flag to display the image. |
| `up_angle` | `float` | `145.0` | Angle threshold for the 'up' pose. |
| `down_angle` | `float` | `90.0` | Angle threshold for the 'down' pose. |
| `model` | `str` | `None` | Path to Ultralytics YOLO Pose Model File |
### Arguments `model.predict`
{% include "macros/predict-args.md" %}
### Arguments `model.track`
{% include "macros/track-args.md" %}
## FAQ
### How do I monitor my workouts using Ultralytics YOLO11?
To monitor your workouts using Ultralytics YOLO11, you can utilize the pose estimation capabilities to track and analyze key body landmarks and joints in real-time. This allows you to receive instant feedback on your exercise form, count repetitions, and measure performance metrics. You can start by using the provided example code for pushups, pullups, or ab workouts as shown:
print("Video frame is empty or video processing has been successfully completed.")
break
im0=gym.monitor(im0)
cv2.destroyAllWindows()
```
For further customization and settings, you can refer to the [AIGym](#arguments-aigym) section in the documentation.
### What are the benefits of using Ultralytics YOLO11 for workout monitoring?
Using Ultralytics YOLO11 for workout monitoring provides several key benefits:
-**Optimized Performance:** By tailoring workouts based on monitoring data, you can achieve better results.
-**Goal Achievement:** Easily track and adjust fitness goals for measurable progress.
-**Personalization:** Get customized workout plans based on your individual data for optimal effectiveness.
-**Health Awareness:** Early detection of patterns that indicate potential health issues or over-training.
-**Informed Decisions:** Make data-driven decisions to adjust routines and set realistic goals.
You can watch a [YouTube video demonstration](https://www.youtube.com/watch?v=LGGxqLZtvuw) to see these benefits in action.
### How accurate is Ultralytics YOLO11 in detecting and tracking exercises?
Ultralytics YOLO11 is highly accurate in detecting and tracking exercises due to its state-of-the-art pose estimation capabilities. It can accurately track key body landmarks and joints, providing real-time feedback on exercise form and performance metrics. The model's pretrained weights and robust architecture ensure high [precision](https://www.ultralytics.com/glossary/precision) and reliability. For real-world examples, check out the [real-world applications](#real-world-applications) section in the documentation, which showcases pushups and pullups counting.
### Can I use Ultralytics YOLO11 for custom workout routines?
Yes, Ultralytics YOLO11 can be adapted for custom workout routines. The `AIGym` class supports different pose types such as "pushup", "pullup", and "abworkout." You can specify keypoints and angles to detect specific exercises. Here is an example setup:
```python
fromultralyticsimportsolutions
gym=solutions.AIGym(
line_width=2,
show=True,
kpts=[6,8,10],
)
```
For more details on setting arguments, refer to the [Arguments `AIGym`](#arguments-aigym) section. This flexibility allows you to monitor various exercises and customize routines based on your needs.
### How can I save the workout monitoring output using Ultralytics YOLO11?
To save the workout monitoring output, you can modify the code to include a video writer that saves the processed frames. Here's an example:
print("Video frame is empty or video processing has been successfully completed.")
break
im0=gym.monitor(im0)
video_writer.write(im0)
cv2.destroyAllWindows()
video_writer.release()
```
This setup writes the monitored video to an output file. For more details, refer to the [Workouts Monitoring with Save Output](#workouts-monitoring-using-ultralytics-yolo11) section.
description:Comprehensive guide to troubleshoot common YOLO11 issues, from installation errors to model training challenges. Enhance your Ultralytics projects with our expert tips.
keywords:YOLO, YOLO11, troubleshooting, installation errors, model training, GPU issues, Ultralytics, AI, computer vision, deep learning, Python, CUDA, PyTorch, debugging
---
# Troubleshooting Common YOLO Issues
<palign="center">
<imgwidth="800"src="https://github.com/ultralytics/docs/releases/download/0/yolo-common-issues.avif"alt="YOLO Common Issues Image">
</p>
## Introduction
This guide serves as a comprehensive aid for troubleshooting common issues encountered while working with YOLO11 on your Ultralytics projects. Navigating through these issues can be a breeze with the right guidance, ensuring your projects remain on track without unnecessary delays.
<strong>Watch:</strong> Ultralytics YOLO11 Common Issues | Installation Errors, Model Training Issues
</p>
## Common Issues
### Installation Errors
Installation errors can arise due to various reasons, such as incompatible versions, missing dependencies, or incorrect environment setups. First, check to make sure you are doing the following:
- You're using Python 3.8 or later as recommended.
- Ensure that you have the correct version of [PyTorch](https://www.ultralytics.com/glossary/pytorch)(1.8 or later) installed.
- Consider using virtual environments to avoid conflicts.
- Follow the [official installation guide](../quickstart.md) step by step.
Additionally, here are some common installation issues users have encountered, along with their respective solutions:
- Import Errors or Dependency Issues - If you're getting errors during the import of YOLO11, or you're having issues related to dependencies, consider the following troubleshooting steps:
- **Fresh Installation**: Sometimes, starting with a fresh installation can resolve unexpected issues. Especially with libraries like Ultralytics, where updates might introduce changes to the file tree structure or functionalities.
- **Update Regularly**: Ensure you're using the latest version of the library. Older versions might not be compatible with recent updates, leading to potential conflicts or issues.
- **Check Dependencies**: Verify that all required dependencies are correctly installed and are of the compatible versions.
- **Review Changes**: If you initially cloned or installed an older version, be aware that significant updates might affect the library's structure or functionalities. Always refer to the official documentation or changelogs to understand any major changes.
- Remember, keeping your libraries and dependencies up-to-date is crucial for a smooth and error-free experience.
- Running YOLO11 on GPU - If you're having trouble running YOLO11 on GPU, consider the following troubleshooting steps:
- **Verify CUDA Compatibility and Installation**: Ensure your GPU is CUDA compatible and that CUDA is correctly installed. Use the `nvidia-smi` command to check the status of your NVIDIA GPU and CUDA version.
- **Check PyTorch and CUDA Integration**: Ensure PyTorch can utilize CUDA by running `import torch; print(torch.cuda.is_available())` in a Python terminal. If it returns 'True', PyTorch is set up to use CUDA.
- **Environment Activation**: Ensure you're in the correct environment where all necessary packages are installed.
- **Update Your Packages**: Outdated packages might not be compatible with your GPU. Keep them updated.
- **Program Configuration**: Check if the program or code specifies GPU usage. In YOLO11, this might be in the settings or configuration.
### Model Training Issues
This section will address common issues faced while training and their respective explanations and solutions.
#### Verification of Configuration Settings
**Issue**: You are unsure whether the configuration settings in the `.yaml` file are being applied correctly during model training.
**Solution**: The configuration settings in the `.yaml` file should be applied when using the `model.train()` function. To ensure that these settings are correctly applied, follow these steps:
- Confirm that the path to your `.yaml` configuration file is correct.
- Make sure you pass the path to your `.yaml` file as the `data` argument when calling `model.train()`, as shown below:
**Issue**: Training is slow on a single GPU, and you want to speed up the process using multiple GPUs.
**Solution**: Increasing the [batch size](https://www.ultralytics.com/glossary/batch-size) can accelerate training, but it's essential to consider GPU memory capacity. To speed up training with multiple GPUs, follow these steps:
- Ensure that you have multiple GPUs available.
- Modify your .yaml configuration file to specify the number of GPUs to use, e.g., gpus: 4.
- Increase the batch size accordingly to fully utilize the multiple GPUs without exceeding memory limits.
- Modify your training command to utilize multiple GPUs:
```python
# Adjust the batch size and other settings as needed to optimize training speed
**Issue**: You want to know which parameters should be continuously monitored during training, apart from loss.
**Solution**: While loss is a crucial metric to monitor, it's also essential to track other metrics for model performance optimization. Some key metrics to monitor during training include:
- Precision
- Recall
-[Mean Average Precision](https://www.ultralytics.com/glossary/mean-average-precision-map)(mAP)
You can access these metrics from the training logs or by using tools like TensorBoard or wandb for visualization. Implementing early stopping based on these metrics can help you achieve better results.
#### Tools for Tracking Training Progress
**Issue**: You are looking for recommendations on tools to track training progress.
**Solution**: To track and visualize training progress, you can consider using the following tools:
-[TensorBoard](https://www.tensorflow.org/tensorboard): TensorBoard is a popular choice for visualizing training metrics, including loss, [accuracy](https://www.ultralytics.com/glossary/accuracy), and more. You can integrate it with your YOLO11 training process.
-[Comet](https://bit.ly/yolov8-readme-comet): Comet provides an extensive toolkit for experiment tracking and comparison. It allows you to track metrics, hyperparameters, and even model weights. Integration with YOLO models is also straightforward, providing you with a complete overview of your experiment cycle.
-[Ultralytics HUB](https://hub.ultralytics.com/): Ultralytics HUB offers a specialized environment for tracking YOLO models, giving you a one-stop platform to manage metrics, datasets, and even collaborate with your team. Given its tailored focus on YOLO, it offers more customized tracking options.
Each of these tools offers its own set of advantages, so you may want to consider the specific needs of your project when making a choice.
#### How to Check if Training is Happening on the GPU
**Issue**: The 'device' value in the training logs is 'null,' and you're unsure if training is happening on the GPU.
**Solution**: The 'device' value being 'null' typically means that the training process is set to automatically use an available GPU, which is the default behavior. To ensure training occurs on a specific GPU, you can manually set the 'device' value to the GPU index (e.g., '0' for the first GPU) in your .yaml configuration file:
```yaml
device:0
```
This will explicitly assign the training process to the specified GPU. If you wish to train on the CPU, set 'device' to 'cpu'.
Keep an eye on the 'runs' folder for logs and metrics to monitor training progress effectively.
#### Key Considerations for Effective Model Training
Here are some things to keep in mind, if you are facing issues related to model training.
**Dataset Format and Labels**
- Importance: The foundation of any [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) model lies in the quality and format of the data it is trained on.
- Recommendation: Ensure that your custom dataset and its associated labels adhere to the expected format. It's crucial to verify that annotations are accurate and of high quality. Incorrect or subpar annotations can derail the model's learning process, leading to unpredictable outcomes.
**Model Convergence**
- Importance: Achieving model convergence ensures that the model has sufficiently learned from the [training data](https://www.ultralytics.com/glossary/training-data).
- Recommendation: When training a model 'from scratch', it's vital to ensure that the model reaches a satisfactory level of convergence. This might necessitate a longer training duration, with more [epochs](https://www.ultralytics.com/glossary/epoch), compared to when you're fine-tuning an existing model.
**[Learning Rate](https://www.ultralytics.com/glossary/learning-rate) and Batch Size**
- Importance: These hyperparameters play a pivotal role in determining how the model updates its weights during training.
- Recommendation: Regularly evaluate if the chosen learning rate and batch size are optimal for your specific dataset. Parameters that are not in harmony with the dataset's characteristics can hinder the model's performance.
**Class Distribution**
- Importance: The distribution of classes in your dataset can influence the model's prediction tendencies.
- Recommendation: Regularly assess the distribution of classes within your dataset. If there's a class imbalance, there's a risk that the model will develop a bias towards the more prevalent class. This bias can be evident in the confusion matrix, where the model might predominantly predict the majority class.
**Cross-Check with Pretrained Weights**
- Importance: Leveraging pretrained weights can provide a solid starting point for model training, especially when data is limited.
- Recommendation: As a diagnostic step, consider training your model using the same data but initializing it with pretrained weights. If this approach yields a well-formed confusion matrix, it could suggest that the 'from scratch' model might require further training or adjustments.
### Issues Related to Model Predictions
This section will address common issues faced during model prediction.
#### Getting Bounding Box Predictions With Your YOLO11 Custom Model
**Issue**: When running predictions with a custom YOLO11 model, there are challenges with the format and visualization of the bounding box coordinates.
**Solution**:
- Coordinate Format: YOLO11 provides bounding box coordinates in absolute pixel values. To convert these to relative coordinates (ranging from 0 to 1), you need to divide by the image dimensions. For example, let's say your image size is 640x640. Then you would do the following:
```python
# Convert absolute coordinates to relative coordinates
x1=x1/640# Divide x-coordinates by image width
x2=x2/640
y1=y1/640# Divide y-coordinates by image height
y2=y2/640
```
- File Name: To obtain the file name of the image you're predicting on, access the image file path directly from the result object within your prediction loop.
#### Filtering Objects in YOLO11 Predictions
**Issue**: Facing issues with how to filter and display only specific objects in the prediction results when running YOLO11 using the Ultralytics library.
**Solution**: To detect specific classes use the classes argument to specify the classes you want to include in the output. For instance, to detect only cars (assuming 'cars' have class index 2):
**Issue**: Confusion regarding the difference between box precision, mask precision, and [confusion matrix](https://www.ultralytics.com/glossary/confusion-matrix) precision in YOLO11.
**Solution**: Box precision measures the accuracy of predicted bounding boxes compared to the actual ground truth boxes using IoU (Intersection over Union) as the metric. Mask precision assesses the agreement between predicted segmentation masks and ground truth masks in pixel-wise object classification. Confusion matrix precision, on the other hand, focuses on overall classification accuracy across all classes and does not consider the geometric accuracy of predictions. It's important to note that a [bounding box](https://www.ultralytics.com/glossary/bounding-box) can be geometrically accurate (true positive) even if the class prediction is wrong, leading to differences between box precision and confusion matrix precision. These metrics evaluate distinct aspects of a model's performance, reflecting the need for different evaluation metrics in various tasks.
#### Extracting Object Dimensions in YOLO11
**Issue**: Difficulty in retrieving the length and height of detected objects in YOLO11, especially when multiple objects are detected in an image.
**Solution**: To retrieve the bounding box dimensions, first use the Ultralytics YOLO11 model to predict objects in an image. Then, extract the width and height information of bounding boxes from the prediction results.
**Issue:** Deploying models in a multi-GPU environment can sometimes lead to unexpected behaviors like unexpected memory usage, inconsistent results across GPUs, etc.
**Solution:** Check for default GPU initialization. Some frameworks, like PyTorch, might initialize CUDA operations on a default GPU before transitioning to the designated GPUs. To bypass unexpected default initializations, specify the GPU directly during deployment and prediction. Then, use tools to monitor GPU utilization and memory usage to identify any anomalies in real-time. Also, ensure you're using the latest version of the framework or library.
#### Model Conversion/Exporting Issues
**Issue:** During the process of converting or exporting machine learning models to different formats or platforms, users might encounter errors or unexpected behaviors.
**Solution:**
- Compatibility Check: Ensure that you are using versions of libraries and frameworks that are compatible with each other. Mismatched versions can lead to unexpected errors during conversion.
- Environment Reset: If you're using an interactive environment like Jupyter or Colab, consider restarting your environment after making significant changes or installations. A fresh start can sometimes resolve underlying issues.
- Official Documentation: Always refer to the official documentation of the tool or library you are using for conversion. It often contains specific guidelines and best practices for model exporting.
- Community Support: Check the library or framework's official repository for similar issues reported by other users. The maintainers or community might have provided solutions or workarounds in discussion threads.
- Update Regularly: Ensure that you are using the latest version of the tool or library. Developers frequently release updates that fix known bugs or improve functionality.
- Test Incrementally: Before performing a full conversion, test the process with a smaller model or dataset to identify potential issues early on.
## Community and Support
Engaging with a community of like-minded individuals can significantly enhance your experience and success in working with YOLO11. Below are some channels and resources you may find helpful.
### Forums and Channels for Getting Help
**GitHub Issues:** The YOLO11 repository on GitHub has an [Issues tab](https://github.com/ultralytics/ultralytics/issues) where you can ask questions, report bugs, and suggest new features. The community and maintainers are active here, and it's a great place to get help with specific problems.
**Ultralytics Discord Server:** Ultralytics has a [Discord server](https://discord.com/invite/ultralytics) where you can interact with other users and the developers.
### Official Documentation and Resources
**Ultralytics YOLO11 Docs**: The [official documentation](../index.md) provides a comprehensive overview of YOLO11, along with guides on installation, usage, and troubleshooting.
These resources should provide a solid foundation for troubleshooting and improving your YOLO11 projects, as well as connecting with others in the YOLO11 community.
## Conclusion
Troubleshooting is an integral part of any development process, and being equipped with the right knowledge can significantly reduce the time and effort spent in resolving issues. This guide aimed to address the most common challenges faced by users of the YOLO11 model within the Ultralytics ecosystem. By understanding and addressing these common issues, you can ensure smoother project progress and achieve better results with your [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) tasks.
Remember, the Ultralytics community is a valuable resource. Engaging with fellow developers and experts can provide additional insights and solutions that might not be covered in standard documentation. Always keep learning, experimenting, and sharing your experiences to contribute to the collective knowledge of the community.
Happy troubleshooting!
## FAQ
### How do I resolve installation errors with YOLO11?
Installation errors can often be due to compatibility issues or missing dependencies. Ensure you use Python 3.8 or later and have PyTorch 1.8 or later installed. It's beneficial to use virtual environments to avoid conflicts. For a step-by-step installation guide, follow our [official installation guide](../quickstart.md). If you encounter import errors, try a fresh installation or update the library to the latest version.
### Why is my YOLO11 model training slow on a single GPU?
Training on a single GPU might be slow due to large batch sizes or insufficient memory. To speed up training, use multiple GPUs. Ensure your system has multiple GPUs available and adjust your `.yaml` configuration file to specify the number of GPUs, e.g., `gpus: 4`. Increase the batch size accordingly to fully utilize the GPUs without exceeding memory limits. Example command:
### How can I ensure my YOLO11 model is training on the GPU?
If the 'device' value shows 'null' in the training logs, it generally means the training process is set to automatically use an available GPU. To explicitly assign a specific GPU, set the 'device' value in your `.yaml` configuration file. For instance:
```yaml
device:0
```
This sets the training process to the first GPU. Consult the `nvidia-smi` command to confirm your CUDA setup.
### How can I monitor and track my YOLO11 model training progress?
Tracking and visualizing training progress can be efficiently managed through tools like [TensorBoard](https://www.tensorflow.org/tensorboard), [Comet](https://bit.ly/yolov8-readme-comet), and [Ultralytics HUB](https://hub.ultralytics.com/). These tools allow you to log and visualize metrics such as loss, [precision](https://www.ultralytics.com/glossary/precision), [recall](https://www.ultralytics.com/glossary/recall), and mAP. Implementing [early stopping](#continuous-monitoring-parameters) based on these metrics can also help achieve better training outcomes.
### What should I do if YOLO11 is not recognizing my dataset format?
Ensure your dataset and labels conform to the expected format. Verify that annotations are accurate and of high quality. If you face any issues, refer to the [Data Collection and Annotation](https://docs.ultralytics.com/guides/data-collection-and-annotation/) guide for best practices. For more dataset-specific guidance, check the [Datasets](https://docs.ultralytics.com/datasets/) section in the documentation.
description:Explore essential YOLO11 performance metrics like mAP, IoU, F1 Score, Precision, and Recall. Learn how to calculate and interpret them for model evaluation.
Performance metrics are key tools to evaluate the [accuracy](https://www.ultralytics.com/glossary/accuracy) and efficiency of [object detection](https://www.ultralytics.com/glossary/object-detection) models. They shed light on how effectively a model can identify and localize objects within images. Additionally, they help in understanding the model's handling of false positives and false negatives. These insights are crucial for evaluating and enhancing the model's performance. In this guide, we will explore various performance metrics associated with YOLO11, their significance, and how to interpret them.
Let's start by discussing some metrics that are not only important to YOLO11 but are broadly applicable across different object detection models.
-**[Intersection over Union](https://www.ultralytics.com/glossary/intersection-over-union-iou) (IoU):** IoU is a measure that quantifies the overlap between a predicted [bounding box](https://www.ultralytics.com/glossary/bounding-box) and a ground truth bounding box. It plays a fundamental role in evaluating the accuracy of object localization.
-**Average Precision (AP):** AP computes the area under the precision-recall curve, providing a single value that encapsulates the model's precision and recall performance.
-**Mean Average Precision (mAP):** mAP extends the concept of AP by calculating the average AP values across multiple object classes. This is useful in multi-class object detection scenarios to provide a comprehensive evaluation of the model's performance.
-**Precision and Recall:** Precision quantifies the proportion of true positives among all positive predictions, assessing the model's capability to avoid false positives. On the other hand, Recall calculates the proportion of true positives among all actual positives, measuring the model's ability to detect all instances of a class.
-**F1 Score:** The F1 Score is the harmonic mean of precision and recall, providing a balanced assessment of a model's performance while considering both false positives and false negatives.
## How to Calculate Metrics for YOLO11 Model
Now, we can explore [YOLO11's Validation mode](../modes/val.md) that can be used to compute the above discussed evaluation metrics.
Using the validation mode is simple. Once you have a trained model, you can invoke the model.val() function. This function will then process the validation dataset and return a variety of performance metrics. But what do these metrics mean? And how should you interpret them?
### Interpreting the Output
Let's break down the output of the model.val() function and understand each segment of the output.
#### Class-wise Metrics
One of the sections of the output is the class-wise breakdown of performance metrics. This granular information is useful when you are trying to understand how well the model is doing for each specific class, especially in datasets with a diverse range of object categories. For each class in the dataset the following is provided:
-**Class**: This denotes the name of the object class, such as "person", "car", or "dog".
-**Images**: This metric tells you the number of images in the validation set that contain the object class.
-**Instances**: This provides the count of how many times the class appears across all images in the validation set.
-**Box(P, R, mAP50, mAP50-95)**: This metric provides insights into the model's performance in detecting objects:
- **P (Precision)**: The accuracy of the detected objects, indicating how many detections were correct.
- **R (Recall)**: The ability of the model to identify all instances of objects in the images.
- **mAP50**: Mean average precision calculated at an intersection over union (IoU) threshold of 0.50. It's a measure of the model's accuracy considering only the "easy" detections.
- **mAP50-95**: The average of the mean average precision calculated at varying IoU thresholds, ranging from 0.50 to 0.95. It gives a comprehensive view of the model's performance across different levels of detection difficulty.
#### Speed Metrics
The speed of inference can be as critical as accuracy, especially in real-time object detection scenarios. This section breaks down the time taken for various stages of the validation process, from preprocessing to post-processing.
#### COCO Metrics Evaluation
For users validating on the COCO dataset, additional metrics are calculated using the COCO evaluation script. These metrics give insights into precision and recall at different IoU thresholds and for objects of different sizes.
#### Visual Outputs
The model.val() function, apart from producing numeric metrics, also yields visual outputs that can provide a more intuitive understanding of the model's performance. Here's a breakdown of the visual outputs you can expect:
-**F1 Score Curve (`F1_curve.png`)**: This curve represents the [F1 score](https://www.ultralytics.com/glossary/f1-score) across various thresholds. Interpreting this curve can offer insights into the model's balance between false positives and false negatives over different thresholds.
-**Precision-Recall Curve (`PR_curve.png`)**: An integral visualization for any classification problem, this curve showcases the trade-offs between precision and [recall](https://www.ultralytics.com/glossary/recall) at varied thresholds. It becomes especially significant when dealing with imbalanced classes.
-**Precision Curve (`P_curve.png`)**: A graphical representation of precision values at different thresholds. This curve helps in understanding how precision varies as the threshold changes.
-**Recall Curve (`R_curve.png`)**: Correspondingly, this graph illustrates how the recall values change across different thresholds.
-**[Confusion Matrix](https://www.ultralytics.com/glossary/confusion-matrix) (`confusion_matrix.png`)**: The confusion matrix provides a detailed view of the outcomes, showcasing the counts of true positives, true negatives, false positives, and false negatives for each class.
-**Normalized Confusion Matrix (`confusion_matrix_normalized.png`)**: This visualization is a normalized version of the confusion matrix. It represents the data in proportions rather than raw counts. This format makes it simpler to compare the performance across classes.
-**Validation Batch Labels (`val_batchX_labels.jpg`)**: These images depict the ground truth labels for distinct batches from the validation dataset. They provide a clear picture of what the objects are and their respective locations as per the dataset.
-**Validation Batch Predictions (`val_batchX_pred.jpg`)**: Contrasting the label images, these visuals display the predictions made by the YOLO11 model for the respective batches. By comparing these to the label images, you can easily assess how well the model detects and classifies objects visually.
#### Results Storage
For future reference, the results are saved to a directory, typically named runs/detect/val.
## Choosing the Right Metrics
Choosing the right metrics to evaluate often depends on the specific application.
-**mAP:** Suitable for a broad assessment of model performance.
-**IoU:** Essential when precise object location is crucial.
-**Precision:** Important when minimizing false detections is a priority.
-**Recall:** Vital when it's important to detect every instance of an object.
-**F1 Score:** Useful when a balance between precision and recall is needed.
For real-time applications, speed metrics like FPS (Frames Per Second) and latency are crucial to ensure timely results.
## Interpretation of Results
It's important to understand the metrics. Here's what some of the commonly observed lower scores might suggest:
-**Low mAP:** Indicates the model may need general refinements.
-**Low IoU:** The model might be struggling to pinpoint objects accurately. Different bounding box methods could help.
-**Low Precision:** The model may be detecting too many non-existent objects. Adjusting confidence thresholds might reduce this.
-**Low Recall:** The model could be missing real objects. Improving [feature extraction](https://www.ultralytics.com/glossary/feature-extraction) or using more data might help.
-**Imbalanced F1 Score:** There's a disparity between precision and recall.
-**Class-specific AP:** Low scores here can highlight classes the model struggles with.
## Case Studies
Real-world examples can help clarify how these metrics work in practice.
### Case 1
-**Situation:** mAP and F1 Score are suboptimal, but while Recall is good, Precision isn't.
-**Interpretation & Action:** There might be too many incorrect detections. Tightening confidence thresholds could reduce these, though it might also slightly decrease recall.
### Case 2
-**Situation:** mAP and Recall are acceptable, but IoU is lacking.
-**Interpretation & Action:** The model detects objects well but might not be localizing them precisely. Refining bounding box predictions might help.
### Case 3
-**Situation:** Some classes have a much lower AP than others, even with a decent overall mAP.
-**Interpretation & Action:** These classes might be more challenging for the model. Using more data for these classes or adjusting class weights during training could be beneficial.
## Connect and Collaborate
Tapping into a community of enthusiasts and experts can amplify your journey with YOLO11. Here are some avenues that can facilitate learning, troubleshooting, and networking.
### Engage with the Broader Community
-**GitHub Issues:** The YOLO11 repository on GitHub has an [Issues tab](https://github.com/ultralytics/ultralytics/issues) where you can ask questions, report bugs, and suggest new features. The community and maintainers are active here, and it's a great place to get help with specific problems.
-**Ultralytics Discord Server:** Ultralytics has a [Discord server](https://discord.com/invite/ultralytics) where you can interact with other users and the developers.
### Official Documentation and Resources:
-**Ultralytics YOLO11 Docs:** The [official documentation](../index.md) provides a comprehensive overview of YOLO11, along with guides on installation, usage, and troubleshooting.
Using these resources will not only guide you through any challenges but also keep you updated with the latest trends and best practices in the YOLO11 community.
## Conclusion
In this guide, we've taken a close look at the essential performance metrics for YOLO11. These metrics are key to understanding how well a model is performing and are vital for anyone aiming to fine-tune their models. They offer the necessary insights for improvements and to make sure the model works effectively in real-life situations.
Remember, the YOLO11 and Ultralytics community is an invaluable asset. Engaging with fellow developers and experts can open doors to insights and solutions not found in standard documentation. As you journey through object detection, keep the spirit of learning alive, experiment with new strategies, and share your findings. By doing so, you contribute to the community's collective wisdom and ensure its growth.
Happy object detecting!
## FAQ
### What is the significance of [Mean Average Precision](https://www.ultralytics.com/glossary/mean-average-precision-map) (mAP) in evaluating YOLO11 model performance?
Mean Average Precision (mAP) is crucial for evaluating YOLO11 models as it provides a single metric encapsulating precision and recall across multiple classes. mAP@0.50 measures precision at an IoU threshold of 0.50, focusing on the model's ability to detect objects correctly. mAP@0.50:0.95 averages precision across a range of IoU thresholds, offering a comprehensive assessment of detection performance. High mAP scores indicate that the model effectively balances precision and recall, essential for applications like autonomous driving and surveillance.
### How do I interpret the Intersection over Union (IoU) value for YOLO11 object detection?
Intersection over Union (IoU) measures the overlap between the predicted and ground truth bounding boxes. IoU values range from 0 to 1, where higher values indicate better localization accuracy. An IoU of 1.0 means perfect alignment. Typically, an IoU threshold of 0.50 is used to define true positives in metrics like mAP. Lower IoU values suggest that the model struggles with precise object localization, which can be improved by refining bounding box regression or increasing annotation accuracy.
### Why is the F1 Score important for evaluating YOLO11 models in object detection?
The F1 Score is important for evaluating YOLO11 models because it provides a harmonic mean of precision and recall, balancing both false positives and false negatives. It is particularly valuable when dealing with imbalanced datasets or applications where either precision or recall alone is insufficient. A high F1 Score indicates that the model effectively detects objects while minimizing both missed detections and false alarms, making it suitable for critical applications like security systems and medical imaging.
### What are the key advantages of using Ultralytics YOLO11 for real-time object detection?
Ultralytics YOLO11 offers multiple advantages for real-time object detection:
-**Speed and Efficiency**: Optimized for high-speed inference, suitable for applications requiring low latency.
-**High Accuracy**: Advanced algorithm ensures high mAP and IoU scores, balancing precision and recall.
-**Flexibility**: Supports various tasks including object detection, segmentation, and classification.
-**Ease of Use**: User-friendly interfaces, extensive documentation, and seamless integration with platforms like Ultralytics HUB ([HUB Quickstart](../hub/quickstart.md)).
This makes YOLO11 ideal for diverse applications from autonomous vehicles to smart city solutions.
### How can validation metrics from YOLO11 help improve model performance?
Validation metrics from YOLO11 like precision, recall, mAP, and IoU help diagnose and improve model performance by providing insights into different aspects of detection:
-**Precision**: Helps identify and minimize false positives.
-**Recall**: Ensures all relevant objects are detected.
-**mAP**: Offers an overall performance snapshot, guiding general improvements.
By analyzing these metrics, specific weaknesses can be targeted, such as adjusting confidence thresholds to improve precision or gathering more diverse data to enhance recall. For detailed explanations of these metrics and how to interpret them, check [Object Detection Metrics](#object-detection-metrics).
description:Learn how to ensure thread-safe YOLO model inference in Python. Avoid race conditions and run your multi-threaded tasks reliably with best practices.
keywords:YOLO models, thread-safe, Python threading, model inference, concurrency, race conditions, multi-threaded, parallelism, Python GIL
---
# Thread-Safe Inference with YOLO Models
Running YOLO models in a multi-threaded environment requires careful consideration to ensure thread safety. Python's `threading` module allows you to run several threads concurrently, but when it comes to using YOLO models across these threads, there are important safety issues to be aware of. This page will guide you through creating thread-safe YOLO model inference.
## Understanding Python Threading
Python threads are a form of parallelism that allow your program to run multiple operations at once. However, Python's Global Interpreter Lock (GIL) means that only one thread can execute Python bytecode at a time.
<palign="center">
<imgwidth="800"src="https://github.com/ultralytics/docs/releases/download/0/single-vs-multi-thread-examples.avif"alt="Single vs Multi-Thread Examples">
</p>
While this sounds like a limitation, threads can still provide concurrency, especially for I/O-bound operations or when using operations that release the GIL, like those performed by YOLO's underlying C libraries.
## The Danger of Shared Model Instances
Instantiating a YOLO model outside your threads and sharing this instance across multiple threads can lead to race conditions, where the internal state of the model is inconsistently modified due to concurrent accesses. This is particularly problematic when the model or its components hold state that is not designed to be thread-safe.
### Non-Thread-Safe Example: Single Model Instance
When using threads in Python, it's important to recognize patterns that can lead to concurrency issues. Here is what you should avoid: sharing a single YOLO model instance across multiple threads.
```python
# Unsafe: Sharing a single model instance across threads
fromthreadingimportThread
fromultralyticsimportYOLO
# Instantiate the model outside the thread
shared_model=YOLO("yolo11n.pt")
defpredict(image_path):
"""Predicts objects in an image using a preloaded YOLO model, take path string to image as argument."""
results=shared_model.predict(image_path)
# Process results
# Starting threads that share the same model instance
In the example above, the `shared_model` is used by multiple threads, which can lead to unpredictable results because `predict` could be executed simultaneously by multiple threads.
### Non-Thread-Safe Example: Multiple Model Instances
Similarly, here is an unsafe pattern with multiple YOLO model instances:
```python
# Unsafe: Sharing multiple model instances across threads can still lead to issues
fromthreadingimportThread
fromultralyticsimportYOLO
# Instantiate multiple models outside the thread
shared_model_1=YOLO("yolo11n_1.pt")
shared_model_2=YOLO("yolo11n_2.pt")
defpredict(model,image_path):
"""Runs prediction on an image using a specified YOLO model, returning the results."""
results=model.predict(image_path)
# Process results
# Starting threads with individual model instances
Even though there are two separate model instances, the risk of concurrency issues still exists. If the internal implementation of `YOLO` is not thread-safe, using separate instances might not prevent race conditions, especially if these instances share any underlying resources or states that are not thread-local.
## Thread-Safe Inference
To perform thread-safe inference, you should instantiate a separate YOLO model within each thread. This ensures that each thread has its own isolated model instance, eliminating the risk of race conditions.
### Thread-Safe Example
Here's how to instantiate a YOLO model inside each thread for safe parallel inference:
```python
# Safe: Instantiating a single model inside each thread
fromthreadingimportThread
fromultralyticsimportYOLO
defthread_safe_predict(image_path):
"""Predict on an image using a new YOLO model instance in a thread-safe manner; takes image path as input."""
local_model=YOLO("yolo11n.pt")
results=local_model.predict(image_path)
# Process results
# Starting threads that each have their own model instance
In this example, each thread creates its own `YOLO` instance. This prevents any thread from interfering with the model state of another, thus ensuring that each thread performs inference safely and without unexpected interactions with the other threads.
## Conclusion
When using YOLO models with Python's `threading`, always instantiate your models within the thread that will use them to ensure thread safety. This practice avoids race conditions and makes sure that your inference tasks run reliably.
For more advanced scenarios and to further optimize your multi-threaded inference performance, consider using process-based parallelism with `multiprocessing` or leveraging a task queue with dedicated worker processes.
## FAQ
### How can I avoid race conditions when using YOLO models in a multi-threaded Python environment?
To prevent race conditions when using Ultralytics YOLO models in a multi-threaded Python environment, instantiate a separate YOLO model within each thread. This ensures that each thread has its own isolated model instance, avoiding concurrent modification of the model state.
Example:
```python
fromthreadingimportThread
fromultralyticsimportYOLO
defthread_safe_predict(image_path):
"""Predict on an image in a thread-safe manner."""
For additional context, refer to the section on [Thread-Safe Inference](#thread-safe-inference).
### Why should each thread have its own YOLO model instance?
Each thread should have its own YOLO model instance to prevent race conditions. When a single model instance is shared among multiple threads, concurrent accesses can lead to unpredictable behavior and modifications of the model's internal state. By using separate instances, you ensure thread isolation, making your multi-threaded tasks reliable and safe.
For detailed guidance, check the [Non-Thread-Safe Example: Single Model Instance](#non-thread-safe-example-single-model-instance) and [Thread-Safe Example](#thread-safe-example) sections.
### How does Python's Global Interpreter Lock (GIL) affect YOLO model inference?
Python's Global Interpreter Lock (GIL) allows only one thread to execute Python bytecode at a time, which can limit the performance of CPU-bound multi-threading tasks. However, for I/O-bound operations or processes that use libraries releasing the GIL, like YOLO's C libraries, you can still achieve concurrency. For enhanced performance, consider using process-based parallelism with Python's `multiprocessing` module.
For more about threading in Python, see the [Understanding Python Threading](#understanding-python-threading) section.
### Is it safer to use process-based parallelism instead of threading for YOLO model inference?
Yes, using Python's `multiprocessing` module is safer and often more efficient for running YOLO model inference in parallel. Process-based parallelism creates separate memory spaces, avoiding the Global Interpreter Lock (GIL) and reducing the risk of concurrency issues. Each process will operate independently with its own YOLO model instance.
For further details on process-based parallelism with YOLO models, refer to the page on [Thread-Safe Inference](#thread-safe-inference).