Commit 7a650e36 authored by mashun1's avatar mashun1
Browse files

yolov5-qat

parents
Pipeline #821 canceled with stages
<img src="https://cdn.comet.ml/img/notebook_logo.png">
# YOLOv5 with Comet
This guide will cover how to use YOLOv5 with [Comet](https://bit.ly/yolov5-readme-comet2)
# About Comet
Comet builds tools that help data scientists, engineers, and team leaders accelerate and optimize machine learning and deep learning models.
Track and visualize model metrics in real time, save your hyperparameters, datasets, and model checkpoints, and visualize your model predictions with [Comet Custom Panels](https://www.comet.com/docs/v2/guides/comet-dashboard/code-panels/about-panels/?utm_source=yolov5&utm_medium=partner&utm_campaign=partner_yolov5_2022&utm_content=github)! Comet makes sure you never lose track of your work and makes it easy to share results and collaborate across teams of all sizes!
# Getting Started
## Install Comet
```shell
pip install comet_ml
```
## Configure Comet Credentials
There are two ways to configure Comet with YOLOv5.
You can either set your credentials through environment variables
**Environment Variables**
```shell
export COMET_API_KEY=<Your Comet API Key>
export COMET_PROJECT_NAME=<Your Comet Project Name> # This will default to 'yolov5'
```
Or create a `.comet.config` file in your working directory and set your credentials there.
**Comet Configuration File**
```
[comet]
api_key=<Your Comet API Key>
project_name=<Your Comet Project Name> # This will default to 'yolov5'
```
## Run the Training Script
```shell
# Train YOLOv5s on COCO128 for 5 epochs
python train.py --img 640 --batch 16 --epochs 5 --data coco128.yaml --weights yolov5s.pt
```
That's it! Comet will automatically log your hyperparameters, command line arguments, training and validation metrics. You can visualize and analyze your runs in the Comet UI
<img width="1920" alt="yolo-ui" src="https://user-images.githubusercontent.com/26833433/202851203-164e94e1-2238-46dd-91f8-de020e9d6b41.png">
# Try out an Example!
Check out an example of a [completed run here](https://www.comet.com/examples/comet-example-yolov5/a0e29e0e9b984e4a822db2a62d0cb357?experiment-tab=chart&showOutliers=true&smoothing=0&transformY=smoothing&xAxis=step&utm_source=yolov5&utm_medium=partner&utm_campaign=partner_yolov5_2022&utm_content=github)
Or better yet, try it out yourself in this Colab Notebook
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/comet-ml/comet-examples/blob/master/integrations/model-training/yolov5/notebooks/Comet_and_YOLOv5.ipynb)
# Log automatically
By default, Comet will log the following items
## Metrics
- Box Loss, Object Loss, Classification Loss for the training and validation data
- mAP_0.5, mAP_0.5:0.95 metrics for the validation data.
- Precision and Recall for the validation data
## Parameters
- Model Hyperparameters
- All parameters passed through the command line options
## Visualizations
- Confusion Matrix of the model predictions on the validation data
- Plots for the PR and F1 curves across all classes
- Correlogram of the Class Labels
# Configure Comet Logging
Comet can be configured to log additional data either through command line flags passed to the training script or through environment variables.
```shell
export COMET_MODE=online # Set whether to run Comet in 'online' or 'offline' mode. Defaults to online
export COMET_MODEL_NAME=<your model name> #Set the name for the saved model. Defaults to yolov5
export COMET_LOG_CONFUSION_MATRIX=false # Set to disable logging a Comet Confusion Matrix. Defaults to true
export COMET_MAX_IMAGE_UPLOADS=<number of allowed images to upload to Comet> # Controls how many total image predictions to log to Comet. Defaults to 100.
export COMET_LOG_PER_CLASS_METRICS=true # Set to log evaluation metrics for each detected class at the end of training. Defaults to false
export COMET_DEFAULT_CHECKPOINT_FILENAME=<your checkpoint filename> # Set this if you would like to resume training from a different checkpoint. Defaults to 'last.pt'
export COMET_LOG_BATCH_LEVEL_METRICS=true # Set this if you would like to log training metrics at the batch level. Defaults to false.
export COMET_LOG_PREDICTIONS=true # Set this to false to disable logging model predictions
```
## Logging Checkpoints with Comet
Logging Models to Comet is disabled by default. To enable it, pass the `save-period` argument to the training script. This will save the logged checkpoints to Comet based on the interval value provided by `save-period`
```shell
python train.py \
--img 640 \
--batch 16 \
--epochs 5 \
--data coco128.yaml \
--weights yolov5s.pt \
--save-period 1
```
## Logging Model Predictions
By default, model predictions (images, ground truth labels and bounding boxes) will be logged to Comet.
You can control the frequency of logged predictions and the associated images by passing the `bbox_interval` command line argument. Predictions can be visualized using Comet's Object Detection Custom Panel. This frequency corresponds to every Nth batch of data per epoch. In the example below, we are logging every 2nd batch of data for each epoch.
**Note:** The YOLOv5 validation dataloader will default to a batch size of 32, so you will have to set the logging frequency accordingly.
Here is an [example project using the Panel](https://www.comet.com/examples/comet-example-yolov5?shareable=YcwMiJaZSXfcEXpGOHDD12vA1&utm_source=yolov5&utm_medium=partner&utm_campaign=partner_yolov5_2022&utm_content=github)
```shell
python train.py \
--img 640 \
--batch 16 \
--epochs 5 \
--data coco128.yaml \
--weights yolov5s.pt \
--bbox_interval 2
```
### Controlling the number of Prediction Images logged to Comet
When logging predictions from YOLOv5, Comet will log the images associated with each set of predictions. By default a maximum of 100 validation images are logged. You can increase or decrease this number using the `COMET_MAX_IMAGE_UPLOADS` environment variable.
```shell
env COMET_MAX_IMAGE_UPLOADS=200 python train.py \
--img 640 \
--batch 16 \
--epochs 5 \
--data coco128.yaml \
--weights yolov5s.pt \
--bbox_interval 1
```
### Logging Class Level Metrics
Use the `COMET_LOG_PER_CLASS_METRICS` environment variable to log mAP, precision, recall, f1 for each class.
```shell
env COMET_LOG_PER_CLASS_METRICS=true python train.py \
--img 640 \
--batch 16 \
--epochs 5 \
--data coco128.yaml \
--weights yolov5s.pt
```
## Uploading a Dataset to Comet Artifacts
If you would like to store your data using [Comet Artifacts](https://www.comet.com/docs/v2/guides/data-management/using-artifacts/#learn-more?utm_source=yolov5&utm_medium=partner&utm_campaign=partner_yolov5_2022&utm_content=github), you can do so using the `upload_dataset` flag.
The dataset be organized in the way described in the [YOLOv5 documentation](https://docs.ultralytics.com/yolov5/tutorials/train_custom_data/). The dataset config `yaml` file must follow the same format as that of the `coco128.yaml` file.
```shell
python train.py \
--img 640 \
--batch 16 \
--epochs 5 \
--data coco128.yaml \
--weights yolov5s.pt \
--upload_dataset
```
You can find the uploaded dataset in the Artifacts tab in your Comet Workspace <img width="1073" alt="artifact-1" src="https://user-images.githubusercontent.com/7529846/186929193-162718bf-ec7b-4eb9-8c3b-86b3763ef8ea.png">
You can preview the data directly in the Comet UI. <img width="1082" alt="artifact-2" src="https://user-images.githubusercontent.com/7529846/186929215-432c36a9-c109-4eb0-944b-84c2786590d6.png">
Artifacts are versioned and also support adding metadata about the dataset. Comet will automatically log the metadata from your dataset `yaml` file <img width="963" alt="artifact-3" src="https://user-images.githubusercontent.com/7529846/186929256-9d44d6eb-1a19-42de-889a-bcbca3018f2e.png">
### Using a saved Artifact
If you would like to use a dataset from Comet Artifacts, set the `path` variable in your dataset `yaml` file to point to the following Artifact resource URL.
```
# contents of artifact.yaml file
path: "comet://<workspace name>/<artifact name>:<artifact version or alias>"
```
Then pass this file to your training script in the following way
```shell
python train.py \
--img 640 \
--batch 16 \
--epochs 5 \
--data artifact.yaml \
--weights yolov5s.pt
```
Artifacts also allow you to track the lineage of data as it flows through your Experimentation workflow. Here you can see a graph that shows you all the experiments that have used your uploaded dataset. <img width="1391" alt="artifact-4" src="https://user-images.githubusercontent.com/7529846/186929264-4c4014fa-fe51-4f3c-a5c5-f6d24649b1b4.png">
## Resuming a Training Run
If your training run is interrupted for any reason, e.g. disrupted internet connection, you can resume the run using the `resume` flag and the Comet Run Path.
The Run Path has the following format `comet://<your workspace name>/<your project name>/<experiment id>`.
This will restore the run to its state before the interruption, which includes restoring the model from a checkpoint, restoring all hyperparameters and training arguments and downloading Comet dataset Artifacts if they were used in the original run. The resumed run will continue logging to the existing Experiment in the Comet UI
```shell
python train.py \
--resume "comet://<your run path>"
```
## Hyperparameter Search with the Comet Optimizer
YOLOv5 is also integrated with Comet's Optimizer, making is simple to visualize hyperparameter sweeps in the Comet UI.
### Configuring an Optimizer Sweep
To configure the Comet Optimizer, you will have to create a JSON file with the information about the sweep. An example file has been provided in `utils/loggers/comet/optimizer_config.json`
```shell
python utils/loggers/comet/hpo.py \
--comet_optimizer_config "utils/loggers/comet/optimizer_config.json"
```
The `hpo.py` script accepts the same arguments as `train.py`. If you wish to pass additional arguments to your sweep simply add them after the script.
```shell
python utils/loggers/comet/hpo.py \
--comet_optimizer_config "utils/loggers/comet/optimizer_config.json" \
--save-period 1 \
--bbox_interval 1
```
### Running a Sweep in Parallel
```shell
comet optimizer -j <set number of workers> utils/loggers/comet/hpo.py \
utils/loggers/comet/optimizer_config.json"
```
### Visualizing Results
Comet provides a number of ways to visualize the results of your sweep. Take a look at a [project with a completed sweep here](https://www.comet.com/examples/comet-example-yolov5/view/PrlArHGuuhDTKC1UuBmTtOSXD/panels?utm_source=yolov5&utm_medium=partner&utm_campaign=partner_yolov5_2022&utm_content=github)
<img width="1626" alt="hyperparameter-yolo" src="https://user-images.githubusercontent.com/7529846/186914869-7dc1de14-583f-4323-967b-c9a66a29e495.png">
import glob
import json
import logging
import os
import sys
from pathlib import Path
logger = logging.getLogger(__name__)
FILE = Path(__file__).resolve()
ROOT = FILE.parents[3] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
try:
import comet_ml
# Project Configuration
config = comet_ml.config.get_config()
COMET_PROJECT_NAME = config.get_string(os.getenv("COMET_PROJECT_NAME"), "comet.project_name", default="yolov5")
except ImportError:
comet_ml = None
COMET_PROJECT_NAME = None
import PIL
import torch
import torchvision.transforms as T
import yaml
from utils.dataloaders import img2label_paths
from utils.general import check_dataset, scale_boxes, xywh2xyxy
from utils.metrics import box_iou
COMET_PREFIX = "comet://"
COMET_MODE = os.getenv("COMET_MODE", "online")
# Model Saving Settings
COMET_MODEL_NAME = os.getenv("COMET_MODEL_NAME", "yolov5")
# Dataset Artifact Settings
COMET_UPLOAD_DATASET = os.getenv("COMET_UPLOAD_DATASET", "false").lower() == "true"
# Evaluation Settings
COMET_LOG_CONFUSION_MATRIX = os.getenv("COMET_LOG_CONFUSION_MATRIX", "true").lower() == "true"
COMET_LOG_PREDICTIONS = os.getenv("COMET_LOG_PREDICTIONS", "true").lower() == "true"
COMET_MAX_IMAGE_UPLOADS = int(os.getenv("COMET_MAX_IMAGE_UPLOADS", 100))
# Confusion Matrix Settings
CONF_THRES = float(os.getenv("CONF_THRES", 0.001))
IOU_THRES = float(os.getenv("IOU_THRES", 0.6))
# Batch Logging Settings
COMET_LOG_BATCH_METRICS = os.getenv("COMET_LOG_BATCH_METRICS", "false").lower() == "true"
COMET_BATCH_LOGGING_INTERVAL = os.getenv("COMET_BATCH_LOGGING_INTERVAL", 1)
COMET_PREDICTION_LOGGING_INTERVAL = os.getenv("COMET_PREDICTION_LOGGING_INTERVAL", 1)
COMET_LOG_PER_CLASS_METRICS = os.getenv("COMET_LOG_PER_CLASS_METRICS", "false").lower() == "true"
RANK = int(os.getenv("RANK", -1))
to_pil = T.ToPILImage()
class CometLogger:
"""Log metrics, parameters, source code, models and much more with Comet."""
def __init__(self, opt, hyp, run_id=None, job_type="Training", **experiment_kwargs) -> None:
self.job_type = job_type
self.opt = opt
self.hyp = hyp
# Comet Flags
self.comet_mode = COMET_MODE
self.save_model = opt.save_period > -1
self.model_name = COMET_MODEL_NAME
# Batch Logging Settings
self.log_batch_metrics = COMET_LOG_BATCH_METRICS
self.comet_log_batch_interval = COMET_BATCH_LOGGING_INTERVAL
# Dataset Artifact Settings
self.upload_dataset = self.opt.upload_dataset or COMET_UPLOAD_DATASET
self.resume = self.opt.resume
# Default parameters to pass to Experiment objects
self.default_experiment_kwargs = {
"log_code": False,
"log_env_gpu": True,
"log_env_cpu": True,
"project_name": COMET_PROJECT_NAME,
}
self.default_experiment_kwargs.update(experiment_kwargs)
self.experiment = self._get_experiment(self.comet_mode, run_id)
self.experiment.set_name(self.opt.name)
self.data_dict = self.check_dataset(self.opt.data)
self.class_names = self.data_dict["names"]
self.num_classes = self.data_dict["nc"]
self.logged_images_count = 0
self.max_images = COMET_MAX_IMAGE_UPLOADS
if run_id is None:
self.experiment.log_other("Created from", "YOLOv5")
if not isinstance(self.experiment, comet_ml.OfflineExperiment):
workspace, project_name, experiment_id = self.experiment.url.split("/")[-3:]
self.experiment.log_other(
"Run Path",
f"{workspace}/{project_name}/{experiment_id}",
)
self.log_parameters(vars(opt))
self.log_parameters(self.opt.hyp)
self.log_asset_data(
self.opt.hyp,
name="hyperparameters.json",
metadata={"type": "hyp-config-file"},
)
self.log_asset(
f"{self.opt.save_dir}/opt.yaml",
metadata={"type": "opt-config-file"},
)
self.comet_log_confusion_matrix = COMET_LOG_CONFUSION_MATRIX
if hasattr(self.opt, "conf_thres"):
self.conf_thres = self.opt.conf_thres
else:
self.conf_thres = CONF_THRES
if hasattr(self.opt, "iou_thres"):
self.iou_thres = self.opt.iou_thres
else:
self.iou_thres = IOU_THRES
self.log_parameters({"val_iou_threshold": self.iou_thres, "val_conf_threshold": self.conf_thres})
self.comet_log_predictions = COMET_LOG_PREDICTIONS
if self.opt.bbox_interval == -1:
self.comet_log_prediction_interval = 1 if self.opt.epochs < 10 else self.opt.epochs // 10
else:
self.comet_log_prediction_interval = self.opt.bbox_interval
if self.comet_log_predictions:
self.metadata_dict = {}
self.logged_image_names = []
self.comet_log_per_class_metrics = COMET_LOG_PER_CLASS_METRICS
self.experiment.log_others(
{
"comet_mode": COMET_MODE,
"comet_max_image_uploads": COMET_MAX_IMAGE_UPLOADS,
"comet_log_per_class_metrics": COMET_LOG_PER_CLASS_METRICS,
"comet_log_batch_metrics": COMET_LOG_BATCH_METRICS,
"comet_log_confusion_matrix": COMET_LOG_CONFUSION_MATRIX,
"comet_model_name": COMET_MODEL_NAME,
}
)
# Check if running the Experiment with the Comet Optimizer
if hasattr(self.opt, "comet_optimizer_id"):
self.experiment.log_other("optimizer_id", self.opt.comet_optimizer_id)
self.experiment.log_other("optimizer_objective", self.opt.comet_optimizer_objective)
self.experiment.log_other("optimizer_metric", self.opt.comet_optimizer_metric)
self.experiment.log_other("optimizer_parameters", json.dumps(self.hyp))
def _get_experiment(self, mode, experiment_id=None):
"""Returns a new or existing Comet.ml experiment based on mode and optional experiment_id."""
if mode == "offline":
return (
comet_ml.ExistingOfflineExperiment(
previous_experiment=experiment_id,
**self.default_experiment_kwargs,
)
if experiment_id is not None
else comet_ml.OfflineExperiment(
**self.default_experiment_kwargs,
)
)
try:
if experiment_id is not None:
return comet_ml.ExistingExperiment(
previous_experiment=experiment_id,
**self.default_experiment_kwargs,
)
return comet_ml.Experiment(**self.default_experiment_kwargs)
except ValueError:
logger.warning(
"COMET WARNING: "
"Comet credentials have not been set. "
"Comet will default to offline logging. "
"Please set your credentials to enable online logging."
)
return self._get_experiment("offline", experiment_id)
return
def log_metrics(self, log_dict, **kwargs):
"""Logs metrics to the current experiment, accepting a dictionary of metric names and values."""
self.experiment.log_metrics(log_dict, **kwargs)
def log_parameters(self, log_dict, **kwargs):
"""Logs parameters to the current experiment, accepting a dictionary of parameter names and values."""
self.experiment.log_parameters(log_dict, **kwargs)
def log_asset(self, asset_path, **kwargs):
"""Logs a file or directory as an asset to the current experiment."""
self.experiment.log_asset(asset_path, **kwargs)
def log_asset_data(self, asset, **kwargs):
"""Logs in-memory data as an asset to the current experiment, with optional kwargs."""
self.experiment.log_asset_data(asset, **kwargs)
def log_image(self, img, **kwargs):
"""Logs an image to the current experiment with optional kwargs."""
self.experiment.log_image(img, **kwargs)
def log_model(self, path, opt, epoch, fitness_score, best_model=False):
"""Logs model checkpoint to experiment with path, options, epoch, fitness, and best model flag."""
if not self.save_model:
return
model_metadata = {
"fitness_score": fitness_score[-1],
"epochs_trained": epoch + 1,
"save_period": opt.save_period,
"total_epochs": opt.epochs,
}
model_files = glob.glob(f"{path}/*.pt")
for model_path in model_files:
name = Path(model_path).name
self.experiment.log_model(
self.model_name,
file_or_folder=model_path,
file_name=name,
metadata=model_metadata,
overwrite=True,
)
def check_dataset(self, data_file):
"""Validates the dataset configuration by loading the YAML file specified in `data_file`."""
with open(data_file) as f:
data_config = yaml.safe_load(f)
path = data_config.get("path")
if path and path.startswith(COMET_PREFIX):
path = data_config["path"].replace(COMET_PREFIX, "")
return self.download_dataset_artifact(path)
self.log_asset(self.opt.data, metadata={"type": "data-config-file"})
return check_dataset(data_file)
def log_predictions(self, image, labelsn, path, shape, predn):
"""Logs predictions with IOU filtering, given image, labels, path, shape, and predictions."""
if self.logged_images_count >= self.max_images:
return
detections = predn[predn[:, 4] > self.conf_thres]
iou = box_iou(labelsn[:, 1:], detections[:, :4])
mask, _ = torch.where(iou > self.iou_thres)
if len(mask) == 0:
return
filtered_detections = detections[mask]
filtered_labels = labelsn[mask]
image_id = path.split("/")[-1].split(".")[0]
image_name = f"{image_id}_curr_epoch_{self.experiment.curr_epoch}"
if image_name not in self.logged_image_names:
native_scale_image = PIL.Image.open(path)
self.log_image(native_scale_image, name=image_name)
self.logged_image_names.append(image_name)
metadata = [
{
"label": f"{self.class_names[int(cls)]}-gt",
"score": 100,
"box": {"x": xyxy[0], "y": xyxy[1], "x2": xyxy[2], "y2": xyxy[3]},
}
for cls, *xyxy in filtered_labels.tolist()
]
metadata.extend(
{
"label": f"{self.class_names[int(cls)]}",
"score": conf * 100,
"box": {"x": xyxy[0], "y": xyxy[1], "x2": xyxy[2], "y2": xyxy[3]},
}
for *xyxy, conf, cls in filtered_detections.tolist()
)
self.metadata_dict[image_name] = metadata
self.logged_images_count += 1
return
def preprocess_prediction(self, image, labels, shape, pred):
"""Processes prediction data, resizing labels and adding dataset metadata."""
nl, _ = labels.shape[0], pred.shape[0]
# Predictions
if self.opt.single_cls:
pred[:, 5] = 0
predn = pred.clone()
scale_boxes(image.shape[1:], predn[:, :4], shape[0], shape[1])
labelsn = None
if nl:
tbox = xywh2xyxy(labels[:, 1:5]) # target boxes
scale_boxes(image.shape[1:], tbox, shape[0], shape[1]) # native-space labels
labelsn = torch.cat((labels[:, 0:1], tbox), 1) # native-space labels
scale_boxes(image.shape[1:], predn[:, :4], shape[0], shape[1]) # native-space pred
return predn, labelsn
def add_assets_to_artifact(self, artifact, path, asset_path, split):
"""Adds image and label assets to a wandb artifact given dataset split and paths."""
img_paths = sorted(glob.glob(f"{asset_path}/*"))
label_paths = img2label_paths(img_paths)
for image_file, label_file in zip(img_paths, label_paths):
image_logical_path, label_logical_path = map(lambda x: os.path.relpath(x, path), [image_file, label_file])
try:
artifact.add(
image_file,
logical_path=image_logical_path,
metadata={"split": split},
)
artifact.add(
label_file,
logical_path=label_logical_path,
metadata={"split": split},
)
except ValueError as e:
logger.error("COMET ERROR: Error adding file to Artifact. Skipping file.")
logger.error(f"COMET ERROR: {e}")
continue
return artifact
def upload_dataset_artifact(self):
"""Uploads a YOLOv5 dataset as an artifact to the Comet.ml platform."""
dataset_name = self.data_dict.get("dataset_name", "yolov5-dataset")
path = str((ROOT / Path(self.data_dict["path"])).resolve())
metadata = self.data_dict.copy()
for key in ["train", "val", "test"]:
split_path = metadata.get(key)
if split_path is not None:
metadata[key] = split_path.replace(path, "")
artifact = comet_ml.Artifact(name=dataset_name, artifact_type="dataset", metadata=metadata)
for key in metadata.keys():
if key in ["train", "val", "test"]:
if isinstance(self.upload_dataset, str) and (key != self.upload_dataset):
continue
asset_path = self.data_dict.get(key)
if asset_path is not None:
artifact = self.add_assets_to_artifact(artifact, path, asset_path, key)
self.experiment.log_artifact(artifact)
return
def download_dataset_artifact(self, artifact_path):
"""Downloads a dataset artifact to a specified directory using the experiment's logged artifact."""
logged_artifact = self.experiment.get_artifact(artifact_path)
artifact_save_dir = str(Path(self.opt.save_dir) / logged_artifact.name)
logged_artifact.download(artifact_save_dir)
metadata = logged_artifact.metadata
data_dict = metadata.copy()
data_dict["path"] = artifact_save_dir
metadata_names = metadata.get("names")
if isinstance(metadata_names, dict):
data_dict["names"] = {int(k): v for k, v in metadata.get("names").items()}
elif isinstance(metadata_names, list):
data_dict["names"] = {int(k): v for k, v in zip(range(len(metadata_names)), metadata_names)}
else:
raise "Invalid 'names' field in dataset yaml file. Please use a list or dictionary"
return self.update_data_paths(data_dict)
def update_data_paths(self, data_dict):
"""Updates data paths in the dataset dictionary, defaulting 'path' to an empty string if not present."""
path = data_dict.get("path", "")
for split in ["train", "val", "test"]:
if data_dict.get(split):
split_path = data_dict.get(split)
data_dict[split] = (
f"{path}/{split_path}" if isinstance(split, str) else [f"{path}/{x}" for x in split_path]
)
return data_dict
def on_pretrain_routine_end(self, paths):
"""Called at the end of pretraining routine to handle paths if training is not being resumed."""
if self.opt.resume:
return
for path in paths:
self.log_asset(str(path))
if self.upload_dataset and not self.resume:
self.upload_dataset_artifact()
return
def on_train_start(self):
"""Logs hyperparameters at the start of training."""
self.log_parameters(self.hyp)
def on_train_epoch_start(self):
"""Called at the start of each training epoch."""
return
def on_train_epoch_end(self, epoch):
"""Updates the current epoch in the experiment tracking at the end of each epoch."""
self.experiment.curr_epoch = epoch
return
def on_train_batch_start(self):
"""Called at the start of each training batch."""
return
def on_train_batch_end(self, log_dict, step):
"""Callback function that updates and logs metrics at the end of each training batch if conditions are met."""
self.experiment.curr_step = step
if self.log_batch_metrics and (step % self.comet_log_batch_interval == 0):
self.log_metrics(log_dict, step=step)
return
def on_train_end(self, files, save_dir, last, best, epoch, results):
"""Logs metadata and optionally saves model files at the end of training."""
if self.comet_log_predictions:
curr_epoch = self.experiment.curr_epoch
self.experiment.log_asset_data(self.metadata_dict, "image-metadata.json", epoch=curr_epoch)
for f in files:
self.log_asset(f, metadata={"epoch": epoch})
self.log_asset(f"{save_dir}/results.csv", metadata={"epoch": epoch})
if not self.opt.evolve:
model_path = str(best if best.exists() else last)
name = Path(model_path).name
if self.save_model:
self.experiment.log_model(
self.model_name,
file_or_folder=model_path,
file_name=name,
overwrite=True,
)
# Check if running Experiment with Comet Optimizer
if hasattr(self.opt, "comet_optimizer_id"):
metric = results.get(self.opt.comet_optimizer_metric)
self.experiment.log_other("optimizer_metric_value", metric)
self.finish_run()
def on_val_start(self):
"""Called at the start of validation, currently a placeholder with no functionality."""
return
def on_val_batch_start(self):
"""Placeholder called at the start of a validation batch with no current functionality."""
return
def on_val_batch_end(self, batch_i, images, targets, paths, shapes, outputs):
"""Callback executed at the end of a validation batch, conditionally logs predictions to Comet ML."""
if not (self.comet_log_predictions and ((batch_i + 1) % self.comet_log_prediction_interval == 0)):
return
for si, pred in enumerate(outputs):
if len(pred) == 0:
continue
image = images[si]
labels = targets[targets[:, 0] == si, 1:]
shape = shapes[si]
path = paths[si]
predn, labelsn = self.preprocess_prediction(image, labels, shape, pred)
if labelsn is not None:
self.log_predictions(image, labelsn, path, shape, predn)
return
def on_val_end(self, nt, tp, fp, p, r, f1, ap, ap50, ap_class, confusion_matrix):
"""Logs per-class metrics to Comet.ml after validation if enabled and more than one class exists."""
if self.comet_log_per_class_metrics and self.num_classes > 1:
for i, c in enumerate(ap_class):
class_name = self.class_names[c]
self.experiment.log_metrics(
{
"mAP@.5": ap50[i],
"mAP@.5:.95": ap[i],
"precision": p[i],
"recall": r[i],
"f1": f1[i],
"true_positives": tp[i],
"false_positives": fp[i],
"support": nt[c],
},
prefix=class_name,
)
if self.comet_log_confusion_matrix:
epoch = self.experiment.curr_epoch
class_names = list(self.class_names.values())
class_names.append("background")
num_classes = len(class_names)
self.experiment.log_confusion_matrix(
matrix=confusion_matrix.matrix,
max_categories=num_classes,
labels=class_names,
epoch=epoch,
column_label="Actual Category",
row_label="Predicted Category",
file_name=f"confusion-matrix-epoch-{epoch}.json",
)
def on_fit_epoch_end(self, result, epoch):
"""Logs metrics at the end of each training epoch."""
self.log_metrics(result, epoch=epoch)
def on_model_save(self, last, epoch, final_epoch, best_fitness, fi):
"""Callback to save model checkpoints periodically if conditions are met."""
if ((epoch + 1) % self.opt.save_period == 0 and not final_epoch) and self.opt.save_period != -1:
self.log_model(last.parent, self.opt, epoch, fi, best_model=best_fitness == fi)
def on_params_update(self, params):
"""Logs updated parameters during training."""
self.log_parameters(params)
def finish_run(self):
"""Ends the current experiment and logs its completion."""
self.experiment.end()
import logging
import os
from urllib.parse import urlparse
try:
import comet_ml
except ImportError:
comet_ml = None
import yaml
logger = logging.getLogger(__name__)
COMET_PREFIX = "comet://"
COMET_MODEL_NAME = os.getenv("COMET_MODEL_NAME", "yolov5")
COMET_DEFAULT_CHECKPOINT_FILENAME = os.getenv("COMET_DEFAULT_CHECKPOINT_FILENAME", "last.pt")
def download_model_checkpoint(opt, experiment):
"""Downloads YOLOv5 model checkpoint from Comet ML experiment, updating `opt.weights` with download path."""
model_dir = f"{opt.project}/{experiment.name}"
os.makedirs(model_dir, exist_ok=True)
model_name = COMET_MODEL_NAME
model_asset_list = experiment.get_model_asset_list(model_name)
if len(model_asset_list) == 0:
logger.error(f"COMET ERROR: No checkpoints found for model name : {model_name}")
return
model_asset_list = sorted(
model_asset_list,
key=lambda x: x["step"],
reverse=True,
)
logged_checkpoint_map = {asset["fileName"]: asset["assetId"] for asset in model_asset_list}
resource_url = urlparse(opt.weights)
checkpoint_filename = resource_url.query
if checkpoint_filename:
asset_id = logged_checkpoint_map.get(checkpoint_filename)
else:
asset_id = logged_checkpoint_map.get(COMET_DEFAULT_CHECKPOINT_FILENAME)
checkpoint_filename = COMET_DEFAULT_CHECKPOINT_FILENAME
if asset_id is None:
logger.error(f"COMET ERROR: Checkpoint {checkpoint_filename} not found in the given Experiment")
return
try:
logger.info(f"COMET INFO: Downloading checkpoint {checkpoint_filename}")
asset_filename = checkpoint_filename
model_binary = experiment.get_asset(asset_id, return_type="binary", stream=False)
model_download_path = f"{model_dir}/{asset_filename}"
with open(model_download_path, "wb") as f:
f.write(model_binary)
opt.weights = model_download_path
except Exception as e:
logger.warning("COMET WARNING: Unable to download checkpoint from Comet")
logger.exception(e)
def set_opt_parameters(opt, experiment):
"""
Update the opts Namespace with parameters from Comet's ExistingExperiment when resuming a run.
Args:
opt (argparse.Namespace): Namespace of command line options
experiment (comet_ml.APIExperiment): Comet API Experiment object
"""
asset_list = experiment.get_asset_list()
resume_string = opt.resume
for asset in asset_list:
if asset["fileName"] == "opt.yaml":
asset_id = asset["assetId"]
asset_binary = experiment.get_asset(asset_id, return_type="binary", stream=False)
opt_dict = yaml.safe_load(asset_binary)
for key, value in opt_dict.items():
setattr(opt, key, value)
opt.resume = resume_string
# Save hyperparameters to YAML file
# Necessary to pass checks in training script
save_dir = f"{opt.project}/{experiment.name}"
os.makedirs(save_dir, exist_ok=True)
hyp_yaml_path = f"{save_dir}/hyp.yaml"
with open(hyp_yaml_path, "w") as f:
yaml.dump(opt.hyp, f)
opt.hyp = hyp_yaml_path
def check_comet_weights(opt):
"""
Downloads model weights from Comet and updates the weights path to point to saved weights location.
Args:
opt (argparse.Namespace): Command Line arguments passed
to YOLOv5 training script
Returns:
None/bool: Return True if weights are successfully downloaded
else return None
"""
if comet_ml is None:
return
if isinstance(opt.weights, str) and opt.weights.startswith(COMET_PREFIX):
api = comet_ml.API()
resource = urlparse(opt.weights)
experiment_path = f"{resource.netloc}{resource.path}"
experiment = api.get(experiment_path)
download_model_checkpoint(opt, experiment)
return True
return None
def check_comet_resume(opt):
"""
Restores run parameters to its original state based on the model checkpoint and logged Experiment parameters.
Args:
opt (argparse.Namespace): Command Line arguments passed
to YOLOv5 training script
Returns:
None/bool: Return True if the run is restored successfully
else return None
"""
if comet_ml is None:
return
if isinstance(opt.resume, str) and opt.resume.startswith(COMET_PREFIX):
api = comet_ml.API()
resource = urlparse(opt.resume)
experiment_path = f"{resource.netloc}{resource.path}"
experiment = api.get(experiment_path)
set_opt_parameters(opt, experiment)
download_model_checkpoint(opt, experiment)
return True
return None
import argparse
import json
import logging
import os
import sys
from pathlib import Path
import comet_ml
logger = logging.getLogger(__name__)
FILE = Path(__file__).resolve()
ROOT = FILE.parents[3] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
from train import train
from utils.callbacks import Callbacks
from utils.general import increment_path
from utils.torch_utils import select_device
# Project Configuration
config = comet_ml.config.get_config()
COMET_PROJECT_NAME = config.get_string(os.getenv("COMET_PROJECT_NAME"), "comet.project_name", default="yolov5")
def get_args(known=False):
"""Parses command-line arguments for YOLOv5 training, supporting configuration of weights, data paths,
hyperparameters, and more.
"""
parser = argparse.ArgumentParser()
parser.add_argument("--weights", type=str, default=ROOT / "yolov5s.pt", help="initial weights path")
parser.add_argument("--cfg", type=str, default="", help="model.yaml path")
parser.add_argument("--data", type=str, default=ROOT / "data/coco128.yaml", help="dataset.yaml path")
parser.add_argument("--hyp", type=str, default=ROOT / "data/hyps/hyp.scratch-low.yaml", help="hyperparameters path")
parser.add_argument("--epochs", type=int, default=300, help="total training epochs")
parser.add_argument("--batch-size", type=int, default=16, help="total batch size for all GPUs, -1 for autobatch")
parser.add_argument("--imgsz", "--img", "--img-size", type=int, default=640, help="train, val image size (pixels)")
parser.add_argument("--rect", action="store_true", help="rectangular training")
parser.add_argument("--resume", nargs="?", const=True, default=False, help="resume most recent training")
parser.add_argument("--nosave", action="store_true", help="only save final checkpoint")
parser.add_argument("--noval", action="store_true", help="only validate final epoch")
parser.add_argument("--noautoanchor", action="store_true", help="disable AutoAnchor")
parser.add_argument("--noplots", action="store_true", help="save no plot files")
parser.add_argument("--evolve", type=int, nargs="?", const=300, help="evolve hyperparameters for x generations")
parser.add_argument("--bucket", type=str, default="", help="gsutil bucket")
parser.add_argument("--cache", type=str, nargs="?", const="ram", help='--cache images in "ram" (default) or "disk"')
parser.add_argument("--image-weights", action="store_true", help="use weighted image selection for training")
parser.add_argument("--device", default="", help="cuda device, i.e. 0 or 0,1,2,3 or cpu")
parser.add_argument("--multi-scale", action="store_true", help="vary img-size +/- 50%%")
parser.add_argument("--single-cls", action="store_true", help="train multi-class data as single-class")
parser.add_argument("--optimizer", type=str, choices=["SGD", "Adam", "AdamW"], default="SGD", help="optimizer")
parser.add_argument("--sync-bn", action="store_true", help="use SyncBatchNorm, only available in DDP mode")
parser.add_argument("--workers", type=int, default=8, help="max dataloader workers (per RANK in DDP mode)")
parser.add_argument("--project", default=ROOT / "runs/train", help="save to project/name")
parser.add_argument("--name", default="exp", help="save to project/name")
parser.add_argument("--exist-ok", action="store_true", help="existing project/name ok, do not increment")
parser.add_argument("--quad", action="store_true", help="quad dataloader")
parser.add_argument("--cos-lr", action="store_true", help="cosine LR scheduler")
parser.add_argument("--label-smoothing", type=float, default=0.0, help="Label smoothing epsilon")
parser.add_argument("--patience", type=int, default=100, help="EarlyStopping patience (epochs without improvement)")
parser.add_argument("--freeze", nargs="+", type=int, default=[0], help="Freeze layers: backbone=10, first3=0 1 2")
parser.add_argument("--save-period", type=int, default=-1, help="Save checkpoint every x epochs (disabled if < 1)")
parser.add_argument("--seed", type=int, default=0, help="Global training seed")
parser.add_argument("--local_rank", type=int, default=-1, help="Automatic DDP Multi-GPU argument, do not modify")
# Weights & Biases arguments
parser.add_argument("--entity", default=None, help="W&B: Entity")
parser.add_argument("--upload_dataset", nargs="?", const=True, default=False, help='W&B: Upload data, "val" option')
parser.add_argument("--bbox_interval", type=int, default=-1, help="W&B: Set bounding-box image logging interval")
parser.add_argument("--artifact_alias", type=str, default="latest", help="W&B: Version of dataset artifact to use")
# Comet Arguments
parser.add_argument("--comet_optimizer_config", type=str, help="Comet: Path to a Comet Optimizer Config File.")
parser.add_argument("--comet_optimizer_id", type=str, help="Comet: ID of the Comet Optimizer sweep.")
parser.add_argument("--comet_optimizer_objective", type=str, help="Comet: Set to 'minimize' or 'maximize'.")
parser.add_argument("--comet_optimizer_metric", type=str, help="Comet: Metric to Optimize.")
parser.add_argument(
"--comet_optimizer_workers",
type=int,
default=1,
help="Comet: Number of Parallel Workers to use with the Comet Optimizer.",
)
return parser.parse_known_args()[0] if known else parser.parse_args()
def run(parameters, opt):
"""Executes YOLOv5 training with given hyperparameters and options, setting up device and training directories."""
hyp_dict = {k: v for k, v in parameters.items() if k not in ["epochs", "batch_size"]}
opt.save_dir = str(increment_path(Path(opt.project) / opt.name, exist_ok=opt.exist_ok or opt.evolve))
opt.batch_size = parameters.get("batch_size")
opt.epochs = parameters.get("epochs")
device = select_device(opt.device, batch_size=opt.batch_size)
train(hyp_dict, opt, device, callbacks=Callbacks())
if __name__ == "__main__":
opt = get_args(known=True)
opt.weights = str(opt.weights)
opt.cfg = str(opt.cfg)
opt.data = str(opt.data)
opt.project = str(opt.project)
optimizer_id = os.getenv("COMET_OPTIMIZER_ID")
if optimizer_id is None:
with open(opt.comet_optimizer_config) as f:
optimizer_config = json.load(f)
optimizer = comet_ml.Optimizer(optimizer_config)
else:
optimizer = comet_ml.Optimizer(optimizer_id)
opt.comet_optimizer_id = optimizer.id
status = optimizer.status()
opt.comet_optimizer_objective = status["spec"]["objective"]
opt.comet_optimizer_metric = status["spec"]["metric"]
logger.info("COMET INFO: Starting Hyperparameter Sweep")
for parameter in optimizer.get_parameters():
run(parameter["parameters"], opt)
# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license
# WARNING ⚠️ wandb is deprecated and will be removed in future release.
# See supported integrations at https://github.com/ultralytics/yolov5#integrations
import logging
import os
import sys
from contextlib import contextmanager
from pathlib import Path
from utils.general import LOGGER, colorstr
FILE = Path(__file__).resolve()
ROOT = FILE.parents[3] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
RANK = int(os.getenv("RANK", -1))
DEPRECATION_WARNING = (
f"{colorstr('wandb')}: WARNING ⚠️ wandb is deprecated and will be removed in a future release. "
f'See supported integrations at https://github.com/ultralytics/yolov5#integrations.'
)
try:
import wandb
assert hasattr(wandb, "__version__") # verify package import not local dir
LOGGER.warning(DEPRECATION_WARNING)
except (ImportError, AssertionError):
wandb = None
class WandbLogger:
"""
Log training runs, datasets, models, and predictions to Weights & Biases.
This logger sends information to W&B at wandb.ai. By default, this information includes hyperparameters, system
configuration and metrics, model metrics, and basic data metrics and analyses.
By providing additional command line arguments to train.py, datasets, models and predictions can also be logged.
For more on how this logger is used, see the Weights & Biases documentation:
https://docs.wandb.com/guides/integrations/yolov5
"""
def __init__(self, opt, run_id=None, job_type="Training"):
"""
- Initialize WandbLogger instance
- Upload dataset if opt.upload_dataset is True
- Setup training processes if job_type is 'Training'
arguments:
opt (namespace) -- Commandline arguments for this run
run_id (str) -- Run ID of W&B run to be resumed
job_type (str) -- To set the job_type for this run
"""
# Pre-training routine --
self.job_type = job_type
self.wandb, self.wandb_run = wandb, wandb.run if wandb else None
self.val_artifact, self.train_artifact = None, None
self.train_artifact_path, self.val_artifact_path = None, None
self.result_artifact = None
self.val_table, self.result_table = None, None
self.max_imgs_to_log = 16
self.data_dict = None
if self.wandb:
self.wandb_run = wandb.run or wandb.init(
config=opt,
resume="allow",
project="YOLOv5" if opt.project == "runs/train" else Path(opt.project).stem,
entity=opt.entity,
name=opt.name if opt.name != "exp" else None,
job_type=job_type,
id=run_id,
allow_val_change=True,
)
if self.wandb_run and self.job_type == "Training":
if isinstance(opt.data, dict):
# This means another dataset manager has already processed the dataset info (e.g. ClearML)
# and they will have stored the already processed dict in opt.data
self.data_dict = opt.data
self.setup_training(opt)
def setup_training(self, opt):
"""
Setup the necessary processes for training YOLO models:
- Attempt to download model checkpoint and dataset artifacts if opt.resume stats with WANDB_ARTIFACT_PREFIX
- Update data_dict, to contain info of previous run if resumed and the paths of dataset artifact if downloaded
- Setup log_dict, initialize bbox_interval
arguments:
opt (namespace) -- commandline arguments for this run
"""
self.log_dict, self.current_epoch = {}, 0
self.bbox_interval = opt.bbox_interval
if isinstance(opt.resume, str):
model_dir, _ = self.download_model_artifact(opt)
if model_dir:
self.weights = Path(model_dir) / "last.pt"
config = self.wandb_run.config
opt.weights, opt.save_period, opt.batch_size, opt.bbox_interval, opt.epochs, opt.hyp, opt.imgsz = (
str(self.weights),
config.save_period,
config.batch_size,
config.bbox_interval,
config.epochs,
config.hyp,
config.imgsz,
)
if opt.bbox_interval == -1:
self.bbox_interval = opt.bbox_interval = (opt.epochs // 10) if opt.epochs > 10 else 1
if opt.evolve or opt.noplots:
self.bbox_interval = opt.bbox_interval = opt.epochs + 1 # disable bbox_interval
def log_model(self, path, opt, epoch, fitness_score, best_model=False):
"""
Log the model checkpoint as W&B artifact.
arguments:
path (Path) -- Path of directory containing the checkpoints
opt (namespace) -- Command line arguments for this run
epoch (int) -- Current epoch number
fitness_score (float) -- fitness score for current epoch
best_model (boolean) -- Boolean representing if the current checkpoint is the best yet.
"""
model_artifact = wandb.Artifact(
f"run_{wandb.run.id}_model",
type="model",
metadata={
"original_url": str(path),
"epochs_trained": epoch + 1,
"save period": opt.save_period,
"project": opt.project,
"total_epochs": opt.epochs,
"fitness_score": fitness_score,
},
)
model_artifact.add_file(str(path / "last.pt"), name="last.pt")
wandb.log_artifact(
model_artifact,
aliases=[
"latest",
"last",
f"epoch {str(self.current_epoch)}",
"best" if best_model else "",
],
)
LOGGER.info(f"Saving model artifact on epoch {epoch + 1}")
def val_one_image(self, pred, predn, path, names, im):
"""Evaluates model prediction for a single image, returning metrics and visualizations."""
pass
def log(self, log_dict):
"""
Save the metrics to the logging dictionary.
arguments:
log_dict (Dict) -- metrics/media to be logged in current step
"""
if self.wandb_run:
for key, value in log_dict.items():
self.log_dict[key] = value
def end_epoch(self):
"""
Commit the log_dict, model artifacts and Tables to W&B and flush the log_dict.
arguments:
best_result (boolean): Boolean representing if the result of this evaluation is best or not
"""
if self.wandb_run:
with all_logging_disabled():
try:
wandb.log(self.log_dict)
except BaseException as e:
LOGGER.info(
f"An error occurred in wandb logger. The training will proceed without interruption. More info\n{e}"
)
self.wandb_run.finish()
self.wandb_run = None
self.log_dict = {}
def finish_run(self):
"""Log metrics if any and finish the current W&B run."""
if self.wandb_run:
if self.log_dict:
with all_logging_disabled():
wandb.log(self.log_dict)
wandb.run.finish()
LOGGER.warning(DEPRECATION_WARNING)
@contextmanager
def all_logging_disabled(highest_level=logging.CRITICAL):
"""source - https://gist.github.com/simon-weber/7853144
A context manager that will prevent any logging messages triggered during the body from being processed.
:param highest_level: the maximum logging level in use.
This would only need to be changed if a custom level greater than CRITICAL is defined.
"""
previous_level = logging.root.manager.disable
logging.disable(highest_level)
try:
yield
finally:
logging.disable(previous_level)
# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license
"""Loss functions."""
import torch
import torch.nn as nn
from utils.metrics import bbox_iou
from utils.torch_utils import de_parallel
def smooth_BCE(eps=0.1):
"""Returns label smoothing BCE targets for reducing overfitting; pos: `1.0 - 0.5*eps`, neg: `0.5*eps`. For details see ttps://github.com/ultralytics/yolov3/issues/238#issuecomment-598028441"""
return 1.0 - 0.5 * eps, 0.5 * eps
class BCEBlurWithLogitsLoss(nn.Module):
# BCEwithLogitLoss() with reduced missing label effects.
def __init__(self, alpha=0.05):
"""Initializes a modified BCEWithLogitsLoss with reduced missing label effects, taking optional alpha smoothing
parameter.
"""
super().__init__()
self.loss_fcn = nn.BCEWithLogitsLoss(reduction="none") # must be nn.BCEWithLogitsLoss()
self.alpha = alpha
def forward(self, pred, true):
"""Computes modified BCE loss for YOLOv5 with reduced missing label effects, taking pred and true tensors,
returns mean loss.
"""
loss = self.loss_fcn(pred, true)
pred = torch.sigmoid(pred) # prob from logits
dx = pred - true # reduce only missing label effects
# dx = (pred - true).abs() # reduce missing label and false label effects
alpha_factor = 1 - torch.exp((dx - 1) / (self.alpha + 1e-4))
loss *= alpha_factor
return loss.mean()
class FocalLoss(nn.Module):
# Wraps focal loss around existing loss_fcn(), i.e. criteria = FocalLoss(nn.BCEWithLogitsLoss(), gamma=1.5)
def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
"""Initializes FocalLoss with specified loss function, gamma, and alpha values; modifies loss reduction to
'none'.
"""
super().__init__()
self.loss_fcn = loss_fcn # must be nn.BCEWithLogitsLoss()
self.gamma = gamma
self.alpha = alpha
self.reduction = loss_fcn.reduction
self.loss_fcn.reduction = "none" # required to apply FL to each element
def forward(self, pred, true):
"""Calculates the focal loss between predicted and true labels using a modified BCEWithLogitsLoss."""
loss = self.loss_fcn(pred, true)
# p_t = torch.exp(-loss)
# loss *= self.alpha * (1.000001 - p_t) ** self.gamma # non-zero power for gradient stability
# TF implementation https://github.com/tensorflow/addons/blob/v0.7.1/tensorflow_addons/losses/focal_loss.py
pred_prob = torch.sigmoid(pred) # prob from logits
p_t = true * pred_prob + (1 - true) * (1 - pred_prob)
alpha_factor = true * self.alpha + (1 - true) * (1 - self.alpha)
modulating_factor = (1.0 - p_t) ** self.gamma
loss *= alpha_factor * modulating_factor
if self.reduction == "mean":
return loss.mean()
elif self.reduction == "sum":
return loss.sum()
else: # 'none'
return loss
class QFocalLoss(nn.Module):
# Wraps Quality focal loss around existing loss_fcn(), i.e. criteria = FocalLoss(nn.BCEWithLogitsLoss(), gamma=1.5)
def __init__(self, loss_fcn, gamma=1.5, alpha=0.25):
"""Initializes Quality Focal Loss with given loss function, gamma, alpha; modifies reduction to 'none'."""
super().__init__()
self.loss_fcn = loss_fcn # must be nn.BCEWithLogitsLoss()
self.gamma = gamma
self.alpha = alpha
self.reduction = loss_fcn.reduction
self.loss_fcn.reduction = "none" # required to apply FL to each element
def forward(self, pred, true):
"""Computes the focal loss between `pred` and `true` using BCEWithLogitsLoss, adjusting for imbalance with
`gamma` and `alpha`.
"""
loss = self.loss_fcn(pred, true)
pred_prob = torch.sigmoid(pred) # prob from logits
alpha_factor = true * self.alpha + (1 - true) * (1 - self.alpha)
modulating_factor = torch.abs(true - pred_prob) ** self.gamma
loss *= alpha_factor * modulating_factor
if self.reduction == "mean":
return loss.mean()
elif self.reduction == "sum":
return loss.sum()
else: # 'none'
return loss
class ComputeLoss:
sort_obj_iou = False
# Compute losses
def __init__(self, model, autobalance=False):
"""Initializes ComputeLoss with model and autobalance option, autobalances losses if True."""
device = next(model.parameters()).device # get model device
h = model.hyp # hyperparameters
# Define criteria
BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h["cls_pw"]], device=device))
BCEobj = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h["obj_pw"]], device=device))
# Class label smoothing https://arxiv.org/pdf/1902.04103.pdf eqn 3
self.cp, self.cn = smooth_BCE(eps=h.get("label_smoothing", 0.0)) # positive, negative BCE targets
# Focal loss
g = h["fl_gamma"] # focal loss gamma
if g > 0:
BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)
m = de_parallel(model).model[-1] # Detect() module
self.balance = {3: [4.0, 1.0, 0.4]}.get(m.nl, [4.0, 1.0, 0.25, 0.06, 0.02]) # P3-P7
self.ssi = list(m.stride).index(16) if autobalance else 0 # stride 16 index
self.BCEcls, self.BCEobj, self.gr, self.hyp, self.autobalance = BCEcls, BCEobj, 1.0, h, autobalance
self.na = m.na # number of anchors
self.nc = m.nc # number of classes
self.nl = m.nl # number of layers
self.anchors = m.anchors
self.device = device
def __call__(self, p, targets): # predictions, targets
"""Performs forward pass, calculating class, box, and object loss for given predictions and targets."""
lcls = torch.zeros(1, device=self.device) # class loss
lbox = torch.zeros(1, device=self.device) # box loss
lobj = torch.zeros(1, device=self.device) # object loss
tcls, tbox, indices, anchors = self.build_targets(p, targets) # targets
# Losses
for i, pi in enumerate(p): # layer index, layer predictions
b, a, gj, gi = indices[i] # image, anchor, gridy, gridx
tobj = torch.zeros(pi.shape[:4], dtype=pi.dtype, device=self.device) # target obj
n = b.shape[0] # number of targets
if n:
# pxy, pwh, _, pcls = pi[b, a, gj, gi].tensor_split((2, 4, 5), dim=1) # faster, requires torch 1.8.0
pxy, pwh, _, pcls = pi[b, a, gj, gi].split((2, 2, 1, self.nc), 1) # target-subset of predictions
# Regression
pxy = pxy.sigmoid() * 2 - 0.5
pwh = (pwh.sigmoid() * 2) ** 2 * anchors[i]
pbox = torch.cat((pxy, pwh), 1) # predicted box
iou = bbox_iou(pbox, tbox[i], CIoU=True).squeeze() # iou(prediction, target)
lbox += (1.0 - iou).mean() # iou loss
# Objectness
iou = iou.detach().clamp(0).type(tobj.dtype)
if self.sort_obj_iou:
j = iou.argsort()
b, a, gj, gi, iou = b[j], a[j], gj[j], gi[j], iou[j]
if self.gr < 1:
iou = (1.0 - self.gr) + self.gr * iou
tobj[b, a, gj, gi] = iou # iou ratio
# Classification
if self.nc > 1: # cls loss (only if multiple classes)
t = torch.full_like(pcls, self.cn, device=self.device) # targets
t[range(n), tcls[i]] = self.cp
lcls += self.BCEcls(pcls, t) # BCE
# Append targets to text file
# with open('targets.txt', 'a') as file:
# [file.write('%11.5g ' * 4 % tuple(x) + '\n') for x in torch.cat((txy[i], twh[i]), 1)]
obji = self.BCEobj(pi[..., 4], tobj)
lobj += obji * self.balance[i] # obj loss
if self.autobalance:
self.balance[i] = self.balance[i] * 0.9999 + 0.0001 / obji.detach().item()
if self.autobalance:
self.balance = [x / self.balance[self.ssi] for x in self.balance]
lbox *= self.hyp["box"]
lobj *= self.hyp["obj"]
lcls *= self.hyp["cls"]
bs = tobj.shape[0] # batch size
return (lbox + lobj + lcls) * bs, torch.cat((lbox, lobj, lcls)).detach()
def build_targets(self, p, targets):
"""Prepares model targets from input targets (image,class,x,y,w,h) for loss computation, returning class, box,
indices, and anchors.
"""
na, nt = self.na, targets.shape[0] # number of anchors, targets
tcls, tbox, indices, anch = [], [], [], []
gain = torch.ones(7, device=self.device) # normalized to gridspace gain
ai = torch.arange(na, device=self.device).float().view(na, 1).repeat(1, nt) # same as .repeat_interleave(nt)
targets = torch.cat((targets.repeat(na, 1, 1), ai[..., None]), 2) # append anchor indices
g = 0.5 # bias
off = (
torch.tensor(
[
[0, 0],
[1, 0],
[0, 1],
[-1, 0],
[0, -1], # j,k,l,m
# [1, 1], [1, -1], [-1, 1], [-1, -1], # jk,jm,lk,lm
],
device=self.device,
).float()
* g
) # offsets
for i in range(self.nl):
anchors, shape = self.anchors[i], p[i].shape
gain[2:6] = torch.tensor(shape)[[3, 2, 3, 2]] # xyxy gain
# Match targets to anchors
t = targets * gain # shape(3,n,7)
if nt:
# Matches
r = t[..., 4:6] / anchors[:, None] # wh ratio
j = torch.max(r, 1 / r).max(2)[0] < self.hyp["anchor_t"] # compare
# j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t'] # iou(3,n)=wh_iou(anchors(3,2), gwh(n,2))
t = t[j] # filter
# Offsets
gxy = t[:, 2:4] # grid xy
gxi = gain[[2, 3]] - gxy # inverse
j, k = ((gxy % 1 < g) & (gxy > 1)).T
l, m = ((gxi % 1 < g) & (gxi > 1)).T
j = torch.stack((torch.ones_like(j), j, k, l, m))
t = t.repeat((5, 1, 1))[j]
offsets = (torch.zeros_like(gxy)[None] + off[:, None])[j]
else:
t = targets[0]
offsets = 0
# Define
bc, gxy, gwh, a = t.chunk(4, 1) # (image, class), grid xy, grid wh, anchors
a, (b, c) = a.long().view(-1), bc.long().T # anchors, image, class
gij = (gxy - offsets).long()
gi, gj = gij.T # grid indices
# Append
indices.append((b, a, gj.clamp_(0, shape[2] - 1), gi.clamp_(0, shape[3] - 1))) # image, anchor, grid
tbox.append(torch.cat((gxy - gij, gwh), 1)) # box
anch.append(anchors[a]) # anchors
tcls.append(c) # class
return tcls, tbox, indices, anch
# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license
"""Model validation metrics."""
import math
import warnings
from pathlib import Path
import matplotlib.pyplot as plt
import numpy as np
import torch
from utils import TryExcept, threaded
def fitness(x):
"""Calculates fitness of a model using weighted sum of metrics P, R, mAP@0.5, mAP@0.5:0.95."""
w = [0.0, 0.0, 0.1, 0.9] # weights for [P, R, mAP@0.5, mAP@0.5:0.95]
return (x[:, :4] * w).sum(1)
def smooth(y, f=0.05):
"""Applies box filter smoothing to array `y` with fraction `f`, yielding a smoothed array."""
nf = round(len(y) * f * 2) // 2 + 1 # number of filter elements (must be odd)
p = np.ones(nf // 2) # ones padding
yp = np.concatenate((p * y[0], y, p * y[-1]), 0) # y padded
return np.convolve(yp, np.ones(nf) / nf, mode="valid") # y-smoothed
def ap_per_class(tp, conf, pred_cls, target_cls, plot=False, save_dir=".", names=(), eps=1e-16, prefix=""):
"""
Compute the average precision, given the recall and precision curves.
Source: https://github.com/rafaelpadilla/Object-Detection-Metrics.
# Arguments
tp: True positives (nparray, nx1 or nx10).
conf: Objectness value from 0-1 (nparray).
pred_cls: Predicted object classes (nparray).
target_cls: True object classes (nparray).
plot: Plot precision-recall curve at mAP@0.5
save_dir: Plot save directory
# Returns
The average precision as computed in py-faster-rcnn.
"""
# Sort by objectness
i = np.argsort(-conf)
tp, conf, pred_cls = tp[i], conf[i], pred_cls[i]
# Find unique classes
unique_classes, nt = np.unique(target_cls, return_counts=True)
nc = unique_classes.shape[0] # number of classes, number of detections
# Create Precision-Recall curve and compute AP for each class
px, py = np.linspace(0, 1, 1000), [] # for plotting
ap, p, r = np.zeros((nc, tp.shape[1])), np.zeros((nc, 1000)), np.zeros((nc, 1000))
for ci, c in enumerate(unique_classes):
i = pred_cls == c
n_l = nt[ci] # number of labels
n_p = i.sum() # number of predictions
if n_p == 0 or n_l == 0:
continue
# Accumulate FPs and TPs
fpc = (1 - tp[i]).cumsum(0)
tpc = tp[i].cumsum(0)
# Recall
recall = tpc / (n_l + eps) # recall curve
r[ci] = np.interp(-px, -conf[i], recall[:, 0], left=0) # negative x, xp because xp decreases
# Precision
precision = tpc / (tpc + fpc) # precision curve
p[ci] = np.interp(-px, -conf[i], precision[:, 0], left=1) # p at pr_score
# AP from recall-precision curve
for j in range(tp.shape[1]):
ap[ci, j], mpre, mrec = compute_ap(recall[:, j], precision[:, j])
if plot and j == 0:
py.append(np.interp(px, mrec, mpre)) # precision at mAP@0.5
# Compute F1 (harmonic mean of precision and recall)
f1 = 2 * p * r / (p + r + eps)
names = [v for k, v in names.items() if k in unique_classes] # list: only classes that have data
names = dict(enumerate(names)) # to dict
if plot:
plot_pr_curve(px, py, ap, Path(save_dir) / f"{prefix}PR_curve.png", names)
plot_mc_curve(px, f1, Path(save_dir) / f"{prefix}F1_curve.png", names, ylabel="F1")
plot_mc_curve(px, p, Path(save_dir) / f"{prefix}P_curve.png", names, ylabel="Precision")
plot_mc_curve(px, r, Path(save_dir) / f"{prefix}R_curve.png", names, ylabel="Recall")
i = smooth(f1.mean(0), 0.1).argmax() # max F1 index
p, r, f1 = p[:, i], r[:, i], f1[:, i]
tp = (r * nt).round() # true positives
fp = (tp / (p + eps) - tp).round() # false positives
return tp, fp, p, r, f1, ap, unique_classes.astype(int)
def compute_ap(recall, precision):
"""Compute the average precision, given the recall and precision curves
# Arguments
recall: The recall curve (list)
precision: The precision curve (list)
# Returns
Average precision, precision curve, recall curve
"""
# Append sentinel values to beginning and end
mrec = np.concatenate(([0.0], recall, [1.0]))
mpre = np.concatenate(([1.0], precision, [0.0]))
# Compute the precision envelope
mpre = np.flip(np.maximum.accumulate(np.flip(mpre)))
# Integrate area under curve
method = "interp" # methods: 'continuous', 'interp'
if method == "interp":
x = np.linspace(0, 1, 101) # 101-point interp (COCO)
ap = np.trapz(np.interp(x, mrec, mpre), x) # integrate
else: # 'continuous'
i = np.where(mrec[1:] != mrec[:-1])[0] # points where x axis (recall) changes
ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1]) # area under curve
return ap, mpre, mrec
class ConfusionMatrix:
# Updated version of https://github.com/kaanakan/object_detection_confusion_matrix
def __init__(self, nc, conf=0.25, iou_thres=0.45):
"""Initializes ConfusionMatrix with given number of classes, confidence, and IoU threshold."""
self.matrix = np.zeros((nc + 1, nc + 1))
self.nc = nc # number of classes
self.conf = conf
self.iou_thres = iou_thres
def process_batch(self, detections, labels):
"""
Return intersection-over-union (Jaccard index) of boxes.
Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
Arguments:
detections (Array[N, 6]), x1, y1, x2, y2, conf, class
labels (Array[M, 5]), class, x1, y1, x2, y2
Returns:
None, updates confusion matrix accordingly
"""
if detections is None:
gt_classes = labels.int()
for gc in gt_classes:
self.matrix[self.nc, gc] += 1 # background FN
return
detections = detections[detections[:, 4] > self.conf]
gt_classes = labels[:, 0].int()
detection_classes = detections[:, 5].int()
iou = box_iou(labels[:, 1:], detections[:, :4])
x = torch.where(iou > self.iou_thres)
if x[0].shape[0]:
matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy()
if x[0].shape[0] > 1:
matches = matches[matches[:, 2].argsort()[::-1]]
matches = matches[np.unique(matches[:, 1], return_index=True)[1]]
matches = matches[matches[:, 2].argsort()[::-1]]
matches = matches[np.unique(matches[:, 0], return_index=True)[1]]
else:
matches = np.zeros((0, 3))
n = matches.shape[0] > 0
m0, m1, _ = matches.transpose().astype(int)
for i, gc in enumerate(gt_classes):
j = m0 == i
if n and sum(j) == 1:
self.matrix[detection_classes[m1[j]], gc] += 1 # correct
else:
self.matrix[self.nc, gc] += 1 # true background
if n:
for i, dc in enumerate(detection_classes):
if not any(m1 == i):
self.matrix[dc, self.nc] += 1 # predicted background
def tp_fp(self):
"""Calculates true positives (tp) and false positives (fp) excluding the background class from the confusion
matrix.
"""
tp = self.matrix.diagonal() # true positives
fp = self.matrix.sum(1) - tp # false positives
# fn = self.matrix.sum(0) - tp # false negatives (missed detections)
return tp[:-1], fp[:-1] # remove background class
@TryExcept("WARNING ⚠️ ConfusionMatrix plot failure")
def plot(self, normalize=True, save_dir="", names=()):
"""Plots confusion matrix using seaborn, optional normalization; can save plot to specified directory."""
import seaborn as sn
array = self.matrix / ((self.matrix.sum(0).reshape(1, -1) + 1e-9) if normalize else 1) # normalize columns
array[array < 0.005] = np.nan # don't annotate (would appear as 0.00)
fig, ax = plt.subplots(1, 1, figsize=(12, 9), tight_layout=True)
nc, nn = self.nc, len(names) # number of classes, names
sn.set(font_scale=1.0 if nc < 50 else 0.8) # for label size
labels = (0 < nn < 99) and (nn == nc) # apply names to ticklabels
ticklabels = (names + ["background"]) if labels else "auto"
with warnings.catch_warnings():
warnings.simplefilter("ignore") # suppress empty matrix RuntimeWarning: All-NaN slice encountered
sn.heatmap(
array,
ax=ax,
annot=nc < 30,
annot_kws={"size": 8},
cmap="Blues",
fmt=".2f",
square=True,
vmin=0.0,
xticklabels=ticklabels,
yticklabels=ticklabels,
).set_facecolor((1, 1, 1))
ax.set_xlabel("True")
ax.set_ylabel("Predicted")
ax.set_title("Confusion Matrix")
fig.savefig(Path(save_dir) / "confusion_matrix.png", dpi=250)
plt.close(fig)
def print(self):
"""Prints the confusion matrix row-wise, with each class and its predictions separated by spaces."""
for i in range(self.nc + 1):
print(" ".join(map(str, self.matrix[i])))
def bbox_iou(box1, box2, xywh=True, GIoU=False, DIoU=False, CIoU=False, eps=1e-7):
"""
Calculates IoU, GIoU, DIoU, or CIoU between two boxes, supporting xywh/xyxy formats.
Input shapes are box1(1,4) to box2(n,4).
"""
# Get the coordinates of bounding boxes
if xywh: # transform from xywh to xyxy
(x1, y1, w1, h1), (x2, y2, w2, h2) = box1.chunk(4, -1), box2.chunk(4, -1)
w1_, h1_, w2_, h2_ = w1 / 2, h1 / 2, w2 / 2, h2 / 2
b1_x1, b1_x2, b1_y1, b1_y2 = x1 - w1_, x1 + w1_, y1 - h1_, y1 + h1_
b2_x1, b2_x2, b2_y1, b2_y2 = x2 - w2_, x2 + w2_, y2 - h2_, y2 + h2_
else: # x1, y1, x2, y2 = box1
b1_x1, b1_y1, b1_x2, b1_y2 = box1.chunk(4, -1)
b2_x1, b2_y1, b2_x2, b2_y2 = box2.chunk(4, -1)
w1, h1 = b1_x2 - b1_x1, (b1_y2 - b1_y1).clamp(eps)
w2, h2 = b2_x2 - b2_x1, (b2_y2 - b2_y1).clamp(eps)
# Intersection area
inter = (b1_x2.minimum(b2_x2) - b1_x1.maximum(b2_x1)).clamp(0) * (
b1_y2.minimum(b2_y2) - b1_y1.maximum(b2_y1)
).clamp(0)
# Union Area
union = w1 * h1 + w2 * h2 - inter + eps
# IoU
iou = inter / union
if CIoU or DIoU or GIoU:
cw = b1_x2.maximum(b2_x2) - b1_x1.minimum(b2_x1) # convex (smallest enclosing box) width
ch = b1_y2.maximum(b2_y2) - b1_y1.minimum(b2_y1) # convex height
if CIoU or DIoU: # Distance or Complete IoU https://arxiv.org/abs/1911.08287v1
c2 = cw**2 + ch**2 + eps # convex diagonal squared
rho2 = ((b2_x1 + b2_x2 - b1_x1 - b1_x2) ** 2 + (b2_y1 + b2_y2 - b1_y1 - b1_y2) ** 2) / 4 # center dist ** 2
if CIoU: # https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47
v = (4 / math.pi**2) * (torch.atan(w2 / h2) - torch.atan(w1 / h1)).pow(2)
with torch.no_grad():
alpha = v / (v - iou + (1 + eps))
return iou - (rho2 / c2 + v * alpha) # CIoU
return iou - rho2 / c2 # DIoU
c_area = cw * ch + eps # convex area
return iou - (c_area - union) / c_area # GIoU https://arxiv.org/pdf/1902.09630.pdf
return iou # IoU
def box_iou(box1, box2, eps=1e-7):
# https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py
"""
Return intersection-over-union (Jaccard index) of boxes.
Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
Arguments:
box1 (Tensor[N, 4])
box2 (Tensor[M, 4])
Returns:
iou (Tensor[N, M]): the NxM matrix containing the pairwise
IoU values for every element in boxes1 and boxes2
"""
# inter(N,M) = (rb(N,M,2) - lt(N,M,2)).clamp(0).prod(2)
(a1, a2), (b1, b2) = box1.unsqueeze(1).chunk(2, 2), box2.unsqueeze(0).chunk(2, 2)
inter = (torch.min(a2, b2) - torch.max(a1, b1)).clamp(0).prod(2)
# IoU = inter / (area1 + area2 - inter)
return inter / ((a2 - a1).prod(2) + (b2 - b1).prod(2) - inter + eps)
def bbox_ioa(box1, box2, eps=1e-7):
"""
Returns the intersection over box2 area given box1, box2.
Boxes are x1y1x2y2
box1: np.array of shape(4)
box2: np.array of shape(nx4)
returns: np.array of shape(n)
"""
# Get the coordinates of bounding boxes
b1_x1, b1_y1, b1_x2, b1_y2 = box1
b2_x1, b2_y1, b2_x2, b2_y2 = box2.T
# Intersection area
inter_area = (np.minimum(b1_x2, b2_x2) - np.maximum(b1_x1, b2_x1)).clip(0) * (
np.minimum(b1_y2, b2_y2) - np.maximum(b1_y1, b2_y1)
).clip(0)
# box2 area
box2_area = (b2_x2 - b2_x1) * (b2_y2 - b2_y1) + eps
# Intersection over box2 area
return inter_area / box2_area
def wh_iou(wh1, wh2, eps=1e-7):
"""Calculates the Intersection over Union (IoU) for two sets of widths and heights; `wh1` and `wh2` should be nx2
and mx2 tensors.
"""
wh1 = wh1[:, None] # [N,1,2]
wh2 = wh2[None] # [1,M,2]
inter = torch.min(wh1, wh2).prod(2) # [N,M]
return inter / (wh1.prod(2) + wh2.prod(2) - inter + eps) # iou = inter / (area1 + area2 - inter)
# Plots ----------------------------------------------------------------------------------------------------------------
@threaded
def plot_pr_curve(px, py, ap, save_dir=Path("pr_curve.png"), names=()):
"""Plots precision-recall curve, optionally per class, saving to `save_dir`; `px`, `py` are lists, `ap` is Nx2
array, `names` optional.
"""
fig, ax = plt.subplots(1, 1, figsize=(9, 6), tight_layout=True)
py = np.stack(py, axis=1)
if 0 < len(names) < 21: # display per-class legend if < 21 classes
for i, y in enumerate(py.T):
ax.plot(px, y, linewidth=1, label=f"{names[i]} {ap[i, 0]:.3f}") # plot(recall, precision)
else:
ax.plot(px, py, linewidth=1, color="grey") # plot(recall, precision)
ax.plot(px, py.mean(1), linewidth=3, color="blue", label="all classes %.3f mAP@0.5" % ap[:, 0].mean())
ax.set_xlabel("Recall")
ax.set_ylabel("Precision")
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
ax.legend(bbox_to_anchor=(1.04, 1), loc="upper left")
ax.set_title("Precision-Recall Curve")
fig.savefig(save_dir, dpi=250)
plt.close(fig)
@threaded
def plot_mc_curve(px, py, save_dir=Path("mc_curve.png"), names=(), xlabel="Confidence", ylabel="Metric"):
"""Plots a metric-confidence curve for model predictions, supporting per-class visualization and smoothing."""
fig, ax = plt.subplots(1, 1, figsize=(9, 6), tight_layout=True)
if 0 < len(names) < 21: # display per-class legend if < 21 classes
for i, y in enumerate(py):
ax.plot(px, y, linewidth=1, label=f"{names[i]}") # plot(confidence, metric)
else:
ax.plot(px, py.T, linewidth=1, color="grey") # plot(confidence, metric)
y = smooth(py.mean(0), 0.05)
ax.plot(px, y, linewidth=3, color="blue", label=f"all classes {y.max():.2f} at {px[y.argmax()]:.3f}")
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
ax.legend(bbox_to_anchor=(1.04, 1), loc="upper left")
ax.set_title(f"{ylabel}-Confidence Curve")
fig.savefig(save_dir, dpi=250)
plt.close(fig)
# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license
"""Plotting utils."""
import contextlib
import math
import os
from copy import copy
from pathlib import Path
import cv2
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sn
import torch
from PIL import Image, ImageDraw
from scipy.ndimage.filters import gaussian_filter1d
from ultralytics.utils.plotting import Annotator
from utils import TryExcept, threaded
from utils.general import LOGGER, clip_boxes, increment_path, xywh2xyxy, xyxy2xywh
from utils.metrics import fitness
# Settings
RANK = int(os.getenv("RANK", -1))
matplotlib.rc("font", **{"size": 11})
matplotlib.use("Agg") # for writing to files only
class Colors:
# Ultralytics color palette https://ultralytics.com/
def __init__(self):
"""
Initializes the Colors class with a palette derived from Ultralytics color scheme, converting hex codes to RGB.
Colors derived from `hex = matplotlib.colors.TABLEAU_COLORS.values()`.
"""
hexs = (
"FF3838",
"FF9D97",
"FF701F",
"FFB21D",
"CFD231",
"48F90A",
"92CC17",
"3DDB86",
"1A9334",
"00D4BB",
"2C99A8",
"00C2FF",
"344593",
"6473FF",
"0018EC",
"8438FF",
"520085",
"CB38FF",
"FF95C8",
"FF37C7",
)
self.palette = [self.hex2rgb(f"#{c}") for c in hexs]
self.n = len(self.palette)
def __call__(self, i, bgr=False):
"""Returns color from palette by index `i`, in BGR format if `bgr=True`, else RGB; `i` is an integer index."""
c = self.palette[int(i) % self.n]
return (c[2], c[1], c[0]) if bgr else c
@staticmethod
def hex2rgb(h):
"""Converts hexadecimal color `h` to an RGB tuple (PIL-compatible) with order (R, G, B)."""
return tuple(int(h[1 + i : 1 + i + 2], 16) for i in (0, 2, 4))
colors = Colors() # create instance for 'from utils.plots import colors'
def feature_visualization(x, module_type, stage, n=32, save_dir=Path("runs/detect/exp")):
"""
x: Features to be visualized
module_type: Module type
stage: Module stage within model
n: Maximum number of feature maps to plot
save_dir: Directory to save results
"""
if ("Detect" not in module_type) and (
"Segment" not in module_type
): # 'Detect' for Object Detect task,'Segment' for Segment task
batch, channels, height, width = x.shape # batch, channels, height, width
if height > 1 and width > 1:
f = save_dir / f"stage{stage}_{module_type.split('.')[-1]}_features.png" # filename
blocks = torch.chunk(x[0].cpu(), channels, dim=0) # select batch index 0, block by channels
n = min(n, channels) # number of plots
fig, ax = plt.subplots(math.ceil(n / 8), 8, tight_layout=True) # 8 rows x n/8 cols
ax = ax.ravel()
plt.subplots_adjust(wspace=0.05, hspace=0.05)
for i in range(n):
ax[i].imshow(blocks[i].squeeze()) # cmap='gray'
ax[i].axis("off")
LOGGER.info(f"Saving {f}... ({n}/{channels})")
plt.savefig(f, dpi=300, bbox_inches="tight")
plt.close()
np.save(str(f.with_suffix(".npy")), x[0].cpu().numpy()) # npy save
def hist2d(x, y, n=100):
"""
Generates a logarithmic 2D histogram, useful for visualizing label or evolution distributions.
Used in used in labels.png and evolve.png.
"""
xedges, yedges = np.linspace(x.min(), x.max(), n), np.linspace(y.min(), y.max(), n)
hist, xedges, yedges = np.histogram2d(x, y, (xedges, yedges))
xidx = np.clip(np.digitize(x, xedges) - 1, 0, hist.shape[0] - 1)
yidx = np.clip(np.digitize(y, yedges) - 1, 0, hist.shape[1] - 1)
return np.log(hist[xidx, yidx])
def butter_lowpass_filtfilt(data, cutoff=1500, fs=50000, order=5):
"""Applies a low-pass Butterworth filter to `data` with specified `cutoff`, `fs`, and `order`."""
from scipy.signal import butter, filtfilt
# https://stackoverflow.com/questions/28536191/how-to-filter-smooth-with-scipy-numpy
def butter_lowpass(cutoff, fs, order):
nyq = 0.5 * fs
normal_cutoff = cutoff / nyq
return butter(order, normal_cutoff, btype="low", analog=False)
b, a = butter_lowpass(cutoff, fs, order=order)
return filtfilt(b, a, data) # forward-backward filter
def output_to_target(output, max_det=300):
"""Converts YOLOv5 model output to [batch_id, class_id, x, y, w, h, conf] format for plotting, limiting detections
to `max_det`.
"""
targets = []
for i, o in enumerate(output):
box, conf, cls = o[:max_det, :6].cpu().split((4, 1, 1), 1)
j = torch.full((conf.shape[0], 1), i)
targets.append(torch.cat((j, cls, xyxy2xywh(box), conf), 1))
return torch.cat(targets, 0).numpy()
@threaded
def plot_images(images, targets, paths=None, fname="images.jpg", names=None):
"""Plots an image grid with labels from YOLOv5 predictions or targets, saving to `fname`."""
if isinstance(images, torch.Tensor):
images = images.cpu().float().numpy()
if isinstance(targets, torch.Tensor):
targets = targets.cpu().numpy()
max_size = 1920 # max image size
max_subplots = 16 # max image subplots, i.e. 4x4
bs, _, h, w = images.shape # batch size, _, height, width
bs = min(bs, max_subplots) # limit plot images
ns = np.ceil(bs**0.5) # number of subplots (square)
if np.max(images[0]) <= 1:
images *= 255 # de-normalise (optional)
# Build Image
mosaic = np.full((int(ns * h), int(ns * w), 3), 255, dtype=np.uint8) # init
for i, im in enumerate(images):
if i == max_subplots: # if last batch has fewer images than we expect
break
x, y = int(w * (i // ns)), int(h * (i % ns)) # block origin
im = im.transpose(1, 2, 0)
mosaic[y : y + h, x : x + w, :] = im
# Resize (optional)
scale = max_size / ns / max(h, w)
if scale < 1:
h = math.ceil(scale * h)
w = math.ceil(scale * w)
mosaic = cv2.resize(mosaic, tuple(int(x * ns) for x in (w, h)))
# Annotate
fs = int((h + w) * ns * 0.01) # font size
annotator = Annotator(mosaic, line_width=round(fs / 10), font_size=fs, pil=True, example=names)
for i in range(i + 1):
x, y = int(w * (i // ns)), int(h * (i % ns)) # block origin
annotator.rectangle([x, y, x + w, y + h], None, (255, 255, 255), width=2) # borders
if paths:
annotator.text([x + 5, y + 5], text=Path(paths[i]).name[:40], txt_color=(220, 220, 220)) # filenames
if len(targets) > 0:
ti = targets[targets[:, 0] == i] # image targets
boxes = xywh2xyxy(ti[:, 2:6]).T
classes = ti[:, 1].astype("int")
labels = ti.shape[1] == 6 # labels if no conf column
conf = None if labels else ti[:, 6] # check for confidence presence (label vs pred)
if boxes.shape[1]:
if boxes.max() <= 1.01: # if normalized with tolerance 0.01
boxes[[0, 2]] *= w # scale to pixels
boxes[[1, 3]] *= h
elif scale < 1: # absolute coords need scale if image scales
boxes *= scale
boxes[[0, 2]] += x
boxes[[1, 3]] += y
for j, box in enumerate(boxes.T.tolist()):
cls = classes[j]
color = colors(cls)
cls = names[cls] if names else cls
if labels or conf[j] > 0.25: # 0.25 conf thresh
label = f"{cls}" if labels else f"{cls} {conf[j]:.1f}"
annotator.box_label(box, label, color=color)
annotator.im.save(fname) # save
def plot_lr_scheduler(optimizer, scheduler, epochs=300, save_dir=""):
"""Plots learning rate schedule for given optimizer and scheduler, saving plot to `save_dir`."""
optimizer, scheduler = copy(optimizer), copy(scheduler) # do not modify originals
y = []
for _ in range(epochs):
scheduler.step()
y.append(optimizer.param_groups[0]["lr"])
plt.plot(y, ".-", label="LR")
plt.xlabel("epoch")
plt.ylabel("LR")
plt.grid()
plt.xlim(0, epochs)
plt.ylim(0)
plt.savefig(Path(save_dir) / "LR.png", dpi=200)
plt.close()
def plot_val_txt():
"""
Plots 2D and 1D histograms of bounding box centers from 'val.txt' using matplotlib, saving as 'hist2d.png' and
'hist1d.png'.
Example: from utils.plots import *; plot_val()
"""
x = np.loadtxt("val.txt", dtype=np.float32)
box = xyxy2xywh(x[:, :4])
cx, cy = box[:, 0], box[:, 1]
fig, ax = plt.subplots(1, 1, figsize=(6, 6), tight_layout=True)
ax.hist2d(cx, cy, bins=600, cmax=10, cmin=0)
ax.set_aspect("equal")
plt.savefig("hist2d.png", dpi=300)
fig, ax = plt.subplots(1, 2, figsize=(12, 6), tight_layout=True)
ax[0].hist(cx, bins=600)
ax[1].hist(cy, bins=600)
plt.savefig("hist1d.png", dpi=200)
def plot_targets_txt():
"""
Plots histograms of object detection targets from 'targets.txt', saving the figure as 'targets.jpg'.
Example: from utils.plots import *; plot_targets_txt()
"""
x = np.loadtxt("targets.txt", dtype=np.float32).T
s = ["x targets", "y targets", "width targets", "height targets"]
fig, ax = plt.subplots(2, 2, figsize=(8, 8), tight_layout=True)
ax = ax.ravel()
for i in range(4):
ax[i].hist(x[i], bins=100, label=f"{x[i].mean():.3g} +/- {x[i].std():.3g}")
ax[i].legend()
ax[i].set_title(s[i])
plt.savefig("targets.jpg", dpi=200)
def plot_val_study(file="", dir="", x=None):
"""
Plots validation study results from 'study*.txt' files in a directory or a specific file, comparing model
performance and speed.
Example: from utils.plots import *; plot_val_study()
"""
save_dir = Path(file).parent if file else Path(dir)
plot2 = False # plot additional results
if plot2:
ax = plt.subplots(2, 4, figsize=(10, 6), tight_layout=True)[1].ravel()
fig2, ax2 = plt.subplots(1, 1, figsize=(8, 4), tight_layout=True)
# for f in [save_dir / f'study_coco_{x}.txt' for x in ['yolov5n6', 'yolov5s6', 'yolov5m6', 'yolov5l6', 'yolov5x6']]:
for f in sorted(save_dir.glob("study*.txt")):
y = np.loadtxt(f, dtype=np.float32, usecols=[0, 1, 2, 3, 7, 8, 9], ndmin=2).T
x = np.arange(y.shape[1]) if x is None else np.array(x)
if plot2:
s = ["P", "R", "mAP@.5", "mAP@.5:.95", "t_preprocess (ms/img)", "t_inference (ms/img)", "t_NMS (ms/img)"]
for i in range(7):
ax[i].plot(x, y[i], ".-", linewidth=2, markersize=8)
ax[i].set_title(s[i])
j = y[3].argmax() + 1
ax2.plot(
y[5, 1:j],
y[3, 1:j] * 1e2,
".-",
linewidth=2,
markersize=8,
label=f.stem.replace("study_coco_", "").replace("yolo", "YOLO"),
)
ax2.plot(
1e3 / np.array([209, 140, 97, 58, 35, 18]),
[34.6, 40.5, 43.0, 47.5, 49.7, 51.5],
"k.-",
linewidth=2,
markersize=8,
alpha=0.25,
label="EfficientDet",
)
ax2.grid(alpha=0.2)
ax2.set_yticks(np.arange(20, 60, 5))
ax2.set_xlim(0, 57)
ax2.set_ylim(25, 55)
ax2.set_xlabel("GPU Speed (ms/img)")
ax2.set_ylabel("COCO AP val")
ax2.legend(loc="lower right")
f = save_dir / "study.png"
print(f"Saving {f}...")
plt.savefig(f, dpi=300)
@TryExcept() # known issue https://github.com/ultralytics/yolov5/issues/5395
def plot_labels(labels, names=(), save_dir=Path("")):
"""Plots dataset labels, saving correlogram and label images, handles classes, and visualizes bounding boxes."""
LOGGER.info(f"Plotting labels to {save_dir / 'labels.jpg'}... ")
c, b = labels[:, 0], labels[:, 1:].transpose() # classes, boxes
nc = int(c.max() + 1) # number of classes
x = pd.DataFrame(b.transpose(), columns=["x", "y", "width", "height"])
# seaborn correlogram
sn.pairplot(x, corner=True, diag_kind="auto", kind="hist", diag_kws=dict(bins=50), plot_kws=dict(pmax=0.9))
plt.savefig(save_dir / "labels_correlogram.jpg", dpi=200)
plt.close()
# matplotlib labels
matplotlib.use("svg") # faster
ax = plt.subplots(2, 2, figsize=(8, 8), tight_layout=True)[1].ravel()
y = ax[0].hist(c, bins=np.linspace(0, nc, nc + 1) - 0.5, rwidth=0.8)
with contextlib.suppress(Exception): # color histogram bars by class
[y[2].patches[i].set_color([x / 255 for x in colors(i)]) for i in range(nc)] # known issue #3195
ax[0].set_ylabel("instances")
if 0 < len(names) < 30:
ax[0].set_xticks(range(len(names)))
ax[0].set_xticklabels(list(names.values()), rotation=90, fontsize=10)
else:
ax[0].set_xlabel("classes")
sn.histplot(x, x="x", y="y", ax=ax[2], bins=50, pmax=0.9)
sn.histplot(x, x="width", y="height", ax=ax[3], bins=50, pmax=0.9)
# rectangles
labels[:, 1:3] = 0.5 # center
labels[:, 1:] = xywh2xyxy(labels[:, 1:]) * 2000
img = Image.fromarray(np.ones((2000, 2000, 3), dtype=np.uint8) * 255)
for cls, *box in labels[:1000]:
ImageDraw.Draw(img).rectangle(box, width=1, outline=colors(cls)) # plot
ax[1].imshow(img)
ax[1].axis("off")
for a in [0, 1, 2, 3]:
for s in ["top", "right", "left", "bottom"]:
ax[a].spines[s].set_visible(False)
plt.savefig(save_dir / "labels.jpg", dpi=200)
matplotlib.use("Agg")
plt.close()
def imshow_cls(im, labels=None, pred=None, names=None, nmax=25, verbose=False, f=Path("images.jpg")):
"""Displays a grid of images with optional labels and predictions, saving to a file."""
from utils.augmentations import denormalize
names = names or [f"class{i}" for i in range(1000)]
blocks = torch.chunk(
denormalize(im.clone()).cpu().float(), len(im), dim=0
) # select batch index 0, block by channels
n = min(len(blocks), nmax) # number of plots
m = min(8, round(n**0.5)) # 8 x 8 default
fig, ax = plt.subplots(math.ceil(n / m), m) # 8 rows x n/8 cols
ax = ax.ravel() if m > 1 else [ax]
# plt.subplots_adjust(wspace=0.05, hspace=0.05)
for i in range(n):
ax[i].imshow(blocks[i].squeeze().permute((1, 2, 0)).numpy().clip(0.0, 1.0))
ax[i].axis("off")
if labels is not None:
s = names[labels[i]] + (f"—{names[pred[i]]}" if pred is not None else "")
ax[i].set_title(s, fontsize=8, verticalalignment="top")
plt.savefig(f, dpi=300, bbox_inches="tight")
plt.close()
if verbose:
LOGGER.info(f"Saving {f}")
if labels is not None:
LOGGER.info("True: " + " ".join(f"{names[i]:3s}" for i in labels[:nmax]))
if pred is not None:
LOGGER.info("Predicted:" + " ".join(f"{names[i]:3s}" for i in pred[:nmax]))
return f
def plot_evolve(evolve_csv="path/to/evolve.csv"):
"""
Plots hyperparameter evolution results from a given CSV, saving the plot and displaying best results.
Example: from utils.plots import *; plot_evolve()
"""
evolve_csv = Path(evolve_csv)
data = pd.read_csv(evolve_csv)
keys = [x.strip() for x in data.columns]
x = data.values
f = fitness(x)
j = np.argmax(f) # max fitness index
plt.figure(figsize=(10, 12), tight_layout=True)
matplotlib.rc("font", **{"size": 8})
print(f"Best results from row {j} of {evolve_csv}:")
for i, k in enumerate(keys[7:]):
v = x[:, 7 + i]
mu = v[j] # best single result
plt.subplot(6, 5, i + 1)
plt.scatter(v, f, c=hist2d(v, f, 20), cmap="viridis", alpha=0.8, edgecolors="none")
plt.plot(mu, f.max(), "k+", markersize=15)
plt.title(f"{k} = {mu:.3g}", fontdict={"size": 9}) # limit to 40 characters
if i % 5 != 0:
plt.yticks([])
print(f"{k:>15}: {mu:.3g}")
f = evolve_csv.with_suffix(".png") # filename
plt.savefig(f, dpi=200)
plt.close()
print(f"Saved {f}")
def plot_results(file="path/to/results.csv", dir=""):
"""
Plots training results from a 'results.csv' file; accepts file path and directory as arguments.
Example: from utils.plots import *; plot_results('path/to/results.csv')
"""
save_dir = Path(file).parent if file else Path(dir)
fig, ax = plt.subplots(2, 5, figsize=(12, 6), tight_layout=True)
ax = ax.ravel()
files = list(save_dir.glob("results*.csv"))
assert len(files), f"No results.csv files found in {save_dir.resolve()}, nothing to plot."
for f in files:
try:
data = pd.read_csv(f)
s = [x.strip() for x in data.columns]
x = data.values[:, 0]
for i, j in enumerate([1, 2, 3, 4, 5, 8, 9, 10, 6, 7]):
y = data.values[:, j].astype("float")
# y[y == 0] = np.nan # don't show zero values
ax[i].plot(x, y, marker=".", label=f.stem, linewidth=2, markersize=8) # actual results
ax[i].plot(x, gaussian_filter1d(y, sigma=3), ":", label="smooth", linewidth=2) # smoothing line
ax[i].set_title(s[j], fontsize=12)
# if j in [8, 9, 10]: # share train and val loss y axes
# ax[i].get_shared_y_axes().join(ax[i], ax[i - 5])
except Exception as e:
LOGGER.info(f"Warning: Plotting error for {f}: {e}")
ax[1].legend()
fig.savefig(save_dir / "results.png", dpi=200)
plt.close()
def profile_idetection(start=0, stop=0, labels=(), save_dir=""):
"""
Plots per-image iDetection logs, comparing metrics like storage and performance over time.
Example: from utils.plots import *; profile_idetection()
"""
ax = plt.subplots(2, 4, figsize=(12, 6), tight_layout=True)[1].ravel()
s = ["Images", "Free Storage (GB)", "RAM Usage (GB)", "Battery", "dt_raw (ms)", "dt_smooth (ms)", "real-world FPS"]
files = list(Path(save_dir).glob("frames*.txt"))
for fi, f in enumerate(files):
try:
results = np.loadtxt(f, ndmin=2).T[:, 90:-30] # clip first and last rows
n = results.shape[1] # number of rows
x = np.arange(start, min(stop, n) if stop else n)
results = results[:, x]
t = results[0] - results[0].min() # set t0=0s
results[0] = x
for i, a in enumerate(ax):
if i < len(results):
label = labels[fi] if len(labels) else f.stem.replace("frames_", "")
a.plot(t, results[i], marker=".", label=label, linewidth=1, markersize=5)
a.set_title(s[i])
a.set_xlabel("time (s)")
# if fi == len(files) - 1:
# a.set_ylim(bottom=0)
for side in ["top", "right"]:
a.spines[side].set_visible(False)
else:
a.remove()
except Exception as e:
print(f"Warning: Plotting error for {f}; {e}")
ax[1].legend()
plt.savefig(Path(save_dir) / "idetection_profile.png", dpi=200)
def save_one_box(xyxy, im, file=Path("im.jpg"), gain=1.02, pad=10, square=False, BGR=False, save=True):
"""Crops and saves an image from bounding box `xyxy`, applied with `gain` and `pad`, optionally squares and adjusts
for BGR.
"""
xyxy = torch.tensor(xyxy).view(-1, 4)
b = xyxy2xywh(xyxy) # boxes
if square:
b[:, 2:] = b[:, 2:].max(1)[0].unsqueeze(1) # attempt rectangle to square
b[:, 2:] = b[:, 2:] * gain + pad # box wh * gain + pad
xyxy = xywh2xyxy(b).long()
clip_boxes(xyxy, im.shape)
crop = im[int(xyxy[0, 1]) : int(xyxy[0, 3]), int(xyxy[0, 0]) : int(xyxy[0, 2]), :: (1 if BGR else -1)]
if save:
file.parent.mkdir(parents=True, exist_ok=True) # make directory
f = str(increment_path(file).with_suffix(".jpg"))
# cv2.imwrite(f, crop) # save BGR, https://github.com/ultralytics/yolov5/issues/7007 chroma subsampling issue
Image.fromarray(crop[..., ::-1]).save(f, quality=95, subsampling=0) # save RGB
return crop
# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license
"""Image augmentation functions."""
import math
import random
import cv2
import numpy as np
from ..augmentations import box_candidates
from ..general import resample_segments, segment2box
def mixup(im, labels, segments, im2, labels2, segments2):
"""
Applies MixUp augmentation blending two images, labels, and segments with a random ratio.
See https://arxiv.org/pdf/1710.09412.pdf
"""
r = np.random.beta(32.0, 32.0) # mixup ratio, alpha=beta=32.0
im = (im * r + im2 * (1 - r)).astype(np.uint8)
labels = np.concatenate((labels, labels2), 0)
segments = np.concatenate((segments, segments2), 0)
return im, labels, segments
def random_perspective(
im, targets=(), segments=(), degrees=10, translate=0.1, scale=0.1, shear=10, perspective=0.0, border=(0, 0)
):
# torchvision.transforms.RandomAffine(degrees=(-10, 10), translate=(.1, .1), scale=(.9, 1.1), shear=(-10, 10))
# targets = [cls, xyxy]
height = im.shape[0] + border[0] * 2 # shape(h,w,c)
width = im.shape[1] + border[1] * 2
# Center
C = np.eye(3)
C[0, 2] = -im.shape[1] / 2 # x translation (pixels)
C[1, 2] = -im.shape[0] / 2 # y translation (pixels)
# Perspective
P = np.eye(3)
P[2, 0] = random.uniform(-perspective, perspective) # x perspective (about y)
P[2, 1] = random.uniform(-perspective, perspective) # y perspective (about x)
# Rotation and Scale
R = np.eye(3)
a = random.uniform(-degrees, degrees)
# a += random.choice([-180, -90, 0, 90]) # add 90deg rotations to small rotations
s = random.uniform(1 - scale, 1 + scale)
# s = 2 ** random.uniform(-scale, scale)
R[:2] = cv2.getRotationMatrix2D(angle=a, center=(0, 0), scale=s)
# Shear
S = np.eye(3)
S[0, 1] = math.tan(random.uniform(-shear, shear) * math.pi / 180) # x shear (deg)
S[1, 0] = math.tan(random.uniform(-shear, shear) * math.pi / 180) # y shear (deg)
# Translation
T = np.eye(3)
T[0, 2] = random.uniform(0.5 - translate, 0.5 + translate) * width # x translation (pixels)
T[1, 2] = random.uniform(0.5 - translate, 0.5 + translate) * height # y translation (pixels)
# Combined rotation matrix
M = T @ S @ R @ P @ C # order of operations (right to left) is IMPORTANT
if (border[0] != 0) or (border[1] != 0) or (M != np.eye(3)).any(): # image changed
if perspective:
im = cv2.warpPerspective(im, M, dsize=(width, height), borderValue=(114, 114, 114))
else: # affine
im = cv2.warpAffine(im, M[:2], dsize=(width, height), borderValue=(114, 114, 114))
# Visualize
# import matplotlib.pyplot as plt
# ax = plt.subplots(1, 2, figsize=(12, 6))[1].ravel()
# ax[0].imshow(im[:, :, ::-1]) # base
# ax[1].imshow(im2[:, :, ::-1]) # warped
# Transform label coordinates
n = len(targets)
new_segments = []
if n:
new = np.zeros((n, 4))
segments = resample_segments(segments) # upsample
for i, segment in enumerate(segments):
xy = np.ones((len(segment), 3))
xy[:, :2] = segment
xy = xy @ M.T # transform
xy = xy[:, :2] / xy[:, 2:3] if perspective else xy[:, :2] # perspective rescale or affine
# clip
new[i] = segment2box(xy, width, height)
new_segments.append(xy)
# filter candidates
i = box_candidates(box1=targets[:, 1:5].T * s, box2=new.T, area_thr=0.01)
targets = targets[i]
targets[:, 1:5] = new[i]
new_segments = np.array(new_segments)[i]
return im, targets, new_segments
# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license
"""Dataloaders."""
import os
import random
import cv2
import numpy as np
import torch
from torch.utils.data import DataLoader, distributed
from ..augmentations import augment_hsv, copy_paste, letterbox
from ..dataloaders import InfiniteDataLoader, LoadImagesAndLabels, SmartDistributedSampler, seed_worker
from ..general import LOGGER, xyn2xy, xywhn2xyxy, xyxy2xywhn
from ..torch_utils import torch_distributed_zero_first
from .augmentations import mixup, random_perspective
RANK = int(os.getenv("RANK", -1))
def create_dataloader(
path,
imgsz,
batch_size,
stride,
single_cls=False,
hyp=None,
augment=False,
cache=False,
pad=0.0,
rect=False,
rank=-1,
workers=8,
image_weights=False,
quad=False,
prefix="",
shuffle=False,
mask_downsample_ratio=1,
overlap_mask=False,
seed=0,
):
if rect and shuffle:
LOGGER.warning("WARNING ⚠️ --rect is incompatible with DataLoader shuffle, setting shuffle=False")
shuffle = False
with torch_distributed_zero_first(rank): # init dataset *.cache only once if DDP
dataset = LoadImagesAndLabelsAndMasks(
path,
imgsz,
batch_size,
augment=augment, # augmentation
hyp=hyp, # hyperparameters
rect=rect, # rectangular batches
cache_images=cache,
single_cls=single_cls,
stride=int(stride),
pad=pad,
image_weights=image_weights,
prefix=prefix,
downsample_ratio=mask_downsample_ratio,
overlap=overlap_mask,
rank=rank,
)
batch_size = min(batch_size, len(dataset))
nd = torch.cuda.device_count() # number of CUDA devices
nw = min([os.cpu_count() // max(nd, 1), batch_size if batch_size > 1 else 0, workers]) # number of workers
sampler = None if rank == -1 else SmartDistributedSampler(dataset, shuffle=shuffle)
loader = DataLoader if image_weights else InfiniteDataLoader # only DataLoader allows for attribute updates
generator = torch.Generator()
generator.manual_seed(6148914691236517205 + seed + RANK)
return loader(
dataset,
batch_size=batch_size,
shuffle=shuffle and sampler is None,
num_workers=nw,
sampler=sampler,
pin_memory=True,
collate_fn=LoadImagesAndLabelsAndMasks.collate_fn4 if quad else LoadImagesAndLabelsAndMasks.collate_fn,
worker_init_fn=seed_worker,
generator=generator,
), dataset
class LoadImagesAndLabelsAndMasks(LoadImagesAndLabels): # for training/testing
def __init__(
self,
path,
img_size=640,
batch_size=16,
augment=False,
hyp=None,
rect=False,
image_weights=False,
cache_images=False,
single_cls=False,
stride=32,
pad=0,
min_items=0,
prefix="",
downsample_ratio=1,
overlap=False,
rank=-1,
seed=0,
):
super().__init__(
path,
img_size,
batch_size,
augment,
hyp,
rect,
image_weights,
cache_images,
single_cls,
stride,
pad,
min_items,
prefix,
rank,
seed,
)
self.downsample_ratio = downsample_ratio
self.overlap = overlap
def __getitem__(self, index):
"""Returns a transformed item from the dataset at the specified index, handling indexing and image weighting."""
index = self.indices[index] # linear, shuffled, or image_weights
hyp = self.hyp
mosaic = self.mosaic and random.random() < hyp["mosaic"]
masks = []
if mosaic:
# Load mosaic
img, labels, segments = self.load_mosaic(index)
shapes = None
# MixUp augmentation
if random.random() < hyp["mixup"]:
img, labels, segments = mixup(img, labels, segments, *self.load_mosaic(random.randint(0, self.n - 1)))
else:
# Load image
img, (h0, w0), (h, w) = self.load_image(index)
# Letterbox
shape = self.batch_shapes[self.batch[index]] if self.rect else self.img_size # final letterboxed shape
img, ratio, pad = letterbox(img, shape, auto=False, scaleup=self.augment)
shapes = (h0, w0), ((h / h0, w / w0), pad) # for COCO mAP rescaling
labels = self.labels[index].copy()
# [array, array, ....], array.shape=(num_points, 2), xyxyxyxy
segments = self.segments[index].copy()
if len(segments):
for i_s in range(len(segments)):
segments[i_s] = xyn2xy(
segments[i_s],
ratio[0] * w,
ratio[1] * h,
padw=pad[0],
padh=pad[1],
)
if labels.size: # normalized xywh to pixel xyxy format
labels[:, 1:] = xywhn2xyxy(labels[:, 1:], ratio[0] * w, ratio[1] * h, padw=pad[0], padh=pad[1])
if self.augment:
img, labels, segments = random_perspective(
img,
labels,
segments=segments,
degrees=hyp["degrees"],
translate=hyp["translate"],
scale=hyp["scale"],
shear=hyp["shear"],
perspective=hyp["perspective"],
)
nl = len(labels) # number of labels
if nl:
labels[:, 1:5] = xyxy2xywhn(labels[:, 1:5], w=img.shape[1], h=img.shape[0], clip=True, eps=1e-3)
if self.overlap:
masks, sorted_idx = polygons2masks_overlap(
img.shape[:2], segments, downsample_ratio=self.downsample_ratio
)
masks = masks[None] # (640, 640) -> (1, 640, 640)
labels = labels[sorted_idx]
else:
masks = polygons2masks(img.shape[:2], segments, color=1, downsample_ratio=self.downsample_ratio)
masks = (
torch.from_numpy(masks)
if len(masks)
else torch.zeros(
1 if self.overlap else nl, img.shape[0] // self.downsample_ratio, img.shape[1] // self.downsample_ratio
)
)
# TODO: albumentations support
if self.augment:
# Albumentations
# there are some augmentation that won't change boxes and masks,
# so just be it for now.
img, labels = self.albumentations(img, labels)
nl = len(labels) # update after albumentations
# HSV color-space
augment_hsv(img, hgain=hyp["hsv_h"], sgain=hyp["hsv_s"], vgain=hyp["hsv_v"])
# Flip up-down
if random.random() < hyp["flipud"]:
img = np.flipud(img)
if nl:
labels[:, 2] = 1 - labels[:, 2]
masks = torch.flip(masks, dims=[1])
# Flip left-right
if random.random() < hyp["fliplr"]:
img = np.fliplr(img)
if nl:
labels[:, 1] = 1 - labels[:, 1]
masks = torch.flip(masks, dims=[2])
# Cutouts # labels = cutout(img, labels, p=0.5)
labels_out = torch.zeros((nl, 6))
if nl:
labels_out[:, 1:] = torch.from_numpy(labels)
# Convert
img = img.transpose((2, 0, 1))[::-1] # HWC to CHW, BGR to RGB
img = np.ascontiguousarray(img)
return (torch.from_numpy(img), labels_out, self.im_files[index], shapes, masks)
def load_mosaic(self, index):
"""Loads 1 image + 3 random images into a 4-image YOLOv5 mosaic, adjusting labels and segments accordingly."""
labels4, segments4 = [], []
s = self.img_size
yc, xc = (int(random.uniform(-x, 2 * s + x)) for x in self.mosaic_border) # mosaic center x, y
# 3 additional image indices
indices = [index] + random.choices(self.indices, k=3) # 3 additional image indices
for i, index in enumerate(indices):
# Load image
img, _, (h, w) = self.load_image(index)
# place img in img4
if i == 0: # top left
img4 = np.full((s * 2, s * 2, img.shape[2]), 114, dtype=np.uint8) # base image with 4 tiles
x1a, y1a, x2a, y2a = max(xc - w, 0), max(yc - h, 0), xc, yc # xmin, ymin, xmax, ymax (large image)
x1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, h # xmin, ymin, xmax, ymax (small image)
elif i == 1: # top right
x1a, y1a, x2a, y2a = xc, max(yc - h, 0), min(xc + w, s * 2), yc
x1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), h
elif i == 2: # bottom left
x1a, y1a, x2a, y2a = max(xc - w, 0), yc, xc, min(s * 2, yc + h)
x1b, y1b, x2b, y2b = w - (x2a - x1a), 0, w, min(y2a - y1a, h)
elif i == 3: # bottom right
x1a, y1a, x2a, y2a = xc, yc, min(xc + w, s * 2), min(s * 2, yc + h)
x1b, y1b, x2b, y2b = 0, 0, min(w, x2a - x1a), min(y2a - y1a, h)
img4[y1a:y2a, x1a:x2a] = img[y1b:y2b, x1b:x2b] # img4[ymin:ymax, xmin:xmax]
padw = x1a - x1b
padh = y1a - y1b
labels, segments = self.labels[index].copy(), self.segments[index].copy()
if labels.size:
labels[:, 1:] = xywhn2xyxy(labels[:, 1:], w, h, padw, padh) # normalized xywh to pixel xyxy format
segments = [xyn2xy(x, w, h, padw, padh) for x in segments]
labels4.append(labels)
segments4.extend(segments)
# Concat/clip labels
labels4 = np.concatenate(labels4, 0)
for x in (labels4[:, 1:], *segments4):
np.clip(x, 0, 2 * s, out=x) # clip when using random_perspective()
# img4, labels4 = replicate(img4, labels4) # replicate
# Augment
img4, labels4, segments4 = copy_paste(img4, labels4, segments4, p=self.hyp["copy_paste"])
img4, labels4, segments4 = random_perspective(
img4,
labels4,
segments4,
degrees=self.hyp["degrees"],
translate=self.hyp["translate"],
scale=self.hyp["scale"],
shear=self.hyp["shear"],
perspective=self.hyp["perspective"],
border=self.mosaic_border,
) # border to remove
return img4, labels4, segments4
@staticmethod
def collate_fn(batch):
"""Custom collation function for DataLoader, batches images, labels, paths, shapes, and segmentation masks."""
img, label, path, shapes, masks = zip(*batch) # transposed
batched_masks = torch.cat(masks, 0)
for i, l in enumerate(label):
l[:, 0] = i # add target image index for build_targets()
return torch.stack(img, 0), torch.cat(label, 0), path, shapes, batched_masks
def polygon2mask(img_size, polygons, color=1, downsample_ratio=1):
"""
Args:
img_size (tuple): The image size.
polygons (np.ndarray): [N, M], N is the number of polygons,
M is the number of points(Be divided by 2).
"""
mask = np.zeros(img_size, dtype=np.uint8)
polygons = np.asarray(polygons)
polygons = polygons.astype(np.int32)
shape = polygons.shape
polygons = polygons.reshape(shape[0], -1, 2)
cv2.fillPoly(mask, polygons, color=color)
nh, nw = (img_size[0] // downsample_ratio, img_size[1] // downsample_ratio)
# NOTE: fillPoly firstly then resize is trying the keep the same way
# of loss calculation when mask-ratio=1.
mask = cv2.resize(mask, (nw, nh))
return mask
def polygons2masks(img_size, polygons, color, downsample_ratio=1):
"""
Args:
img_size (tuple): The image size.
polygons (list[np.ndarray]): each polygon is [N, M],
N is the number of polygons,
M is the number of points(Be divided by 2).
"""
masks = []
for si in range(len(polygons)):
mask = polygon2mask(img_size, [polygons[si].reshape(-1)], color, downsample_ratio)
masks.append(mask)
return np.array(masks)
def polygons2masks_overlap(img_size, segments, downsample_ratio=1):
"""Return a (640, 640) overlap mask."""
masks = np.zeros(
(img_size[0] // downsample_ratio, img_size[1] // downsample_ratio),
dtype=np.int32 if len(segments) > 255 else np.uint8,
)
areas = []
ms = []
for si in range(len(segments)):
mask = polygon2mask(
img_size,
[segments[si].reshape(-1)],
downsample_ratio=downsample_ratio,
color=1,
)
ms.append(mask)
areas.append(mask.sum())
areas = np.asarray(areas)
index = np.argsort(-areas)
ms = np.array(ms)[index]
for i in range(len(segments)):
mask = ms[i] * (i + 1)
masks = masks + mask
masks = np.clip(masks, a_min=0, a_max=i + 1)
return masks, index
import cv2
import numpy as np
import torch
import torch.nn.functional as F
def crop_mask(masks, boxes):
"""
"Crop" predicted masks by zeroing out everything not in the predicted bbox. Vectorized by Chong (thanks Chong).
Args:
- masks should be a size [n, h, w] tensor of masks
- boxes should be a size [n, 4] tensor of bbox coords in relative point form
"""
n, h, w = masks.shape
x1, y1, x2, y2 = torch.chunk(boxes[:, :, None], 4, 1) # x1 shape(1,1,n)
r = torch.arange(w, device=masks.device, dtype=x1.dtype)[None, None, :] # rows shape(1,w,1)
c = torch.arange(h, device=masks.device, dtype=x1.dtype)[None, :, None] # cols shape(h,1,1)
return masks * ((r >= x1) * (r < x2) * (c >= y1) * (c < y2))
def process_mask_upsample(protos, masks_in, bboxes, shape):
"""
Crop after upsample.
protos: [mask_dim, mask_h, mask_w]
masks_in: [n, mask_dim], n is number of masks after nms
bboxes: [n, 4], n is number of masks after nms
shape: input_image_size, (h, w)
return: h, w, n
"""
c, mh, mw = protos.shape # CHW
masks = (masks_in @ protos.float().view(c, -1)).sigmoid().view(-1, mh, mw)
masks = F.interpolate(masks[None], shape, mode="bilinear", align_corners=False)[0] # CHW
masks = crop_mask(masks, bboxes) # CHW
return masks.gt_(0.5)
def process_mask(protos, masks_in, bboxes, shape, upsample=False):
"""
Crop before upsample.
proto_out: [mask_dim, mask_h, mask_w]
out_masks: [n, mask_dim], n is number of masks after nms
bboxes: [n, 4], n is number of masks after nms
shape:input_image_size, (h, w)
return: h, w, n
"""
c, mh, mw = protos.shape # CHW
ih, iw = shape
masks = (masks_in @ protos.float().view(c, -1)).sigmoid().view(-1, mh, mw) # CHW
downsampled_bboxes = bboxes.clone()
downsampled_bboxes[:, 0] *= mw / iw
downsampled_bboxes[:, 2] *= mw / iw
downsampled_bboxes[:, 3] *= mh / ih
downsampled_bboxes[:, 1] *= mh / ih
masks = crop_mask(masks, downsampled_bboxes) # CHW
if upsample:
masks = F.interpolate(masks[None], shape, mode="bilinear", align_corners=False)[0] # CHW
return masks.gt_(0.5)
def process_mask_native(protos, masks_in, bboxes, shape):
"""
Crop after upsample.
protos: [mask_dim, mask_h, mask_w]
masks_in: [n, mask_dim], n is number of masks after nms
bboxes: [n, 4], n is number of masks after nms
shape: input_image_size, (h, w)
return: h, w, n
"""
c, mh, mw = protos.shape # CHW
masks = (masks_in @ protos.float().view(c, -1)).sigmoid().view(-1, mh, mw)
gain = min(mh / shape[0], mw / shape[1]) # gain = old / new
pad = (mw - shape[1] * gain) / 2, (mh - shape[0] * gain) / 2 # wh padding
top, left = int(pad[1]), int(pad[0]) # y, x
bottom, right = int(mh - pad[1]), int(mw - pad[0])
masks = masks[:, top:bottom, left:right]
masks = F.interpolate(masks[None], shape, mode="bilinear", align_corners=False)[0] # CHW
masks = crop_mask(masks, bboxes) # CHW
return masks.gt_(0.5)
def scale_image(im1_shape, masks, im0_shape, ratio_pad=None):
"""
img1_shape: model input shape, [h, w]
img0_shape: origin pic shape, [h, w, 3]
masks: [h, w, num]
"""
# Rescale coordinates (xyxy) from im1_shape to im0_shape
if ratio_pad is None: # calculate from im0_shape
gain = min(im1_shape[0] / im0_shape[0], im1_shape[1] / im0_shape[1]) # gain = old / new
pad = (im1_shape[1] - im0_shape[1] * gain) / 2, (im1_shape[0] - im0_shape[0] * gain) / 2 # wh padding
else:
pad = ratio_pad[1]
top, left = int(pad[1]), int(pad[0]) # y, x
bottom, right = int(im1_shape[0] - pad[1]), int(im1_shape[1] - pad[0])
if len(masks.shape) < 2:
raise ValueError(f'"len of masks shape" should be 2 or 3, but got {len(masks.shape)}')
masks = masks[top:bottom, left:right]
# masks = masks.permute(2, 0, 1).contiguous()
# masks = F.interpolate(masks[None], im0_shape[:2], mode='bilinear', align_corners=False)[0]
# masks = masks.permute(1, 2, 0).contiguous()
masks = cv2.resize(masks, (im0_shape[1], im0_shape[0]))
if len(masks.shape) == 2:
masks = masks[:, :, None]
return masks
def mask_iou(mask1, mask2, eps=1e-7):
"""
mask1: [N, n] m1 means number of predicted objects
mask2: [M, n] m2 means number of gt objects
Note: n means image_w x image_h
return: masks iou, [N, M]
"""
intersection = torch.matmul(mask1, mask2.t()).clamp(0)
union = (mask1.sum(1)[:, None] + mask2.sum(1)[None]) - intersection # (area1 + area2) - intersection
return intersection / (union + eps)
def masks_iou(mask1, mask2, eps=1e-7):
"""
mask1: [N, n] m1 means number of predicted objects
mask2: [N, n] m2 means number of gt objects
Note: n means image_w x image_h
return: masks iou, (N, )
"""
intersection = (mask1 * mask2).sum(1).clamp(0) # (N, )
union = (mask1.sum(1) + mask2.sum(1))[None] - intersection # (area1 + area2) - intersection
return intersection / (union + eps)
def masks2segments(masks, strategy="largest"):
"""Converts binary (n,160,160) masks to polygon segments with options for concatenation or selecting the largest
segment.
"""
segments = []
for x in masks.int().cpu().numpy().astype("uint8"):
c = cv2.findContours(x, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[0]
if c:
if strategy == "concat": # concatenate all segments
c = np.concatenate([x.reshape(-1, 2) for x in c])
elif strategy == "largest": # select largest segment
c = np.array(c[np.array([len(x) for x in c]).argmax()]).reshape(-1, 2)
else:
c = np.zeros((0, 2)) # no segments found
segments.append(c.astype("float32"))
return segments
import torch
import torch.nn as nn
import torch.nn.functional as F
from ..general import xywh2xyxy
from ..loss import FocalLoss, smooth_BCE
from ..metrics import bbox_iou
from ..torch_utils import de_parallel
from .general import crop_mask
class ComputeLoss:
# Compute losses
def __init__(self, model, autobalance=False, overlap=False):
"""Initializes the compute loss function for YOLOv5 models with options for autobalancing and overlap
handling.
"""
self.sort_obj_iou = False
self.overlap = overlap
device = next(model.parameters()).device # get model device
h = model.hyp # hyperparameters
# Define criteria
BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h["cls_pw"]], device=device))
BCEobj = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h["obj_pw"]], device=device))
# Class label smoothing https://arxiv.org/pdf/1902.04103.pdf eqn 3
self.cp, self.cn = smooth_BCE(eps=h.get("label_smoothing", 0.0)) # positive, negative BCE targets
# Focal loss
g = h["fl_gamma"] # focal loss gamma
if g > 0:
BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)
m = de_parallel(model).model[-1] # Detect() module
self.balance = {3: [4.0, 1.0, 0.4]}.get(m.nl, [4.0, 1.0, 0.25, 0.06, 0.02]) # P3-P7
self.ssi = list(m.stride).index(16) if autobalance else 0 # stride 16 index
self.BCEcls, self.BCEobj, self.gr, self.hyp, self.autobalance = BCEcls, BCEobj, 1.0, h, autobalance
self.na = m.na # number of anchors
self.nc = m.nc # number of classes
self.nl = m.nl # number of layers
self.nm = m.nm # number of masks
self.anchors = m.anchors
self.device = device
def __call__(self, preds, targets, masks): # predictions, targets, model
"""Evaluates YOLOv5 model's loss for given predictions, targets, and masks; returns total loss components."""
p, proto = preds
bs, nm, mask_h, mask_w = proto.shape # batch size, number of masks, mask height, mask width
lcls = torch.zeros(1, device=self.device)
lbox = torch.zeros(1, device=self.device)
lobj = torch.zeros(1, device=self.device)
lseg = torch.zeros(1, device=self.device)
tcls, tbox, indices, anchors, tidxs, xywhn = self.build_targets(p, targets) # targets
# Losses
for i, pi in enumerate(p): # layer index, layer predictions
b, a, gj, gi = indices[i] # image, anchor, gridy, gridx
tobj = torch.zeros(pi.shape[:4], dtype=pi.dtype, device=self.device) # target obj
n = b.shape[0] # number of targets
if n:
pxy, pwh, _, pcls, pmask = pi[b, a, gj, gi].split((2, 2, 1, self.nc, nm), 1) # subset of predictions
# Box regression
pxy = pxy.sigmoid() * 2 - 0.5
pwh = (pwh.sigmoid() * 2) ** 2 * anchors[i]
pbox = torch.cat((pxy, pwh), 1) # predicted box
iou = bbox_iou(pbox, tbox[i], CIoU=True).squeeze() # iou(prediction, target)
lbox += (1.0 - iou).mean() # iou loss
# Objectness
iou = iou.detach().clamp(0).type(tobj.dtype)
if self.sort_obj_iou:
j = iou.argsort()
b, a, gj, gi, iou = b[j], a[j], gj[j], gi[j], iou[j]
if self.gr < 1:
iou = (1.0 - self.gr) + self.gr * iou
tobj[b, a, gj, gi] = iou # iou ratio
# Classification
if self.nc > 1: # cls loss (only if multiple classes)
t = torch.full_like(pcls, self.cn, device=self.device) # targets
t[range(n), tcls[i]] = self.cp
lcls += self.BCEcls(pcls, t) # BCE
# Mask regression
if tuple(masks.shape[-2:]) != (mask_h, mask_w): # downsample
masks = F.interpolate(masks[None], (mask_h, mask_w), mode="nearest")[0]
marea = xywhn[i][:, 2:].prod(1) # mask width, height normalized
mxyxy = xywh2xyxy(xywhn[i] * torch.tensor([mask_w, mask_h, mask_w, mask_h], device=self.device))
for bi in b.unique():
j = b == bi # matching index
if self.overlap:
mask_gti = torch.where(masks[bi][None] == tidxs[i][j].view(-1, 1, 1), 1.0, 0.0)
else:
mask_gti = masks[tidxs[i]][j]
lseg += self.single_mask_loss(mask_gti, pmask[j], proto[bi], mxyxy[j], marea[j])
obji = self.BCEobj(pi[..., 4], tobj)
lobj += obji * self.balance[i] # obj loss
if self.autobalance:
self.balance[i] = self.balance[i] * 0.9999 + 0.0001 / obji.detach().item()
if self.autobalance:
self.balance = [x / self.balance[self.ssi] for x in self.balance]
lbox *= self.hyp["box"]
lobj *= self.hyp["obj"]
lcls *= self.hyp["cls"]
lseg *= self.hyp["box"] / bs
loss = lbox + lobj + lcls + lseg
return loss * bs, torch.cat((lbox, lseg, lobj, lcls)).detach()
def single_mask_loss(self, gt_mask, pred, proto, xyxy, area):
"""Calculates and normalizes single mask loss for YOLOv5 between predicted and ground truth masks."""
pred_mask = (pred @ proto.view(self.nm, -1)).view(-1, *proto.shape[1:]) # (n,32) @ (32,80,80) -> (n,80,80)
loss = F.binary_cross_entropy_with_logits(pred_mask, gt_mask, reduction="none")
return (crop_mask(loss, xyxy).mean(dim=(1, 2)) / area).mean()
def build_targets(self, p, targets):
"""Prepares YOLOv5 targets for loss computation; inputs targets (image, class, x, y, w, h), output target
classes/boxes.
"""
na, nt = self.na, targets.shape[0] # number of anchors, targets
tcls, tbox, indices, anch, tidxs, xywhn = [], [], [], [], [], []
gain = torch.ones(8, device=self.device) # normalized to gridspace gain
ai = torch.arange(na, device=self.device).float().view(na, 1).repeat(1, nt) # same as .repeat_interleave(nt)
if self.overlap:
batch = p[0].shape[0]
ti = []
for i in range(batch):
num = (targets[:, 0] == i).sum() # find number of targets of each image
ti.append(torch.arange(num, device=self.device).float().view(1, num).repeat(na, 1) + 1) # (na, num)
ti = torch.cat(ti, 1) # (na, nt)
else:
ti = torch.arange(nt, device=self.device).float().view(1, nt).repeat(na, 1)
targets = torch.cat((targets.repeat(na, 1, 1), ai[..., None], ti[..., None]), 2) # append anchor indices
g = 0.5 # bias
off = (
torch.tensor(
[
[0, 0],
[1, 0],
[0, 1],
[-1, 0],
[0, -1], # j,k,l,m
# [1, 1], [1, -1], [-1, 1], [-1, -1], # jk,jm,lk,lm
],
device=self.device,
).float()
* g
) # offsets
for i in range(self.nl):
anchors, shape = self.anchors[i], p[i].shape
gain[2:6] = torch.tensor(shape)[[3, 2, 3, 2]] # xyxy gain
# Match targets to anchors
t = targets * gain # shape(3,n,7)
if nt:
# Matches
r = t[..., 4:6] / anchors[:, None] # wh ratio
j = torch.max(r, 1 / r).max(2)[0] < self.hyp["anchor_t"] # compare
# j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t'] # iou(3,n)=wh_iou(anchors(3,2), gwh(n,2))
t = t[j] # filter
# Offsets
gxy = t[:, 2:4] # grid xy
gxi = gain[[2, 3]] - gxy # inverse
j, k = ((gxy % 1 < g) & (gxy > 1)).T
l, m = ((gxi % 1 < g) & (gxi > 1)).T
j = torch.stack((torch.ones_like(j), j, k, l, m))
t = t.repeat((5, 1, 1))[j]
offsets = (torch.zeros_like(gxy)[None] + off[:, None])[j]
else:
t = targets[0]
offsets = 0
# Define
bc, gxy, gwh, at = t.chunk(4, 1) # (image, class), grid xy, grid wh, anchors
(a, tidx), (b, c) = at.long().T, bc.long().T # anchors, image, class
gij = (gxy - offsets).long()
gi, gj = gij.T # grid indices
# Append
indices.append((b, a, gj.clamp_(0, shape[2] - 1), gi.clamp_(0, shape[3] - 1))) # image, anchor, grid
tbox.append(torch.cat((gxy - gij, gwh), 1)) # box
anch.append(anchors[a]) # anchors
tcls.append(c) # class
tidxs.append(tidx)
xywhn.append(torch.cat((gxy, gwh), 1) / gain[2:6]) # xywh normalized
return tcls, tbox, indices, anch, tidxs, xywhn
# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license
"""Model validation metrics."""
import numpy as np
from ..metrics import ap_per_class
def fitness(x):
"""Evaluates model fitness by a weighted sum of 8 metrics, `x`: [N,8] array, weights: [0.1, 0.9] for mAP and F1."""
w = [0.0, 0.0, 0.1, 0.9, 0.0, 0.0, 0.1, 0.9]
return (x[:, :8] * w).sum(1)
def ap_per_class_box_and_mask(
tp_m,
tp_b,
conf,
pred_cls,
target_cls,
plot=False,
save_dir=".",
names=(),
):
"""
Args:
tp_b: tp of boxes.
tp_m: tp of masks.
other arguments see `func: ap_per_class`.
"""
results_boxes = ap_per_class(
tp_b, conf, pred_cls, target_cls, plot=plot, save_dir=save_dir, names=names, prefix="Box"
)[2:]
results_masks = ap_per_class(
tp_m, conf, pred_cls, target_cls, plot=plot, save_dir=save_dir, names=names, prefix="Mask"
)[2:]
return {
"boxes": {
"p": results_boxes[0],
"r": results_boxes[1],
"ap": results_boxes[3],
"f1": results_boxes[2],
"ap_class": results_boxes[4],
},
"masks": {
"p": results_masks[0],
"r": results_masks[1],
"ap": results_masks[3],
"f1": results_masks[2],
"ap_class": results_masks[4],
},
}
class Metric:
def __init__(self) -> None:
self.p = [] # (nc, )
self.r = [] # (nc, )
self.f1 = [] # (nc, )
self.all_ap = [] # (nc, 10)
self.ap_class_index = [] # (nc, )
@property
def ap50(self):
"""
AP@0.5 of all classes.
Return:
(nc, ) or [].
"""
return self.all_ap[:, 0] if len(self.all_ap) else []
@property
def ap(self):
"""AP@0.5:0.95
Return:
(nc, ) or [].
"""
return self.all_ap.mean(1) if len(self.all_ap) else []
@property
def mp(self):
"""
Mean precision of all classes.
Return:
float.
"""
return self.p.mean() if len(self.p) else 0.0
@property
def mr(self):
"""
Mean recall of all classes.
Return:
float.
"""
return self.r.mean() if len(self.r) else 0.0
@property
def map50(self):
"""
Mean AP@0.5 of all classes.
Return:
float.
"""
return self.all_ap[:, 0].mean() if len(self.all_ap) else 0.0
@property
def map(self):
"""
Mean AP@0.5:0.95 of all classes.
Return:
float.
"""
return self.all_ap.mean() if len(self.all_ap) else 0.0
def mean_results(self):
"""Mean of results, return mp, mr, map50, map."""
return (self.mp, self.mr, self.map50, self.map)
def class_result(self, i):
"""Class-aware result, return p[i], r[i], ap50[i], ap[i]"""
return (self.p[i], self.r[i], self.ap50[i], self.ap[i])
def get_maps(self, nc):
"""Calculates and returns mean Average Precision (mAP) for each class given number of classes `nc`."""
maps = np.zeros(nc) + self.map
for i, c in enumerate(self.ap_class_index):
maps[c] = self.ap[i]
return maps
def update(self, results):
"""
Args:
results: tuple(p, r, ap, f1, ap_class)
"""
p, r, all_ap, f1, ap_class_index = results
self.p = p
self.r = r
self.all_ap = all_ap
self.f1 = f1
self.ap_class_index = ap_class_index
class Metrics:
"""Metric for boxes and masks."""
def __init__(self) -> None:
self.metric_box = Metric()
self.metric_mask = Metric()
def update(self, results):
"""
Args:
results: Dict{'boxes': Dict{}, 'masks': Dict{}}
"""
self.metric_box.update(list(results["boxes"].values()))
self.metric_mask.update(list(results["masks"].values()))
def mean_results(self):
"""Computes and returns the mean results for both box and mask metrics by summing their individual means."""
return self.metric_box.mean_results() + self.metric_mask.mean_results()
def class_result(self, i):
"""Returns the sum of box and mask metric results for a specified class index `i`."""
return self.metric_box.class_result(i) + self.metric_mask.class_result(i)
def get_maps(self, nc):
"""Calculates and returns the sum of mean average precisions (mAPs) for both box and mask metrics for `nc`
classes.
"""
return self.metric_box.get_maps(nc) + self.metric_mask.get_maps(nc)
@property
def ap_class_index(self):
"""Returns the class index for average precision, shared by both box and mask metrics."""
return self.metric_box.ap_class_index
KEYS = [
"train/box_loss",
"train/seg_loss", # train loss
"train/obj_loss",
"train/cls_loss",
"metrics/precision(B)",
"metrics/recall(B)",
"metrics/mAP_0.5(B)",
"metrics/mAP_0.5:0.95(B)", # metrics
"metrics/precision(M)",
"metrics/recall(M)",
"metrics/mAP_0.5(M)",
"metrics/mAP_0.5:0.95(M)", # metrics
"val/box_loss",
"val/seg_loss", # val loss
"val/obj_loss",
"val/cls_loss",
"x/lr0",
"x/lr1",
"x/lr2",
]
BEST_KEYS = [
"best/epoch",
"best/precision(B)",
"best/recall(B)",
"best/mAP_0.5(B)",
"best/mAP_0.5:0.95(B)",
"best/precision(M)",
"best/recall(M)",
"best/mAP_0.5(M)",
"best/mAP_0.5:0.95(M)",
]
import contextlib
import math
from pathlib import Path
import cv2
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
from .. import threaded
from ..general import xywh2xyxy
from ..plots import Annotator, colors
@threaded
def plot_images_and_masks(images, targets, masks, paths=None, fname="images.jpg", names=None):
"""Plots a grid of images, their labels, and masks with optional resizing and annotations, saving to fname."""
if isinstance(images, torch.Tensor):
images = images.cpu().float().numpy()
if isinstance(targets, torch.Tensor):
targets = targets.cpu().numpy()
if isinstance(masks, torch.Tensor):
masks = masks.cpu().numpy().astype(int)
max_size = 1920 # max image size
max_subplots = 16 # max image subplots, i.e. 4x4
bs, _, h, w = images.shape # batch size, _, height, width
bs = min(bs, max_subplots) # limit plot images
ns = np.ceil(bs**0.5) # number of subplots (square)
if np.max(images[0]) <= 1:
images *= 255 # de-normalise (optional)
# Build Image
mosaic = np.full((int(ns * h), int(ns * w), 3), 255, dtype=np.uint8) # init
for i, im in enumerate(images):
if i == max_subplots: # if last batch has fewer images than we expect
break
x, y = int(w * (i // ns)), int(h * (i % ns)) # block origin
im = im.transpose(1, 2, 0)
mosaic[y : y + h, x : x + w, :] = im
# Resize (optional)
scale = max_size / ns / max(h, w)
if scale < 1:
h = math.ceil(scale * h)
w = math.ceil(scale * w)
mosaic = cv2.resize(mosaic, tuple(int(x * ns) for x in (w, h)))
# Annotate
fs = int((h + w) * ns * 0.01) # font size
annotator = Annotator(mosaic, line_width=round(fs / 10), font_size=fs, pil=True, example=names)
for i in range(i + 1):
x, y = int(w * (i // ns)), int(h * (i % ns)) # block origin
annotator.rectangle([x, y, x + w, y + h], None, (255, 255, 255), width=2) # borders
if paths:
annotator.text([x + 5, y + 5], text=Path(paths[i]).name[:40], txt_color=(220, 220, 220)) # filenames
if len(targets) > 0:
idx = targets[:, 0] == i
ti = targets[idx] # image targets
boxes = xywh2xyxy(ti[:, 2:6]).T
classes = ti[:, 1].astype("int")
labels = ti.shape[1] == 6 # labels if no conf column
conf = None if labels else ti[:, 6] # check for confidence presence (label vs pred)
if boxes.shape[1]:
if boxes.max() <= 1.01: # if normalized with tolerance 0.01
boxes[[0, 2]] *= w # scale to pixels
boxes[[1, 3]] *= h
elif scale < 1: # absolute coords need scale if image scales
boxes *= scale
boxes[[0, 2]] += x
boxes[[1, 3]] += y
for j, box in enumerate(boxes.T.tolist()):
cls = classes[j]
color = colors(cls)
cls = names[cls] if names else cls
if labels or conf[j] > 0.25: # 0.25 conf thresh
label = f"{cls}" if labels else f"{cls} {conf[j]:.1f}"
annotator.box_label(box, label, color=color)
# Plot masks
if len(masks):
if masks.max() > 1.0: # mean that masks are overlap
image_masks = masks[[i]] # (1, 640, 640)
nl = len(ti)
index = np.arange(nl).reshape(nl, 1, 1) + 1
image_masks = np.repeat(image_masks, nl, axis=0)
image_masks = np.where(image_masks == index, 1.0, 0.0)
else:
image_masks = masks[idx]
im = np.asarray(annotator.im).copy()
for j, box in enumerate(boxes.T.tolist()):
if labels or conf[j] > 0.25: # 0.25 conf thresh
color = colors(classes[j])
mh, mw = image_masks[j].shape
if mh != h or mw != w:
mask = image_masks[j].astype(np.uint8)
mask = cv2.resize(mask, (w, h))
mask = mask.astype(bool)
else:
mask = image_masks[j].astype(bool)
with contextlib.suppress(Exception):
im[y : y + h, x : x + w, :][mask] = (
im[y : y + h, x : x + w, :][mask] * 0.4 + np.array(color) * 0.6
)
annotator.fromarray(im)
annotator.im.save(fname) # save
def plot_results_with_masks(file="path/to/results.csv", dir="", best=True):
"""
Plots training results from CSV files, plotting best or last result highlights based on `best` parameter.
Example: from utils.plots import *; plot_results('path/to/results.csv')
"""
save_dir = Path(file).parent if file else Path(dir)
fig, ax = plt.subplots(2, 8, figsize=(18, 6), tight_layout=True)
ax = ax.ravel()
files = list(save_dir.glob("results*.csv"))
assert len(files), f"No results.csv files found in {save_dir.resolve()}, nothing to plot."
for f in files:
try:
data = pd.read_csv(f)
index = np.argmax(
0.9 * data.values[:, 8] + 0.1 * data.values[:, 7] + 0.9 * data.values[:, 12] + 0.1 * data.values[:, 11]
)
s = [x.strip() for x in data.columns]
x = data.values[:, 0]
for i, j in enumerate([1, 2, 3, 4, 5, 6, 9, 10, 13, 14, 15, 16, 7, 8, 11, 12]):
y = data.values[:, j]
# y[y == 0] = np.nan # don't show zero values
ax[i].plot(x, y, marker=".", label=f.stem, linewidth=2, markersize=2)
if best:
# best
ax[i].scatter(index, y[index], color="r", label=f"best:{index}", marker="*", linewidth=3)
ax[i].set_title(s[j] + f"\n{round(y[index], 5)}")
else:
# last
ax[i].scatter(x[-1], y[-1], color="r", label="last", marker="*", linewidth=3)
ax[i].set_title(s[j] + f"\n{round(y[-1], 5)}")
# if j in [8, 9, 10]: # share train and val loss y axes
# ax[i].get_shared_y_axes().join(ax[i], ax[i - 5])
except Exception as e:
print(f"Warning: Plotting error for {f}: {e}")
ax[1].legend()
fig.savefig(save_dir / "results.png", dpi=200)
plt.close()
# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license
"""PyTorch utils."""
import math
import os
import platform
import subprocess
import time
import warnings
from contextlib import contextmanager
from copy import deepcopy
from pathlib import Path
import torch
import torch.distributed as dist
import torch.nn as nn
import torch.nn.functional as F
from torch.nn.parallel import DistributedDataParallel as DDP
from utils.general import LOGGER, check_version, colorstr, file_date, git_describe
LOCAL_RANK = int(os.getenv("LOCAL_RANK", -1)) # https://pytorch.org/docs/stable/elastic/run.html
RANK = int(os.getenv("RANK", -1))
WORLD_SIZE = int(os.getenv("WORLD_SIZE", 1))
try:
import thop # for FLOPs computation
except ImportError:
thop = None
# Suppress PyTorch warnings
warnings.filterwarnings("ignore", message="User provided device_type of 'cuda', but CUDA is not available. Disabling")
warnings.filterwarnings("ignore", category=UserWarning)
def smart_inference_mode(torch_1_9=check_version(torch.__version__, "1.9.0")):
"""Applies torch.inference_mode() if torch>=1.9.0, else torch.no_grad() as a decorator for functions."""
def decorate(fn):
return (torch.inference_mode if torch_1_9 else torch.no_grad)()(fn)
return decorate
def smartCrossEntropyLoss(label_smoothing=0.0):
"""Returns a CrossEntropyLoss with optional label smoothing for torch>=1.10.0; warns if smoothing on lower
versions.
"""
if check_version(torch.__version__, "1.10.0"):
return nn.CrossEntropyLoss(label_smoothing=label_smoothing)
if label_smoothing > 0:
LOGGER.warning(f"WARNING ⚠️ label smoothing {label_smoothing} requires torch>=1.10.0")
return nn.CrossEntropyLoss()
def smart_DDP(model):
"""Initializes DistributedDataParallel (DDP) for model training, respecting torch version constraints."""
assert not check_version(torch.__version__, "1.12.0", pinned=True), (
"torch==1.12.0 torchvision==0.13.0 DDP training is not supported due to a known issue. "
"Please upgrade or downgrade torch to use DDP. See https://github.com/ultralytics/yolov5/issues/8395"
)
if check_version(torch.__version__, "1.11.0"):
return DDP(model, device_ids=[LOCAL_RANK], output_device=LOCAL_RANK, static_graph=True)
else:
return DDP(model, device_ids=[LOCAL_RANK], output_device=LOCAL_RANK)
def reshape_classifier_output(model, n=1000):
"""Reshapes last layer of model to match class count 'n', supporting Classify, Linear, Sequential types."""
from models.common import Classify
name, m = list((model.model if hasattr(model, "model") else model).named_children())[-1] # last module
if isinstance(m, Classify): # YOLOv5 Classify() head
if m.linear.out_features != n:
m.linear = nn.Linear(m.linear.in_features, n)
elif isinstance(m, nn.Linear): # ResNet, EfficientNet
if m.out_features != n:
setattr(model, name, nn.Linear(m.in_features, n))
elif isinstance(m, nn.Sequential):
types = [type(x) for x in m]
if nn.Linear in types:
i = types.index(nn.Linear) # nn.Linear index
if m[i].out_features != n:
m[i] = nn.Linear(m[i].in_features, n)
elif nn.Conv2d in types:
i = types.index(nn.Conv2d) # nn.Conv2d index
if m[i].out_channels != n:
m[i] = nn.Conv2d(m[i].in_channels, n, m[i].kernel_size, m[i].stride, bias=m[i].bias is not None)
@contextmanager
def torch_distributed_zero_first(local_rank: int):
"""Context manager ensuring ordered operations in distributed training by making all processes wait for the leading
process.
"""
if local_rank not in [-1, 0]:
dist.barrier(device_ids=[local_rank])
yield
if local_rank == 0:
dist.barrier(device_ids=[0])
def device_count():
"""Returns the number of available CUDA devices; works on Linux and Windows by invoking `nvidia-smi`."""
assert platform.system() in ("Linux", "Windows"), "device_count() only supported on Linux or Windows"
try:
cmd = "nvidia-smi -L | wc -l" if platform.system() == "Linux" else 'nvidia-smi -L | find /c /v ""' # Windows
return int(subprocess.run(cmd, shell=True, capture_output=True, check=True).stdout.decode().split()[-1])
except Exception:
return 0
def select_device(device="", batch_size=0, newline=True):
"""Selects computing device (CPU, CUDA GPU, MPS) for YOLOv5 model deployment, logging device info."""
s = f"YOLOv5 🚀 {git_describe() or file_date()} Python-{platform.python_version()} torch-{torch.__version__} "
device = str(device).strip().lower().replace("cuda:", "").replace("none", "") # to string, 'cuda:0' to '0'
cpu = device == "cpu"
mps = device == "mps" # Apple Metal Performance Shaders (MPS)
if cpu or mps:
os.environ["CUDA_VISIBLE_DEVICES"] = "-1" # force torch.cuda.is_available() = False
elif device: # non-cpu device requested
os.environ["CUDA_VISIBLE_DEVICES"] = device # set environment variable - must be before assert is_available()
assert torch.cuda.is_available() and torch.cuda.device_count() >= len(
device.replace(",", "")
), f"Invalid CUDA '--device {device}' requested, use '--device cpu' or pass valid CUDA device(s)"
if not cpu and not mps and torch.cuda.is_available(): # prefer GPU if available
devices = device.split(",") if device else "0" # range(torch.cuda.device_count()) # i.e. 0,1,6,7
n = len(devices) # device count
if n > 1 and batch_size > 0: # check batch_size is divisible by device_count
assert batch_size % n == 0, f"batch-size {batch_size} not multiple of GPU count {n}"
space = " " * (len(s) + 1)
for i, d in enumerate(devices):
p = torch.cuda.get_device_properties(i)
s += f"{'' if i == 0 else space}CUDA:{d} ({p.name}, {p.total_memory / (1 << 20):.0f}MiB)\n" # bytes to MB
arg = "cuda:0"
elif mps and getattr(torch, "has_mps", False) and torch.backends.mps.is_available(): # prefer MPS if available
s += "MPS\n"
arg = "mps"
else: # revert to CPU
s += "CPU\n"
arg = "cpu"
if not newline:
s = s.rstrip()
LOGGER.info(s)
return torch.device(arg)
def time_sync():
"""Synchronizes PyTorch for accurate timing, leveraging CUDA if available, and returns the current time."""
if torch.cuda.is_available():
torch.cuda.synchronize()
return time.time()
def profile(input, ops, n=10, device=None):
"""YOLOv5 speed/memory/FLOPs profiler
Usage:
input = torch.randn(16, 3, 640, 640)
m1 = lambda x: x * torch.sigmoid(x)
m2 = nn.SiLU()
profile(input, [m1, m2], n=100) # profile over 100 iterations
"""
results = []
if not isinstance(device, torch.device):
device = select_device(device)
print(
f"{'Params':>12s}{'GFLOPs':>12s}{'GPU_mem (GB)':>14s}{'forward (ms)':>14s}{'backward (ms)':>14s}"
f"{'input':>24s}{'output':>24s}"
)
for x in input if isinstance(input, list) else [input]:
x = x.to(device)
x.requires_grad = True
for m in ops if isinstance(ops, list) else [ops]:
m = m.to(device) if hasattr(m, "to") else m # device
m = m.half() if hasattr(m, "half") and isinstance(x, torch.Tensor) and x.dtype is torch.float16 else m
tf, tb, t = 0, 0, [0, 0, 0] # dt forward, backward
try:
flops = thop.profile(m, inputs=(x,), verbose=False)[0] / 1e9 * 2 # GFLOPs
except Exception:
flops = 0
try:
for _ in range(n):
t[0] = time_sync()
y = m(x)
t[1] = time_sync()
try:
_ = (sum(yi.sum() for yi in y) if isinstance(y, list) else y).sum().backward()
t[2] = time_sync()
except Exception: # no backward method
# print(e) # for debug
t[2] = float("nan")
tf += (t[1] - t[0]) * 1000 / n # ms per op forward
tb += (t[2] - t[1]) * 1000 / n # ms per op backward
mem = torch.cuda.memory_reserved() / 1e9 if torch.cuda.is_available() else 0 # (GB)
s_in, s_out = (tuple(x.shape) if isinstance(x, torch.Tensor) else "list" for x in (x, y)) # shapes
p = sum(x.numel() for x in m.parameters()) if isinstance(m, nn.Module) else 0 # parameters
print(f"{p:12}{flops:12.4g}{mem:>14.3f}{tf:14.4g}{tb:14.4g}{str(s_in):>24s}{str(s_out):>24s}")
results.append([p, flops, mem, tf, tb, s_in, s_out])
except Exception as e:
print(e)
results.append(None)
torch.cuda.empty_cache()
return results
def is_parallel(model):
"""Checks if the model is using Data Parallelism (DP) or Distributed Data Parallelism (DDP)."""
return type(model) in (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel)
def de_parallel(model):
"""Returns a single-GPU model by removing Data Parallelism (DP) or Distributed Data Parallelism (DDP) if applied."""
return model.module if is_parallel(model) else model
def initialize_weights(model):
"""Initializes weights of Conv2d, BatchNorm2d, and activations (Hardswish, LeakyReLU, ReLU, ReLU6, SiLU) in the
model.
"""
for m in model.modules():
t = type(m)
if t is nn.Conv2d:
pass # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
elif t is nn.BatchNorm2d:
m.eps = 1e-3
m.momentum = 0.03
elif t in [nn.Hardswish, nn.LeakyReLU, nn.ReLU, nn.ReLU6, nn.SiLU]:
m.inplace = True
def find_modules(model, mclass=nn.Conv2d):
"""Finds and returns list of layer indices in `model.module_list` matching the specified `mclass`."""
return [i for i, m in enumerate(model.module_list) if isinstance(m, mclass)]
def sparsity(model):
"""Calculates and returns the global sparsity of a model as the ratio of zero-valued parameters to total
parameters.
"""
a, b = 0, 0
for p in model.parameters():
a += p.numel()
b += (p == 0).sum()
return b / a
def prune(model, amount=0.3):
"""Prunes Conv2d layers in a model to a specified sparsity using L1 unstructured pruning."""
import torch.nn.utils.prune as prune
for name, m in model.named_modules():
if isinstance(m, nn.Conv2d):
prune.l1_unstructured(m, name="weight", amount=amount) # prune
prune.remove(m, "weight") # make permanent
LOGGER.info(f"Model pruned to {sparsity(model):.3g} global sparsity")
def fuse_conv_and_bn(conv, bn):
"""
Fuses Conv2d and BatchNorm2d layers into a single Conv2d layer.
See https://tehnokv.com/posts/fusing-batchnorm-and-conv/.
"""
fusedconv = (
nn.Conv2d(
conv.in_channels,
conv.out_channels,
kernel_size=conv.kernel_size,
stride=conv.stride,
padding=conv.padding,
dilation=conv.dilation,
groups=conv.groups,
bias=True,
)
.requires_grad_(False)
.to(conv.weight.device)
)
# Prepare filters
w_conv = conv.weight.clone().view(conv.out_channels, -1)
w_bn = torch.diag(bn.weight.div(torch.sqrt(bn.eps + bn.running_var)))
fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.shape))
# Prepare spatial bias
b_conv = torch.zeros(conv.weight.size(0), device=conv.weight.device) if conv.bias is None else conv.bias
b_bn = bn.bias - bn.weight.mul(bn.running_mean).div(torch.sqrt(bn.running_var + bn.eps))
fusedconv.bias.copy_(torch.mm(w_bn, b_conv.reshape(-1, 1)).reshape(-1) + b_bn)
return fusedconv
def model_info(model, verbose=False, imgsz=640):
"""
Prints model summary including layers, parameters, gradients, and FLOPs; imgsz may be int or list.
Example: img_size=640 or img_size=[640, 320]
"""
n_p = sum(x.numel() for x in model.parameters()) # number parameters
n_g = sum(x.numel() for x in model.parameters() if x.requires_grad) # number gradients
if verbose:
print(f"{'layer':>5} {'name':>40} {'gradient':>9} {'parameters':>12} {'shape':>20} {'mu':>10} {'sigma':>10}")
for i, (name, p) in enumerate(model.named_parameters()):
name = name.replace("module_list.", "")
print(
"%5g %40s %9s %12g %20s %10.3g %10.3g"
% (i, name, p.requires_grad, p.numel(), list(p.shape), p.mean(), p.std())
)
try: # FLOPs
p = next(model.parameters())
stride = max(int(model.stride.max()), 32) if hasattr(model, "stride") else 32 # max stride
im = torch.empty((1, p.shape[1], stride, stride), device=p.device) # input image in BCHW format
flops = thop.profile(deepcopy(model), inputs=(im,), verbose=False)[0] / 1e9 * 2 # stride GFLOPs
imgsz = imgsz if isinstance(imgsz, list) else [imgsz, imgsz] # expand if int/float
fs = f", {flops * imgsz[0] / stride * imgsz[1] / stride:.1f} GFLOPs" # 640x640 GFLOPs
except Exception:
fs = ""
name = Path(model.yaml_file).stem.replace("yolov5", "YOLOv5") if hasattr(model, "yaml_file") else "Model"
LOGGER.info(f"{name} summary: {len(list(model.modules()))} layers, {n_p} parameters, {n_g} gradients{fs}")
def scale_img(img, ratio=1.0, same_shape=False, gs=32): # img(16,3,256,416)
"""Scales an image tensor `img` of shape (bs,3,y,x) by `ratio`, optionally maintaining the original shape, padded to
multiples of `gs`.
"""
if ratio == 1.0:
return img
h, w = img.shape[2:]
s = (int(h * ratio), int(w * ratio)) # new size
img = F.interpolate(img, size=s, mode="bilinear", align_corners=False) # resize
if not same_shape: # pad/crop img
h, w = (math.ceil(x * ratio / gs) * gs for x in (h, w))
return F.pad(img, [0, w - s[1], 0, h - s[0]], value=0.447) # value = imagenet mean
def copy_attr(a, b, include=(), exclude=()):
"""Copies attributes from object b to a, optionally filtering with include and exclude lists."""
for k, v in b.__dict__.items():
if (len(include) and k not in include) or k.startswith("_") or k in exclude:
continue
else:
setattr(a, k, v)
def smart_optimizer(model, name="Adam", lr=0.001, momentum=0.9, decay=1e-5):
"""
Initializes YOLOv5 smart optimizer with 3 parameter groups for different decay configurations.
Groups are 0) weights with decay, 1) weights no decay, 2) biases no decay.
"""
g = [], [], [] # optimizer parameter groups
bn = tuple(v for k, v in nn.__dict__.items() if "Norm" in k) # normalization layers, i.e. BatchNorm2d()
for v in model.modules():
for p_name, p in v.named_parameters(recurse=0):
if p_name == "bias": # bias (no decay)
g[2].append(p)
elif p_name == "weight" and isinstance(v, bn): # weight (no decay)
g[1].append(p)
else:
g[0].append(p) # weight (with decay)
if name == "Adam":
optimizer = torch.optim.Adam(g[2], lr=lr, betas=(momentum, 0.999)) # adjust beta1 to momentum
elif name == "AdamW":
optimizer = torch.optim.AdamW(g[2], lr=lr, betas=(momentum, 0.999), weight_decay=0.0)
elif name == "RMSProp":
optimizer = torch.optim.RMSprop(g[2], lr=lr, momentum=momentum)
elif name == "SGD":
optimizer = torch.optim.SGD(g[2], lr=lr, momentum=momentum, nesterov=True)
else:
raise NotImplementedError(f"Optimizer {name} not implemented.")
optimizer.add_param_group({"params": g[0], "weight_decay": decay}) # add g0 with weight_decay
optimizer.add_param_group({"params": g[1], "weight_decay": 0.0}) # add g1 (BatchNorm2d weights)
LOGGER.info(
f"{colorstr('optimizer:')} {type(optimizer).__name__}(lr={lr}) with parameter groups "
f'{len(g[1])} weight(decay=0.0), {len(g[0])} weight(decay={decay}), {len(g[2])} bias'
)
return optimizer
def smart_hub_load(repo="ultralytics/yolov5", model="yolov5s", **kwargs):
"""YOLOv5 torch.hub.load() wrapper with smart error handling, adjusting torch arguments for compatibility."""
if check_version(torch.__version__, "1.9.1"):
kwargs["skip_validation"] = True # validation causes GitHub API rate limit errors
if check_version(torch.__version__, "1.12.0"):
kwargs["trust_repo"] = True # argument required starting in torch 0.12
try:
return torch.hub.load(repo, model, **kwargs)
except Exception:
return torch.hub.load(repo, model, force_reload=True, **kwargs)
def smart_resume(ckpt, optimizer, ema=None, weights="yolov5s.pt", epochs=300, resume=True):
"""Resumes training from a checkpoint, updating optimizer, ema, and epochs, with optional resume verification."""
best_fitness = 0.0
start_epoch = ckpt["epoch"] + 1
if ckpt["optimizer"] is not None:
optimizer.load_state_dict(ckpt["optimizer"]) # optimizer
best_fitness = ckpt["best_fitness"]
if ema and ckpt.get("ema"):
ema.ema.load_state_dict(ckpt["ema"].float().state_dict()) # EMA
ema.updates = ckpt["updates"]
if resume:
assert start_epoch > 0, (
f"{weights} training to {epochs} epochs is finished, nothing to resume.\n"
f"Start a new training without --resume, i.e. 'python train.py --weights {weights}'"
)
LOGGER.info(f"Resuming training from {weights} from epoch {start_epoch} to {epochs} total epochs")
if epochs < start_epoch:
LOGGER.info(f"{weights} has been trained for {ckpt['epoch']} epochs. Fine-tuning for {epochs} more epochs.")
epochs += ckpt["epoch"] # finetune additional epochs
return best_fitness, start_epoch, epochs
class EarlyStopping:
# YOLOv5 simple early stopper
def __init__(self, patience=30):
"""Initializes simple early stopping mechanism for YOLOv5, with adjustable patience for non-improving epochs."""
self.best_fitness = 0.0 # i.e. mAP
self.best_epoch = 0
self.patience = patience or float("inf") # epochs to wait after fitness stops improving to stop
self.possible_stop = False # possible stop may occur next epoch
def __call__(self, epoch, fitness):
"""Evaluates if training should stop based on fitness improvement and patience, returning a boolean."""
if fitness >= self.best_fitness: # >= 0 to allow for early zero-fitness stage of training
self.best_epoch = epoch
self.best_fitness = fitness
delta = epoch - self.best_epoch # epochs without improvement
self.possible_stop = delta >= (self.patience - 1) # possible stop may occur next epoch
stop = delta >= self.patience # stop training if patience exceeded
if stop:
LOGGER.info(
f"Stopping training early as no improvement observed in last {self.patience} epochs. "
f"Best results observed at epoch {self.best_epoch}, best model saved as best.pt.\n"
f"To update EarlyStopping(patience={self.patience}) pass a new patience value, "
f"i.e. `python train.py --patience 300` or use `--patience 0` to disable EarlyStopping."
)
return stop
class ModelEMA:
"""Updated Exponential Moving Average (EMA) from https://github.com/rwightman/pytorch-image-models
Keeps a moving average of everything in the model state_dict (parameters and buffers)
For EMA details see https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage
"""
def __init__(self, model, decay=0.9999, tau=2000, updates=0):
"""Initializes EMA with model parameters, decay rate, tau for decay adjustment, and update count; sets model to
evaluation mode.
"""
self.ema = deepcopy(de_parallel(model)).eval() # FP32 EMA
self.updates = updates # number of EMA updates
self.decay = lambda x: decay * (1 - math.exp(-x / tau)) # decay exponential ramp (to help early epochs)
for p in self.ema.parameters():
p.requires_grad_(False)
def update(self, model):
"""Updates the Exponential Moving Average (EMA) parameters based on the current model's parameters."""
self.updates += 1
d = self.decay(self.updates)
msd = de_parallel(model).state_dict() # model state_dict
for k, v in self.ema.state_dict().items():
if v.dtype.is_floating_point: # true for FP16 and FP32
v *= d
v += (1 - d) * msd[k].detach()
# assert v.dtype == msd[k].dtype == torch.float32, f'{k}: EMA {v.dtype} and model {msd[k].dtype} must be FP32'
def update_attr(self, model, include=(), exclude=("process_group", "reducer")):
"""Updates EMA attributes by copying specified attributes from model to EMA, excluding certain attributes by
default.
"""
copy_attr(self.ema, model, include, exclude)
# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license
"""Utils to interact with the Triton Inference Server."""
import typing
from urllib.parse import urlparse
import torch
class TritonRemoteModel:
"""
A wrapper over a model served by the Triton Inference Server.
It can be configured to communicate over GRPC or HTTP. It accepts Torch Tensors as input and returns them as
outputs.
"""
def __init__(self, url: str):
"""
Keyword arguments:
url: Fully qualified address of the Triton server - for e.g. grpc://localhost:8000
"""
parsed_url = urlparse(url)
if parsed_url.scheme == "grpc":
from tritonclient.grpc import InferenceServerClient, InferInput
self.client = InferenceServerClient(parsed_url.netloc) # Triton GRPC client
model_repository = self.client.get_model_repository_index()
self.model_name = model_repository.models[0].name
self.metadata = self.client.get_model_metadata(self.model_name, as_json=True)
def create_input_placeholders() -> typing.List[InferInput]:
return [
InferInput(i["name"], [int(s) for s in i["shape"]], i["datatype"]) for i in self.metadata["inputs"]
]
else:
from tritonclient.http import InferenceServerClient, InferInput
self.client = InferenceServerClient(parsed_url.netloc) # Triton HTTP client
model_repository = self.client.get_model_repository_index()
self.model_name = model_repository[0]["name"]
self.metadata = self.client.get_model_metadata(self.model_name)
def create_input_placeholders() -> typing.List[InferInput]:
return [
InferInput(i["name"], [int(s) for s in i["shape"]], i["datatype"]) for i in self.metadata["inputs"]
]
self._create_input_placeholders_fn = create_input_placeholders
@property
def runtime(self):
"""Returns the model runtime."""
return self.metadata.get("backend", self.metadata.get("platform"))
def __call__(self, *args, **kwargs) -> typing.Union[torch.Tensor, typing.Tuple[torch.Tensor, ...]]:
"""
Invokes the model.
Parameters can be provided via args or kwargs. args, if provided, are assumed to match the order of inputs of
the model. kwargs are matched with the model input names.
"""
inputs = self._create_inputs(*args, **kwargs)
response = self.client.infer(model_name=self.model_name, inputs=inputs)
result = []
for output in self.metadata["outputs"]:
tensor = torch.as_tensor(response.as_numpy(output["name"]))
result.append(tensor)
return result[0] if len(result) == 1 else result
def _create_inputs(self, *args, **kwargs):
"""Creates input tensors from args or kwargs, not both; raises error if none or both are provided."""
args_len, kwargs_len = len(args), len(kwargs)
if not args_len and not kwargs_len:
raise RuntimeError("No inputs provided.")
if args_len and kwargs_len:
raise RuntimeError("Cannot specify args and kwargs at the same time")
placeholders = self._create_input_placeholders_fn()
if args_len:
if args_len != len(placeholders):
raise RuntimeError(f"Expected {len(placeholders)} inputs, got {args_len}.")
for input, value in zip(placeholders, args):
input.set_data_from_numpy(value.cpu().numpy())
else:
for input in placeholders:
value = kwargs[input.name]
input.set_data_from_numpy(value.cpu().numpy())
return placeholders
# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license
"""
Validate a trained YOLOv5 detection model on a detection dataset.
Usage:
$ python val.py --weights yolov5s.pt --data coco128.yaml --img 640
Usage - formats:
$ python val.py --weights yolov5s.pt # PyTorch
yolov5s.torchscript # TorchScript
yolov5s.onnx # ONNX Runtime or OpenCV DNN with --dnn
yolov5s_openvino_model # OpenVINO
yolov5s.engine # TensorRT
yolov5s.mlmodel # CoreML (macOS-only)
yolov5s_saved_model # TensorFlow SavedModel
yolov5s.pb # TensorFlow GraphDef
yolov5s.tflite # TensorFlow Lite
yolov5s_edgetpu.tflite # TensorFlow Edge TPU
yolov5s_paddle_model # PaddlePaddle
"""
import argparse
import json
import os
import subprocess
import sys
from pathlib import Path
import numpy as np
import torch
from tqdm import tqdm
FILE = Path(__file__).resolve()
ROOT = FILE.parents[0] # YOLOv5 root directory
if str(ROOT) not in sys.path:
sys.path.append(str(ROOT)) # add ROOT to PATH
ROOT = Path(os.path.relpath(ROOT, Path.cwd())) # relative
from models.common import DetectMultiBackend
from utils.callbacks import Callbacks
from utils.dataloaders import create_dataloader
from utils.general import (
LOGGER,
TQDM_BAR_FORMAT,
Profile,
check_dataset,
check_img_size,
check_requirements,
check_yaml,
coco80_to_coco91_class,
colorstr,
increment_path,
non_max_suppression,
print_args,
scale_boxes,
xywh2xyxy,
xyxy2xywh,
)
from utils.metrics import ConfusionMatrix, ap_per_class, box_iou
from utils.plots import output_to_target, plot_images, plot_val_study
from utils.torch_utils import select_device, smart_inference_mode
def save_one_txt(predn, save_conf, shape, file):
"""Saves one detection result to a txt file in normalized xywh format, optionally including confidence."""
gn = torch.tensor(shape)[[1, 0, 1, 0]] # normalization gain whwh
for *xyxy, conf, cls in predn.tolist():
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
line = (cls, *xywh, conf) if save_conf else (cls, *xywh) # label format
with open(file, "a") as f:
f.write(("%g " * len(line)).rstrip() % line + "\n")
def save_one_json(predn, jdict, path, class_map):
"""
Saves one JSON detection result with image ID, category ID, bounding box, and score.
Example: {"image_id": 42, "category_id": 18, "bbox": [258.15, 41.29, 348.26, 243.78], "score": 0.236}
"""
image_id = int(path.stem) if path.stem.isnumeric() else path.stem
box = xyxy2xywh(predn[:, :4]) # xywh
box[:, :2] -= box[:, 2:] / 2 # xy center to top-left corner
for p, b in zip(predn.tolist(), box.tolist()):
jdict.append(
{
"image_id": image_id,
"category_id": class_map[int(p[5])],
"bbox": [round(x, 3) for x in b],
"score": round(p[4], 5),
}
)
def process_batch(detections, labels, iouv):
"""
Return correct prediction matrix.
Arguments:
detections (array[N, 6]), x1, y1, x2, y2, conf, class
labels (array[M, 5]), class, x1, y1, x2, y2
Returns:
correct (array[N, 10]), for 10 IoU levels
"""
correct = np.zeros((detections.shape[0], iouv.shape[0])).astype(bool)
iou = box_iou(labels[:, 1:], detections[:, :4])
correct_class = labels[:, 0:1] == detections[:, 5]
for i in range(len(iouv)):
x = torch.where((iou >= iouv[i]) & correct_class) # IoU > threshold and classes match
if x[0].shape[0]:
matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy() # [label, detect, iou]
if x[0].shape[0] > 1:
matches = matches[matches[:, 2].argsort()[::-1]]
matches = matches[np.unique(matches[:, 1], return_index=True)[1]]
# matches = matches[matches[:, 2].argsort()[::-1]]
matches = matches[np.unique(matches[:, 0], return_index=True)[1]]
correct[matches[:, 1].astype(int), i] = True
return torch.tensor(correct, dtype=torch.bool, device=iouv.device)
@smart_inference_mode()
def run(
data,
weights=None, # model.pt path(s)
batch_size=32, # batch size
imgsz=640, # inference size (pixels)
conf_thres=0.001, # confidence threshold
iou_thres=0.6, # NMS IoU threshold
max_det=300, # maximum detections per image
task="val", # train, val, test, speed or study
device="", # cuda device, i.e. 0 or 0,1,2,3 or cpu
workers=8, # max dataloader workers (per RANK in DDP mode)
single_cls=False, # treat as single-class dataset
augment=False, # augmented inference
verbose=False, # verbose output
save_txt=False, # save results to *.txt
save_hybrid=False, # save label+prediction hybrid results to *.txt
save_conf=False, # save confidences in --save-txt labels
save_json=False, # save a COCO-JSON results file
project=ROOT / "runs/val", # save to project/name
name="exp", # save to project/name
exist_ok=False, # existing project/name ok, do not increment
half=True, # use FP16 half-precision inference
dnn=False, # use OpenCV DNN for ONNX inference
model=None,
dataloader=None,
save_dir=Path(""),
plots=True,
callbacks=Callbacks(),
compute_loss=None,
):
# Initialize/load model and set device
training = model is not None
if training: # called by train.py
device, pt, jit, engine = next(model.parameters()).device, True, False, False # get model device, PyTorch model
half &= device.type != "cpu" # half precision only supported on CUDA
model.half() if half else model.float()
else: # called directly
device = select_device(device, batch_size=batch_size)
# Directories
save_dir = increment_path(Path(project) / name, exist_ok=exist_ok) # increment run
(save_dir / "labels" if save_txt else save_dir).mkdir(parents=True, exist_ok=True) # make dir
# Load model
model = DetectMultiBackend(weights, device=device, dnn=dnn, data=data, fp16=half)
stride, pt, jit, engine = model.stride, model.pt, model.jit, model.engine
imgsz = check_img_size(imgsz, s=stride) # check image size
half = model.fp16 # FP16 supported on limited backends with CUDA
if engine:
batch_size = model.batch_size
else:
device = model.device
if not (pt or jit):
batch_size = 1 # export.py models default to batch-size 1
LOGGER.info(f"Forcing --batch-size 1 square inference (1,3,{imgsz},{imgsz}) for non-PyTorch models")
# Data
data = check_dataset(data) # check
# Configure
model.eval()
cuda = device.type != "cpu"
is_coco = isinstance(data.get("val"), str) and data["val"].endswith(f"coco{os.sep}val2017.txt") # COCO dataset
nc = 1 if single_cls else int(data["nc"]) # number of classes
iouv = torch.linspace(0.5, 0.95, 10, device=device) # iou vector for mAP@0.5:0.95
niou = iouv.numel()
# Dataloader
if not training:
if pt and not single_cls: # check --weights are trained on --data
ncm = model.model.nc
assert ncm == nc, (
f"{weights} ({ncm} classes) trained on different --data than what you passed ({nc} "
f"classes). Pass correct combination of --weights and --data that are trained together."
)
model.warmup(imgsz=(1 if pt else batch_size, 3, imgsz, imgsz)) # warmup
pad, rect = (0.0, False) if task == "speed" else (0.5, pt) # square inference for benchmarks
task = task if task in ("train", "val", "test") else "val" # path to train/val/test images
dataloader = create_dataloader(
data[task],
imgsz,
batch_size,
stride,
single_cls,
pad=pad,
rect=rect,
workers=workers,
prefix=colorstr(f"{task}: "),
)[0]
seen = 0
confusion_matrix = ConfusionMatrix(nc=nc)
names = model.names if hasattr(model, "names") else model.module.names # get class names
if isinstance(names, (list, tuple)): # old format
names = dict(enumerate(names))
class_map = coco80_to_coco91_class() if is_coco else list(range(1000))
s = ("%22s" + "%11s" * 6) % ("Class", "Images", "Instances", "P", "R", "mAP50", "mAP50-95")
tp, fp, p, r, f1, mp, mr, map50, ap50, map = 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0
dt = Profile(device=device), Profile(device=device), Profile(device=device) # profiling times
loss = torch.zeros(3, device=device)
jdict, stats, ap, ap_class = [], [], [], []
callbacks.run("on_val_start")
pbar = tqdm(dataloader, desc=s, bar_format=TQDM_BAR_FORMAT) # progress bar
for batch_i, (im, targets, paths, shapes) in enumerate(pbar):
callbacks.run("on_val_batch_start")
with dt[0]:
if cuda:
im = im.to(device, non_blocking=True)
targets = targets.to(device)
im = im.half() if half else im.float() # uint8 to fp16/32
im /= 255 # 0 - 255 to 0.0 - 1.0
nb, _, height, width = im.shape # batch size, channels, height, width
# Inference
with dt[1]:
preds, train_out = model(im) if compute_loss else (model(im, augment=augment), None)
# Loss
if compute_loss:
loss += compute_loss(train_out, targets)[1] # box, obj, cls
# NMS
targets[:, 2:] *= torch.tensor((width, height, width, height), device=device) # to pixels
lb = [targets[targets[:, 0] == i, 1:] for i in range(nb)] if save_hybrid else [] # for autolabelling
with dt[2]:
preds = non_max_suppression(
preds, conf_thres, iou_thres, labels=lb, multi_label=True, agnostic=single_cls, max_det=max_det
)
# Metrics
for si, pred in enumerate(preds):
labels = targets[targets[:, 0] == si, 1:]
nl, npr = labels.shape[0], pred.shape[0] # number of labels, predictions
path, shape = Path(paths[si]), shapes[si][0]
correct = torch.zeros(npr, niou, dtype=torch.bool, device=device) # init
seen += 1
if npr == 0:
if nl:
stats.append((correct, *torch.zeros((2, 0), device=device), labels[:, 0]))
if plots:
confusion_matrix.process_batch(detections=None, labels=labels[:, 0])
continue
# Predictions
if single_cls:
pred[:, 5] = 0
predn = pred.clone()
scale_boxes(im[si].shape[1:], predn[:, :4], shape, shapes[si][1]) # native-space pred
# Evaluate
if nl:
tbox = xywh2xyxy(labels[:, 1:5]) # target boxes
scale_boxes(im[si].shape[1:], tbox, shape, shapes[si][1]) # native-space labels
labelsn = torch.cat((labels[:, 0:1], tbox), 1) # native-space labels
correct = process_batch(predn, labelsn, iouv)
if plots:
confusion_matrix.process_batch(predn, labelsn)
stats.append((correct, pred[:, 4], pred[:, 5], labels[:, 0])) # (correct, conf, pcls, tcls)
# Save/log
if save_txt:
(save_dir / "labels").mkdir(parents=True, exist_ok=True)
save_one_txt(predn, save_conf, shape, file=save_dir / "labels" / f"{path.stem}.txt")
if save_json:
save_one_json(predn, jdict, path, class_map) # append to COCO-JSON dictionary
callbacks.run("on_val_image_end", pred, predn, path, names, im[si])
# Plot images
if plots and batch_i < 3:
plot_images(im, targets, paths, save_dir / f"val_batch{batch_i}_labels.jpg", names) # labels
plot_images(im, output_to_target(preds), paths, save_dir / f"val_batch{batch_i}_pred.jpg", names) # pred
callbacks.run("on_val_batch_end", batch_i, im, targets, paths, shapes, preds)
# Compute metrics
stats = [torch.cat(x, 0).cpu().numpy() for x in zip(*stats)] # to numpy
if len(stats) and stats[0].any():
tp, fp, p, r, f1, ap, ap_class = ap_per_class(*stats, plot=plots, save_dir=save_dir, names=names)
ap50, ap = ap[:, 0], ap.mean(1) # AP@0.5, AP@0.5:0.95
mp, mr, map50, map = p.mean(), r.mean(), ap50.mean(), ap.mean()
nt = np.bincount(stats[3].astype(int), minlength=nc) # number of targets per class
# Print results
pf = "%22s" + "%11i" * 2 + "%11.3g" * 4 # print format
LOGGER.info(pf % ("all", seen, nt.sum(), mp, mr, map50, map))
if nt.sum() == 0:
LOGGER.warning(f"WARNING ⚠️ no labels found in {task} set, can not compute metrics without labels")
# Print results per class
if (verbose or (nc < 50 and not training)) and nc > 1 and len(stats):
for i, c in enumerate(ap_class):
LOGGER.info(pf % (names[c], seen, nt[c], p[i], r[i], ap50[i], ap[i]))
# Print speeds
t = tuple(x.t / seen * 1e3 for x in dt) # speeds per image
if not training:
shape = (batch_size, 3, imgsz, imgsz)
LOGGER.info(f"Speed: %.1fms pre-process, %.1fms inference, %.1fms NMS per image at shape {shape}" % t)
# Plots
if plots:
confusion_matrix.plot(save_dir=save_dir, names=list(names.values()))
callbacks.run("on_val_end", nt, tp, fp, p, r, f1, ap, ap50, ap_class, confusion_matrix)
# Save JSON
if save_json and len(jdict):
w = Path(weights[0] if isinstance(weights, list) else weights).stem if weights is not None else "" # weights
anno_json = str(Path("../datasets/coco/annotations/instances_val2017.json")) # annotations
if not os.path.exists(anno_json):
anno_json = os.path.join(data["path"], "annotations", "instances_val2017.json")
pred_json = str(save_dir / f"{w}_predictions.json") # predictions
LOGGER.info(f"\nEvaluating pycocotools mAP... saving {pred_json}...")
with open(pred_json, "w") as f:
json.dump(jdict, f)
try: # https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb
check_requirements("pycocotools>=2.0.6")
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
anno = COCO(anno_json) # init annotations api
pred = anno.loadRes(pred_json) # init predictions api
eval = COCOeval(anno, pred, "bbox")
if is_coco:
eval.params.imgIds = [int(Path(x).stem) for x in dataloader.dataset.im_files] # image IDs to evaluate
eval.evaluate()
eval.accumulate()
eval.summarize()
map, map50 = eval.stats[:2] # update results (mAP@0.5:0.95, mAP@0.5)
except Exception as e:
LOGGER.info(f"pycocotools unable to run: {e}")
# Return results
model.float() # for training
if not training:
s = f"\n{len(list(save_dir.glob('labels/*.txt')))} labels saved to {save_dir / 'labels'}" if save_txt else ""
LOGGER.info(f"Results saved to {colorstr('bold', save_dir)}{s}")
maps = np.zeros(nc) + map
for i, c in enumerate(ap_class):
maps[c] = ap[i]
return (mp, mr, map50, map, *(loss.cpu() / len(dataloader)).tolist()), maps, t
def parse_opt():
"""Parses command-line options for YOLOv5 model inference configuration."""
parser = argparse.ArgumentParser()
parser.add_argument("--data", type=str, default=ROOT / "data/coco128.yaml", help="dataset.yaml path")
parser.add_argument("--weights", nargs="+", type=str, default=ROOT / "yolov5s.pt", help="model path(s)")
parser.add_argument("--batch-size", type=int, default=32, help="batch size")
parser.add_argument("--imgsz", "--img", "--img-size", type=int, default=640, help="inference size (pixels)")
parser.add_argument("--conf-thres", type=float, default=0.001, help="confidence threshold")
parser.add_argument("--iou-thres", type=float, default=0.6, help="NMS IoU threshold")
parser.add_argument("--max-det", type=int, default=300, help="maximum detections per image")
parser.add_argument("--task", default="val", help="train, val, test, speed or study")
parser.add_argument("--device", default="", help="cuda device, i.e. 0 or 0,1,2,3 or cpu")
parser.add_argument("--workers", type=int, default=8, help="max dataloader workers (per RANK in DDP mode)")
parser.add_argument("--single-cls", action="store_true", help="treat as single-class dataset")
parser.add_argument("--augment", action="store_true", help="augmented inference")
parser.add_argument("--verbose", action="store_true", help="report mAP by class")
parser.add_argument("--save-txt", action="store_true", help="save results to *.txt")
parser.add_argument("--save-hybrid", action="store_true", help="save label+prediction hybrid results to *.txt")
parser.add_argument("--save-conf", action="store_true", help="save confidences in --save-txt labels")
parser.add_argument("--save-json", action="store_true", help="save a COCO-JSON results file")
parser.add_argument("--project", default=ROOT / "runs/val", help="save to project/name")
parser.add_argument("--name", default="exp", help="save to project/name")
parser.add_argument("--exist-ok", action="store_true", help="existing project/name ok, do not increment")
parser.add_argument("--half", action="store_true", help="use FP16 half-precision inference")
parser.add_argument("--dnn", action="store_true", help="use OpenCV DNN for ONNX inference")
opt = parser.parse_args()
opt.data = check_yaml(opt.data) # check YAML
opt.save_json |= opt.data.endswith("coco.yaml")
opt.save_txt |= opt.save_hybrid
print_args(vars(opt))
return opt
def main(opt):
"""Executes YOLOv5 tasks like training, validation, testing, speed, and study benchmarks based on provided
options.
"""
check_requirements(ROOT / "requirements.txt", exclude=("tensorboard", "thop"))
if opt.task in ("train", "val", "test"): # run normally
if opt.conf_thres > 0.001: # https://github.com/ultralytics/yolov5/issues/1466
LOGGER.info(f"WARNING ⚠️ confidence threshold {opt.conf_thres} > 0.001 produces invalid results")
if opt.save_hybrid:
LOGGER.info("WARNING ⚠️ --save-hybrid will return high mAP from hybrid labels, not from predictions alone")
run(**vars(opt))
else:
weights = opt.weights if isinstance(opt.weights, list) else [opt.weights]
opt.half = torch.cuda.is_available() and opt.device != "cpu" # FP16 for fastest results
if opt.task == "speed": # speed benchmarks
# python val.py --task speed --data coco.yaml --batch 1 --weights yolov5n.pt yolov5s.pt...
opt.conf_thres, opt.iou_thres, opt.save_json = 0.25, 0.45, False
for opt.weights in weights:
run(**vars(opt), plots=False)
elif opt.task == "study": # speed vs mAP benchmarks
# python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5n.pt yolov5s.pt...
for opt.weights in weights:
f = f"study_{Path(opt.data).stem}_{Path(opt.weights).stem}.txt" # filename to save to
x, y = list(range(256, 1536 + 128, 128)), [] # x axis (image sizes), y axis
for opt.imgsz in x: # img-size
LOGGER.info(f"\nRunning {f} --imgsz {opt.imgsz}...")
r, _, t = run(**vars(opt), plots=False)
y.append(r + t) # results and times
np.savetxt(f, y, fmt="%10.4g") # save
subprocess.run(["zip", "-r", "study.zip", "study_*.txt"])
plot_val_study(x=x) # plot
else:
raise NotImplementedError(f'--task {opt.task} not in ("train", "val", "test", "speed", "study")')
if __name__ == "__main__":
opt = parse_opt()
main(opt)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment