Commit af155c51 authored by chenzk's avatar chenzk
Browse files

v1.0

parents
Pipeline #2732 failed with stages
in 0 seconds
# Ultralytics YOLO 🚀, AGPL-3.0 license
# Builds ultralytics/ultralytics:latest-cpu image on DockerHub https://hub.docker.com/r/ultralytics/ultralytics
# Image is CPU-optimized for ONNX, OpenVINO and PyTorch YOLO11 deployments
# Use official Python base image for reproducibility (3.11.10 for export and 3.12.6 for inference)
FROM python:3.11.10-slim-bookworm
# Set environment variables
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1 \
PIP_BREAK_SYSTEM_PACKAGES=1
# Downloads to user config dir
ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.ttf \
https://github.com/ultralytics/assets/releases/download/v0.0.0/Arial.Unicode.ttf \
/root/.config/Ultralytics/
# Install linux packages
# g++ required to build 'tflite_support' and 'lap' packages, libusb-1.0-0 required for 'tflite_support' package
RUN apt-get update && \
apt-get install -y --no-install-recommends \
python3-pip git zip unzip wget curl htop libgl1 libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0 \
&& rm -rf /var/lib/apt/lists/*
# Create working directory
WORKDIR /ultralytics
# Copy contents and configure git
COPY . .
RUN sed -i '/^\[http "https:\/\/github\.com\/"\]/,+1d' .git/config
ADD https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt .
# Install pip packages
RUN python3 -m pip install --upgrade pip wheel
RUN pip install -e ".[export]" --extra-index-url https://download.pytorch.org/whl/cpu
# Run exports to AutoInstall packages
RUN yolo export model=tmp/yolo11n.pt format=edgetpu imgsz=32
RUN yolo export model=tmp/yolo11n.pt format=ncnn imgsz=32
# Requires Python<=3.10, bug with paddlepaddle==2.5.0 https://github.com/PaddlePaddle/X2Paddle/issues/991
RUN pip install "paddlepaddle>=2.6.0" x2paddle
# Remove extra build files
RUN rm -rf tmp /root/.config/Ultralytics/persistent_cache.json
# Usage Examples -------------------------------------------------------------------------------------------------------
# Build and Push
# t=ultralytics/ultralytics:latest-python && sudo docker build -f docker/Dockerfile-python -t $t . && sudo docker push $t
# Run
# t=ultralytics/ultralytics:latest-python && sudo docker run -it --ipc=host $t
# Pull and Run
# t=ultralytics/ultralytics:latest-python && sudo docker pull $t && sudo docker run -it --ipc=host $t
# Pull and Run with local volume mounted
# t=ultralytics/ultralytics:latest-python && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/shared/datasets:/datasets $t
# Ultralytics YOLO 🚀, AGPL-3.0 license
# Builds GitHub actions CI runner image for deployment to DockerHub https://hub.docker.com/r/ultralytics/ultralytics
# Image is CUDA-optimized for YOLO11 single/multi-GPU training and inference tests
# Start FROM Ultralytics GPU image
FROM ultralytics/ultralytics:latest
# Set environment variables
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1 \
PIP_BREAK_SYSTEM_PACKAGES=1 \
RUNNER_ALLOW_RUNASROOT=1 \
DEBIAN_FRONTEND=noninteractive
# Set the working directory
WORKDIR /actions-runner
# Download and unpack the latest runner from https://github.com/actions/runner
RUN FILENAME=actions-runner-linux-x64-2.320.0.tar.gz && \
curl -o $FILENAME -L https://github.com/actions/runner/releases/download/v2.320.0/$FILENAME && \
tar xzf $FILENAME && \
rm $FILENAME
# Install runner dependencies
RUN pip install pytest-cov
RUN ./bin/installdependencies.sh && \
apt-get -y install libicu-dev
# Inline ENTRYPOINT command to configure and start runner with default TOKEN and NAME
ENTRYPOINT sh -c './config.sh --url https://github.com/ultralytics/ultralytics \
--token ${GITHUB_RUNNER_TOKEN:-TOKEN} \
--name ${GITHUB_RUNNER_NAME:-NAME} \
--labels gpu-latest \
--replace && \
./run.sh'
# Usage Examples -------------------------------------------------------------------------------------------------------
# Build and Push
# t=ultralytics/ultralytics:latest-runner && sudo docker build -f docker/Dockerfile-runner -t $t . && sudo docker push $t
# Pull and Run in detached mode with access to GPUs 0 and 1
# t=ultralytics/ultralytics:latest-runner && sudo docker run -d -e GITHUB_RUNNER_TOKEN=TOKEN -e GITHUB_RUNNER_NAME=NAME --ipc=host --gpus '"device=0,1"' $t
docker run -it --shm-size=64G -v $PWD/yoloe:/home/yoloe -v /public/DL_DATA/AI:/home/AI -v /opt/hyhal:/opt/hyhal:ro --privileged=true --device=/dev/kfd --device=/dev/dri/ --group-add video --name ye 6063b673703a bash
# python -m torch.utils.collect_env
<br>
<a href="https://www.ultralytics.com/" target="_blank"><img src="https://raw.githubusercontent.com/ultralytics/assets/main/logo/Ultralytics_Logotype_Original.svg" width="320" alt="Ultralytics logo"></a>
# 📚 Ultralytics Docs
[Ultralytics](https://www.ultralytics.com/) Docs are the gateway to understanding and utilizing our cutting-edge machine learning tools. These documents are deployed to [https://docs.ultralytics.com](https://docs.ultralytics.com/) for your convenience.
[![pages-build-deployment](https://github.com/ultralytics/docs/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/ultralytics/docs/actions/workflows/pages/pages-build-deployment)
[![Check Broken links](https://github.com/ultralytics/docs/actions/workflows/links.yml/badge.svg)](https://github.com/ultralytics/docs/actions/workflows/links.yml)
[![Check Domains](https://github.com/ultralytics/docs/actions/workflows/check_domains.yml/badge.svg)](https://github.com/ultralytics/docs/actions/workflows/check_domains.yml)
[![Ultralytics Actions](https://github.com/ultralytics/docs/actions/workflows/format.yml/badge.svg)](https://github.com/ultralytics/docs/actions/workflows/format.yml)
<a href="https://discord.com/invite/ultralytics"><img alt="Discord" src="https://img.shields.io/discord/1089800235347353640?logo=discord&logoColor=white&label=Discord&color=blue"></a> <a href="https://community.ultralytics.com/"><img alt="Ultralytics Forums" src="https://img.shields.io/discourse/users?server=https%3A%2F%2Fcommunity.ultralytics.com&logo=discourse&label=Forums&color=blue"></a> <a href="https://reddit.com/r/ultralytics"><img alt="Ultralytics Reddit" src="https://img.shields.io/reddit/subreddit-subscribers/ultralytics?style=flat&logo=reddit&logoColor=white&label=Reddit&color=blue"></a>
## 🛠️ Installation
[![PyPI - Version](https://img.shields.io/pypi/v/ultralytics?logo=pypi&logoColor=white)](https://pypi.org/project/ultralytics/)
[![Downloads](https://static.pepy.tech/badge/ultralytics)](https://www.pepy.tech/projects/ultralytics)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ultralytics?logo=python&logoColor=gold)](https://pypi.org/project/ultralytics/)
To install the ultralytics package in developer mode, ensure you have Git and Python 3 installed on your system. Then, follow these steps:
1. Clone the ultralytics repository to your local machine using Git:
```bash
git clone https://github.com/ultralytics/ultralytics.git
```
2. Navigate to the cloned repository's root directory:
```bash
cd ultralytics
```
3. Install the package in developer mode using pip (or pip3 for Python 3):
```bash
pip install -e '.[dev]'
```
- This command installs the ultralytics package along with all development dependencies, allowing you to modify the package code and have the changes immediately reflected in your Python environment.
## 🚀 Building and Serving Locally
The `mkdocs serve` command builds and serves a local version of your MkDocs documentation, ideal for development and testing:
```bash
mkdocs serve
```
- #### Command Breakdown:
- `mkdocs` is the main MkDocs command-line interface.
- `serve` is the subcommand to build and locally serve your documentation.
- 🧐 Note:
- Grasp changes to the docs in real-time as `mkdocs serve` supports live reloading.
- To stop the local server, press `CTRL+C`.
## 🌍 Building and Serving Multi-Language
Supporting multi-language documentation? Follow these steps:
1. Stage all new language \*.md files with Git:
```bash
git add docs/**/*.md -f
```
2. Build all languages to the `/site` folder, ensuring relevant root-level files are present:
```bash
# Clear existing /site directory
rm -rf site
# Loop through each language config file and build
mkdocs build -f docs/mkdocs.yml
for file in docs/mkdocs_*.yml; do
echo "Building MkDocs site with $file"
mkdocs build -f "$file"
done
```
3. To preview your site, initiate a simple HTTP server:
```bash
cd site
python -m http.server
# Open in your preferred browser
```
- 🖥️ Access the live site at `http://localhost:8000`.
## 📤 Deploying Your Documentation Site
Choose a hosting provider and deployment method for your MkDocs documentation:
- Configure `mkdocs.yml` with deployment settings.
- Use `mkdocs deploy` to build and deploy your site.
* ### GitHub Pages Deployment Example:
```bash
mkdocs gh-deploy
```
- Update the "Custom domain" in your repository's settings for a personalized URL.
![MkDocs deployment example](https://github.com/ultralytics/docs/releases/download/0/mkdocs-deployment-example.avif)
- For detailed deployment guidance, consult the [MkDocs documentation](https://www.mkdocs.org/user-guide/deploying-your-docs/).
## 💡 Contribute
We cherish the community's input as it drives Ultralytics open-source initiatives. Dive into the [Contributing Guide](https://docs.ultralytics.com/help/contributing/) and share your thoughts via our [Survey](https://www.ultralytics.com/survey?utm_source=github&utm_medium=social&utm_campaign=Survey). A heartfelt thank you 🙏 to each contributor!
![Ultralytics open-source contributors](https://github.com/ultralytics/docs/releases/download/0/ultralytics-open-source-contributors.avif)
## 📜 License
Ultralytics Docs presents two licensing options:
- **AGPL-3.0 License**: Perfect for academia and open collaboration. Details are in the [LICENSE](https://github.com/ultralytics/docs/blob/main/LICENSE) file.
- **Enterprise License**: Tailored for commercial usage, offering a seamless blend of Ultralytics technology in your products. Learn more at [Ultralytics Licensing](https://www.ultralytics.com/license).
## ✉️ Contact
For Ultralytics bug reports and feature requests please visit [GitHub Issues](https://github.com/ultralytics/ultralytics/issues). Become a member of the Ultralytics [Discord](https://discord.com/invite/ultralytics), [Reddit](https://www.reddit.com/r/ultralytics/), or [Forums](https://community.ultralytics.com/) for asking questions, sharing projects, learning discussions, or for help with all things Ultralytics!
<br>
<div align="center">
<a href="https://github.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="3%" alt="Ultralytics GitHub"></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="space">
<a href="https://www.linkedin.com/company/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="3%" alt="Ultralytics LinkedIn"></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="space">
<a href="https://twitter.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-twitter.png" width="3%" alt="Ultralytics Twitter"></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="space">
<a href="https://youtube.com/ultralytics?sub_confirmation=1"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-youtube.png" width="3%" alt="Ultralytics YouTube"></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="space">
<a href="https://www.tiktok.com/@ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-tiktok.png" width="3%" alt="Ultralytics TikTok"></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="space">
<a href="https://ultralytics.com/bilibili"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-bilibili.png" width="3%" alt="Ultralytics BiliBili"></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="space">
<a href="https://discord.com/invite/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-discord.png" width="3%" alt="Ultralytics Discord"></a>
</div>
# Ultralytics YOLO 🚀, AGPL-3.0 license
"""
Automates the building and post-processing of MkDocs documentation, particularly for projects with multilingual content.
It streamlines the workflow for generating localized versions of the documentation and updating HTML links to ensure
they are correctly formatted.
Key Features:
- Automated building of MkDocs documentation: The script compiles both the main documentation and
any localized versions specified in separate MkDocs configuration files.
- Post-processing of generated HTML files: After the documentation is built, the script updates all
HTML files to remove the '.md' extension from internal links. This ensures that links in the built
HTML documentation correctly point to other HTML pages rather than Markdown files, which is crucial
for proper navigation within the web-based documentation.
Usage:
- Run the script from the root directory of your MkDocs project.
- Ensure that MkDocs is installed and that all MkDocs configuration files (main and localized versions)
are present in the project directory.
- The script first builds the documentation using MkDocs, then scans the generated HTML files in the 'site'
directory to update the internal links.
- It's ideal for projects where the documentation is written in Markdown and needs to be served as a static website.
Note:
- This script is built to be run in an environment where Python and MkDocs are installed and properly configured.
"""
import os
import re
import shutil
import subprocess
from pathlib import Path
from bs4 import BeautifulSoup
from tqdm import tqdm
os.environ["JUPYTER_PLATFORM_DIRS"] = "1" # fix DeprecationWarning: Jupyter is migrating to use standard platformdirs
DOCS = Path(__file__).parent.resolve()
SITE = DOCS.parent / "site"
def prepare_docs_markdown(clone_repos=True):
"""Build docs using mkdocs."""
if SITE.exists():
print(f"Removing existing {SITE}")
shutil.rmtree(SITE)
# Get hub-sdk repo
if clone_repos:
repo = "https://github.com/ultralytics/hub-sdk"
local_dir = DOCS.parent / Path(repo).name
if not local_dir.exists():
os.system(f"git clone {repo} {local_dir}")
os.system(f"git -C {local_dir} pull") # update repo
shutil.rmtree(DOCS / "en/hub/sdk", ignore_errors=True) # delete if exists
shutil.copytree(local_dir / "docs", DOCS / "en/hub/sdk") # for docs
shutil.rmtree(DOCS.parent / "hub_sdk", ignore_errors=True) # delete if exists
shutil.copytree(local_dir / "hub_sdk", DOCS.parent / "hub_sdk") # for mkdocstrings
print(f"Cloned/Updated {repo} in {local_dir}")
# Add frontmatter
for file in tqdm((DOCS / "en").rglob("*.md"), desc="Adding frontmatter"):
update_markdown_files(file)
def update_page_title(file_path: Path, new_title: str):
"""Update the title of an HTML file."""
# Read the content of the file
with open(file_path, encoding="utf-8") as file:
content = file.read()
# Replace the existing title with the new title
updated_content = re.sub(r"<title>.*?</title>", f"<title>{new_title}</title>", content)
# Write the updated content back to the file
with open(file_path, "w", encoding="utf-8") as file:
file.write(updated_content)
def update_html_head(script=""):
"""Update the HTML head section of each file."""
html_files = Path(SITE).rglob("*.html")
for html_file in tqdm(html_files, desc="Processing HTML files"):
with html_file.open("r", encoding="utf-8") as file:
html_content = file.read()
if script in html_content: # script already in HTML file
return
head_end_index = html_content.lower().rfind("</head>")
if head_end_index != -1:
# Add the specified JavaScript to the HTML file just before the end of the head tag.
new_html_content = html_content[:head_end_index] + script + html_content[head_end_index:]
with html_file.open("w", encoding="utf-8") as file:
file.write(new_html_content)
def update_subdir_edit_links(subdir="", docs_url=""):
"""Update the HTML head section of each file."""
if str(subdir[0]) == "/":
subdir = str(subdir[0])[1:]
html_files = (SITE / subdir).rglob("*.html")
for html_file in tqdm(html_files, desc="Processing subdir files"):
with html_file.open("r", encoding="utf-8") as file:
soup = BeautifulSoup(file, "html.parser")
# Find the anchor tag and update its href attribute
a_tag = soup.find("a", {"class": "md-content__button md-icon"})
if a_tag and a_tag["title"] == "Edit this page":
a_tag["href"] = f"{docs_url}{a_tag['href'].split(subdir)[-1]}"
# Write the updated HTML back to the file
with open(html_file, "w", encoding="utf-8") as file:
file.write(str(soup))
def update_markdown_files(md_filepath: Path):
"""Creates or updates a Markdown file, ensuring frontmatter is present."""
if md_filepath.exists():
content = md_filepath.read_text().strip()
# Replace apostrophes
content = content.replace("‘", "'").replace("’", "'")
# Add frontmatter if missing
if not content.strip().startswith("---\n") and "macros" not in md_filepath.parts: # skip macros directory
header = "---\ncomments: true\ndescription: TODO ADD DESCRIPTION\nkeywords: TODO ADD KEYWORDS\n---\n\n"
content = header + content
# Ensure MkDocs admonitions "=== " lines are preceded and followed by empty newlines
lines = content.split("\n")
new_lines = []
for i, line in enumerate(lines):
stripped_line = line.strip()
if stripped_line.startswith("=== "):
if i > 0 and new_lines[-1] != "":
new_lines.append("")
new_lines.append(line)
if i < len(lines) - 1 and lines[i + 1].strip() != "":
new_lines.append("")
else:
new_lines.append(line)
content = "\n".join(new_lines)
# Add EOF newline if missing
if not content.endswith("\n"):
content += "\n"
# Save page
md_filepath.write_text(content)
return
def update_docs_html():
"""Updates titles, edit links, head sections, and converts plaintext links in HTML documentation."""
# Update 404 titles
update_page_title(SITE / "404.html", new_title="Ultralytics Docs - Not Found")
# Update edit links
update_subdir_edit_links(
subdir="hub/sdk/", # do not use leading slash
docs_url="https://github.com/ultralytics/hub-sdk/tree/main/docs/",
)
# Convert plaintext links to HTML hyperlinks
files_modified = 0
for html_file in tqdm(SITE.rglob("*.html"), desc="Converting plaintext links"):
with open(html_file, encoding="utf-8") as file:
content = file.read()
updated_content = convert_plaintext_links_to_html(content)
if updated_content != content:
with open(html_file, "w", encoding="utf-8") as file:
file.write(updated_content)
files_modified += 1
print(f"Modified plaintext links in {files_modified} files.")
# Update HTML file head section
script = ""
if any(script):
update_html_head(script)
# Delete the /macros directory from the built site
macros_dir = SITE / "macros"
if macros_dir.exists():
print(f"Removing /macros directory from site: {macros_dir}")
shutil.rmtree(macros_dir)
def convert_plaintext_links_to_html(content):
"""Convert plaintext links to HTML hyperlinks in the main content area only."""
soup = BeautifulSoup(content, "html.parser")
# Find the main content area (adjust this selector based on your HTML structure)
main_content = soup.find("main") or soup.find("div", class_="md-content")
if not main_content:
return content # Return original content if main content area not found
modified = False
for paragraph in main_content.find_all(["p", "li"]): # Focus on paragraphs and list items
for text_node in paragraph.find_all(string=True, recursive=False):
if text_node.parent.name not in {"a", "code"}: # Ignore links and code blocks
new_text = re.sub(
r"(https?://[^\s()<>]*[^\s()<>.,:;!?\'\"])",
r'<a href="\1">\1</a>',
str(text_node),
)
if "<a href=" in new_text:
# Parse the new text with BeautifulSoup to handle HTML properly
new_soup = BeautifulSoup(new_text, "html.parser")
text_node.replace_with(new_soup)
modified = True
return str(soup) if modified else content
def remove_macros():
"""Removes the /macros directory and related entries in sitemap.xml from the built site."""
shutil.rmtree(SITE / "macros", ignore_errors=True)
(SITE / "sitemap.xml.gz").unlink(missing_ok=True)
# Process sitemap.xml
sitemap = SITE / "sitemap.xml"
lines = sitemap.read_text(encoding="utf-8").splitlines(keepends=True)
# Find indices of '/macros/' lines
macros_indices = [i for i, line in enumerate(lines) if "/macros/" in line]
# Create a set of indices to remove (including lines before and after)
indices_to_remove = set()
for i in macros_indices:
indices_to_remove.update(range(i - 1, i + 3)) # i-1, i, i+1, i+2, i+3
# Create new list of lines, excluding the ones to remove
new_lines = [line for i, line in enumerate(lines) if i not in indices_to_remove]
# Write the cleaned content back to the file
sitemap.write_text("".join(new_lines), encoding="utf-8")
print(f"Removed {len(macros_indices)} URLs containing '/macros/' from {sitemap}")
def minify_files(html=True, css=True, js=True):
"""Minifies HTML, CSS, and JS files and prints total reduction stats."""
minify, compress, jsmin = None, None, None
try:
if html:
from minify_html import minify
if css:
from csscompressor import compress
if js:
import jsmin
except ImportError as e:
print(f"Missing required package: {str(e)}")
return
stats = {}
for ext, minifier in {
"html": (lambda x: minify(x, keep_closing_tags=True, minify_css=True, minify_js=True)) if html else None,
"css": compress if css else None,
"js": jsmin.jsmin if js else None,
}.items():
if not minifier:
continue
stats[ext] = {"original": 0, "minified": 0}
directory = "" # "stylesheets" if ext == css else "javascript" if ext == "js" else ""
for f in tqdm((SITE / directory).rglob(f"*.{ext}"), desc=f"Minifying {ext.upper()}"):
content = f.read_text(encoding="utf-8")
minified = minifier(content)
stats[ext]["original"] += len(content)
stats[ext]["minified"] += len(minified)
f.write_text(minified, encoding="utf-8")
for ext, data in stats.items():
if data["original"]:
r = data["original"] - data["minified"] # reduction
print(f"Total {ext.upper()} reduction: {(r / data['original']) * 100:.2f}% ({r / 1024:.2f} KB saved)")
def main():
"""Builds docs, updates titles and edit links, minifies HTML, and prints local server command."""
prepare_docs_markdown()
# Build the main documentation
print(f"Building docs from {DOCS}")
subprocess.run(f"mkdocs build -f {DOCS.parent}/mkdocs.yml --strict", check=True, shell=True)
remove_macros()
print(f"Site built at {SITE}")
# Update docs HTML pages
update_docs_html()
# Minify files
minify_files(html=False, css=False, js=False)
# Show command to serve built website
print('Docs built correctly ✅\nServe site at http://localhost:8000 with "python -m http.server --directory site"')
if __name__ == "__main__":
main()
# Ultralytics YOLO 🚀, AGPL-3.0 license
"""
Helper file to build Ultralytics Docs reference section. Recursively walks through ultralytics dir and builds an MkDocs
reference section of *.md files composed of classes and functions, and also creates a nav menu for use in mkdocs.yaml.
Note: Must be run from repository root directory. Do not run from docs directory.
"""
import re
import subprocess
from collections import defaultdict
from pathlib import Path
# Constants
hub_sdk = False
if hub_sdk:
PACKAGE_DIR = Path("/Users/glennjocher/PycharmProjects/hub-sdk/hub_sdk")
REFERENCE_DIR = PACKAGE_DIR.parent / "docs/reference"
GITHUB_REPO = "ultralytics/hub-sdk"
else:
FILE = Path(__file__).resolve()
PACKAGE_DIR = FILE.parents[1] / "ultralytics" # i.e. /Users/glennjocher/PycharmProjects/ultralytics/ultralytics
REFERENCE_DIR = PACKAGE_DIR.parent / "docs/en/reference"
GITHUB_REPO = "ultralytics/ultralytics"
def extract_classes_and_functions(filepath: Path) -> tuple:
"""Extracts class and function names from a given Python file."""
content = filepath.read_text()
class_pattern = r"(?:^|\n)class\s(\w+)(?:\(|:)"
func_pattern = r"(?:^|\n)def\s(\w+)\("
classes = re.findall(class_pattern, content)
functions = re.findall(func_pattern, content)
return classes, functions
def create_markdown(py_filepath: Path, module_path: str, classes: list, functions: list):
"""Creates a Markdown file containing the API reference for the given Python module."""
md_filepath = py_filepath.with_suffix(".md")
exists = md_filepath.exists()
# Read existing content and keep header content between first two ---
header_content = ""
if exists:
existing_content = md_filepath.read_text()
header_parts = existing_content.split("---")
for part in header_parts:
if "description:" in part or "comments:" in part:
header_content += f"---{part}---\n\n"
if not any(header_content):
header_content = "---\ndescription: TODO ADD DESCRIPTION\nkeywords: TODO ADD KEYWORDS\n---\n\n"
module_name = module_path.replace(".__init__", "")
module_path = module_path.replace(".", "/")
url = f"https://github.com/{GITHUB_REPO}/blob/main/{module_path}.py"
edit = f"https://github.com/{GITHUB_REPO}/edit/main/{module_path}.py"
pretty = url.replace("__init__.py", "\\_\\_init\\_\\_.py") # properly display __init__.py filenames
title_content = (
f"# Reference for `{module_path}.py`\n\n"
f"!!! note\n\n"
f" This file is available at [{pretty}]({url}). If you spot a problem please help fix it by [contributing]"
f"(https://docs.ultralytics.com/help/contributing/) a [Pull Request]({edit}) 🛠️. Thank you 🙏!\n\n"
)
md_content = ["<br>\n"] + [f"## ::: {module_name}.{class_name}\n\n<br><br><hr><br>\n" for class_name in classes]
md_content.extend(f"## ::: {module_name}.{func_name}\n\n<br><br><hr><br>\n" for func_name in functions)
md_content[-1] = md_content[-1].replace("<hr><br>", "") # remove last horizontal line
md_content = header_content + title_content + "\n".join(md_content)
if not md_content.endswith("\n"):
md_content += "\n"
md_filepath.parent.mkdir(parents=True, exist_ok=True)
md_filepath.write_text(md_content)
if not exists:
# Add new markdown file to the git staging area
print(f"Created new file '{md_filepath}'")
subprocess.run(["git", "add", "-f", str(md_filepath)], check=True, cwd=PACKAGE_DIR)
return md_filepath.relative_to(PACKAGE_DIR.parent)
def nested_dict() -> defaultdict:
"""Creates and returns a nested defaultdict."""
return defaultdict(nested_dict)
def sort_nested_dict(d: dict) -> dict:
"""Sorts a nested dictionary recursively."""
return {key: sort_nested_dict(value) if isinstance(value, dict) else value for key, value in sorted(d.items())}
def create_nav_menu_yaml(nav_items: list, save: bool = False):
"""Creates a YAML file for the navigation menu based on the provided list of items."""
nav_tree = nested_dict()
for item_str in nav_items:
item = Path(item_str)
parts = item.parts
current_level = nav_tree["reference"]
for part in parts[2:-1]: # skip the first two parts (docs and reference) and the last part (filename)
current_level = current_level[part]
md_file_name = parts[-1].replace(".md", "")
current_level[md_file_name] = item
nav_tree_sorted = sort_nested_dict(nav_tree)
def _dict_to_yaml(d, level=0):
"""Converts a nested dictionary to a YAML-formatted string with indentation."""
yaml_str = ""
indent = " " * level
for k, v in d.items():
if isinstance(v, dict):
yaml_str += f"{indent}- {k}:\n{_dict_to_yaml(v, level + 1)}"
else:
yaml_str += f"{indent}- {k}: {str(v).replace('docs/en/', '')}\n"
return yaml_str
# Print updated YAML reference section
print("Scan complete, new mkdocs.yaml reference section is:\n\n", _dict_to_yaml(nav_tree_sorted))
# Save new YAML reference section
if save:
(PACKAGE_DIR.parent / "nav_menu_updated.yml").write_text(_dict_to_yaml(nav_tree_sorted))
def main():
"""Main function to extract class and function names, create Markdown files, and generate a YAML navigation menu."""
nav_items = []
for py_filepath in PACKAGE_DIR.rglob("*.py"):
classes, functions = extract_classes_and_functions(py_filepath)
if classes or functions:
py_filepath_rel = py_filepath.relative_to(PACKAGE_DIR)
md_filepath = REFERENCE_DIR / py_filepath_rel
module_path = f"{PACKAGE_DIR.name}.{py_filepath_rel.with_suffix('').as_posix().replace('/', '.')}"
md_rel_filepath = create_markdown(md_filepath, module_path, classes, functions)
nav_items.append(str(md_rel_filepath))
create_nav_menu_yaml(nav_items)
if __name__ == "__main__":
main()
---
description: Discover what's next for Ultralytics with our under-construction page, previewing new, groundbreaking AI and ML features coming soon.
keywords: Ultralytics, coming soon, under construction, new features, AI updates, ML advancements, YOLO, technology preview
---
# Under Construction 🏗️🌟
Welcome to the [Ultralytics](https://www.ultralytics.com/) "Under Construction" page! Here, we're hard at work developing the next generation of AI and ML innovations. This page serves as a teaser for the exciting updates and new features we're eager to share with you!
## Exciting New Features on the Way 🎉
- **Innovative Breakthroughs:** Get ready for advanced features and services that will transform your AI and ML experience.
- **New Horizons:** Anticipate novel products that redefine AI and ML capabilities.
- **Enhanced Services:** We're upgrading our services for greater efficiency and user-friendliness.
## Stay Updated 🚧
This placeholder page is your first stop for upcoming developments. Keep an eye out for:
- **Newsletter:** Subscribe [here](https://www.ultralytics.com/#newsletter) for the latest news.
- **Social Media:** Follow us [here](https://www.linkedin.com/company/ultralytics) for updates and teasers.
- **Blog:** Visit our [blog](https://www.ultralytics.com/blog) for detailed insights.
## We Value Your Input 🗣️
Your feedback shapes our future releases. Share your thoughts and suggestions [here](https://www.ultralytics.com/survey).
## Thank You, Community! 🌍
Your [contributions](https://docs.ultralytics.com/help/contributing/) inspire our continuous [innovation](https://github.com/ultralytics/ultralytics). Stay tuned for the big reveal of what's next in AI and ML at Ultralytics!
---
Excited for what's coming? Bookmark this page and get ready for a transformative AI and ML journey with Ultralytics! 🛠️🤖
docs.ultralytics.com
---
comments: true
description: Explore the widely-used Caltech-101 dataset with 9,000 images across 101 categories. Ideal for object recognition tasks in machine learning and computer vision.
keywords: Caltech-101, dataset, object recognition, machine learning, computer vision, YOLO, deep learning, research, AI
---
# Caltech-101 Dataset
The [Caltech-101](https://data.caltech.edu/records/mzrjq-6wc02) dataset is a widely used dataset for object recognition tasks, containing around 9,000 images from 101 object categories. The categories were chosen to reflect a variety of real-world objects, and the images themselves were carefully selected and annotated to provide a challenging benchmark for object recognition algorithms.
## Key Features
- The Caltech-101 dataset comprises around 9,000 color images divided into 101 categories.
- The categories encompass a wide variety of objects, including animals, vehicles, household items, and people.
- The number of images per category varies, with about 40 to 800 images in each category.
- Images are of variable sizes, with most images being medium resolution.
- Caltech-101 is widely used for training and testing in the field of machine learning, particularly for object recognition tasks.
## Dataset Structure
Unlike many other datasets, the Caltech-101 dataset is not formally split into training and testing sets. Users typically create their own splits based on their specific needs. However, a common practice is to use a random subset of images for training (e.g., 30 images per category) and the remaining images for testing.
## Applications
The Caltech-101 dataset is extensively used for training and evaluating [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models in object recognition tasks, such as [Convolutional Neural Networks](https://www.ultralytics.com/glossary/convolutional-neural-network-cnn) (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. Its wide variety of categories and high-quality images make it an excellent dataset for research and development in the field of machine learning and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv).
## Usage
To train a YOLO model on the Caltech-101 dataset for 100 epochs, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="caltech101", epochs=100, imgsz=416)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=caltech101 model=yolo11n-cls.pt epochs=100 imgsz=416
```
## Sample Images and Annotations
The Caltech-101 dataset contains high-quality color images of various objects, providing a well-structured dataset for object recognition tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://github.com/ultralytics/docs/releases/download/0/caltech101-sample-image.avif)
The example showcases the variety and complexity of the objects in the Caltech-101 dataset, emphasizing the significance of a diverse dataset for training robust object recognition models.
## Citations and Acknowledgments
If you use the Caltech-101 dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@article{fei2007learning,
title={Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories},
author={Fei-Fei, Li and Fergus, Rob and Perona, Pietro},
journal={Computer vision and Image understanding},
volume={106},
number={1},
pages={59--70},
year={2007},
publisher={Elsevier}
}
```
We would like to acknowledge Li Fei-Fei, Rob Fergus, and Pietro Perona for creating and maintaining the Caltech-101 dataset as a valuable resource for the machine learning and computer vision research community. For more information about the Caltech-101 dataset and its creators, visit the [Caltech-101 dataset website](https://data.caltech.edu/records/mzrjq-6wc02).
## FAQ
### What is the Caltech-101 dataset used for in machine learning?
The [Caltech-101](https://data.caltech.edu/records/mzrjq-6wc02) dataset is widely used in machine learning for object recognition tasks. It contains around 9,000 images across 101 categories, providing a challenging benchmark for evaluating object recognition algorithms. Researchers leverage it to train and test models, especially Convolutional [Neural Networks](https://www.ultralytics.com/glossary/neural-network-nn) (CNNs) and [Support Vector Machines](https://www.ultralytics.com/glossary/support-vector-machine-svm) (SVMs), in computer vision.
### How can I train an Ultralytics YOLO model on the Caltech-101 dataset?
To train an Ultralytics YOLO model on the Caltech-101 dataset, you can use the provided code snippets. For example, to train for 100 [epochs](https://www.ultralytics.com/glossary/epoch):
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="caltech101", epochs=100, imgsz=416)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=caltech101 model=yolo11n-cls.pt epochs=100 imgsz=416
```
For more detailed arguments and options, refer to the model [Training](../../modes/train.md) page.
### What are the key features of the Caltech-101 dataset?
The Caltech-101 dataset includes:
- Around 9,000 color images across 101 categories.
- Categories covering a diverse range of objects, including animals, vehicles, and household items.
- Variable number of images per category, typically between 40 and 800.
- Variable image sizes, with most being medium resolution.
These features make it an excellent choice for training and evaluating object recognition models in [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) and computer vision.
### Why should I cite the Caltech-101 dataset in my research?
Citing the Caltech-101 dataset in your research acknowledges the creators' contributions and provides a reference for others who might use the dataset. The recommended citation is:
!!! quote ""
=== "BibTeX"
```bibtex
@article{fei2007learning,
title={Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories},
author={Fei-Fei, Li and Fergus, Rob and Perona, Pietro},
journal={Computer vision and Image understanding},
volume={106},
number={1},
pages={59--70},
year={2007},
publisher={Elsevier}
}
```
Citing helps in maintaining the integrity of academic work and assists peers in locating the original resource.
### Can I use Ultralytics HUB for training models on the Caltech-101 dataset?
Yes, you can use Ultralytics HUB for training models on the Caltech-101 dataset. Ultralytics HUB provides an intuitive platform for managing datasets, training models, and deploying them without extensive coding. For a detailed guide, refer to the [how to train your custom models with Ultralytics HUB](https://www.ultralytics.com/blog/how-to-train-your-custom-models-with-ultralytics-hub) blog post.
---
comments: true
description: Explore the Caltech-256 dataset, featuring 30,000 images across 257 categories, ideal for training and testing object recognition algorithms.
keywords: Caltech-256 dataset, object classification, image dataset, machine learning, computer vision, deep learning, YOLO, training dataset
---
# Caltech-256 Dataset
The [Caltech-256](https://data.caltech.edu/records/nyy15-4j048) dataset is an extensive collection of images used for object classification tasks. It contains around 30,000 images divided into 257 categories (256 object categories and 1 background category). The images are carefully curated and annotated to provide a challenging and diverse benchmark for object recognition algorithms.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/isc06_9qnM0"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> How to Train <a href="https://www.ultralytics.com/glossary/image-classification">Image Classification</a> Model using Caltech-256 Dataset with Ultralytics HUB
</p>
## Key Features
- The Caltech-256 dataset comprises around 30,000 color images divided into 257 categories.
- Each category contains a minimum of 80 images.
- The categories encompass a wide variety of real-world objects, including animals, vehicles, household items, and people.
- Images are of variable sizes and resolutions.
- Caltech-256 is widely used for training and testing in the field of machine learning, particularly for object recognition tasks.
## Dataset Structure
Like Caltech-101, the Caltech-256 dataset does not have a formal split between training and testing sets. Users typically create their own splits according to their specific needs. A common practice is to use a random subset of images for training and the remaining images for testing.
## Applications
The Caltech-256 dataset is extensively used for training and evaluating [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models in object recognition tasks, such as [Convolutional Neural Networks](https://www.ultralytics.com/glossary/convolutional-neural-network-cnn) (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. Its diverse set of categories and high-quality images make it an invaluable dataset for research and development in the field of machine learning and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv).
## Usage
To train a YOLO model on the Caltech-256 dataset for 100 epochs, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="caltech256", epochs=100, imgsz=416)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=caltech256 model=yolo11n-cls.pt epochs=100 imgsz=416
```
## Sample Images and Annotations
The Caltech-256 dataset contains high-quality color images of various objects, providing a comprehensive dataset for object recognition tasks. Here are some examples of images from the dataset ([credit](https://ml4a.github.io/demos/tsne_viewer.html)):
![Dataset sample image](https://github.com/ultralytics/docs/releases/download/0/caltech256-sample-image.avif)
The example showcases the diversity and complexity of the objects in the Caltech-256 dataset, emphasizing the importance of a varied dataset for training robust object recognition models.
## Citations and Acknowledgments
If you use the Caltech-256 dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@article{griffin2007caltech,
title={Caltech-256 object category dataset},
author={Griffin, Gregory and Holub, Alex and Perona, Pietro},
year={2007}
}
```
We would like to acknowledge Gregory Griffin, Alex Holub, and Pietro Perona for creating and maintaining the Caltech-256 dataset as a valuable resource for the [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) and computer vision research community. For more information about the
Caltech-256 dataset and its creators, visit the [Caltech-256 dataset website](https://data.caltech.edu/records/nyy15-4j048).
## FAQ
### What is the Caltech-256 dataset and why is it important for machine learning?
The [Caltech-256](https://data.caltech.edu/records/nyy15-4j048) dataset is a large image dataset used primarily for object classification tasks in machine learning and computer vision. It consists of around 30,000 color images divided into 257 categories, covering a wide range of real-world objects. The dataset's diverse and high-quality images make it an excellent benchmark for evaluating object recognition algorithms, which is crucial for developing robust machine learning models.
### How can I train a YOLO model on the Caltech-256 dataset using Python or CLI?
To train a YOLO model on the Caltech-256 dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch), you can use the following code snippets. Refer to the model [Training](../../modes/train.md) page for additional options.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model
# Train the model
results = model.train(data="caltech256", epochs=100, imgsz=416)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=caltech256 model=yolo11n-cls.pt epochs=100 imgsz=416
```
### What are the most common use cases for the Caltech-256 dataset?
The Caltech-256 dataset is widely used for various object recognition tasks such as:
- Training Convolutional [Neural Networks](https://www.ultralytics.com/glossary/neural-network-nn) (CNNs)
- Evaluating the performance of [Support Vector Machines](https://www.ultralytics.com/glossary/support-vector-machine-svm) (SVMs)
- Benchmarking new deep learning algorithms
- Developing [object detection](https://www.ultralytics.com/glossary/object-detection) models using frameworks like Ultralytics YOLO
Its diversity and comprehensive annotations make it ideal for research and development in machine learning and computer vision.
### How is the Caltech-256 dataset structured and split for training and testing?
The Caltech-256 dataset does not come with a predefined split for training and testing. Users typically create their own splits according to their specific needs. A common approach is to randomly select a subset of images for training and use the remaining images for testing. This flexibility allows users to tailor the dataset to their specific project requirements and experimental setups.
### Why should I use Ultralytics YOLO for training models on the Caltech-256 dataset?
Ultralytics YOLO models offer several advantages for training on the Caltech-256 dataset:
- **High Accuracy**: YOLO models are known for their state-of-the-art performance in object detection tasks.
- **Speed**: They provide real-time inference capabilities, making them suitable for applications requiring quick predictions.
- **Ease of Use**: With Ultralytics HUB, users can train, validate, and deploy models without extensive coding.
- **Pretrained Models**: Starting from pretrained models, like `yolo11n-cls.pt`, can significantly reduce training time and improve model [accuracy](https://www.ultralytics.com/glossary/accuracy).
For more details, explore our [comprehensive training guide](../../modes/train.md).
---
comments: true
description: Explore the CIFAR-10 dataset, featuring 60,000 color images in 10 classes. Learn about its structure, applications, and how to train models using YOLO.
keywords: CIFAR-10, dataset, machine learning, computer vision, image classification, YOLO, deep learning, neural networks
---
# CIFAR-10 Dataset
The [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) (Canadian Institute For Advanced Research) dataset is a collection of images used widely for [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) and computer vision algorithms. It was developed by researchers at the CIFAR institute and consists of 60,000 32x32 color images in 10 different classes.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/fLBbyhPbWzY"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> How to Train an <a href="https://www.ultralytics.com/glossary/image-classification">Image Classification</a> Model with CIFAR-10 Dataset using Ultralytics YOLO11
</p>
## Key Features
- The CIFAR-10 dataset consists of 60,000 images, divided into 10 classes.
- Each class contains 6,000 images, split into 5,000 for training and 1,000 for testing.
- The images are colored and of size 32x32 pixels.
- The 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
- CIFAR-10 is commonly used for training and testing in the field of machine learning and computer vision.
## Dataset Structure
The CIFAR-10 dataset is split into two subsets:
1. **Training Set**: This subset contains 50,000 images used for training machine learning models.
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
## Applications
The CIFAR-10 dataset is widely used for training and evaluating [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models in image classification tasks, such as [Convolutional Neural Networks](https://www.ultralytics.com/glossary/convolutional-neural-network-cnn) (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. The diversity of the dataset in terms of classes and the presence of color images make it a well-rounded dataset for research and development in the field of machine learning and computer vision.
## Usage
To train a YOLO model on the CIFAR-10 dataset for 100 epochs with an image size of 32x32, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="cifar10", epochs=100, imgsz=32)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=cifar10 model=yolo11n-cls.pt epochs=100 imgsz=32
```
## Sample Images and Annotations
The CIFAR-10 dataset contains color images of various objects, providing a well-structured dataset for image classification tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://github.com/ultralytics/docs/releases/download/0/cifar10-sample-image.avif)
The example showcases the variety and complexity of the objects in the CIFAR-10 dataset, highlighting the importance of a diverse dataset for training robust image classification models.
## Citations and Acknowledgments
If you use the CIFAR-10 dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@TECHREPORT{Krizhevsky09learningmultiple,
author={Alex Krizhevsky},
title={Learning multiple layers of features from tiny images},
institution={},
year={2009}
}
```
We would like to acknowledge Alex Krizhevsky for creating and maintaining the CIFAR-10 dataset as a valuable resource for the machine learning and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) research community. For more information about the CIFAR-10 dataset and its creator, visit the [CIFAR-10 dataset website](https://www.cs.toronto.edu/~kriz/cifar.html).
## FAQ
### How can I train a YOLO model on the CIFAR-10 dataset?
To train a YOLO model on the CIFAR-10 dataset using Ultralytics, you can follow the examples provided for both Python and CLI. Here is a basic example to train your model for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 32x32 pixels:
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="cifar10", epochs=100, imgsz=32)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=cifar10 model=yolo11n-cls.pt epochs=100 imgsz=32
```
For more details, refer to the model [Training](../../modes/train.md) page.
### What are the key features of the CIFAR-10 dataset?
The CIFAR-10 dataset consists of 60,000 color images divided into 10 classes. Each class contains 6,000 images, with 5,000 for training and 1,000 for testing. The images are 32x32 pixels in size and vary across the following categories:
- Airplanes
- Cars
- Birds
- Cats
- Deer
- Dogs
- Frogs
- Horses
- Ships
- Trucks
This diverse dataset is essential for training image classification models in fields such as machine learning and computer vision. For more information, visit the CIFAR-10 sections on [dataset structure](#dataset-structure) and [applications](#applications).
### Why use the CIFAR-10 dataset for image classification tasks?
The CIFAR-10 dataset is an excellent benchmark for image classification due to its diversity and structure. It contains a balanced mix of 60,000 labeled images across 10 different categories, which helps in training robust and generalized models. It is widely used for evaluating deep learning models, including Convolutional [Neural Networks](https://www.ultralytics.com/glossary/neural-network-nn) (CNNs) and other machine learning algorithms. The dataset is relatively small, making it suitable for quick experimentation and algorithm development. Explore its numerous applications in the [applications](#applications) section.
### How is the CIFAR-10 dataset structured?
The CIFAR-10 dataset is structured into two main subsets:
1. **Training Set**: Contains 50,000 images used for training machine learning models.
2. **Testing Set**: Consists of 10,000 images for testing and benchmarking the trained models.
Each subset comprises images categorized into 10 classes, with their annotations readily available for model training and evaluation. For more detailed information, refer to the [dataset structure](#dataset-structure) section.
### How can I cite the CIFAR-10 dataset in my research?
If you use the CIFAR-10 dataset in your research or development projects, make sure to cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@TECHREPORT{Krizhevsky09learningmultiple,
author={Alex Krizhevsky},
title={Learning multiple layers of features from tiny images},
institution={},
year={2009}
}
```
Acknowledging the dataset's creators helps support continued research and development in the field. For more details, see the [citations and acknowledgments](#citations-and-acknowledgments) section.
### What are some practical examples of using the CIFAR-10 dataset?
The CIFAR-10 dataset is often used for training image classification models, such as Convolutional Neural Networks (CNNs) and [Support Vector Machines](https://www.ultralytics.com/glossary/support-vector-machine-svm) (SVMs). These models can be employed in various computer vision tasks including [object detection](https://www.ultralytics.com/glossary/object-detection), [image recognition](https://www.ultralytics.com/glossary/image-recognition), and automated tagging. To see some practical examples, check the code snippets in the [usage](#usage) section.
---
comments: true
description: Explore the CIFAR-100 dataset, consisting of 60,000 32x32 color images across 100 classes. Ideal for machine learning and computer vision tasks.
keywords: CIFAR-100, dataset, machine learning, computer vision, image classification, deep learning, YOLO, training, testing, Alex Krizhevsky
---
# CIFAR-100 Dataset
The [CIFAR-100](https://www.cs.toronto.edu/~kriz/cifar.html) (Canadian Institute For Advanced Research) dataset is a significant extension of the CIFAR-10 dataset, composed of 60,000 32x32 color images in 100 different classes. It was developed by researchers at the CIFAR institute, offering a more challenging dataset for more complex machine learning and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) tasks.
## Key Features
- The CIFAR-100 dataset consists of 60,000 images, divided into 100 classes.
- Each class contains 600 images, split into 500 for training and 100 for testing.
- The images are colored and of size 32x32 pixels.
- The 100 different classes are grouped into 20 coarse categories for higher level classification.
- CIFAR-100 is commonly used for training and testing in the field of machine learning and computer vision.
## Dataset Structure
The CIFAR-100 dataset is split into two subsets:
1. **Training Set**: This subset contains 50,000 images used for training machine learning models.
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
## Applications
The CIFAR-100 dataset is extensively used for training and evaluating deep learning models in image classification tasks, such as [Convolutional Neural Networks](https://www.ultralytics.com/glossary/convolutional-neural-network-cnn) (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. The diversity of the dataset in terms of classes and the presence of color images make it a more challenging and comprehensive dataset for research and development in the field of machine learning and computer vision.
## Usage
To train a YOLO model on the CIFAR-100 dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 32x32, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="cifar100", epochs=100, imgsz=32)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=cifar100 model=yolo11n-cls.pt epochs=100 imgsz=32
```
## Sample Images and Annotations
The CIFAR-100 dataset contains color images of various objects, providing a well-structured dataset for [image classification](https://www.ultralytics.com/glossary/image-classification) tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://github.com/ultralytics/docs/releases/download/0/cifar100-sample-image.avif)
The example showcases the variety and complexity of the objects in the CIFAR-100 dataset, highlighting the importance of a diverse dataset for training robust image classification models.
## Citations and Acknowledgments
If you use the CIFAR-100 dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@TECHREPORT{Krizhevsky09learningmultiple,
author={Alex Krizhevsky},
title={Learning multiple layers of features from tiny images},
institution={},
year={2009}
}
```
We would like to acknowledge Alex Krizhevsky for creating and maintaining the CIFAR-100 dataset as a valuable resource for the [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) and computer vision research community. For more information about the CIFAR-100 dataset and its creator, visit the [CIFAR-100 dataset website](https://www.cs.toronto.edu/~kriz/cifar.html).
## FAQ
### What is the CIFAR-100 dataset and why is it significant?
The [CIFAR-100 dataset](https://www.cs.toronto.edu/~kriz/cifar.html) is a large collection of 60,000 32x32 color images classified into 100 classes. Developed by the Canadian Institute For Advanced Research (CIFAR), it provides a challenging dataset ideal for complex machine learning and computer vision tasks. Its significance lies in the diversity of classes and the small size of the images, making it a valuable resource for training and testing deep learning models, like Convolutional [Neural Networks](https://www.ultralytics.com/glossary/neural-network-nn) (CNNs), using frameworks such as Ultralytics YOLO.
### How do I train a YOLO model on the CIFAR-100 dataset?
You can train a YOLO model on the CIFAR-100 dataset using either Python or CLI commands. Here's how:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="cifar100", epochs=100, imgsz=32)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=cifar100 model=yolo11n-cls.pt epochs=100 imgsz=32
```
For a comprehensive list of available arguments, please refer to the model [Training](../../modes/train.md) page.
### What are the primary applications of the CIFAR-100 dataset?
The CIFAR-100 dataset is extensively used in training and evaluating [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models for image classification. Its diverse set of 100 classes, grouped into 20 coarse categories, provides a challenging environment for testing algorithms such as Convolutional Neural Networks (CNNs), [Support Vector Machines](https://www.ultralytics.com/glossary/support-vector-machine-svm) (SVMs), and various other machine learning approaches. This dataset is a key resource in research and development within machine learning and computer vision fields.
### How is the CIFAR-100 dataset structured?
The CIFAR-100 dataset is split into two main subsets:
1. **Training Set**: Contains 50,000 images used for training machine learning models.
2. **Testing Set**: Consists of 10,000 images used for testing and benchmarking the trained models.
Each of the 100 classes contains 600 images, with 500 images for training and 100 for testing, making it uniquely suited for rigorous academic and industrial research.
### Where can I find sample images and annotations from the CIFAR-100 dataset?
The CIFAR-100 dataset includes a variety of color images of various objects, making it a structured dataset for image classification tasks. You can refer to the documentation page to see [sample images and annotations](#sample-images-and-annotations). These examples highlight the dataset's diversity and complexity, important for training robust image classification models.
---
comments: true
description: Explore the Fashion-MNIST dataset, a modern replacement for MNIST with 70,000 Zalando article images. Ideal for benchmarking machine learning models.
keywords: Fashion-MNIST, image classification, Zalando dataset, machine learning, deep learning, CNN, dataset overview
---
# Fashion-MNIST Dataset
The [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset is a database of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Fashion-MNIST is intended to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) algorithms.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/eX5ad6udQ9Q"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> How to do <a href="https://www.ultralytics.com/glossary/image-classification">Image Classification</a> on Fashion MNIST Dataset using Ultralytics YOLO11
</p>
## Key Features
- Fashion-MNIST contains 60,000 training images and 10,000 testing images of Zalando's article images.
- The dataset comprises grayscale images of size 28x28 pixels.
- Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255.
- Fashion-MNIST is widely used for training and testing in the field of machine learning, especially for image classification tasks.
## Dataset Structure
The Fashion-MNIST dataset is split into two subsets:
1. **Training Set**: This subset contains 60,000 images used for training machine learning models.
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
## Labels
Each training and test example is assigned to one of the following labels:
0. T-shirt/top
1. Trouser
2. Pullover
3. Dress
4. Coat
5. Sandal
6. Shirt
7. Sneaker
8. Bag
9. Ankle boot
## Applications
The Fashion-MNIST dataset is widely used for training and evaluating deep learning models in image classification tasks, such as [Convolutional Neural Networks](https://www.ultralytics.com/glossary/convolutional-neural-network-cnn) (CNNs), [Support Vector Machines](https://www.ultralytics.com/glossary/support-vector-machine-svm) (SVMs), and various other machine learning algorithms. The dataset's simple and well-structured format makes it an essential resource for researchers and practitioners in the field of machine learning and computer vision.
## Usage
To train a CNN model on the Fashion-MNIST dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 28x28, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="fashion-mnist", epochs=100, imgsz=28)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=fashion-mnist model=yolo11n-cls.pt epochs=100 imgsz=28
```
## Sample Images and Annotations
The Fashion-MNIST dataset contains grayscale images of Zalando's article images, providing a well-structured dataset for image classification tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://github.com/ultralytics/docs/releases/download/0/fashion-mnist-sample.avif)
The example showcases the variety and complexity of the images in the Fashion-MNIST dataset, highlighting the importance of a diverse dataset for training robust image classification models.
## Acknowledgments
If you use the Fashion-MNIST dataset in your research or development work, please acknowledge the dataset by linking to the [GitHub repository](https://github.com/zalandoresearch/fashion-mnist). This dataset was made available by Zalando Research.
## FAQ
### What is the Fashion-MNIST dataset and how is it different from MNIST?
The [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset is a collection of 70,000 grayscale images of Zalando's article images, intended as a modern replacement for the original MNIST dataset. It serves as a benchmark for machine learning models in the context of image classification tasks. Unlike MNIST, which contains handwritten digits, Fashion-MNIST consists of 28x28-pixel images categorized into 10 fashion-related classes, such as T-shirt/top, trouser, and ankle boot.
### How can I train a YOLO model on the Fashion-MNIST dataset?
To train an Ultralytics YOLO model on the Fashion-MNIST dataset, you can use both Python and CLI commands. Here's a quick example to get you started:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a pretrained model
model = YOLO("yolo11n-cls.pt")
# Train the model on Fashion-MNIST
results = model.train(data="fashion-mnist", epochs=100, imgsz=28)
```
=== "CLI"
```bash
yolo classify train data=fashion-mnist model=yolo11n-cls.pt epochs=100 imgsz=28
```
For more detailed training parameters, refer to the [Training page](../../modes/train.md).
### Why should I use the Fashion-MNIST dataset for benchmarking my machine learning models?
The [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset is widely recognized in the [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) community as a robust alternative to MNIST. It offers a more complex and varied set of images, making it an excellent choice for benchmarking image classification models. The dataset's structure, comprising 60,000 training images and 10,000 testing images, each labeled with one of 10 classes, makes it ideal for evaluating the performance of different machine learning algorithms in a more challenging context.
### Can I use Ultralytics YOLO for image classification tasks like Fashion-MNIST?
Yes, Ultralytics YOLO models can be used for image classification tasks, including those involving the Fashion-MNIST dataset. YOLO11, for example, supports various vision tasks such as detection, segmentation, and classification. To get started with image classification tasks, refer to the [Classification page](https://docs.ultralytics.com/tasks/classify/).
### What are the key features and structure of the Fashion-MNIST dataset?
The Fashion-MNIST dataset is divided into two main subsets: 60,000 training images and 10,000 testing images. Each image is a 28x28-pixel grayscale picture representing one of 10 fashion-related classes. The simplicity and well-structured format make it ideal for training and evaluating models in machine learning and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) tasks. For more details on the dataset structure, see the [Dataset Structure section](#dataset-structure).
### How can I acknowledge the use of the Fashion-MNIST dataset in my research?
If you utilize the Fashion-MNIST dataset in your research or development projects, it's important to acknowledge it by linking to the [GitHub repository](https://github.com/zalandoresearch/fashion-mnist). This helps in attributing the data to Zalando Research, who made the dataset available for public use.
---
comments: true
description: Explore the extensive ImageNet dataset and discover its role in advancing deep learning in computer vision. Access pretrained models and training examples.
keywords: ImageNet, deep learning, visual recognition, computer vision, pretrained models, YOLO, dataset, object detection, image classification
---
# ImageNet Dataset
[ImageNet](https://www.image-net.org/) is a large-scale database of annotated images designed for use in visual object recognition research. It contains over 14 million images, with each image annotated using WordNet synsets, making it one of the most extensive resources available for training [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models in [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) tasks.
## ImageNet Pretrained Models
{% include "macros/yolo-cls-perf.md" %}
## Key Features
- ImageNet contains over 14 million high-resolution images spanning thousands of object categories.
- The dataset is organized according to the WordNet hierarchy, with each synset representing a category.
- ImageNet is widely used for training and benchmarking in the field of computer vision, particularly for [image classification](https://www.ultralytics.com/glossary/image-classification) and [object detection](https://www.ultralytics.com/glossary/object-detection) tasks.
- The annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) has been instrumental in advancing computer vision research.
## Dataset Structure
The ImageNet dataset is organized using the WordNet hierarchy. Each node in the hierarchy represents a category, and each category is described by a synset (a collection of synonymous terms). The images in ImageNet are annotated with one or more synsets, providing a rich resource for training models to recognize various objects and their relationships.
## ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
The annual [ImageNet Large Scale Visual Recognition Challenge (ILSVRC)](https://image-net.org/challenges/LSVRC/) has been an important event in the field of computer vision. It has provided a platform for researchers and developers to evaluate their algorithms and models on a large-scale dataset with standardized evaluation metrics. The ILSVRC has led to significant advancements in the development of deep learning models for image classification, object detection, and other computer vision tasks.
## Applications
The ImageNet dataset is widely used for training and evaluating deep learning models in various computer vision tasks, such as image classification, object detection, and object localization. Some popular deep learning architectures, such as AlexNet, VGG, and ResNet, were developed and benchmarked using the ImageNet dataset.
## Usage
To train a deep learning model on the ImageNet dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="imagenet", epochs=100, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=imagenet model=yolo11n-cls.pt epochs=100 imgsz=224
```
## Sample Images and Annotations
The ImageNet dataset contains high-resolution images spanning thousands of object categories, providing a diverse and extensive dataset for training and evaluating computer vision models. Here are some examples of images from the dataset:
![Dataset sample images](https://github.com/ultralytics/docs/releases/download/0/imagenet-sample-images.avif)
The example showcases the variety and complexity of the images in the ImageNet dataset, highlighting the importance of a diverse dataset for training robust computer vision models.
## Citations and Acknowledgments
If you use the ImageNet dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@article{ILSVRC15,
author = {Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei},
title={ImageNet Large Scale Visual Recognition Challenge},
year={2015},
journal={International Journal of Computer Vision (IJCV)},
volume={115},
number={3},
pages={211-252}
}
```
We would like to acknowledge the ImageNet team, led by Olga Russakovsky, Jia Deng, and Li Fei-Fei, for creating and maintaining the ImageNet dataset as a valuable resource for the [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) and computer vision research community. For more information about the ImageNet dataset and its creators, visit the [ImageNet website](https://www.image-net.org/).
## FAQ
### What is the ImageNet dataset and how is it used in computer vision?
The [ImageNet dataset](https://www.image-net.org/) is a large-scale database consisting of over 14 million high-resolution images categorized using WordNet synsets. It is extensively used in visual object recognition research, including image classification and object detection. The dataset's annotations and sheer volume provide a rich resource for training deep learning models. Notably, models like AlexNet, VGG, and ResNet have been trained and benchmarked using ImageNet, showcasing its role in advancing computer vision.
### How can I use a pretrained YOLO model for image classification on the ImageNet dataset?
To use a pretrained Ultralytics YOLO model for image classification on the ImageNet dataset, follow these steps:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="imagenet", epochs=100, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=imagenet model=yolo11n-cls.pt epochs=100 imgsz=224
```
For more in-depth training instruction, refer to our [Training page](../../modes/train.md).
### Why should I use the Ultralytics YOLO11 pretrained models for my ImageNet dataset projects?
Ultralytics YOLO11 pretrained models offer state-of-the-art performance in terms of speed and [accuracy](https://www.ultralytics.com/glossary/accuracy) for various computer vision tasks. For example, the YOLO11n-cls model, with a top-1 accuracy of 69.0% and a top-5 accuracy of 88.3%, is optimized for real-time applications. Pretrained models reduce the computational resources required for training from scratch and accelerate development cycles. Learn more about the performance metrics of YOLO11 models in the [ImageNet Pretrained Models section](#imagenet-pretrained-models).
### How is the ImageNet dataset structured, and why is it important?
The ImageNet dataset is organized using the WordNet hierarchy, where each node in the hierarchy represents a category described by a synset (a collection of synonymous terms). This structure allows for detailed annotations, making it ideal for training models to recognize a wide variety of objects. The diversity and annotation richness of ImageNet make it a valuable dataset for developing robust and generalizable deep learning models. More about this organization can be found in the [Dataset Structure](#dataset-structure) section.
### What role does the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) play in computer vision?
The annual [ImageNet Large Scale Visual Recognition Challenge (ILSVRC)](https://image-net.org/challenges/LSVRC/) has been pivotal in driving advancements in computer vision by providing a competitive platform for evaluating algorithms on a large-scale, standardized dataset. It offers standardized evaluation metrics, fostering innovation and development in areas such as image classification, object detection, and [image segmentation](https://www.ultralytics.com/glossary/image-segmentation). The challenge has continuously pushed the boundaries of what is possible with deep learning and computer vision technologies.
---
comments: true
description: Discover ImageNet10 a compact version of ImageNet for rapid model testing and CI checks. Perfect for quick evaluations in computer vision tasks.
keywords: ImageNet10, ImageNet, Ultralytics, CI tests, sanity checks, training pipelines, computer vision, deep learning, dataset
---
# ImageNet10 Dataset
The [ImageNet10](https://github.com/ultralytics/assets/releases/download/v0.0.0/imagenet10.zip) dataset is a small-scale subset of the [ImageNet](https://www.image-net.org/) database, developed by [Ultralytics](https://www.ultralytics.com/) and designed for CI tests, sanity checks, and fast testing of training pipelines. This dataset is composed of the first image in the training set and the first image from the validation set of the first 10 classes in ImageNet. Although significantly smaller, it retains the structure and diversity of the original ImageNet dataset.
## Key Features
- ImageNet10 is a compact version of ImageNet, with 20 images representing the first 10 classes of the original dataset.
- The dataset is organized according to the WordNet hierarchy, mirroring the structure of the full ImageNet dataset.
- It is ideally suited for CI tests, sanity checks, and rapid testing of training pipelines in [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) tasks.
- Although not designed for model benchmarking, it can provide a quick indication of a model's basic functionality and correctness.
## Dataset Structure
The ImageNet10 dataset, like the original ImageNet, is organized using the WordNet hierarchy. Each of the 10 classes in ImageNet10 is described by a synset (a collection of synonymous terms). The images in ImageNet10 are annotated with one or more synsets, providing a compact resource for testing models to recognize various objects and their relationships.
## Applications
The ImageNet10 dataset is useful for quickly testing and debugging computer vision models and pipelines. Its small size allows for rapid iteration, making it ideal for continuous integration tests and sanity checks. It can also be used for fast preliminary testing of new models or changes to existing models before moving on to full-scale testing with the complete ImageNet dataset.
## Usage
To test a deep learning model on the ImageNet10 dataset with an image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Test Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="imagenet10", epochs=5, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=imagenet10 model=yolo11n-cls.pt epochs=5 imgsz=224
```
## Sample Images and Annotations
The ImageNet10 dataset contains a subset of images from the original ImageNet dataset. These images are chosen to represent the first 10 classes in the dataset, providing a diverse yet compact dataset for quick testing and evaluation.
![Dataset sample images](https://github.com/ultralytics/docs/releases/download/0/imagenet10-sample-images.avif) The example showcases the variety and complexity of the images in the ImageNet10 dataset, highlighting its usefulness for sanity checks and quick testing of computer vision models.
## Citations and Acknowledgments
If you use the ImageNet10 dataset in your research or development work, please cite the original ImageNet paper:
!!! quote ""
=== "BibTeX"
```bibtex
@article{ILSVRC15,
author = {Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei},
title={ImageNet Large Scale Visual Recognition Challenge},
year={2015},
journal={International Journal of Computer Vision (IJCV)},
volume={115},
number={3},
pages={211-252}
}
```
We would like to acknowledge the ImageNet team, led by Olga Russakovsky, Jia Deng, and Li Fei-Fei, for creating and maintaining the ImageNet dataset. The ImageNet10 dataset, while a compact subset, is a valuable resource for quick testing and debugging in the [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) and computer vision research community. For more information about the ImageNet dataset and its creators, visit the [ImageNet website](https://www.image-net.org/).
## FAQ
### What is the ImageNet10 dataset and how is it different from the full ImageNet dataset?
The [ImageNet10](https://github.com/ultralytics/assets/releases/download/v0.0.0/imagenet10.zip) dataset is a compact subset of the original [ImageNet](https://www.image-net.org/) database, created by Ultralytics for rapid CI tests, sanity checks, and training pipeline evaluations. ImageNet10 comprises only 20 images, representing the first image in the training and validation sets of the first 10 classes in ImageNet. Despite its small size, it maintains the structure and diversity of the full dataset, making it ideal for quick testing but not for benchmarking models.
### How can I use the ImageNet10 dataset to test my deep learning model?
To test your deep learning model on the ImageNet10 dataset with an image size of 224x224, use the following code snippets.
!!! example "Test Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="imagenet10", epochs=5, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=imagenet10 model=yolo11n-cls.pt epochs=5 imgsz=224
```
Refer to the [Training](../../modes/train.md) page for a comprehensive list of available arguments.
### Why should I use the ImageNet10 dataset for CI tests and sanity checks?
The ImageNet10 dataset is designed specifically for CI tests, sanity checks, and quick evaluations in [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) pipelines. Its small size allows for rapid iteration and testing, making it perfect for continuous integration processes where speed is crucial. By maintaining the structural complexity and diversity of the original ImageNet dataset, ImageNet10 provides a reliable indication of a model's basic functionality and correctness without the overhead of processing a large dataset.
### What are the main features of the ImageNet10 dataset?
The ImageNet10 dataset has several key features:
- **Compact Size**: With only 20 images, it allows for rapid testing and debugging.
- **Structured Organization**: Follows the WordNet hierarchy, similar to the full ImageNet dataset.
- **CI and Sanity Checks**: Ideally suited for continuous integration tests and sanity checks.
- **Not for Benchmarking**: While useful for quick model evaluations, it is not designed for extensive benchmarking.
### Where can I download the ImageNet10 dataset?
You can download the ImageNet10 dataset from the [Ultralytics GitHub releases page](https://github.com/ultralytics/assets/releases/download/v0.0.0/imagenet10.zip). For more detailed information about its structure and applications, refer to the [ImageNet10 Dataset](imagenet10.md) page.
---
comments: true
description: Explore the ImageNette dataset, a subset of ImageNet with 10 classes for efficient training and evaluation of image classification models. Ideal for ML and CV projects.
keywords: ImageNette dataset, ImageNet subset, image classification, machine learning, deep learning, YOLO, Convolutional Neural Networks, ML dataset, education, training
---
# ImageNette Dataset
The [ImageNette](https://github.com/fastai/imagenette) dataset is a subset of the larger [Imagenet](https://www.image-net.org/) dataset, but it only includes 10 easily distinguishable classes. It was created to provide a quicker, easier-to-use version of Imagenet for software development and education.
## Key Features
- ImageNette contains images from 10 different classes such as tench, English springer, cassette player, chain saw, church, French horn, garbage truck, gas pump, golf ball, parachute.
- The dataset comprises colored images of varying dimensions.
- ImageNette is widely used for training and testing in the field of machine learning, especially for image classification tasks.
## Dataset Structure
The ImageNette dataset is split into two subsets:
1. **Training Set**: This subset contains several thousands of images used for training machine learning models. The exact number varies per class.
2. **Validation Set**: This subset consists of several hundreds of images used for validating and benchmarking the trained models. Again, the exact number varies per class.
## Applications
The ImageNette dataset is widely used for training and evaluating [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models in image classification tasks, such as [Convolutional Neural Networks](https://www.ultralytics.com/glossary/convolutional-neural-network-cnn) (CNNs), and various other machine learning algorithms. The dataset's straightforward format and well-chosen classes make it a handy resource for both beginner and experienced practitioners in the field of machine learning and computer vision.
## Usage
To train a model on the ImageNette dataset for 100 epochs with a standard image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="imagenette", epochs=100, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=imagenette model=yolo11n-cls.pt epochs=100 imgsz=224
```
## Sample Images and Annotations
The ImageNette dataset contains colored images of various objects and scenes, providing a diverse dataset for [image classification](https://www.ultralytics.com/glossary/image-classification) tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://github.com/ultralytics/docs/releases/download/0/imagenette-sample-image.avif)
The example showcases the variety and complexity of the images in the ImageNette dataset, highlighting the importance of a diverse dataset for training robust image classification models.
## ImageNette160 and ImageNette320
For faster prototyping and training, the ImageNette dataset is also available in two reduced sizes: ImageNette160 and ImageNette320. These datasets maintain the same classes and structure as the full ImageNette dataset, but the images are resized to a smaller dimension. As such, these versions of the dataset are particularly useful for preliminary model testing, or when computational resources are limited.
To use these datasets, simply replace 'imagenette' with 'imagenette160' or 'imagenette320' in the training command. The following code snippets illustrate this:
!!! example "Train Example with ImageNette160"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model with ImageNette160
results = model.train(data="imagenette160", epochs=100, imgsz=160)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model with ImageNette160
yolo classify train data=imagenette160 model=yolo11n-cls.pt epochs=100 imgsz=160
```
!!! example "Train Example with ImageNette320"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model with ImageNette320
results = model.train(data="imagenette320", epochs=100, imgsz=320)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model with ImageNette320
yolo classify train data=imagenette320 model=yolo11n-cls.pt epochs=100 imgsz=320
```
These smaller versions of the dataset allow for rapid iterations during the development process while still providing valuable and realistic image classification tasks.
## Citations and Acknowledgments
If you use the ImageNette dataset in your research or development work, please acknowledge it appropriately. For more information about the ImageNette dataset, visit the [ImageNette dataset GitHub page](https://github.com/fastai/imagenette).
## FAQ
### What is the ImageNette dataset?
The [ImageNette dataset](https://github.com/fastai/imagenette) is a simplified subset of the larger [ImageNet dataset](https://www.image-net.org/), featuring only 10 easily distinguishable classes such as tench, English springer, and French horn. It was created to offer a more manageable dataset for efficient training and evaluation of image classification models. This dataset is particularly useful for quick software development and educational purposes in [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) and computer vision.
### How can I use the ImageNette dataset for training a YOLO model?
To train a YOLO model on the ImageNette dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch), you can use the following commands. Make sure to have the Ultralytics YOLO environment set up.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="imagenette", epochs=100, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=imagenette model=yolo11n-cls.pt epochs=100 imgsz=224
```
For more details, see the [Training](../../modes/train.md) documentation page.
### Why should I use ImageNette for image classification tasks?
The ImageNette dataset is advantageous for several reasons:
- **Quick and Simple**: It contains only 10 classes, making it less complex and time-consuming compared to larger datasets.
- **Educational Use**: Ideal for learning and teaching the basics of image classification since it requires less computational power and time.
- **Versatility**: Widely used to train and benchmark various machine learning models, especially in image classification.
For more details on model training and dataset management, explore the [Dataset Structure](#dataset-structure) section.
### Can the ImageNette dataset be used with different image sizes?
Yes, the ImageNette dataset is also available in two resized versions: ImageNette160 and ImageNette320. These versions help in faster prototyping and are especially useful when computational resources are limited.
!!! example "Train Example with ImageNette160"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt")
# Train the model with ImageNette160
results = model.train(data="imagenette160", epochs=100, imgsz=160)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model with ImageNette160
yolo detect train data=imagenette160 model=yolo11n-cls.pt epochs=100 imgsz=160
```
For more information, refer to [Training with ImageNette160 and ImageNette320](#imagenette160-and-imagenette320).
### What are some practical applications of the ImageNette dataset?
The ImageNette dataset is extensively used in:
- **Educational Settings**: To educate beginners in machine learning and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv).
- **Software Development**: For rapid prototyping and development of image classification models.
- **Deep Learning Research**: To evaluate and benchmark the performance of various deep learning models, especially Convolutional [Neural Networks](https://www.ultralytics.com/glossary/neural-network-nn) (CNNs).
Explore the [Applications](#applications) section for detailed use cases.
---
comments: true
description: Explore the ImageWoof dataset, a challenging subset of ImageNet focusing on 10 dog breeds, designed to enhance image classification models. Learn more on Ultralytics Docs.
keywords: ImageWoof dataset, ImageNet subset, dog breeds, image classification, deep learning, machine learning, Ultralytics, training dataset, noisy labels
---
# ImageWoof Dataset
The [ImageWoof](https://github.com/fastai/imagenette) dataset is a subset of the ImageNet consisting of 10 classes that are challenging to classify, since they're all dog breeds. It was created as a more difficult task for [image classification](https://www.ultralytics.com/glossary/image-classification) algorithms to solve, aiming at encouraging development of more advanced models.
## Key Features
- ImageWoof contains images of 10 different dog breeds: Australian terrier, Border terrier, Samoyed, Beagle, Shih-Tzu, English foxhound, Rhodesian ridgeback, Dingo, Golden retriever, and Old English sheepdog.
- The dataset provides images at various resolutions (full size, 320px, 160px), accommodating for different computational capabilities and research needs.
- It also includes a version with noisy labels, providing a more realistic scenario where labels might not always be reliable.
## Dataset Structure
The ImageWoof dataset structure is based on the dog breed classes, with each breed having its own directory of images.
## Applications
The ImageWoof dataset is widely used for training and evaluating deep learning models in image classification tasks, especially when it comes to more complex and similar classes. The dataset's challenge lies in the subtle differences between the dog breeds, pushing the limits of model's performance and generalization.
## Usage
To train a CNN model on the ImageWoof dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="imagewoof", epochs=100, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=imagewoof model=yolo11n-cls.pt epochs=100 imgsz=224
```
## Dataset Variants
ImageWoof dataset comes in three different sizes to accommodate various research needs and computational capabilities:
1. **Full Size (imagewoof)**: This is the original version of the ImageWoof dataset. It contains full-sized images and is ideal for final training and performance benchmarking.
2. **Medium Size (imagewoof320)**: This version contains images resized to have a maximum edge length of 320 pixels. It's suitable for faster training without significantly sacrificing model performance.
3. **Small Size (imagewoof160)**: This version contains images resized to have a maximum edge length of 160 pixels. It's designed for rapid prototyping and experimentation where training speed is a priority.
To use these variants in your training, simply replace 'imagewoof' in the dataset argument with 'imagewoof320' or 'imagewoof160'. For example:
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# For medium-sized dataset
model.train(data="imagewoof320", epochs=100, imgsz=224)
# For small-sized dataset
model.train(data="imagewoof160", epochs=100, imgsz=224)
```
=== "CLI"
```bash
# Load a pretrained model and train on the small-sized dataset
yolo classify train model=yolo11n-cls.pt data=imagewoof320 epochs=100 imgsz=224
```
It's important to note that using smaller images will likely yield lower performance in terms of classification accuracy. However, it's an excellent way to iterate quickly in the early stages of model development and prototyping.
## Sample Images and Annotations
The ImageWoof dataset contains colorful images of various dog breeds, providing a challenging dataset for image classification tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://github.com/ultralytics/docs/releases/download/0/imagewoof-dataset-sample.avif)
The example showcases the subtle differences and similarities among the different dog breeds in the ImageWoof dataset, highlighting the complexity and difficulty of the classification task.
## Citations and Acknowledgments
If you use the ImageWoof dataset in your research or development work, please make sure to acknowledge the creators of the dataset by linking to the [official dataset repository](https://github.com/fastai/imagenette).
We would like to acknowledge the FastAI team for creating and maintaining the ImageWoof dataset as a valuable resource for the [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) research community. For more information about the ImageWoof dataset, visit the [ImageWoof dataset repository](https://github.com/fastai/imagenette).
## FAQ
### What is the ImageWoof dataset in Ultralytics?
The [ImageWoof](https://github.com/fastai/imagenette) dataset is a challenging subset of ImageNet focusing on 10 specific dog breeds. Created to push the limits of image classification models, it features breeds like Beagle, Shih-Tzu, and Golden Retriever. The dataset includes images at various resolutions (full size, 320px, 160px) and even noisy labels for more realistic training scenarios. This complexity makes ImageWoof ideal for developing more advanced deep learning models.
### How can I train a model using the ImageWoof dataset with Ultralytics YOLO?
To train a [Convolutional Neural Network](https://www.ultralytics.com/glossary/convolutional-neural-network-cnn) (CNN) model on the ImageWoof dataset using Ultralytics YOLO for 100 epochs at an image size of 224x224, you can use the following code:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
model = YOLO("yolo11n-cls.pt") # Load a pretrained model
results = model.train(data="imagewoof", epochs=100, imgsz=224)
```
=== "CLI"
```bash
yolo classify train data=imagewoof model=yolo11n-cls.pt epochs=100 imgsz=224
```
For more details on available training arguments, refer to the [Training](../../modes/train.md) page.
### What versions of the ImageWoof dataset are available?
The ImageWoof dataset comes in three sizes:
1. **Full Size (imagewoof)**: Ideal for final training and benchmarking, containing full-sized images.
2. **Medium Size (imagewoof320)**: Resized images with a maximum edge length of 320 pixels, suited for faster training.
3. **Small Size (imagewoof160)**: Resized images with a maximum edge length of 160 pixels, perfect for rapid prototyping.
Use these versions by replacing 'imagewoof' in the dataset argument accordingly. Note, however, that smaller images may yield lower classification [accuracy](https://www.ultralytics.com/glossary/accuracy) but can be useful for quicker iterations.
### How do noisy labels in the ImageWoof dataset benefit training?
Noisy labels in the ImageWoof dataset simulate real-world conditions where labels might not always be accurate. Training models with this data helps develop robustness and generalization in image classification tasks. This prepares the models to handle ambiguous or mislabeled data effectively, which is often encountered in practical applications.
### What are the key challenges of using the ImageWoof dataset?
The primary challenge of the ImageWoof dataset lies in the subtle differences among the dog breeds it includes. Since it focuses on 10 closely related breeds, distinguishing between them requires more advanced and fine-tuned image classification models. This makes ImageWoof an excellent benchmark to test the capabilities and improvements of [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models.
---
comments: true
description: Learn how to structure datasets for YOLO classification tasks. Detailed folder structure and usage examples for effective training.
keywords: YOLO, image classification, dataset structure, CIFAR-10, Ultralytics, machine learning, training data, model evaluation
---
# Image Classification Datasets Overview
### Dataset Structure for YOLO Classification Tasks
For [Ultralytics](https://www.ultralytics.com/) YOLO classification tasks, the dataset must be organized in a specific split-directory structure under the `root` directory to facilitate proper training, testing, and optional validation processes. This structure includes separate directories for training (`train`) and testing (`test`) phases, with an optional directory for validation (`val`).
Each of these directories should contain one subdirectory for each class in the dataset. The subdirectories are named after the corresponding class and contain all the images for that class. Ensure that each image file is named uniquely and stored in a common format such as JPEG or PNG.
**Folder Structure Example**
Consider the CIFAR-10 dataset as an example. The folder structure should look like this:
```
cifar-10-/
|
|-- train/
| |-- airplane/
| | |-- 10008_airplane.png
| | |-- 10009_airplane.png
| | |-- ...
| |
| |-- automobile/
| | |-- 1000_automobile.png
| | |-- 1001_automobile.png
| | |-- ...
| |
| |-- bird/
| | |-- 10014_bird.png
| | |-- 10015_bird.png
| | |-- ...
| |
| |-- ...
|
|-- test/
| |-- airplane/
| | |-- 10_airplane.png
| | |-- 11_airplane.png
| | |-- ...
| |
| |-- automobile/
| | |-- 100_automobile.png
| | |-- 101_automobile.png
| | |-- ...
| |
| |-- bird/
| | |-- 1000_bird.png
| | |-- 1001_bird.png
| | |-- ...
| |
| |-- ...
|
|-- val/ (optional)
| |-- airplane/
| | |-- 105_airplane.png
| | |-- 106_airplane.png
| | |-- ...
| |
| |-- automobile/
| | |-- 102_automobile.png
| | |-- 103_automobile.png
| | |-- ...
| |
| |-- bird/
| | |-- 1045_bird.png
| | |-- 1046_bird.png
| | |-- ...
| |
| |-- ...
```
This structured approach ensures that the model can effectively learn from well-organized classes during the training phase and accurately evaluate performance during testing and validation phases.
## Usage
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="path/to/dataset", epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=path/to/data model=yolo11n-cls.pt epochs=100 imgsz=640
```
## Supported Datasets
Ultralytics supports the following datasets with automatic download:
- [Caltech 101](caltech101.md): A dataset containing images of 101 object categories for [image classification](https://www.ultralytics.com/glossary/image-classification) tasks.
- [Caltech 256](caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
- [CIFAR-10](cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
- [CIFAR-100](cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
- [Fashion-MNIST](fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
- [ImageNet](imagenet.md): A large-scale dataset for [object detection](https://www.ultralytics.com/glossary/object-detection) and image classification with over 14 million images and 20,000 categories.
- [ImageNet-10](imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
- [Imagenette](imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
- [Imagewoof](imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
- [MNIST](mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.
- [MNIST160](mnist.md): First 8 images of each MNIST category from the MNIST dataset. Dataset contains 160 images total.
### Adding your own dataset
If you have your own dataset and would like to use it for training classification models with Ultralytics, ensure that it follows the format specified above under "Dataset format" and then point your `data` argument to the dataset directory.
## FAQ
### How do I structure my dataset for YOLO classification tasks?
To structure your dataset for Ultralytics YOLO classification tasks, you should follow a specific split-directory format. Organize your dataset into separate directories for `train`, `test`, and optionally `val`. Each of these directories should contain subdirectories named after each class, with the corresponding images inside. This facilitates smooth training and evaluation processes. For an example, consider the CIFAR-10 dataset format:
```
cifar-10-/
|-- train/
| |-- airplane/
| |-- automobile/
| |-- bird/
| ...
|-- test/
| |-- airplane/
| |-- automobile/
| |-- bird/
| ...
|-- val/ (optional)
| |-- airplane/
| |-- automobile/
| |-- bird/
| ...
```
For more details, visit [Dataset Structure for YOLO Classification Tasks](#dataset-structure-for-yolo-classification-tasks).
### What datasets are supported by Ultralytics YOLO for image classification?
Ultralytics YOLO supports automatic downloading of several datasets for image classification, including:
- [Caltech 101](caltech101.md)
- [Caltech 256](caltech256.md)
- [CIFAR-10](cifar10.md)
- [CIFAR-100](cifar100.md)
- [Fashion-MNIST](fashion-mnist.md)
- [ImageNet](imagenet.md)
- [ImageNet-10](imagenet10.md)
- [Imagenette](imagenette.md)
- [Imagewoof](imagewoof.md)
- [MNIST](mnist.md)
These datasets are structured in a way that makes them easy to use with YOLO. Each dataset's page provides further details about its structure and applications.
### How do I add my own dataset for YOLO image classification?
To use your own dataset with Ultralytics YOLO, ensure it follows the specified directory format required for the classification task, with separate `train`, `test`, and optionally `val` directories, and subdirectories for each class containing the respective images. Once your dataset is structured correctly, point the `data` argument to your dataset's root directory when initializing the training script. Here's an example in Python:
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="path/to/your/dataset", epochs=100, imgsz=640)
```
More details can be found in the [Adding your own dataset](#adding-your-own-dataset) section.
### Why should I use Ultralytics YOLO for image classification?
Ultralytics YOLO offers several benefits for image classification, including:
- **Pretrained Models**: Load pretrained models like `yolo11n-cls.pt` to jump-start your training process.
- **Ease of Use**: Simple API and CLI commands for training and evaluation.
- **High Performance**: State-of-the-art [accuracy](https://www.ultralytics.com/glossary/accuracy) and speed, ideal for real-time applications.
- **Support for Multiple Datasets**: Seamless integration with various popular datasets like CIFAR-10, ImageNet, and more.
- **Community and Support**: Access to extensive documentation and an active community for troubleshooting and improvements.
For additional insights and real-world applications, you can explore [Ultralytics YOLO](https://www.ultralytics.com/yolo).
### How can I train a model using Ultralytics YOLO?
Training a model using Ultralytics YOLO can be done easily in both Python and CLI. Here's an example:
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model
# Train the model
results = model.train(data="path/to/dataset", epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=path/to/data model=yolo11n-cls.pt epochs=100 imgsz=640
```
These examples demonstrate the straightforward process of training a YOLO model using either approach. For more information, visit the [Usage](#usage) section.
---
comments: true
description: Explore the MNIST dataset, a cornerstone in machine learning for handwritten digit recognition. Learn about its structure, features, and applications.
keywords: MNIST, dataset, handwritten digits, image classification, deep learning, machine learning, training set, testing set, NIST
---
# MNIST Dataset
The [MNIST](https://en.wikipedia.org/wiki/MNIST_database) (Modified National Institute of Standards and Technology) dataset is a large database of handwritten digits that is commonly used for training various image processing systems and machine learning models. It was created by "re-mixing" the samples from NIST's original datasets and has become a benchmark for evaluating the performance of image classification algorithms.
## Key Features
- MNIST contains 60,000 training images and 10,000 testing images of handwritten digits.
- The dataset comprises grayscale images of size 28x28 pixels.
- The images are normalized to fit into a 28x28 pixel [bounding box](https://www.ultralytics.com/glossary/bounding-box) and anti-aliased, introducing grayscale levels.
- MNIST is widely used for training and testing in the field of machine learning, especially for image classification tasks.
## Dataset Structure
The MNIST dataset is split into two subsets:
1. **Training Set**: This subset contains 60,000 images of handwritten digits used for training machine learning models.
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
## Extended MNIST (EMNIST)
Extended MNIST (EMNIST) is a newer dataset developed and released by NIST to be the successor to MNIST. While MNIST included images only of handwritten digits, EMNIST includes all the images from NIST Special Database 19, which is a large database of handwritten uppercase and lowercase letters as well as digits. The images in EMNIST were converted into the same 28x28 pixel format, by the same process, as were the MNIST images. Accordingly, tools that work with the older, smaller MNIST dataset will likely work unmodified with EMNIST.
## Applications
The MNIST dataset is widely used for training and evaluating [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) models in image classification tasks, such as [Convolutional Neural Networks](https://www.ultralytics.com/glossary/convolutional-neural-network-cnn) (CNNs), [Support Vector Machines](https://www.ultralytics.com/glossary/support-vector-machine-svm) (SVMs), and various other machine learning algorithms. The dataset's simple and well-structured format makes it an essential resource for researchers and practitioners in the field of machine learning and computer vision.
## Usage
To train a CNN model on the MNIST dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 32x32, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="mnist", epochs=100, imgsz=32)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=mnist model=yolo11n-cls.pt epochs=100 imgsz=28
```
## Sample Images and Annotations
The MNIST dataset contains grayscale images of handwritten digits, providing a well-structured dataset for [image classification](https://www.ultralytics.com/glossary/image-classification) tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://upload.wikimedia.org/wikipedia/commons/2/27/MnistExamples.png)
The example showcases the variety and complexity of the handwritten digits in the MNIST dataset, highlighting the importance of a diverse dataset for training robust image classification models.
## Citations and Acknowledgments
If you use the MNIST dataset in your
research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@article{lecun2010mnist,
title={MNIST handwritten digit database},
author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
journal={ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist},
volume={2},
year={2010}
}
```
We would like to acknowledge Yann LeCun, Corinna Cortes, and Christopher J.C. Burges for creating and maintaining the MNIST dataset as a valuable resource for the [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) research community. For more information about the MNIST dataset and its creators, visit the [MNIST dataset website](https://en.wikipedia.org/wiki/MNIST_database).
## FAQ
### What is the MNIST dataset, and why is it important in machine learning?
The [MNIST](https://en.wikipedia.org/wiki/MNIST_database) dataset, or Modified National Institute of Standards and Technology dataset, is a widely-used collection of handwritten digits designed for training and testing image classification systems. It includes 60,000 training images and 10,000 testing images, all of which are grayscale and 28x28 pixels in size. The dataset's importance lies in its role as a standard benchmark for evaluating image classification algorithms, helping researchers and engineers to compare methods and track progress in the field.
### How can I use Ultralytics YOLO to train a model on the MNIST dataset?
To train a model on the MNIST dataset using Ultralytics YOLO, you can follow these steps:
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="mnist", epochs=100, imgsz=32)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo classify train data=mnist model=yolo11n-cls.pt epochs=100 imgsz=28
```
For a detailed list of available training arguments, refer to the [Training](../../modes/train.md) page.
### What is the difference between the MNIST and EMNIST datasets?
The MNIST dataset contains only handwritten digits, whereas the Extended MNIST (EMNIST) dataset includes both digits and uppercase and lowercase letters. EMNIST was developed as a successor to MNIST and utilizes the same 28x28 pixel format for the images, making it compatible with tools and models designed for the original MNIST dataset. This broader range of characters in EMNIST makes it useful for a wider variety of machine learning applications.
### Can I use Ultralytics HUB to train models on custom datasets like MNIST?
Yes, you can use Ultralytics HUB to train models on custom datasets like MNIST. Ultralytics HUB offers a user-friendly interface for uploading datasets, training models, and managing projects without needing extensive coding knowledge. For more details on how to get started, check out the [Ultralytics HUB Quickstart](https://docs.ultralytics.com/hub/quickstart/) page.
---
comments: true
description: Explore our African Wildlife Dataset featuring images of buffalo, elephant, rhino, and zebra for training computer vision models. Ideal for research and conservation.
keywords: African Wildlife Dataset, South African animals, object detection, computer vision, YOLO11, wildlife research, conservation, dataset
---
# African Wildlife Dataset
This dataset showcases four common animal classes typically found in South African nature reserves. It includes images of African wildlife such as buffalo, elephant, rhino, and zebra, providing valuable insights into their characteristics. Essential for training [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) algorithms, this dataset aids in identifying animals in various habitats, from zoos to forests, and supports wildlife research.
<p align="center">
<br>
<iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/biIW5Z6GYl0"
title="YouTube video player" frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
allowfullscreen>
</iframe>
<br>
<strong>Watch:</strong> African Wildlife Animals Detection using Ultralytics YOLO11
</p>
## Dataset Structure
The African wildlife objects detection dataset is split into three subsets:
- **Training set**: Contains 1052 images, each with corresponding annotations.
- **Validation set**: Includes 225 images, each with paired annotations.
- **Testing set**: Comprises 227 images, each with paired annotations.
## Applications
This dataset can be applied in various computer vision tasks such as [object detection](https://www.ultralytics.com/glossary/object-detection), object tracking, and research. Specifically, it can be used to train and evaluate models for identifying African wildlife objects in images, which can have applications in wildlife conservation, ecological research, and monitoring efforts in natural reserves and protected areas. Additionally, it can serve as a valuable resource for educational purposes, enabling students and researchers to study and understand the characteristics and behaviors of different animal species.
## Dataset YAML
A YAML (Yet Another Markup Language) file defines the dataset configuration, including paths, classes, and other pertinent details. For the African wildlife dataset, the `african-wildlife.yaml` file is located at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/african-wildlife.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/african-wildlife.yaml).
!!! example "ultralytics/cfg/datasets/african-wildlife.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/african-wildlife.yaml"
```
## Usage
To train a YOLO11n model on the African wildlife dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, use the provided code samples. For a comprehensive list of available parameters, refer to the model's [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="african-wildlife.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=african-wildlife.yaml model=yolo11n.pt epochs=100 imgsz=640
```
!!! example "Inference Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("path/to/best.pt") # load a brain-tumor fine-tuned model
# Inference using the model
results = model.predict("https://ultralytics.com/assets/african-wildlife-sample.jpg")
```
=== "CLI"
```bash
# Start prediction with a finetuned *.pt model
yolo detect predict model='path/to/best.pt' imgsz=640 source="https://ultralytics.com/assets/african-wildlife-sample.jpg"
```
## Sample Images and Annotations
The African wildlife dataset comprises a wide variety of images showcasing diverse animal species and their natural habitats. Below are examples of images from the dataset, each accompanied by its corresponding annotations.
![African wildlife dataset sample image](https://github.com/ultralytics/docs/releases/download/0/african-wildlife-dataset-sample.avif)
- **Mosaiced Image**: Here, we present a training batch consisting of mosaiced dataset images. Mosaicing, a training technique, combines multiple images into one, enriching batch diversity. This method helps enhance the model's ability to generalize across different object sizes, aspect ratios, and contexts.
This example illustrates the variety and complexity of images in the African wildlife dataset, emphasizing the benefits of including mosaicing during the training process.
## Citations and Acknowledgments
The dataset has been released available under the [AGPL-3.0 License](https://github.com/ultralytics/ultralytics/blob/main/LICENSE).
## FAQ
### What is the African Wildlife Dataset, and how can it be used in computer vision projects?
The African Wildlife Dataset includes images of four common animal species found in South African nature reserves: buffalo, elephant, rhino, and zebra. It is a valuable resource for training computer vision algorithms in object detection and animal identification. The dataset supports various tasks like object tracking, research, and conservation efforts. For more information on its structure and applications, refer to the [Dataset Structure](#dataset-structure) section and [Applications](#applications) of the dataset.
### How do I train a YOLO11 model using the African Wildlife Dataset?
You can train a YOLO11 model on the African Wildlife Dataset by using the `african-wildlife.yaml` configuration file. Below is an example of how to train the YOLO11n model for 100 epochs with an image size of 640:
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="african-wildlife.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=african-wildlife.yaml model=yolo11n.pt epochs=100 imgsz=640
```
For additional training parameters and options, refer to the [Training](../../modes/train.md) documentation.
### Where can I find the YAML configuration file for the African Wildlife Dataset?
The YAML configuration file for the African Wildlife Dataset, named `african-wildlife.yaml`, can be found at [this GitHub link](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/african-wildlife.yaml). This file defines the dataset configuration, including paths, classes, and other details crucial for training [machine learning](https://www.ultralytics.com/glossary/machine-learning-ml) models. See the [Dataset YAML](#dataset-yaml) section for more details.
### Can I see sample images and annotations from the African Wildlife Dataset?
Yes, the African Wildlife Dataset includes a wide variety of images showcasing diverse animal species in their natural habitats. You can view sample images and their corresponding annotations in the [Sample Images and Annotations](#sample-images-and-annotations) section. This section also illustrates the use of mosaicing technique to combine multiple images into one for enriched batch diversity, enhancing the model's generalization ability.
### How can the African Wildlife Dataset be used to support wildlife conservation and research?
The African Wildlife Dataset is ideal for supporting wildlife conservation and research by enabling the training and evaluation of models to identify African wildlife in different habitats. These models can assist in monitoring animal populations, studying their behavior, and recognizing conservation needs. Additionally, the dataset can be utilized for educational purposes, helping students and researchers understand the characteristics and behaviors of different animal species. More details can be found in the [Applications](#applications) section.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment