YOLO Format Format Guide

    Annotation format for YOLO object detection models

    Annotation

    Specification

    The YOLO annotation format is a simple text-based labeling format used for training YOLO (You Only Look Once) object detection models. Each image in the dataset has a corresponding .txt annotation file with the same base name. Every line in the annotation file represents one bounding box and contains five space-separated values: the class index (integer), the x-center coordinate, the y-center coordinate, the box width, and the box height. All coordinates are normalized to the range [0, 1] relative to the image dimensions.

    The normalization convention means that x-center and width are divided by the image width, while y-center and height are divided by the image height. This makes annotations resolution-independent — the same annotation file works correctly regardless of whether the image is resized. The class index is a zero-based integer that maps to class names defined in a separate configuration file (typically data.yaml). An image with no objects has an empty annotation file or no annotation file at all.

    For segmentation tasks, the YOLO format extends to polygon annotations where each line contains the class index followed by pairs of x,y coordinates defining the polygon vertices. For oriented bounding boxes (OBB), the format uses class index followed by four x,y corner point pairs. For pose estimation, keypoints are appended after the bounding box as x,y,visibility triplets for each keypoint. The Ultralytics YOLOv8 framework has standardized these extended formats across detection, segmentation, classification, pose estimation, and oriented bounding box tasks.

    When to Use YOLO Format

    Use the YOLO format when training any model in the YOLO family — YOLOv5, YOLOv7, YOLOv8, YOLOv9, YOLOv10, YOLO11, or RT-DETR through the Ultralytics framework. YOLO is the most popular object detection architecture for real-time applications, and the annotation format is supported by all major labeling tools including Label Studio, CVAT, Roboflow, Labelbox, and V7. If your use case involves real-time object detection, the YOLO format and training pipeline is likely the fastest path to deployment.

    Choose YOLO format over COCO format when your workflow is centered on YOLO-family models and you prefer the simplicity of one-text-file-per-image. YOLO format is easier to manually inspect, edit, and version control because each annotation is a small text file rather than a single large JSON. It also avoids the complexity of COCO's nested JSON structure with separate category, annotation, and image dictionaries.

    YOLO format is less suitable when you need to store rich annotation metadata (annotator ID, confidence scores, annotation timestamps), when your task requires non-rectangular annotations beyond what YOLO's polygon format supports, or when you are training models outside the YOLO family that expect COCO, VOC, or other annotation formats.

    Schema / Structure

    text
    YOLO Detection Format (per line):
    <class_id> <x_center> <y_center> <width> <height>
    
    Where:
      class_id  - Integer class index (0-based)
      x_center  - Bounding box center X (normalized 0.0-1.0)
      y_center  - Bounding box center Y (normalized 0.0-1.0)
      width     - Bounding box width (normalized 0.0-1.0)
      height    - Bounding box height (normalized 0.0-1.0)
    
    YOLO Segmentation Format (per line):
    <class_id> <x1> <y1> <x2> <y2> ... <xn> <yn>
    
    Dataset Directory Structure:
    dataset/
    ├── data.yaml            # Class names and paths
    ├── train/
    │   ├── images/
    │   │   ├── img001.jpg
    │   │   └── img002.jpg
    │   └── labels/
    │       ├── img001.txt
    │       └── img002.txt
    ├── val/
    │   ├── images/
    │   └── labels/
    └── test/
        ├── images/
        └── labels/
    YOLO annotation format specification with detection, segmentation, and directory structure

    Example Data

    yaml
    # data.yaml - Dataset configuration
    path: ./dataset
    train: train/images
    val: val/images
    test: test/images
    names:
      0: person
      1: car
      2: bicycle
      3: traffic_light
    
    # --- labels/img001.txt ---
    # Two people and one car detected in the image
    0 0.4531 0.3275 0.1200 0.4500
    0 0.7125 0.4100 0.0950 0.3800
    1 0.2800 0.5500 0.3200 0.2800
    
    # --- labels/img002.txt ---
    # One bicycle and one traffic light
    2 0.6200 0.6800 0.1500 0.2200
    3 0.1500 0.1200 0.0400 0.0800
    YOLO dataset configuration (data.yaml) and annotation files for object detection

    Ertas Support

    Ertas Data Suite supports YOLO format datasets for computer vision training data preparation. You can import YOLO-formatted annotation datasets, apply data quality checks including annotation validation (verifying coordinate ranges, class index validity, and bounding box sanity), and export cleaned datasets maintaining the YOLO directory structure. PII redaction can be applied to associated metadata files while preserving annotation integrity.

    Related Resources

    Ship AI that runs on your users' devices.

    Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.