COCO 格式 Format Guide

Microsoft COCO 标注格式，用于目标检测和语义分割

Annotation

Specification

COCO（Common Objects in Context，上下文中的常见物体）标注格式是由微软研究院为 COCO 数据集和基准测试开发的一种综合性 JSON 标注标准。它已成为计算机视觉领域中最广泛采用的标注格式之一，支持目标检测、实例分割、关键点检测、全景分割、图像描述生成和密集姿态估计等任务。与较为简单的 YOLO 格式不同，COCO 将整个数据集的所有标注存储在单个 JSON 文件中，具有丰富的关系型结构。

COCO 格式使用关系型数据模型，包含四个主要实体：images（包含 id、file_name、width、height）、annotations（包含 id、image_id、category_id、bbox、segmentation、area、iscrowd）、categories（包含 id、name、supercategory）以及可选的 licenses 和 info 元数据。边界框以 [x_min, y_min, width, height] 格式存储，使用绝对像素坐标（而非 YOLO 的归一化坐标）。分割掩码以多边形顶点列表或压缩 RLE（行程编码）的形式存储。

这种关系型结构支持强大的查询能力——可以查找特定图像的所有标注、某个类别的所有实例，或根据面积、密集标记等属性筛选标注。pycocotools 库提供了 Python API（COCO 类），用于加载、查询和评估 COCO 格式的数据集。COCO 评估指标（AP、AP50、AP75、AP_small、AP_medium、AP_large）已成为目标检测和分割模型评估的标准基准。

When to Use COCO 格式

当使用需要 COCO 格式标注的框架训练模型时，应使用 COCO 格式，包括 Detectron2、MMDetection、DETR 以及许多 Hugging Face 视觉模型。COCO 格式是学术论文和基准对比的标准格式——如果你需要发表成果或与已发表的基线进行对比，COCO 格式和评估指标是必需的。当你的标注需要在边界框之外还包含分割掩码、关键点或描述时，COCO 格式也是最佳选择。

在以下情况下应优先选择 COCO 格式而非 YOLO 格式：需要丰富的标注元数据（面积、iscrowd 标志、分割多边形），处理需要同时包含"stuff"和"thing"标注的全景分割任务，或评估工作流使用标准 COCO 指标。当单张图像有多种标注类型（边界框加分割加关键点）需要统一存储时，COCO 格式也更为合适。

当数据集非常大且需要对单个图像标注进行版本控制时，COCO 格式不太便捷（单个 JSON 文件难以进行差异比较和合并）。它在解析和生成方面也比 YOLO 格式更复杂，需要使用 pycocotools 库或进行仔细的 JSON 处理。对于仅包含边界框的简单检测任务且使用 YOLO 模型训练，YOLO 格式更简单且同样有效。

Schema / Structure

json

{
  "info": {
    "year": 2026,
    "version": "1.0",
    "description": "Custom object detection dataset",
    "contributor": "Ertas",
    "date_created": "2026-03-15"
  },
  "licenses": [
    {"id": 1, "name": "CC BY 4.0", "url": "https://creativecommons.org/licenses/by/4.0/"}
  ],
  "categories": [
    {"id": 1, "name": "car", "supercategory": "vehicle"},
    {"id": 2, "name": "person", "supercategory": "human"}
  ],
  "images": [
    {"id": 1, "file_name": "img001.jpg", "width": 1920, "height": 1080}
  ],
  "annotations": [
    {
      "id": 1,
      "image_id": 1,
      "category_id": 1,
      "bbox": [100.0, 200.0, 300.0, 150.0],
      "area": 45000.0,
      "segmentation": [[100,200, 400,200, 400,350, 100,350]],
      "iscrowd": 0
    }
  ]
}

COCO JSON 标注格式，包含 info、categories、images 和 annotations 部分

Example Data

python

from pycocotools.coco import COCO
import json

# Load and query a COCO dataset
coco = COCO("annotations/instances_train.json")

# Get all images containing 'car'
car_id = coco.getCatIds(catNms=["car"])[0]
car_img_ids = coco.getImgIds(catIds=[car_id])
print(f"Found {len(car_img_ids)} images with cars")

# Get annotations for a specific image
img_info = coco.loadImgs(car_img_ids[0])[0]
ann_ids = coco.getAnnIds(imgIds=img_info["id"])
anns = coco.loadAnns(ann_ids)
for ann in anns:
    cat = coco.loadCats(ann["category_id"])[0]
    print(f"  {cat['name']}: bbox={ann['bbox']}, area={ann['area']}")

# Create a COCO dataset programmatically
dataset = {
    "info": {"version": "1.0", "description": "My dataset"},
    "categories": [
        {"id": 1, "name": "dog", "supercategory": "animal"},
        {"id": 2, "name": "cat", "supercategory": "animal"},
    ],
    "images": [
        {"id": 1, "file_name": "photo_001.jpg", "width": 640, "height": 480},
    ],
    "annotations": [
        {"id": 1, "image_id": 1, "category_id": 1,
         "bbox": [50, 100, 200, 180], "area": 36000, "iscrowd": 0,
         "segmentation": [[50,100, 250,100, 250,280, 50,280]]},
    ],
}
with open("annotations.json", "w") as f:
    json.dump(dataset, f, indent=2)

使用 pycocotools 加载、查询和创建 COCO 格式标注数据集

Ertas Support

Ertas Data Suite 支持 COCO 格式的导入和导出，适用于计算机视觉训练数据工作流。你可以将 COCO JSON 标注文件与相应的图像数据集一同导入，验证标注完整性（检查孤立标注、缺失图像和无效类别引用），并将清理后的数据集以 COCO 格式导出。平台还支持 COCO 和 YOLO 之间的格式转换，适用于需要同时使用两种格式的工作流。

Ship AI that runs on your users' devices.

Free plan with 30 credits/mo, no card required. Paid plans from $25/mo USD.

or view pricing →