COCO Format Format Guide

用於物件偵測和分割的 Microsoft COCO 標註格式

Annotation

Specification

COCO（Common Objects in Context，常見物件上下文）標註格式是由 Microsoft Research 為 COCO 資料集和基準測試開發的全面性 JSON 標註標準。它已成為電腦視覺中最廣泛採用的標註格式之一，支援物件偵測、實例分割、關鍵點偵測、全景分割、影像描述和密集姿態估計。與較簡單的 YOLO 格式不同，COCO 將整個資料集的所有標註儲存在具有豐富關聯結構的單一 JSON 檔案中。

COCO 格式使用關聯資料模型，具有四個主要實體：images（包含 id、file_name、width、height）、annotations（包含 id、image_id、category_id、bbox、segmentation、area、iscrowd）、categories（包含 id、name、supercategory），以及可選的 licenses 和 info 中繼資料。邊界框以 [x_min, y_min, width, height] 格式使用絕對像素座標儲存（不像 YOLO 那樣歸一化）。分割遮罩以多邊形頂點列表或壓縮 RLE（行程長度編碼）形式儲存二元遮罩。

關聯結構支援強大的查詢——尋找特定影像的所有標註、特定類別的所有實例，或按標註屬性如面積或群眾狀態進行篩選。pycocotools 程式庫提供了一個 Python API（COCO 類別）用於載入、查詢和評估 COCO 格式的資料集。COCO 評估指標（AP、AP50、AP75、AP_small、AP_medium、AP_large）已成為物件偵測和分割模型評估的標準基準。

When to Use COCO Format

當使用期望 COCO 風格標註的框架訓練模型時，請使用 COCO 格式，包括 Detectron2、MMDetection、DETR 和許多 Hugging Face 視覺模型。COCO 格式是學術研究論文和基準比較的標準——如果您正在發表結果或與已發表的基線進行比較，COCO 格式和評估指標是預期的選擇。當您的標註除了邊界框外還包含分割遮罩、關鍵點或描述時，它也是最佳選擇。

當您需要豐富的標註中繼資料（面積、iscrowd 標記、分割多邊形）、當您處理需要「stuff」和「thing」標註的全景分割任務時，或當您的評估工作流程使用標準 COCO 指標時，選擇 COCO 格式而非 YOLO 格式。當您在每張影像上有多種標註類型（邊界框加分割加關鍵點）需要以統一格式儲存時，COCO 格式也是首選。

當您的資料集非常大且您想要對個別影像標註進行版本控制時（單一 JSON 檔案難以進行差異比較和合併），COCO 格式不太方便。它的解析和生成也比 YOLO 格式更複雜，需要 pycocotools 程式庫或仔細的 JSON 操作。對於使用 YOLO 模型訓練的簡單僅邊界框偵測任務，YOLO 格式更簡單且同樣有效。

Schema / Structure

json

{
  "info": {
    "year": 2026,
    "version": "1.0",
    "description": "Custom object detection dataset",
    "contributor": "Ertas",
    "date_created": "2026-03-15"
  },
  "licenses": [
    {"id": 1, "name": "CC BY 4.0", "url": "https://creativecommons.org/licenses/by/4.0/"}
  ],
  "categories": [
    {"id": 1, "name": "car", "supercategory": "vehicle"},
    {"id": 2, "name": "person", "supercategory": "human"}
  ],
  "images": [
    {"id": 1, "file_name": "img001.jpg", "width": 1920, "height": 1080}
  ],
  "annotations": [
    {
      "id": 1,
      "image_id": 1,
      "category_id": 1,
      "bbox": [100.0, 200.0, 300.0, 150.0],
      "area": 45000.0,
      "segmentation": [[100,200, 400,200, 400,350, 100,350]],
      "iscrowd": 0
    }
  ]
}

COCO JSON 標註格式，包含 info、categories、images 和 annotations 區段

Example Data

python

from pycocotools.coco import COCO
import json

# Load and query a COCO dataset
coco = COCO("annotations/instances_train.json")

# Get all images containing 'car'
car_id = coco.getCatIds(catNms=["car"])[0]
car_img_ids = coco.getImgIds(catIds=[car_id])
print(f"Found {len(car_img_ids)} images with cars")

# Get annotations for a specific image
img_info = coco.loadImgs(car_img_ids[0])[0]
ann_ids = coco.getAnnIds(imgIds=img_info["id"])
anns = coco.loadAnns(ann_ids)
for ann in anns:
    cat = coco.loadCats(ann["category_id"])[0]
    print(f"  {cat['name']}: bbox={ann['bbox']}, area={ann['area']}")

# Create a COCO dataset programmatically
dataset = {
    "info": {"version": "1.0", "description": "My dataset"},
    "categories": [
        {"id": 1, "name": "dog", "supercategory": "animal"},
        {"id": 2, "name": "cat", "supercategory": "animal"},
    ],
    "images": [
        {"id": 1, "file_name": "photo_001.jpg", "width": 640, "height": 480},
    ],
    "annotations": [
        {"id": 1, "image_id": 1, "category_id": 1,
         "bbox": [50, 100, 200, 180], "area": 36000, "iscrowd": 0,
         "segmentation": [[50,100, 250,100, 250,280, 50,280]]},
    ],
}
with open("annotations.json", "w") as f:
    json.dump(dataset, f, indent=2)

使用 pycocotools 載入、查詢和建立 COCO 格式標註資料集

Ertas Support

Ertas Data Suite 支援電腦視覺訓練資料工作流程的 COCO 格式匯入和匯出。您可以匯入 COCO JSON 標註檔案及其影像資料集、驗證標註完整性（檢查孤立標註、缺失影像和無效類別參考），並以 COCO 格式匯出清理後的資料集。支援 COCO 和 YOLO 之間的格式轉換，適用於需要兩種格式的工作流程。

Ship AI that runs on your users' devices.

Free plan with 30 credits/mo, no card required. Paid plans from $25/mo USD.

or view pricing →