Back to blog
    EU AI Act Article 10 vs. Article 30: What Your Data Team Needs to Know
    eu-ai-actarticle-10article-30compliancedata-preparationsegment:enterprise

    EU AI Act Article 10 vs. Article 30: What Your Data Team Needs to Know

    A detailed comparison of EU AI Act Articles 10 and 30 — the two most critical provisions for AI training data governance, documentation, and compliance.

    EErtas Team·

    If your organization builds or deploys high-risk AI systems in the EU, two articles in the EU AI Act will directly shape how your data team operates: Article 10 (Data and Data Governance) and Article 30 (Technical Documentation). They're related but distinct — and confusing them leads to compliance gaps.

    This piece breaks down what each article requires, who's responsible, and how they interact in practice.

    Article 10: Data and Data Governance

    Article 10 is about the process of preparing training data. It sets requirements for how high-risk AI systems' training, validation, and testing datasets must be managed.

    What It Requires

    Data governance practices covering:

    • Design choices for data collection and origin
    • Data preparation operations (cleaning, labeling, aggregation)
    • Relevance and representativeness assessments
    • Examination for possible biases
    • Identification of data gaps or shortcomings

    Data quality criteria including:

    • Training data must be relevant, sufficiently representative, and as error-free as possible
    • Datasets must be appropriate for the intended purpose of the AI system
    • Statistical properties must be understood and documented

    Bias examination:

    • Datasets must be examined for biases that could lead to discriminatory outcomes
    • Where bias is identified, appropriate measures must be taken to address it
    • The examination process itself must be documented

    Who's Responsible

    Article 10 obligations fall on the provider of the high-risk AI system — the entity that develops or commissions the AI system and places it on the market. In practice, this means the data team, ML engineers, and their management chain.

    The Practical Challenge

    Article 10 demands that your data preparation process is documented and auditable. This is where most enterprises struggle — not because they don't clean data or check for bias, but because these steps happen in scattered scripts, notebooks, and ad-hoc processes with no unified record.

    Article 30: Technical Documentation

    Article 30 is about the output — the documentation you must produce and maintain for each high-risk AI system.

    What It Requires

    Technical documentation must include:

    • General description of the AI system, its intended purpose, and the provider
    • Detailed description of system elements including algorithms, data, training processes, and design choices
    • Information about training data: data sources, scope, main characteristics, collection methodology, labeling procedures, and data cleaning/preparation methods
    • Validation and testing procedures: metrics, test results, and performance benchmarks
    • Risk management measures: identified risks and mitigation steps
    • Monitoring and update plans: post-deployment monitoring approach

    Who's Responsible

    Same as Article 10 — the provider. But Article 30 documentation also needs to be made available to market surveillance authorities upon request. This means the documentation must be organized, complete, and accessible — not buried in a team wiki or scattered across Git commits.

    The Practical Challenge

    Article 30 requires you to produce a coherent document (or document set) that describes your entire AI system, including its training data lineage. If your data pipeline is a chain of disconnected tools, assembling this documentation retroactively is expensive and error-prone.

    How They Interact

    Think of Article 10 as the process requirements and Article 30 as the reporting requirements. They're complementary:

    AspectArticle 10Article 30
    FocusHow you prepare dataWhat you document about it
    ScopeData governance practicesFull system technical documentation
    TimingDuring developmentMaintained throughout lifecycle
    AudienceInternal teamsRegulators and authorities
    Key outputGoverned data pipelineTechnical documentation package

    Article 10 tells you what your data pipeline must do. Article 30 tells you what you must be able to prove it did.

    The Gap Most Enterprises Have

    The typical enterprise AI pipeline has a version of Article 10 compliance — teams do clean data, examine bias, and validate quality. What's missing is the connection to Article 30: the documentation that proves these steps happened, with what data, by whom, and with what results.

    This gap exists because most data pipelines are built from disconnected tools:

    1. Ingestion happens in one tool (Docling, Unstructured.io, custom parsers)
    2. Cleaning happens in Python scripts or notebooks
    3. Labeling happens in Label Studio or Prodigy
    4. Quality scoring happens in Cleanlab or custom code
    5. Export happens in yet another script

    At each boundary, audit trail continuity breaks. The ingestion tool doesn't know what the cleaning script did. The labeling tool doesn't know what was filtered out during cleaning. The quality scorer doesn't know the original provenance of the data it's evaluating.

    What a Compliant Pipeline Looks Like

    To satisfy both Article 10 and Article 30 simultaneously, a data pipeline needs:

    1. Unified logging: Every operation across every stage recorded in a single audit log
    2. Operator attribution: Who performed or approved each step, with timestamps
    3. Data lineage: Ability to trace any output record back to its original source through every transformation
    4. Quality metrics: Automated capture of quality scores, error rates, and bias assessments
    5. Export capability: One-click generation of documentation that satisfies Article 30's format requirements

    This is fundamentally an architecture problem, not a compliance bolt-on. Platforms that handle the full pipeline in a single system — like Ertas Data Suite — generate this documentation as a byproduct of normal operation, because every stage shares the same logging infrastructure.

    What Your Data Team Should Do Now

    1. Audit your current pipeline for Article 10 gaps: Is bias examination documented? Are data governance practices written down?
    2. Assess your Article 30 readiness: Could you produce complete technical documentation for your AI system today?
    3. Identify lineage breaks: Where does audit trail continuity fail in your current tool chain?
    4. Plan for August 2026: Build compliance into new pipelines rather than retrofitting existing ones

    The enforcement deadline is approaching. The cost of building documentation in from the start is a fraction of the cost of reconstructing it after the fact.

    Turn unstructured data into AI-ready datasets — without it leaving the building.

    On-premise data preparation with full audit trail. No data egress. No fragmented toolchains. EU AI Act Article 30 compliance built in.

    Keep reading