Ertas for Invoice & Receipt Processing

Fine-tune models that extract line items, amounts, dates, tax details, and vendor information from invoices and receipts in any format your organization receives.

The Challenge

Accounts payable teams process invoices from hundreds of vendors, each with a different layout, terminology, and level of detail. Extracting structured data — vendor name, invoice number, line items, quantities, unit prices, tax amounts, payment terms, and bank details — from these diverse formats is tedious manual work that is prone to errors. A single transposition error in an amount field can cascade into payment discrepancies, vendor disputes, and accounting reconciliation issues.

Template-based extraction tools work for high-volume vendors with consistent formats but fail on the long tail of vendors that send invoices in unique layouts. Machine learning-based extraction tools improve coverage but still struggle with handwritten invoices, scanned documents with OCR artifacts, multi-page invoices with complex line item tables, and invoices in multiple languages. The accuracy gap between what extraction tools deliver and what AP teams need means that every invoice still requires human verification — defeating the purpose of automation.

The Solution

Ertas enables AP teams to fine-tune extraction models on their specific invoice corpus, training the model to handle the exact vendor formats, layout variations, and field naming conventions they encounter. With Ertas Studio, teams upload annotated invoices as JSONL — each entry containing the OCR text from an invoice and the corresponding structured data fields — and train a model that maps unstructured invoice text to clean, structured output matching their ERP system's field schema.

The fine-tuned model handles vendor-specific quirks that generic tools miss: the vendor that lists 'Net Amount' instead of 'Subtotal,' the one that embeds tax in line item prices rather than listing it separately, the European vendor that uses comma as the decimal separator. Because the model learns from the organization's actual invoices, it reflects the real distribution of formats and edge cases — not a synthetic training set. Deployed through Ertas Cloud or locally via Ollama, the model processes invoices as they arrive, outputting structured data ready for ERP import. Invoices below the confidence threshold are routed to human review, creating a feedback loop for continuous model improvement.

Key Features

Studio

Custom Field Extraction Training

Train extraction models on your ERP system's specific field schema using Studio. Map any invoice format to your exact data structure, including custom fields, calculated totals, and multi-currency support.

Hub

Document Understanding Models

Start from models on Hub that understand document layouts, table structures, and common financial terminology — so fine-tuning focuses on your vendor-specific extraction accuracy.

Cloud

Invoice Processing API

Deploy through Cloud as an extraction API that accepts invoice text (post-OCR) and returns structured JSON matching your ERP import schema, with per-field confidence scores.

Vault

Financial Data Protection

Vault ensures all invoice data — including vendor bank details, payment amounts, and account numbers — is encrypted at rest and in transit with configurable retention policies.

Example Workflow

A construction company's AP department processes 3,000 invoices monthly from 400+ vendors — material suppliers, subcontractors, equipment rental companies, and professional services firms. The team annotates 8,000 invoices with structured field mappings and uploads them to Ertas Vault. In Ertas Studio, they fine-tune a model targeting their ERP's 22-field invoice schema including project codes, cost categories, and retention amounts specific to construction billing. The model is deployed as an API endpoint integrated with their invoice intake workflow. Incoming invoices are OCR-processed and sent to the model, which outputs structured data matching the ERP import format. The model handles 78% of invoices at high confidence with no human intervention, routes 18% for quick verification (usually just one or two ambiguous fields), and flags 4% for full manual review. Monthly processing time drops from 120 staff-hours to 35, and data entry errors decrease by 90%.