Back to blog
    n8n + Local LLMs: Building HIPAA-Compliant Automation Workflows
    n8nlocal-llmhipaahealthcareautomationsegment:agency

    n8n + Local LLMs: Building HIPAA-Compliant Automation Workflows

    How to architect HIPAA-compliant automation workflows using self-hosted n8n and local LLM inference — with practical examples for clinical note summarisation and appointment triage.

    EErtas Team·

    n8n has become the workflow automation platform of choice for agencies that need self-hosted infrastructure. When combined with locally running LLMs, it creates a fully self-contained automation stack where no data leaves the organisation's network — exactly what healthcare clients require.

    This guide walks through the architecture, the specific integration patterns, and a HIPAA compliance checklist for the complete stack.

    The Architecture

    The core architecture is straightforward:

    [EHR / Clinical System] → [n8n (self-hosted)] → [Local LLM (Ollama/vLLM)] → [Output destination]
    

    Every component runs on infrastructure the healthcare organisation controls. n8n orchestrates the workflow. The local LLM handles natural language processing. No external API calls are made for AI inference.

    Component Stack

    ComponentRoleDeployment
    n8nWorkflow orchestrationDocker container on org's server
    Ollama or vLLMLLM inference serverSame server or dedicated GPU machine
    PostgreSQLn8n workflow data + execution logsLocal database
    Redis (optional)Queue management for high-volume workflowsLocal instance
    Reverse proxyTLS termination, access controlNginx/Caddy on same network

    Hardware Requirements

    For a typical healthcare deployment serving 20-50 concurrent workflow executions:

    • CPU: 8+ cores (for n8n and supporting services)
    • RAM: 32 GB minimum (16 GB for n8n/services, 16 GB for model loading)
    • GPU: RTX 5090 (32 GB VRAM) or RTX 4090 (24 GB VRAM)
    • Storage: 500 GB SSD (models, logs, workflow data)

    Total hardware cost: $4,000-6,000 for a complete server build. Compare this to the monthly cost of cloud n8n + cloud AI APIs, and the payback period is typically 2-4 months.

    Connecting n8n to Local LLM Endpoints

    n8n connects to local LLMs through the HTTP Request node or the OpenAI-compatible node. Both Ollama and vLLM expose OpenAI-compatible API endpoints, which makes integration straightforward.

    Using Ollama

    Ollama runs on localhost:11434 by default. In n8n:

    1. Add an HTTP Request node
    2. Set the URL to http://localhost:11434/api/chat
    3. Method: POST
    4. Body (JSON):
    {
      "model": "your-fine-tuned-model",
      "messages": [
        {"role": "system", "content": "You are a clinical note summariser..."},
        {"role": "user", "content": "{{$json.clinical_note}}"}
      ],
      "stream": false
    }
    

    Alternatively, use Ollama's OpenAI-compatible endpoint at http://localhost:11434/v1/chat/completions with n8n's built-in OpenAI node — just change the base URL in the credentials configuration.

    Using vLLM

    vLLM provides higher throughput for concurrent requests. It exposes an OpenAI-compatible API by default:

    python -m vllm.entrypoints.openai.api_server \
      --model /path/to/your/model \
      --host 0.0.0.0 --port 8000
    

    In n8n, configure the OpenAI credentials with base URL http://your-gpu-server:8000/v1 and any string as the API key (vLLM does not require authentication by default — add it via reverse proxy).

    Example Workflow 1: Clinical Note Summarisation

    Use case: Physicians dictate lengthy clinical notes. The workflow summarises them into structured discharge summaries.

    Workflow steps:

    1. Trigger: Webhook receives clinical note from EHR system (or n8n polls a shared folder)
    2. Pre-process: Extract patient identifiers, separate metadata from note body
    3. LLM inference: Send note body to local LLM with system prompt specifying output format (SOAP note structure)
    4. Post-process: Parse LLM output, validate required fields are present
    5. Quality check: If confidence indicators are below threshold, flag for human review
    6. Output: Write structured summary back to EHR via API, or deposit in review queue

    n8n node chain: Webhook → Function (pre-process) → HTTP Request (LLM) → Function (validate) → IF (quality check) → [EHR API / Review Queue]

    The entire workflow executes in 3-8 seconds depending on note length. With a fine-tuned model, summary quality is comparable to a physician spending 15 minutes on the same task.

    Example Workflow 2: Appointment Triage

    Use case: Patient messages requesting appointments are classified by urgency and routed to the appropriate department.

    Workflow steps:

    1. Trigger: n8n polls patient message queue from the practice management system
    2. LLM inference: Send patient message to local LLM with classification prompt (urgent/routine/non-clinical, department assignment)
    3. Parse response: Extract classification and confidence score
    4. Route: Based on classification, create appointment request in the appropriate department queue
    5. Notify: Send confirmation to patient via secure messaging

    Key advantage of fine-tuning: A general-purpose model makes triage errors because it does not understand the specific practice's department structure, provider specialities, or triage protocols. A model fine-tuned on 2,000-3,000 historical triage decisions from that specific practice achieves 95%+ accuracy.

    Example Workflow 3: Prior Authorization Document Assembly

    Use case: Assemble prior authorisation packages by extracting relevant clinical information and matching it to payer requirements.

    Workflow steps:

    1. Trigger: Prior auth request initiated in practice management system
    2. Gather data: n8n queries EHR for relevant clinical notes, lab results, imaging reports
    3. LLM extraction: Local LLM extracts clinically relevant information matching the payer's criteria
    4. Document assembly: Populate prior auth template with extracted data
    5. Review queue: Present assembled package to staff for final review and submission

    This workflow reduces prior auth preparation from 30-45 minutes to 5-10 minutes of review time.

    HIPAA Compliance Checklist for the Stack

    Use this checklist to validate that your n8n + local LLM deployment meets HIPAA requirements:

    Administrative Safeguards

    • Designated security officer responsible for the AI automation system
    • Workforce training on AI system use and PHI handling
    • Access authorisation policy — who can create/modify workflows that process PHI
    • Incident response procedure specific to AI workflow failures or unexpected outputs
    • Regular risk assessment including the AI automation components

    Physical Safeguards

    • Server hardware in a physically secured location (locked server room, data centre)
    • Access logs for physical access to the server
    • Environmental controls (power, cooling, fire suppression)

    Technical Safeguards

    • Unique user authentication for n8n access
    • Role-based access controls in n8n (admin vs. viewer vs. editor)
    • TLS encryption for all network communication (n8n ↔ LLM, n8n ↔ EHR)
    • Audit logging enabled in n8n (all workflow executions logged)
    • LLM inference logs captured and retained per policy
    • Automatic session timeout for n8n web interface
    • Encryption at rest for the database storing workflow execution data
    • Network segmentation — AI server on isolated VLAN
    • No outbound internet access from the AI server (or restricted to package updates only)

    Operational Safeguards

    • Backup procedures for n8n workflows, LLM models, and configuration
    • Disaster recovery plan including AI automation components
    • Change management process for workflow modifications
    • Regular testing of backup restoration
    • Model versioning — track which model version produced which outputs

    Getting Started

    The fastest path to a working deployment:

    1. Set up n8n via Docker on a server with a GPU (n8n self-hosting documentation)
    2. Install Ollama on the same server and load your base model
    3. Fine-tune the model on task-specific data using Ertas Studio
    4. Build a simple proof-of-concept workflow (clinical note summarisation is the easiest starting point)
    5. Run the compliance checklist above
    6. Demonstrate to the healthcare client with synthetic data before connecting to production systems

    Ship AI that runs on your users' devices.

    Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

    Further Reading

    Ship AI that runs on your users' devices.

    Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

    Keep reading