n8n + Local LLMs: Building HIPAA-Compliant Automation Workflows

n8n has become the workflow automation platform of choice for agencies that need self-hosted infrastructure. When combined with locally running LLMs, it creates a fully self-contained automation stack where no data leaves the organisation's network — exactly what healthcare clients require.

This guide walks through the architecture, the specific integration patterns, and a HIPAA compliance checklist for the complete stack.

The Architecture

The core architecture is straightforward:

[EHR / Clinical System] → [n8n (self-hosted)] → [Local LLM (Ollama/vLLM)] → [Output destination]

Every component runs on infrastructure the healthcare organisation controls. n8n orchestrates the workflow. The local LLM handles natural language processing. No external API calls are made for AI inference.

Component Stack

Component	Role	Deployment
n8n	Workflow orchestration	Docker container on org's server
Ollama or vLLM	LLM inference server	Same server or dedicated GPU machine
PostgreSQL	n8n workflow data + execution logs	Local database
Redis (optional)	Queue management for high-volume workflows	Local instance
Reverse proxy	TLS termination, access control	Nginx/Caddy on same network

Hardware Requirements

For a typical healthcare deployment serving 20-50 concurrent workflow executions:

CPU: 8+ cores (for n8n and supporting services)
RAM: 32 GB minimum (16 GB for n8n/services, 16 GB for model loading)
GPU: RTX 5090 (32 GB VRAM) or RTX 4090 (24 GB VRAM)
Storage: 500 GB SSD (models, logs, workflow data)

Total hardware cost: $4,000-6,000 for a complete server build. Compare this to the monthly cost of cloud n8n + cloud AI APIs, and the payback period is typically 2-4 months.

Connecting n8n to Local LLM Endpoints

n8n connects to local LLMs through the HTTP Request node or the OpenAI-compatible node. Both Ollama and vLLM expose OpenAI-compatible API endpoints, which makes integration straightforward.

Using Ollama

Ollama runs on localhost:11434 by default. In n8n:

Add an HTTP Request node
Set the URL to http://localhost:11434/api/chat
Method: POST
Body (JSON):

{
  "model": "your-fine-tuned-model",
  "messages": [
    {"role": "system", "content": "You are a clinical note summariser..."},
    {"role": "user", "content": "{{$json.clinical_note}}"}
  ],
  "stream": false
}

Alternatively, use Ollama's OpenAI-compatible endpoint at http://localhost:11434/v1/chat/completions with n8n's built-in OpenAI node — just change the base URL in the credentials configuration.

Using vLLM

vLLM provides higher throughput for concurrent requests. It exposes an OpenAI-compatible API by default:

python -m vllm.entrypoints.openai.api_server \
  --model /path/to/your/model \
  --host 0.0.0.0 --port 8000

In n8n, configure the OpenAI credentials with base URL http://your-gpu-server:8000/v1 and any string as the API key (vLLM does not require authentication by default — add it via reverse proxy).

Example Workflow 1: Clinical Note Summarisation

Use case: Physicians dictate lengthy clinical notes. The workflow summarises them into structured discharge summaries.

Workflow steps:

Trigger: Webhook receives clinical note from EHR system (or n8n polls a shared folder)
Pre-process: Extract patient identifiers, separate metadata from note body
LLM inference: Send note body to local LLM with system prompt specifying output format (SOAP note structure)
Post-process: Parse LLM output, validate required fields are present
Quality check: If confidence indicators are below threshold, flag for human review
Output: Write structured summary back to EHR via API, or deposit in review queue

n8n node chain: Webhook → Function (pre-process) → HTTP Request (LLM) → Function (validate) → IF (quality check) → [EHR API / Review Queue]

The entire workflow executes in 3-8 seconds depending on note length. With a fine-tuned model, summary quality is comparable to a physician spending 15 minutes on the same task.

Example Workflow 2: Appointment Triage

Use case: Patient messages requesting appointments are classified by urgency and routed to the appropriate department.

Workflow steps:

Trigger: n8n polls patient message queue from the practice management system
LLM inference: Send patient message to local LLM with classification prompt (urgent/routine/non-clinical, department assignment)
Parse response: Extract classification and confidence score
Route: Based on classification, create appointment request in the appropriate department queue
Notify: Send confirmation to patient via secure messaging

Key advantage of fine-tuning: A general-purpose model makes triage errors because it does not understand the specific practice's department structure, provider specialities, or triage protocols. A model fine-tuned on 2,000-3,000 historical triage decisions from that specific practice achieves 95%+ accuracy.

Example Workflow 3: Prior Authorization Document Assembly

Use case: Assemble prior authorisation packages by extracting relevant clinical information and matching it to payer requirements.

Workflow steps:

Trigger: Prior auth request initiated in practice management system
Gather data: n8n queries EHR for relevant clinical notes, lab results, imaging reports
LLM extraction: Local LLM extracts clinically relevant information matching the payer's criteria
Document assembly: Populate prior auth template with extracted data
Review queue: Present assembled package to staff for final review and submission

This workflow reduces prior auth preparation from 30-45 minutes to 5-10 minutes of review time.

HIPAA Compliance Checklist for the Stack

Use this checklist to validate that your n8n + local LLM deployment meets HIPAA requirements:

Administrative Safeguards

Designated security officer responsible for the AI automation system
Workforce training on AI system use and PHI handling
Access authorisation policy — who can create/modify workflows that process PHI
Incident response procedure specific to AI workflow failures or unexpected outputs
Regular risk assessment including the AI automation components

Physical Safeguards

Server hardware in a physically secured location (locked server room, data centre)
Access logs for physical access to the server
Environmental controls (power, cooling, fire suppression)

Technical Safeguards

Unique user authentication for n8n access
Role-based access controls in n8n (admin vs. viewer vs. editor)
TLS encryption for all network communication (n8n ↔ LLM, n8n ↔ EHR)
Audit logging enabled in n8n (all workflow executions logged)
LLM inference logs captured and retained per policy
Automatic session timeout for n8n web interface
Encryption at rest for the database storing workflow execution data
Network segmentation — AI server on isolated VLAN
No outbound internet access from the AI server (or restricted to package updates only)

Operational Safeguards

Backup procedures for n8n workflows, LLM models, and configuration
Disaster recovery plan including AI automation components
Change management process for workflow modifications
Regular testing of backup restoration
Model versioning — track which model version produced which outputs

Getting Started

The fastest path to a working deployment:

Set up n8n via Docker on a server with a GPU (n8n self-hosting documentation)
Install Ollama on the same server and load your base model
Fine-tune the model on task-specific data using Ertas Studio
Build a simple proof-of-concept workflow (clinical note summarisation is the easiest starting point)
Run the compliance checklist above
Demonstrate to the healthcare client with synthetic data before connecting to production systems

Ship AI that runs on your users' devices.

Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

View early bird pricing or join the waitlist →

n8n + Local LLMs: Building HIPAA-Compliant Automation Workflows

The Architecture

Component Stack

Hardware Requirements

Connecting n8n to Local LLM Endpoints

Using Ollama

Using vLLM

Example Workflow 1: Clinical Note Summarisation

Example Workflow 2: Appointment Triage

Example Workflow 3: Prior Authorization Document Assembly

HIPAA Compliance Checklist for the Stack

Administrative Safeguards

Physical Safeguards

Technical Safeguards

Operational Safeguards

Getting Started

Further Reading

Ship AI that runs on your users' devices.

Keep reading

On-Premise Healthcare AI: Architecture and Infrastructure Guide

Fine-Tuning Healthcare AI: From Clinical Notes to Compliant Deployment

Case Study: How an n8n Agency Deployed HIPAA-Compliant AI for a Hospital Network