
Claude Desktop + Local Fine-Tuned Model: Complete Setup Guide
Run your fine-tuned model locally, connect it to Claude Desktop via MCP, and get a zero-cost domain AI assistant inside the Claude interface. Full step-by-step setup.
Claude Desktop is a powerful AI assistant. Your fine-tuned model is specialized for your domain. You can connect them via MCP (Model Context Protocol) to get Claude's interface with your model's domain expertise — and pay zero API costs for the specialized inference.
This guide walks through the complete setup: Ollama deployment, MCP server, Claude Desktop configuration, and testing.
Prerequisites
- Claude Desktop installed (download at claude.ai/download)
- A fine-tuned GGUF model from Ertas
- Node.js 18+ (for the MCP server)
- 8GB+ RAM on your machine (for running Ollama with a 7B model)
Step 1: Install and Configure Ollama
# macOS / Linux
curl -fsSL https://ollama.ai/install.sh | sh
# Windows: download installer from ollama.ai
Import your fine-tuned GGUF model:
# Create a Modelfile — this defines how Ollama will run your model
cat > Modelfile << 'EOF'
FROM /path/to/your-model.gguf
# System prompt baked into the model
SYSTEM """
You are a specialized AI assistant for [your domain]. [Add a brief description of what your model does and how it should behave.]
"""
# Optional: tune generation parameters
PARAMETER temperature 0.3
PARAMETER num_ctx 4096
EOF
# Create the model with a name you will use to call it
ollama create my-domain-model -f Modelfile
Verify the model loads correctly:
ollama run my-domain-model "Test prompt — describe what you do"
You should see a response from your fine-tuned model. If it returns an error about loading the GGUF, verify the file path in your Modelfile.
Step 2: Install the MCP SDK
Create a project directory for your MCP server:
mkdir claude-domain-mcp
cd claude-domain-mcp
npm init -y
npm install @modelcontextprotocol/sdk
Step 3: Write the MCP Server
Create server.mjs:
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const OLLAMA_URL = 'http://localhost:11434/api/chat';
const MODEL = 'my-domain-model'; // Must match what you named it in Ollama
const server = new Server(
{ name: 'domain-assistant', version: '1.0.0' },
{ capabilities: { tools: {} } }
);
// Define tools that Claude can use
server.setRequestHandler(ListToolsRequestSchema, async () => {
return {
tools: [
{
name: 'ask_domain_model',
description:
'Use this for [describe your domain task precisely — e.g., "generating real estate listing descriptions", "classifying support tickets", "writing on-brand copy for ClientX"]. This model has specialized knowledge of [domain specifics].',
inputSchema: {
type: 'object',
properties: {
request: {
type: 'string',
description: 'The specific task or question for the domain model',
},
context: {
type: 'string',
description: 'Relevant context such as property details, product specs, customer info, etc.',
},
},
required: ['request'],
},
},
],
};
});
// Handle tool invocations from Claude
server.setRequestHandler(CallToolRequestSchema, async (request) => {
if (request.params.name !== 'ask_domain_model') {
throw new Error(`Unknown tool: ${request.params.name}`);
}
const { request: userRequest, context } = request.params.arguments;
const userMessage = context
? `${context}\n\n${userRequest}`
: userRequest;
let response;
try {
const res = await fetch(OLLAMA_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: MODEL,
messages: [{ role: 'user', content: userMessage }],
stream: false,
}),
});
if (!res.ok) {
throw new Error(`Ollama returned ${res.status}: ${await res.text()}`);
}
const data = await res.json();
response = data.message?.content ?? 'No response from model';
} catch (err) {
return {
content: [
{
type: 'text',
text: `Error calling domain model: ${err.message}. Is Ollama running?`,
},
],
isError: true,
};
}
return {
content: [{ type: 'text', text: response }],
};
});
const transport = new StdioServerTransport();
await server.connect(transport);
Test the server works standalone:
echo '{"jsonrpc":"2.0","method":"tools/list","id":1}' | node server.mjs
You should see a JSON response listing your tools.
Step 4: Configure Claude Desktop
macOS: Edit ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: Edit %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"domain-assistant": {
"command": "node",
"args": ["/absolute/path/to/claude-domain-mcp/server.mjs"],
"env": {
"NODE_ENV": "production"
}
}
}
}
Important: Use the absolute path to server.mjs. Relative paths do not work reliably.
Quit Claude Desktop completely (Command+Q / right-click taskbar → Quit on Windows) and relaunch it.
Step 5: Verify the Connection
In Claude Desktop, look for the tools icon (hammer icon in the input bar). Click it to see available tools. You should see ask_domain_model listed under "domain-assistant."
Test with a prompt that should trigger your tool:
"Using the domain model, generate [your expected output type] for [test input]."
Claude should call your tool and return the response. If it does not call the tool, try being more explicit: "Please use the domain assistant tool to..."
Troubleshooting
Tool not appearing in Claude Desktop:
- Verify the config file JSON is valid (no trailing commas, correct syntax)
- Check the path to server.mjs is absolute and correct
- Look at Claude Desktop logs: macOS
~/Library/Logs/Claude/, Windows%APPDATA%\Claude\logs\
Tool called but returns error:
- Run
ollama servemanually in a terminal and confirmollama run my-domain-model "test"works - Verify the model name in server.mjs matches exactly what you named it in Ollama
- Check Ollama is listening on port 11434:
curl http://localhost:11434/api/tags
Model gives wrong or generic responses:
- Verify the GGUF file loaded correctly (it is the fine-tuned version, not the base model)
- Check that your Modelfile system prompt is appropriate
- Try different temperature settings in the Modelfile
Ship AI that runs on your users' devices.
Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Further Reading
- MCP + Fine-Tuned Local Model — The architecture overview
- MCP Server Zero API Costs — The cost case for this setup
- OpenAI-Compatible Local API — Using Ollama's OpenAI API interface
- Cursor MCP Fine-Tuned Model — The same pattern in Cursor IDE
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Keep reading

MCP + Fine-Tuned Local Model: Connect Claude to Your Domain-Specific AI
Model Context Protocol (MCP) lets Claude Desktop talk to any server — including your own Ollama-hosted fine-tuned model. Here's the architecture and setup for routing Claude requests to a custom domain model.

MCP Servers + Local Models: Zero API Costs for Domain-Specific AI Tools
The combination of MCP servers and fine-tuned local models eliminates per-token costs for AI tools built on Claude, Cursor, and other MCP-compatible clients. Here's the cost math and the architecture.

Cursor + MCP + Fine-Tuned Model: Domain AI Inside Your Code Editor
Cursor supports MCP servers. Connect your fine-tuned domain model to Cursor and get specialized AI capabilities inside the editor — code generation trained on your codebase, documentation in your style, domain-specific autocomplete.