
MCP + Fine-Tuned Local Model: Connect Claude to Your Domain-Specific AI
Model Context Protocol (MCP) lets Claude Desktop talk to any server — including your own Ollama-hosted fine-tuned model. Here's the architecture and setup for routing Claude requests to a custom domain model.
Model Context Protocol (MCP) is Anthropic's open standard for connecting AI assistants to external tools, data sources, and services. Claude Desktop supports MCP out of the box — and because MCP servers can be any HTTP service, you can use it to connect Claude to your own Ollama-hosted fine-tuned model.
The result: Claude handles the conversation interface and general reasoning, while your fine-tuned model handles the domain-specific tasks it was trained for. Zero-cost domain inference, Claude's interface.
What MCP Actually Is
MCP is a protocol, not a product. It defines a standard way for an AI assistant (the client) to discover and call capabilities from external servers (MCP servers). An MCP server exposes:
- Tools — functions the AI can call ("search_database", "classify_document", "generate_listing")
- Resources — data sources the AI can read ("customer_records", "product_catalog")
- Prompts — reusable prompt templates the AI can reference
Claude Desktop reads an MCP configuration file and connects to the servers listed in it. When you ask Claude to do something, it can invoke the tools from those servers as part of its response.
Key insight: An MCP server is just an HTTP or stdio-based service. There is nothing special about what runs inside it. Your fine-tuned model, served by Ollama, can power an MCP tool that Claude calls.
Architecture
User asks Claude Desktop: "Generate a listing for this property..."
↓
Claude Desktop recognizes this maps to the real_estate_tools MCP server
↓
Claude calls the generate_listing tool (via MCP)
↓
MCP server receives the request
↓
MCP server calls your Ollama API (fine-tuned listing model)
↓
Ollama returns the generated listing
↓
MCP server returns result to Claude
↓
Claude formats and presents the result to the user
Claude acts as the orchestrator and conversation interface. Your fine-tuned model does the specialized domain inference.
Building the MCP Server
An MCP server can be built with the @modelcontextprotocol/sdk npm package or the mcp Python package.
Minimal Node.js MCP server wrapping Ollama:
// mcp-server.js
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { CallToolRequestSchema, ListToolsRequestSchema } from '@modelcontextprotocol/sdk/types.js';
const OLLAMA_BASE = 'http://localhost:11434';
const MODEL_NAME = 'your-fine-tuned-model'; // Name you gave it in Ollama
const server = new Server(
{ name: 'domain-model-server', version: '1.0.0' },
{ capabilities: { tools: {} } }
);
// Declare the tools Claude can call
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [
{
name: 'generate_domain_content',
description: 'Generate domain-specific content using the fine-tuned model. Use this for [your use case description].',
inputSchema: {
type: 'object',
properties: {
prompt: {
type: 'string',
description: 'The specific request for the domain model'
},
context: {
type: 'string',
description: 'Additional context (property details, product info, etc.)'
}
},
required: ['prompt']
}
}
]
}));
// Handle tool calls
server.setRequestHandler(CallToolRequestSchema, async (request) => {
if (request.params.name === 'generate_domain_content') {
const { prompt, context } = request.params.arguments;
const fullPrompt = context
? `Context: ${context}\n\nRequest: ${prompt}`
: prompt;
const response = await fetch(`${OLLAMA_BASE}/api/chat`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: MODEL_NAME,
messages: [{ role: 'user', content: fullPrompt }],
stream: false
})
});
const data = await response.json();
const content = data.message.content;
return {
content: [{ type: 'text', text: content }]
};
}
throw new Error(`Unknown tool: ${request.params.name}`);
});
// Start server over stdio (Claude Desktop connects this way)
const transport = new StdioServerTransport();
await server.connect(transport);
Claude Desktop Configuration
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"domain-model": {
"command": "node",
"args": ["/path/to/your/mcp-server.js"],
"env": {}
}
}
}
Restart Claude Desktop. The tools from your MCP server are now available to Claude.
Testing the Connection
In Claude Desktop, type a prompt that should trigger your tool:
"Generate a listing description for a 3-bedroom craftsman bungalow in Portland with updated kitchen and original hardwood floors."
Claude should invoke your generate_domain_content tool and return output from your fine-tuned model. You can watch the MCP server's stdout for debug output to confirm the connection.
Use Case Examples
Real estate brokerage: MCP server wrapping a listing description model. Agent asks Claude "write a listing for [property]," Claude calls the tool, fine-tuned model generates on-brand description.
E-commerce support: MCP server wrapping a support resolution model. Support agent asks Claude "how should I respond to this ticket about [issue]?," Claude calls the tool, returns a resolution draft.
Content agency: MCP server wrapping a brand voice model. Copywriter asks Claude "write a LinkedIn post about [topic] for [Brand]," Claude calls the brand-specific tool.
The general pattern: Claude for interface and reasoning, your fine-tuned model for the specialized domain task.
Ship AI that runs on your users' devices.
Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Further Reading
- Claude Desktop Local Model Setup — Step-by-step Claude Desktop + Ollama configuration
- MCP Server Zero API Costs — The cost case for MCP + local models
- Cursor MCP Fine-Tuned Model — The same pattern in Cursor IDE
- MCP Tools for Agency Client Workflows — Agency delivery via MCP
Ship AI that runs on your users' devices.
Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.
Keep reading

Claude Desktop + Local Fine-Tuned Model: Complete Setup Guide
Run your fine-tuned model locally, connect it to Claude Desktop via MCP, and get a zero-cost domain AI assistant inside the Claude interface. Full step-by-step setup.

MCP Servers + Local Models: Zero API Costs for Domain-Specific AI Tools
The combination of MCP servers and fine-tuned local models eliminates per-token costs for AI tools built on Claude, Cursor, and other MCP-compatible clients. Here's the cost math and the architecture.

Cursor + MCP + Fine-Tuned Model: Domain AI Inside Your Code Editor
Cursor supports MCP servers. Connect your fine-tuned domain model to Cursor and get specialized AI capabilities inside the editor — code generation trained on your codebase, documentation in your style, domain-specific autocomplete.