Back to blog
    MCP + Fine-Tuned Local Model: Connect Claude to Your Domain-Specific AI
    mcpclaude-desktoplocal-modelollamafine-tuningsegment:vibecoder

    MCP + Fine-Tuned Local Model: Connect Claude to Your Domain-Specific AI

    Model Context Protocol (MCP) lets Claude Desktop talk to any server — including your own Ollama-hosted fine-tuned model. Here's the architecture and setup for routing Claude requests to a custom domain model.

    EErtas Team·

    Model Context Protocol (MCP) is Anthropic's open standard for connecting AI assistants to external tools, data sources, and services. Claude Desktop supports MCP out of the box — and because MCP servers can be any HTTP service, you can use it to connect Claude to your own Ollama-hosted fine-tuned model.

    The result: Claude handles the conversation interface and general reasoning, while your fine-tuned model handles the domain-specific tasks it was trained for. Zero-cost domain inference, Claude's interface.

    What MCP Actually Is

    MCP is a protocol, not a product. It defines a standard way for an AI assistant (the client) to discover and call capabilities from external servers (MCP servers). An MCP server exposes:

    • Tools — functions the AI can call ("search_database", "classify_document", "generate_listing")
    • Resources — data sources the AI can read ("customer_records", "product_catalog")
    • Prompts — reusable prompt templates the AI can reference

    Claude Desktop reads an MCP configuration file and connects to the servers listed in it. When you ask Claude to do something, it can invoke the tools from those servers as part of its response.

    Key insight: An MCP server is just an HTTP or stdio-based service. There is nothing special about what runs inside it. Your fine-tuned model, served by Ollama, can power an MCP tool that Claude calls.

    Architecture

    User asks Claude Desktop: "Generate a listing for this property..."
            ↓
    Claude Desktop recognizes this maps to the real_estate_tools MCP server
            ↓
    Claude calls the generate_listing tool (via MCP)
            ↓
    MCP server receives the request
            ↓
    MCP server calls your Ollama API (fine-tuned listing model)
            ↓
    Ollama returns the generated listing
            ↓
    MCP server returns result to Claude
            ↓
    Claude formats and presents the result to the user
    

    Claude acts as the orchestrator and conversation interface. Your fine-tuned model does the specialized domain inference.

    Building the MCP Server

    An MCP server can be built with the @modelcontextprotocol/sdk npm package or the mcp Python package.

    Minimal Node.js MCP server wrapping Ollama:

    // mcp-server.js
    import { Server } from '@modelcontextprotocol/sdk/server/index.js';
    import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
    import { CallToolRequestSchema, ListToolsRequestSchema } from '@modelcontextprotocol/sdk/types.js';
    
    const OLLAMA_BASE = 'http://localhost:11434';
    const MODEL_NAME = 'your-fine-tuned-model'; // Name you gave it in Ollama
    
    const server = new Server(
      { name: 'domain-model-server', version: '1.0.0' },
      { capabilities: { tools: {} } }
    );
    
    // Declare the tools Claude can call
    server.setRequestHandler(ListToolsRequestSchema, async () => ({
      tools: [
        {
          name: 'generate_domain_content',
          description: 'Generate domain-specific content using the fine-tuned model. Use this for [your use case description].',
          inputSchema: {
            type: 'object',
            properties: {
              prompt: {
                type: 'string',
                description: 'The specific request for the domain model'
              },
              context: {
                type: 'string',
                description: 'Additional context (property details, product info, etc.)'
              }
            },
            required: ['prompt']
          }
        }
      ]
    }));
    
    // Handle tool calls
    server.setRequestHandler(CallToolRequestSchema, async (request) => {
      if (request.params.name === 'generate_domain_content') {
        const { prompt, context } = request.params.arguments;
    
        const fullPrompt = context
          ? `Context: ${context}\n\nRequest: ${prompt}`
          : prompt;
    
        const response = await fetch(`${OLLAMA_BASE}/api/chat`, {
          method: 'POST',
          headers: { 'Content-Type': 'application/json' },
          body: JSON.stringify({
            model: MODEL_NAME,
            messages: [{ role: 'user', content: fullPrompt }],
            stream: false
          })
        });
    
        const data = await response.json();
        const content = data.message.content;
    
        return {
          content: [{ type: 'text', text: content }]
        };
      }
    
      throw new Error(`Unknown tool: ${request.params.name}`);
    });
    
    // Start server over stdio (Claude Desktop connects this way)
    const transport = new StdioServerTransport();
    await server.connect(transport);
    

    Claude Desktop Configuration

    Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

    {
      "mcpServers": {
        "domain-model": {
          "command": "node",
          "args": ["/path/to/your/mcp-server.js"],
          "env": {}
        }
      }
    }
    

    Restart Claude Desktop. The tools from your MCP server are now available to Claude.

    Testing the Connection

    In Claude Desktop, type a prompt that should trigger your tool:

    "Generate a listing description for a 3-bedroom craftsman bungalow in Portland with updated kitchen and original hardwood floors."

    Claude should invoke your generate_domain_content tool and return output from your fine-tuned model. You can watch the MCP server's stdout for debug output to confirm the connection.

    Use Case Examples

    Real estate brokerage: MCP server wrapping a listing description model. Agent asks Claude "write a listing for [property]," Claude calls the tool, fine-tuned model generates on-brand description.

    E-commerce support: MCP server wrapping a support resolution model. Support agent asks Claude "how should I respond to this ticket about [issue]?," Claude calls the tool, returns a resolution draft.

    Content agency: MCP server wrapping a brand voice model. Copywriter asks Claude "write a LinkedIn post about [topic] for [Brand]," Claude calls the brand-specific tool.

    The general pattern: Claude for interface and reasoning, your fine-tuned model for the specialized domain task.


    Ship AI that runs on your users' devices.

    Ertas early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

    Further Reading

    Ship AI that runs on your users' devices.

    Early bird pricing starts at $14.50/mo — locked in for life. Plans for builders and agencies.

    Keep reading