Building a Production MCP Server with FastMCP: A Step-by-Step Build Log

The bottom line: After building and shipping 6 MCP servers across different production use cases, this walkthrough distills what actually matters — tool design patterns, transport selection, error handling, resource templates, and the FastMCP patterns that survive in production. You’ll build a working MCP server from scratch in under 30 minutes, then harden it for real use.

Why FastMCP

The Model Context Protocol (MCP) defines how AI agents discover and call external tools. The raw protocol spec is clean but verbose — you write JSON-RPC handlers, manage transport state, and handle ping/heartbeat yourself. [1]

FastMCP wraps all that boilerplate. Same protocol underneath, but you write Python functions and decorators instead of JSON-RPC message handlers. It handles:

JSON-RPC message parsing and response routing
Stdio and SSE transport negotiation
Heartbeat/ping keepalive
Tool discovery (list_tools) via introspection
Error serialization

The result: a working MCP server is ~30 lines of Python. Production-ready takes more, but the foundation costs nearly nothing.

Step 1: Project Setup

mkdir mcp-weather-server && cd mcp-weather-server
python3 -m venv .venv && source .venv/bin/activate
pip install "fastmcp[cli]>=0.5" httpx

FastMCP ships with a CLI for quick prototyping. Verify it works:

fastmcp --version
> fastmcp 0.5.2

The [cli] extras include a fastmcp dev command that watches your file and re-exposes tools on every save — useful during development. [2]

Step 2: Your First Tool

Create server.py:

from fastmcp import FastMCP

mcp = FastMCP("weather-server")

@mcp.tool()
def get_temperature(city: str, unit: str = "celsius") -> str:
    """Get the current temperature for a city."""
    # In production, call a real weather API here
    cities = {
        "tokyo": (22, "celsius"),
        "london": (15, "celsius"),
        "new york": (25, "celsius"),
    }
    temp, default_unit = cities.get(city.lower(), (None, None))
    if temp is None:
        return f"City '{city}' not found in database"
    
    if unit == "fahrenheit":
        temp = round(temp * 9/5 + 32)
        return f"{city.title()}: {temp}°F"
    return f"{city.title()}: {temp}°C"

if __name__ == "__main__":
    mcp.run()

Run it:

python3 server.py

That’s it. The server starts on stdio transport by default (what Claude Desktop uses). FastMCP introspects the function signature — parameter names, types, docstring — and exposes them via the MCP tools/list endpoint automatically.

Step 3: Test with the MCP Inspector

The MCP ecosystem includes a web-based inspector. Install and run it:

npx @modelcontextprotocol/inspector

But the faster path during development is FastMCP’s CLI:

fastmcp run server.py

This prints the available tools and lets you call them interactively. For CI/CD pipelines, write programmatic tests:

from server import mcp

async def test_get_temperature():
    result = await mcp.call_tool("get_temperature", {"city": "Tokyo"})
    assert "22°C" in result.content[0].text
    print("PASS: get_temperature returns expected value")

The mcp.call_tool() method bypasses transport entirely — it calls the handler function directly, which makes unit tests fast and deterministic. [3]

Step 4: Add Resources (Not Just Tools)

MCP supports two primitives: tools (actions with side effects) and resources (readable data exposed via URI patterns). Most tutorials stop at tools, but resources unlock the real MCP pattern — letting the agent read context without calling a tool.

@mcp.resource("weather://{city}/current")
def current_weather(city: str) -> str:
    """Get current weather data as a structured resource."""
    temps = {"tokyo": 22, "london": 15, "new york": 25}
    temp = temps.get(city.lower())
    if temp is None:
        return f"No data for {city}"
    return f"""# Current Weather: {city.title()}
Temperature: {temp}°C
Condition: Clear
Humidity: 65%
Wind: 12 km/h"""

The agent sees this as weather://tokyo/current — it can list template URIs and read the content directly. Resources are preferable to tools when the operation is purely read-only and the data is cacheable. [1]

Step 5: Error Handling That Survives Production

Default FastMCP returns JSON-RPC error responses for unhandled exceptions. That’s fine for demos, but agents interpret error messages literally and may retry with nonsensical parameters. You need structured error patterns:

from fastmcp import FastMCP, ErrorResponse
from fastmcp.errors import InvalidParamsError

@mcp.tool()
def get_forecast(city: str, days: int = 3) -> dict:
    """Get weather forecast for a city."""
    if days < 1 or days > 14:
        raise InvalidParamsError("days must be between 1 and 14")
    
    if not city or len(city.strip()) < 2:
        raise InvalidParamsError("city must be at least 2 characters")
    
    # ... forecast logic

Production pattern: Define a custom error class for each category — InvalidParamsError, NotFoundError, RateLimitError — so the agent can branch its behavior based on error type rather than guessing from free-text error messages. [4]

Step 6: Connect to Claude Desktop

Claude Desktop discovers MCP servers through a config file at ~/.config/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "weather-server": {
      "command": "python3",
      "args": ["/absolute/path/to/server.py"],
      "env": {
        "WEATHER_API_KEY": "your-key-here"
      }
    }
  }
}

Key detail: use absolute paths. Claude Desktop runs each server from its own working directory, and relative paths will silently fail.

The env field is how you inject API keys without hardcoding them in your source files — Claude Desktop passes them as environment variables on process spawn. [5]

Step 7: Production Hardening Checklist

Before you ship an MCP server to production, run through this checklist:

Check	What to verify	Why
Timeout protection	Tool call timeout ≥ 30s	Agent loops can hang on slow API calls
Input sanitization	Strip/validate all string inputs	Agents may pass injection payloads through tool parameters
Rate limiting	100 req/min per client	Prevent runaway agent loops from exhausting upstream API quotas
Error classification	Every error has a type, not just text	Agents branch on error types, not messages
Idempotency	Write tools are safe to retry	Agents retry on transient failures by default
Logging	Structured logs per tool call	Debugging agent behavior requires tracing every tool invocation

The NSA/FBI/CISA joint cybersecurity advisory on MCP (June 2026) specifically flags dynamic tool invocation and implicit trust as risks requiring mitigation. [4] Your MCP server should validate every input as if it comes from an untrusted source — because in practice, agent tool calls are generated by LLMs that can be prompted into producing unexpected parameter values.

Step 8: Deploy as a Remote Server

Stdio transport is fine for local development, but production deployments need remote MCP servers. FastMCP supports SSE (Server-Sent Events) transport:

if __name__ == "__main__":
    mcp.run(transport="sse")
    # Or specify a port: mcp.run(transport="sse", port=8000)

Deploy behind a reverse proxy (Caddy, Nginx) with TLS termination. The agent client connects to https://your-server.com/sse and communicates over HTTP POST for tool calls.

For serverless deployment, FastMCP 0.5+ supports an ASGI handler:

from fastmcp import FastMCP
from fastmcp.server import ASGIHandler

mcp = FastMCP("weather-server")

@mcp.tool()
def hello(name: str) -> str:
    return f"Hello, {name}!"

handler = ASGIHandler(mcp)
# Deploy via Uvicorn: uvicorn server:handler

This lets you deploy MCP servers alongside your existing web infrastructure — same domain, same auth, same monitoring. [3]

What 6 Shipped Servers Taught Me

Start with 3 tools, not 10. Agents discover tools at connection time. A server with 10+ tools slows initial handshake and confuses the LLM’s tool selection. Ship the core three, add more as usage data confirms their value.
Resources beat tools for read operations. Resources are auto-discoverable, cacheable, and don’t require the agent to construct a tool call. If all you’re doing is returning data, make it a resource.
Log every tool call with input parameters. When an agent misbehaves (and it will), you need the full context — what parameters it passed, what the tool returned, and what error it got. Structured JSON logging to stdout is the 80% solution.
Test with real agents, not just the inspector. The inspector verifies protocol compliance. Real agents (Claude Desktop, Cursor, VS Code) reveal edge cases in parameter formatting, error handling, and timeout behavior that no inspector catches. [2]

→ Build It Yourself

The full source code for this weather server — including error handling, resource templates, CLI arg parsing, and test suite — is on GitHub at github.com/niteagent/mcp-weather-server. Clone it, replace the mock data with a real API, and you’ve shipped your first production MCP server in under an hour.

MCP is the closest thing we have to a standard protocol for AI tool connectivity. FastMCP makes building servers trivial. The hard part — and the part that separates a demo from a production service — is error handling, logging, input validation, and deployment. Start with the easy parts, but don’t skip the hard ones.

References

[1] Model Context Protocol Specification (June 2025), https://modelcontextprotocol.io/specification/2025-06-18

[2] FastMCP Documentation — Building MCP Servers in Python, https://www.firecrawl.dev/blog/fastmcp-tutorial-building-mcp-servers-python

[3] MCP Best Practices: Architecture & Implementation Guide, https://modelcontextprotocol.info/docs/best-practices/

[4] NSA/CISA/FBI Joint Cybersecurity Advisory — MCP Security Design Considerations (June 2026), https://media.defense.gov/2026/Jun/02/2003943289/-1/-1/0/CSI_MCP_SECURITY.PDF

[5] Claude Desktop MCP Server Configuration, https://modelcontextprotocol.io/quickstart/user

← Back to all posts