# MCP Hot-Reload Pattern for AI Agents A language-agnostic guide for implementing hot-reloadable MCP (Model Context Protocol) servers that update tools without process restarts. ## The Problem When building MCP servers for AI agents (Claude, GPT, etc.), code changes typically require: 1. Stopping the MCP server 2. Restarting the AI agent/IDE 3. Reconnecting everything This breaks flow and wastes time during development. ## The Solution: CLI Subprocess Decoupling **Core insight**: Decouple the MCP server from your codebase by having tools execute via CLI subprocess calls instead of direct function imports. ``` ┌─────────────────────────────────────────────────────────────────┐ │ AI Agent (Claude, etc.) │ │ │ │ Calls tools ─────────────────────────────────────────────┐ │ │ Receives tools/list_changed notification ◄───────────┐ │ │ │ Re-fetches tool list automatically │ │ │ └───────────────────────────────────────────────────────┼───┼─────┘ │ │ STDIO / SSE / WebSocket │ │ │ │ ┌───────────────────────────────────────────────────────┼───┼─────┐ │ MCP Server Process │ │ │ │ │ │ │ │ ┌─────────────────┐ ┌──────────────────────┐ │ │ │ │ │ File Watcher │───▶│ On Change: │ │ │ │ │ │ │ │ 1. Fetch new schema │─────┘ │ │ │ │ Watches: │ │ 2. Update tools │ │ │ │ │ src/**/* │ │ 3. Notify client │─────────┘ │ │ └─────────────────┘ └──────────────────────┘ │ │ │ │ Tool execution: subprocess.run(["your-cli", "command"]) │ │ Schema fetch: subprocess.run(["your-cli", "schema"]) │ └─────────────────────────────────────────────────────────────────┘ │ │ subprocess (fresh process) ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Your CLI / Codebase │ │ │ │ your-cli schema → Returns tool definitions as JSON │ │ your-cli command1 → Executes tool logic, returns JSON │ │ your-cli command2 → Executes tool logic, returns JSON │ │ │ │ ✓ Each call is a fresh process (no module caching) │ │ ✓ Code changes take effect immediately │ └─────────────────────────────────────────────────────────────────┘ ``` ## Why This Works | Approach | Problem | |----------|---------| | Direct imports | Module/class caching - changes not seen | | Hot-reload libraries | Complex, fragile, language-specific | | Process restart | Disrupts AI agent connection | | **CLI subprocess** | ✓ Always fresh code, universal | Each subprocess is a **fresh process** with: - No cached modules/classes from previous runs - Complete isolation from MCP server state - Works with any language (Python, Node, Go, Rust, etc.) ## Implementation Steps ### Step 1: Create a Schema Command Your CLI should output tool definitions as JSON: ```bash your-cli schema --json ``` Output format: ```json { "tools": [ { "name": "tool_name", "description": "What the tool does", "parameters": { "type": "object", "properties": { "param1": {"type": "string", "description": "..."}, "param2": {"type": "integer", "description": "..."} }, "required": ["param1"] }, "cli_command": ["subcommand", "--flag", "{param1}"] } ] } ``` The `cli_command` field maps MCP tool calls to CLI invocations. ### Step 2: Create Tool Commands Each tool should be a CLI command that: - Accepts arguments - Returns JSON to stdout - Exits with appropriate code ```bash your-cli do-something --param1 "value" --json # Output: {"result": "success", "data": {...}} ``` ### Step 3: Build the MCP Server Your MCP server needs three components: #### A. Schema Fetcher ``` function getSchema(): result = subprocess.run(["your-cli", "schema", "--json"]) return JSON.parse(result.stdout) ``` #### B. Dynamic Tool Registration ``` function registerTools(schema): for each tool in schema.tools: handler = createHandler(tool.cli_command) mcp.registerTool(tool.name, tool.description, handler) function createHandler(cliCommand): return async function(params): args = buildArgs(cliCommand, params) result = subprocess.run(["your-cli"] + args + ["--json"]) return JSON.parse(result.stdout) ``` #### C. File Watcher with Notification ``` function watchForChanges(path, onNotify): watcher.watch(path, "**/*.{py,ts,go,rs}") watcher.on("change", async () => { schema = getSchema() // Fresh subprocess call registerTools(schema) // Update tool registry onNotify() // Notify AI agent }) ``` ### Step 4: Session Capture for Notifications The MCP protocol's `tools/list_changed` notification requires a session reference. Capture it on the first tool call: ``` session = null function onToolCall(context): if session == null: session = context.session // ... execute tool function notifyToolsChanged(): if session != null: session.send("notifications/tools/list_changed") ``` **Important**: The session is only available during an active request. Store it globally on first tool call. ## What Hot-Reloads vs What Requires Restart | Change Type | Hot-Reloads | Requires Restart | |-------------|:-----------:|:----------------:| | Tool definitions (schema) | ✓ | | | Tool implementation logic | ✓ | | | CLI command code | ✓ | | | MCP server code | | ✓ | | File watcher code | | ✓ | The MCP server itself is a thin layer - most of your code lives in the CLI and hot-reloads automatically. ## Key Design Principles ### 1. Thin MCP Server The MCP server should be minimal: - Fetch schema via subprocess - Register tools dynamically - Execute tools via subprocess - Watch files and notify No business logic in the MCP server itself. ### 2. CLI as Source of Truth Your CLI defines everything: - Tool names and descriptions - Parameter schemas - Command mappings - Execution logic The MCP server just translates between MCP protocol and CLI calls. ### 3. JSON Everywhere All communication uses JSON: - Schema output - Tool results - Error messages This ensures compatibility across languages and easy debugging. ### 4. Graceful Degradation If session isn't captured yet: - Log a warning - Skip notification - Tools still work, just won't auto-refresh ## Example: Adding a New Tool 1. Add tool definition to your schema command: ```json { "name": "new_tool", "description": "Does something new", "parameters": {...}, "cli_command": ["new-command", "{param}"] } ``` 2. Implement the CLI command: ```bash your-cli new-command --param "value" --json ``` 3. Save the file 4. **Automatic**: File watcher detects change → fetches new schema → registers tool → notifies AI agent → AI agent sees new tool No restart required! ## Debugging ### Recommended: Log to File The MCP server's stdout/stderr are used for protocol communication. Log to a file: ``` ~/.your-app/mcp-server.log ``` ### Key Log Messages - `Server starting` - MCP server initialized - `Registered N tools` - Tools loaded from schema - `File change detected` - Watcher triggered - `Refreshed N tools` - Schema re-fetched and tools updated - `Session captured` - Ready to send notifications - `Sent tools/list_changed` - AI agent notified ### Common Issues **"No session - cannot send notification"** - Call any tool first to capture the session - Session is only available during tool execution **Tools not updating** - Verify CLI schema command works: `your-cli schema --json` - Check file watcher is monitoring correct path - Look for errors in log file **Subprocess errors** - Ensure CLI is in PATH or use absolute path - Check CLI returns valid JSON - Verify exit codes (0 = success) ## Language-Specific Notes ### Python - Use `subprocess.run()` with `capture_output=True` - For async: `asyncio.to_thread(subprocess.run, ...)` - FastMCP: Use `ctx: Context` parameter for session access ### Node.js/TypeScript - Use `child_process.execSync()` or `spawn()` - For async: `util.promisify(exec)` - MCP SDK: Access session via request context ### Go - Use `exec.Command()` with `Output()` - For concurrent: goroutines with channels - MCP SDK: Session available in handler context ### Rust - Use `std::process::Command` with `output()` - For async: `tokio::process::Command` - MCP SDK: Session in request state ## Summary The CLI subprocess pattern provides: 1. **True hot-reload** - Code changes work instantly 2. **Language agnostic** - Works with any CLI 3. **Simple architecture** - Thin MCP server, rich CLI 4. **Reliable** - No complex reload mechanisms 5. **Debuggable** - Standard subprocess, JSON I/O The trade-off is subprocess overhead per tool call, but for development workflows this is negligible compared to the productivity gains.