Architecture¶

MCP Generator produces a modular, composable server architecture — not a monolithic single file.

Generated Structure¶

graph TD
    A[Main Composition Server] --> B[Pet Sub-Server]
    A --> C[Store Sub-Server]
    A --> D[User Sub-Server]
    A --> E[Middleware Stack]
    E --> F[Error Handling]
    E --> G[Authentication]
    E --> H[Timing]
    E --> I[Logging]

For a Petstore API, the generator produces:

generated_mcp/
├── swagger_petstore_openapi_mcp_generated.py   # Main entry point
├── servers/
│   ├── pet.py              # Pet tag → sub-server (8 tools)
│   ├── store.py            # Store tag → sub-server (4 tools)
│   └── user.py             # User tag → sub-server (7 tools)
├── middleware/
│   └── authentication.py   # JWT/OAuth2 middleware
├── fastmcp.json            # FastMCP configuration
├── pyproject.toml           # Package metadata
├── Dockerfile               # Multi-stage Docker build
├── docker-compose.yml
└── README.md                # Auto-generated documentation

Key Design Decisions¶

One Sub-Server per Tag¶

Each OpenAPI tag becomes a separate FastMCP server module under servers/. These are composed into the main server using mount():

# Main composition server
app = FastMCP("Swagger Petstore")
app.mount(pet_mcp, namespace="pet")
app.mount(store_mcp, namespace="store")
app.mount(user_mcp, namespace="user")

This keeps each module focused and independently maintainable.

Tool Tags and Timeouts¶

Generated tools automatically include FastMCP 3.1 features:

@mcp.tool(tags=["pet"], timeout=30)
async def list_pets(ctx: Context, limit: str | None = None) -> dict[str, Any]:
    """List all pets in the store."""
    ...

Tags — derived from the API tag (e.g., pet_api → ["pet"])
Timeouts — configurable per-tool (default 30s)
Deprecated detection — endpoints marked deprecated in OpenAPI get version="deprecated"

Tag Auto-Discovery¶

If your OpenAPI spec has endpoints without explicit tags, the generator automatically infers tags from the URL path structure. For example, /api/v1/users/{id} gets tagged as users.

FastMCP 3.x Native¶

The generated code targets FastMCP 3.x directly:

mount() with namespace= (not the deprecated import_server())
http_app() for Streamable HTTP transport
Built-in EventStore for SSE resumability
Native middleware pipeline
Transforms pipeline (SearchTools, CodeMode)

Transforms (FastMCP 3.1)¶

The generated server supports FastMCP 3.1 transforms — pipeline components that modify tool presentation:

SearchTools¶

BM25 text search over tool names and descriptions. Useful for servers with many tools:

{
  "features": {
    "search_tools": {
      "enabled": true
    }
  }
}

When enabled, LLMs can search for relevant tools by name/description before listing the full catalog.

CodeMode (Experimental)¶

Replaces the full tool catalog with three meta-tools:

search — find relevant tools by keyword
inspect — get detailed tool schema
execute — run a tool by name

{
  "features": {
    "code_mode": {
      "enabled": true
    }
  }
}

CodeMode is experimental and useful for very large tool catalogs where full enumeration is impractical.

Transport Modes¶

The generated server supports multiple transports:

Transport	Command	Use Case
STDIO	`--transport stdio`	Local AI clients (Claude Desktop)
HTTP	`--transport http --port 8000`	Network access, Streamable HTTP

In HTTP mode, the server uses FastMCP's Streamable HTTP implementation with an event store for SSE resumability.

Middleware Pipeline¶

Every generated server includes a configurable middleware stack:

Request → Error Handling → Authentication → Timing → Logging → Tool Execution

Error Handling — catches exceptions, returns structured errors
Authentication — JWT validation or backend token forwarding
Timing — tracks execution duration per tool call
Logging — structured request/response logging

Optional middleware (enabled via flags):

Caching — response caching with configurable TTL
Storage — persistent backend for tokens, sessions, cache

MCP Resources¶

When --enable-resources is used, GET endpoints are also exposed as MCP resources — allowing AI clients to read API data as context without calling tools.

OpenAPI Client¶

The generator includes a pure Python OpenAPI client generator that creates a typed API client from the spec. This client is used internally by the generated MCP tools to make HTTP requests to the backend API.

No Java or Node.js runtime is required.