Affiliate disclosure: Some links in this article are affiliate links. If you sign up through them, we may earn a commission at no extra cost to you. Recommendations are based on documented platform capabilities and official pricing as of May 2026.

How to Deploy an MCP Server on Fly.io in 2026 (Step-by-Step)

Fly.io is the right platform for MCP servers when Railway’s single-region limitation becomes a constraint. If your MCP clients are distributed across geographies — a team split across Tokyo, London, and San Francisco, or a product serving users worldwide — Fly.io’s 35+ region anycast routing is the feature no other PaaS offers.

This guide covers deploying a Python or TypeScript MCP server to Fly.io with Streamable HTTP transport, persistent state, custom domain, and proper authentication. It assumes you have a working local MCP server and want it running in production.

Why Fly.io for MCP

Multi-region routing. Fly.io deploys your container to multiple datacenters simultaneously and routes each incoming connection to the nearest healthy instance. For an MCP server with a global user base, this reduces latency meaningfully — a client in Tokyo hitting a nrt region instance instead of a US West one saves 150+ ms per tool call.

Machines can stay allocated or auto-suspend. Unlike serverless platforms that cold-start on every request, Fly Machines can be configured to stay running 24/7 (matching Railway’s always-on behavior) or to suspend when no connections are active and resume in 300–500 ms. For low-traffic MCP servers, auto-suspend drops idle cost toward zero.

Persistent volumes are mature. Fly Volumes attach to a machine and survive redeploys. Unlike Railway’s volumes (which work but lack snapshot tooling), Fly volumes support snapshots and can be backed up to Fly’s Tigris object storage. For MCP servers that need to persist data between restarts, this matters.

Per-second billing. A Fly machine running 100% uptime on a shared-cpu-2x (512 MB) costs roughly $4–5/month. If your MCP server handles bursty traffic, auto-suspend drops that to near zero for idle periods.

Prerequisites

A working MCP server in Python or TypeScript using Streamable HTTP transport (not stdio)
Docker installed locally
Fly CLI (flyctl) installed: curl -L https://fly.io/install.sh | sh
A Fly.io account: [Sign up here](https://hostingpundit.com/go/fly-io) — the free tier includes 3 VMs and 3 GB storage

Step 1: Prepare your MCP server for Fly.io

Use Streamable HTTP transport

Your MCP server must use Streamable HTTP transport and bind to 0.0.0.0 on the port Fly.io assigns via $PORT.

Python (FastMCP):

import os
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("my-server")

<h1>... define your tools ...</h1>

if __name__ == "__main__":
    mcp.run(
        transport="streamable-http",
        host="0.0.0.0",
        port=int(os.environ.get("PORT", 8080)),
    )

TypeScript:

import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import express from "express";

const app = express();
app.use(express.json());

// ... define your server and tools ...

const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined });
await server.connect(transport);

app.post("/mcp", (req, res) => transport.handleRequest(req, res, req.body));
app.get("/mcp", (req, res) => transport.handleRequest(req, res));
app.delete("/mcp", (req, res) => transport.handleRequest(req, res));

const port = parseInt(process.env.PORT ?? "8080");
app.listen(port, "0.0.0.0", () => {
  console.log(`MCP server listening on ${port}`);
});

Add a Dockerfile

Fly.io detects Dockerfiles automatically. Create one in your project root:

Python:

FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8080
CMD ["python", "server.py"]

Node.js:

FROM node:20-slim
WORKDIR /app
COPY package<em>.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
EXPOSE 8080
CMD ["node", "dist/index.js"]

Gotcha: Fly.io sets $PORT automatically (default is 8080 for HTTP services). Your Dockerfile EXPOSE and application code should both use this value consistently.

Step 2: Initialize the Fly.io app

Run flyctl launch from your project directory:

flyctl launch

flyctl will:

Detect your Dockerfile
Ask for an app name (or generate one)
Ask which region to deploy to (pick the one closest to most of your users — use fly platform regions to list options)
Create fly.toml in your project directory

Important: edit the generated fly.toml before deploying. The defaults need adjustment for an MCP server:

app = "your-mcp-server"
primary_region = "nrt"  # or your chosen region

[build]
  # Fly auto-detects your Dockerfile

[[services]]
  internal_port = 8080
  protocol = "tcp"

  [services.concurrency]
    type = "connections"
    hard_limit = 100
    soft_limit = 80

  [[services.ports]]
    handlers = ["tls", "http"]
    port = 443

  [[services.ports]]
    handlers = ["http"]
    port = 80
    force_https = true

  [[services.http_checks]]
    interval = 15000
    timeout = 5000
    grace_period = "10s"
    method = "get"
    path = "/health"

[env]
  PORT = "8080"

Key configuration points:

internal_port = 8080 must match the port your server binds to
[[services.http_checks]] with path /health — add a health endpoint to your server
force_https = true — redirect HTTP to HTTPS automatically

Step 3: Set secrets (environment variables)

Never put credentials in fly.toml or Dockerfiles. Use Fly’s secrets system:

fly secrets set MCP_AUTH_TOKEN=$(openssl rand -hex 32)
fly secrets set MY_API_KEY=your_api_key_here
fly secrets set DATABASE_URL=your_database_url

Fly injects these as environment variables at runtime. They are encrypted at rest and never appear in build logs.

To verify secrets are set (shows names but not values):

fly secrets list

Step 4: Deploy

fly deploy

flyctl will:

Build your Docker image
Push it to Fly’s image registry
Deploy to your configured region(s)
Run health checks
Print your app’s URL on success

A successful deploy looks like:

==> Verifying app config
==> Building image
...
==> Pushing image to registry
==> Creating release
==> Monitoring deployment
  Machine e784567d create started ... started
  ✓ Machine e784567d [app] is healthy [HTTP GET /health - 200]
==> Visit your newly deployed app at https://your-mcp-server.fly.dev

Your MCP endpoint is live at: https://your-mcp-server.fly.dev/mcp

Step 5: Add more regions (optional but powerful)

This is Fly.io’s killer feature for MCP. To deploy to additional regions:

fly regions add fra  # Frankfurt
fly regions add lax  # Los Angeles
fly scale count 3    # One machine per region

Fly.io’s anycast routing automatically sends each user to the nearest healthy instance. Your MCP clients don’t need to know which region they’re hitting — the DNS routing handles it transparently.

To see which machines are running and where:

fly status

Step 6: Custom domain

Add your domain in the Fly dashboard: your-app → Certificates → Add Certificate
Fly provides a DNS record to add at your registrar (typically an A record or CNAME)
Fly provisions a Let’s Encrypt certificate automatically

Alternatively, via CLI:

fly certs create mcp.yourdomain.com
fly certs show mcp.yourdomain.com  # Shows required DNS records

Once propagated, your MCP endpoint is: https://mcp.yourdomain.com/mcp

Step 7: Connect to Claude Code / Claude Desktop

Claude Code (.claude/settings.json or ~/.claude/settings.json):

{
  "mcpServers": {
    "my-server": {
      "type": "http",
      "url": "https://your-mcp-server.fly.dev/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_MCP_AUTH_TOKEN"
      }
    }
  }
}

Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "my-server": {
      "transport": {
        "type": "http",
        "url": "https://your-mcp-server.fly.dev/mcp"
      },
      "headers": {
        "Authorization": "Bearer YOUR_MCP_AUTH_TOKEN"
      }
    }
  }
}

Test without a client:

curl -X POST https://your-mcp-server.fly.dev/mcp 
  -H "Content-Type: application/json" 
  -H "Authorization: Bearer YOUR_MCP_AUTH_TOKEN" 
  -d '{"jsonrpc":"2.0","method":"tools/list","id":1}'

Persistent storage (when you need it)

If your MCP server needs to persist data (embeddings cache, conversation history, tool state), create a Fly Volume:

fly volumes create mcp_data --size 10 --region nrt

Mount it in fly.toml:

[mounts]
  source = "mcp_data"
  destination = "/data"

Your MCP server can then write to /data/ and the data persists across restarts and redeploys.

Note on multi-region and volumes: Fly Volumes are attached to a single machine in a single region. If you run machines in multiple regions and each needs persistent storage, each region’s machine gets its own volume. For shared state across regions, use an external Postgres (Fly Postgres, Supabase, Neon) or Fly’s Tigris object storage.

Cost breakdown

Configuration	Monthly cost
Free tier (3 shared-cpu-1x, 256 MB, 1 region)	$0
Single shared-cpu-2x (512 MB), 1 region, always-on	~$4.50
Single shared-cpu-2x, 1 region, auto-suspend (low traffic)	~$0.50–2.00
3-region deployment (nrt, fra, sjc), shared-cpu-2x each	~$13–15
+ 10 GB volume	+$1.50/month

Gotcha: Fly’s free tier uses shared-cpu-1x machines with 256 MB RAM. Python MCP servers using FastMCP, LangChain, or similar libraries routinely exceed 256 MB at startup. Budget for at least a shared-cpu-2x (512 MB) if you’re running Python. Node.js MCP servers typically fit within 256 MB for simple tools.

Common gotchas

Wrong internal_port in fly.toml. If internal_port doesn’t match the port your server binds to, Fly’s health checks fail and the deploy loops indefinitely. Double-check that fly.toml‘s internal_port matches your app’s $PORT.

Auth token required. Your Fly.io app URL is publicly reachable. Without bearer token authentication, anyone can invoke your MCP tools. Set MCP_AUTH_TOKEN as a secret and validate it on every /mcp request.

Multi-region state. If you scale to multiple regions, avoid in-memory session state — different requests may hit different machines. Use Fly Volumes (per-region) or an external database for state that must be consistent across instances.

Health check grace period. If your server takes >10 seconds to start (common with large Python dependencies), Fly may kill the machine before it’s ready. Set grace_period = "30s" in your health check config.

SSE connections and Fly’s idle timeout. Fly’s load balancer closes connections idle for over 75 seconds by default. For MCP clients holding long-lived SSE connections, configure your client to send keepalive pings or increase the Fly timeout via [services.tcp_checks] settings.

Railway vs. Fly.io: when to choose each

If you’re deciding between these two platforms specifically for an MCP server:

Single region, fast deploys, minimal config → Railway
Multi-region, global users, per-second billing → Fly.io
Need GPU alongside MCP tools → Railway

For a full side-by-side, see Railway vs Fly.io for AI Agents.

Prices verified May 2026. Check official docs before committing — hosting pricing changes frequently.*

How to Deploy an MCP Server on Fly.io in 2026 (Step-by-Step)

How to Deploy an MCP Server on Fly.io in 2026 (Step-by-Step)

Why Fly.io for MCP

Prerequisites

Step 1: Prepare your MCP server for Fly.io

Use Streamable HTTP transport

Add a Dockerfile

Step 2: Initialize the Fly.io app

Step 3: Set secrets (environment variables)

Step 4: Deploy

Step 5: Add more regions (optional but powerful)

Step 6: Custom domain

Step 7: Connect to Claude Code / Claude Desktop

Persistent storage (when you need it)

Cost breakdown

Common gotchas

Railway vs. Fly.io: when to choose each

Comments

Leave a Reply Cancel reply

More posts

Self-Host Ollama on a $7 VPS: Complete Setup Guide (2026)

Cloudways vs Hetzner for AI-Powered WordPress in 2026

Modal vs Replicate vs RunPod for AI Inference in 2026: Honest Comparison

How to Deploy an MCP Server on Fly.io in 2026 (Step-by-Step)