Skip to content

FastMCP Server

Helm chart for deploying FastMCP Server on Kubernetes — a MCP (Model Context Protocol) server that dynamically loads tools, resources, prompts, and knowledge bases from multiple sources.

Key Features

  • Multi-source loading — tools, resources, prompts, and knowledge from inline ConfigMaps, S3-compatible storage (AWS S3, MinIO, R2), Git repositories, or OCI artifacts
  • Merge precedence — Inline (highest) > S3 > Git > OCI (lowest) — override remote tools locally without touching the upstream source
  • Bearer, JWT, and multi-auth — configure one auth mode or combine multiple providers through FastMCP
  • Knowledge base support — serve Markdown files as MCP resources for RAG and context injection
  • Extra pip packages — install additional Python packages at startup before loading tools
  • Tool metadata — optional __tags__, __timeout__, __annotations_mcp__ module variables for tool categorization and behavior hints
  • Resource templates — parameterized URIs like users://{user_id}/profile for dynamic resources
  • Multiple resources per fileRESOURCES dict maps multiple URIs to handler functions
  • Error masking — hide internal error details from clients via MCP_MASK_ERROR_DETAILS
  • Duplicate handling — control behavior when tools share names via MCP_ON_DUPLICATE_TOOLS
  • Built-in Web UI — dashboard at /ui with tools/resources/prompts explorer (Alpine.js + Tailwind CDN)
  • Prometheus metrics — tool call counts, durations, errors, source sync status at /metrics
  • Structured JSON loggingLOG_FORMAT=json for Loki, ELK, CloudWatch, Datadog
  • Dedicated health endpoints/healthz (liveness/readiness by default), /readyz, /startupz
  • Diagnostic endpointGET /debug/info with full server introspection
  • Init container pattern — pre-sync sources before server starts via initSync.enabled
  • Gateway API HTTPRoute — expose the MCP endpoint through existing Gateway listeners
  • Dual-stack Service controls — optional service.ipFamilyPolicy and service.ipFamilies
  • Namespace override — deploy chart-managed resources into a target namespace while keeping the Helm release separate
  • Restricted runtime defaults — non-root UID/GID 1000, seccomp RuntimeDefault, dropped Linux capabilities, and no service account token automount
  • Strict loadingMCP_STRICT_LOADING=true fails on boot if any tool/resource has errors
  • Hot reload — automatic tool/resource reload on filesystem changes via MCP_HOT_RELOAD=true
  • Periodic sync — poll S3/Git sources for changes at configurable intervals
  • Webhook reloadPOST /reload endpoint for CI/CD-triggered reloads
  • OCI artifact source — pull tool bundles from OCI registries via ORAS with optional registry credentials
  • Selective sync — include/exclude glob patterns for source filtering
  • Gateway mode — compose multiple MCP servers via MCP_MODE=gateway and MCP_MOUNT_SERVERS
  • Tag visibility — enable/disable tools by tags with MCP_ENABLE_TAGS and MCP_DISABLE_TAGS
  • Multi-auth — combine bearer + JWT providers via MCP_AUTH_PROVIDERS
  • Tool-level scopes__required_scopes__ module variable for authorization
  • Context integration — tools can use ctx: Context for progress, logging, sampling, elicitation, and session state
  • Rate limiting__rate_limit__ module variable or MCP_RATE_LIMIT_DEFAULT env var (sliding window)
  • Caching__cache_ttl__ module variable for idempotent tool result caching
  • Tool sandboxing__max_memory_mb__ and __max_output_size_kb__ resource limits per tool
  • PodDisruptionBudgetpdb.enabled for zero-downtime rolling updates
  • HorizontalPodAutoscalerautoscaling.enabled for auto-scaling based on CPU/memory
  • Security — Trivy vulnerability scan, CycloneDX SBOM, Cosign keyless signing, SLSA provenance

Installation

HTTPS Repository

helm repo add helmforge https://repo.helmforge.dev
helm repo update
helm install fastmcp-server helmforge/fastmcp-server

OCI Registry

helm install fastmcp-server oci://ghcr.io/helmforgedev/helm/fastmcp-server

Basic Example — Inline Tools

# values.yaml
sources:
  inline:
    tools:
      greet.py: |
        def greet(name: str) -> str:
            """Greet someone by name."""
            return f"Hello, {name}!"
      math_ops.py: |
        def add(a: float, b: float) -> float:
            """Add two numbers."""
            return a + b
        def multiply(a: float, b: float) -> float:
            """Multiply two numbers."""
            return a * b
    knowledge:
      overview.md: |
        # Product Overview
        This document provides context for the AI assistant.

S3 Source (MinIO, AWS S3, Cloudflare R2)

sources:
  s3:
    enabled: true
    endpoint: 'https://minio.example.com'
    bucket: mcp-tools
    region: us-east-1
    prefix: production
    accessKey: '<access-key>'
    secretKey: '<secret-key>'
    include:
      - 'tools/**/*.py'
      - 'knowledge/**/*.md'
    exclude:
      - '**/*.tmp'
    syncInterval: 300

Git Source

sources:
  git:
    enabled: true
    repository: 'https://github.com/your-org/mcp-tools.git'
    branch: main
    path: '' # optional subdirectory
    token: '<github-token>' # for private repos
    allowedRepositories:
      - 'https://github.com/your-org/mcp-tools.git'
    allowedBranches:
      - main
    include:
      - 'tools/**/*.py'
      - 'resources/**/*.py'
    exclude:
      - '**/private/**'
    syncInterval: 300

OCI Source

sources:
  oci:
    enabled: true
    registry: ghcr.io/your-org/mcp-bundle
    tag: '1.0.0'
    username: '<registry-user>'
    password: '<registry-token>'
    include:
      - 'tools/**/*.py'
      - 'knowledge/**/*.md'
    exclude:
      - '**/*.tmp'

Authentication

Bearer Token

auth:
  type: bearer
  bearer:
    token: my-secret-token
    # or use an existing Kubernetes secret:
    # existingSecret: my-auth-secret
    # existingSecretKey: token

JWT

auth:
  type: jwt
  jwt:
    issuer: 'https://auth.example.com'
    audience: 'mcp-server'
    jwksUri: 'https://auth.example.com/.well-known/jwks.json'

Multi-Auth and Scopes

auth:
  type: multi
  providers:
    - bearer
    - jwt
  bearer:
    existingSecret: fastmcp-auth
    existingSecretKey: token
  jwt:
    issuer: 'https://auth.example.com'
    audience: 'mcp-server'
    jwksUri: 'https://auth.example.com/.well-known/jwks.json'
    algorithm: RS256
  scopes:
    - tools:read
    - tools:execute
  requiredScopes:
    - tools:execute
  clientId: fastmcp-server
  requireHumanApprovalForDestructive: true

Gateway Mode

gateway:
  enabled: true
  mountServers:
    github:
      transport: streamable-http
      url: 'https://github-mcp.example.com/mcp'
      auth:
        type: bearer
        tokenEnv: GITHUB_MCP_TOKEN
    internal:
      transport: streamable-http
      url: 'http://internal-mcp.default.svc.cluster.local:8000/mcp'

extraEnv:
  - name: GITHUB_MCP_TOKEN
    valueFrom:
      secretKeyRef:
        name: github-mcp-auth
        key: token

Visibility and Reload

hotReload:
  enabled: true

visibility:
  mode: allowlist
  enableTags:
    - public
    - approved
  disableTags:
    - destructive

Observability

Prometheus Metrics

metrics:
  enabled: true
  serviceMonitor:
    enabled: true # requires Prometheus Operator
    interval: 30s

Metrics exposed at /metrics: tool call counts, durations, errors, source sync status, auth attempts.

Structured Logging

server:
  logFormat: json # JSON output for log aggregation

Health Endpoints

EndpointTypeWhen 200
/healthzLivenessAlways (process running)
/readyzReadinessSources synced + components loaded
/startupzStartupFull initialization complete

Diagnostics

GET /debug/info returns server version, FastMCP version, uptime, registered components, source status, auth type, and configuration.

Gateway API and Networking

Use gatewayAPI.enabled when an existing Gateway controller owns ingress traffic and the FastMCP chart should only render an HTTPRoute:

gatewayAPI:
  enabled: true
  parentRefs:
    - name: public-gateway
      namespace: gateway-system
      sectionName: https
  hostnames:
    - mcp.example.com
  paths:
    - type: PathPrefix
      value: /mcp

For clusters running IPv4/IPv6 dual-stack, set the Service family fields explicitly:

service:
  ipFamilyPolicy: PreferDualStack
  ipFamilies:
    - IPv4
    - IPv6

If you need the Helm release metadata in one namespace and the FastMCP workload in another, create the target namespace first and set namespaceOverride:

namespaceOverride: fastmcp-runtime

Rate Limiting

rateLimiting:
  default: '100/min' # global default
  perTool:
    DEPLOY: '5/min' # per-tool override
    DELETE_DATA: '2/min'

Or via module-level variable:

__rate_limit__ = "5/min"

def deploy(service: str, version: str) -> str:
    """Deploy a service (rate limited)."""
    return f"Deployed {service}@{version}"

Caching

__cache_ttl__ = 300  # cache results for 5 minutes

def get_exchange_rate(currency: str) -> float:
    """Get current exchange rate (cached 5min)."""
    ...
caching:
  enabled: true # default
  maxSize: 1000 # max entries per tool

Tool Sandboxing

__max_memory_mb__ = 256
__max_output_size_kb__ = 100

def process_data(data: str) -> str:
    """Process data with resource limits."""
    ...

Autoscaling

autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 80

pdb:
  enabled: true
  minAvailable: 1

Key Values

KeyDefaultDescription
namespaceOverride""Override namespace for chart-managed resources
image.repositorydocker.io/helmforge/fastmcp-serverContainer image
image.tag0.11.2Image tag
server.namefastmcp-serverServer name in MCP responses
server.version""Server version env; empty uses chart appVersion
server.environmentdevRuntime environment (MCP_ENV)
server.host0.0.0.0Bind host
server.port8000HTTP port
server.path/mcpMCP endpoint path
server.workspace/app/workspaceRuntime source workspace
server.logFormattextLog format: text or json
server.strictLoadingfalseFail on boot if component errors
server.maskErrorDetails""Override runtime error masking behavior
server.onDuplicateToolserrorDuplicate tool policy: error, warn, replace
server.corsAllowedOrigins[]Allowed CORS origins
server.maxSourceFileSizeBytes1048576Max source file size
server.maxKnowledgeBytes10485760Max knowledge file size
server.allowedKnowledgeExtensions[".md", ".txt"]Knowledge file extensions
server.strategy.typeRecreateDeployment strategy
server.revisionHistoryLimit10Deployment revision history
ui.enabledtrueEnable Web UI at /ui
metrics.enabledfalseEnable Prometheus metrics at /metrics
metrics.serviceMonitor.enabledfalseCreate ServiceMonitor when metrics are enabled
auth.typenoneAuthentication: none, bearer, jwt, multi
auth.allowNoAuthfalseExplicitly allow no-auth production deployments
auth.bearer.token""Bearer token secret value
auth.bearer.existingSecret""Existing bearer token secret
auth.jwt.issuer""JWT issuer
auth.jwt.audience""JWT audience
auth.jwt.jwksUri""JWT JWKS URI
auth.jwt.algorithmRS256JWT verification algorithm
auth.jwt.publicKeyExistingSecret""Existing JWT public key secret
auth.scopes[]Advertised auth scopes
auth.requiredScopes[]Required request scopes
auth.clientId""OAuth client identifier
auth.providers[]Providers used by multi auth
auth.reloadRequiredScopes[]Scopes required for reload operations
auth.requireHumanApprovalForDestructivetrueRequire approval metadata for destructive tools
rateLimiting.default""Default rate limit (e.g., 100/min)
caching.enabledtrueEnable result caching
sandboxing.maxMemoryMb0Default max memory per tool (MB)
sandboxing.maxOutputSizeKb0Default max output per tool (KB)
sources.blockedFileAllowlist[]Explicit allowlist for blocked file names
sources.inline.dir/workspace/inlineInline ConfigMap mount directory
sources.inline.tools{}Inline Python tool files
sources.inline.resources{}Inline resource files
sources.inline.prompts{}Inline prompt files
sources.inline.knowledge{}Inline knowledge base files
sources.s3.enabledfalseEnable S3 source
sources.s3.bucket""S3 bucket name
sources.s3.include[]S3 include glob patterns
sources.s3.exclude[]S3 exclude glob patterns
sources.s3.syncInterval0S3 polling interval in seconds
sources.git.enabledfalseEnable Git source
sources.git.repository""Git repository HTTPS URL
sources.git.username""Optional Git username
sources.git.allowedRepositories[]Allowed Git repository URLs
sources.git.allowedBranches[]Allowed Git branches
sources.git.include[]Git include glob patterns
sources.git.exclude[]Git exclude glob patterns
sources.git.syncInterval0Git polling interval in seconds
sources.oci.enabledfalseEnable OCI artifact source
sources.oci.registry""OCI artifact reference
sources.oci.tag""OCI artifact tag; empty lets runtime decide
sources.oci.existingSecret""Existing OCI registry credential secret
sources.oci.include[]OCI include glob patterns
sources.oci.exclude[]OCI exclude glob patterns
hotReload.enabledfalseEnable filesystem hot reload
gateway.enabledfalseRun server in gateway mode
gateway.mountServers{}Gateway mount server map
gateway.rawMountServersJson""Raw MCP_MOUNT_SERVERS JSON override
visibility.modeblocklistTool visibility mode
visibility.enableTags[]Allowlisted tags
visibility.disableTags[]Hidden tags
extraPipPackages[]Extra pip packages to install at startup
initSync.enabledfalseRun source sync as init container
persistence.enabledfalseEnable persistent workspace volume
serviceAccount.createfalseCreate a dedicated Kubernetes ServiceAccount
serviceAccount.automountServiceAccountTokenfalseAutomount Kubernetes API token
service.ipFamilyPolicy""Optional Service IP family policy
service.ipFamilies[]Optional Service IP families
gatewayAPI.enabledfalseCreate Gateway API HTTPRoute
gatewayAPI.parentRefs[]Parent Gateway references
gatewayAPI.hostnames[]Hostnames attached to the HTTPRoute
networkPolicy.enabledfalseCreate NetworkPolicy
networkPolicy.ingress[]Custom ingress rules; empty uses service port
networkPolicy.egress[]Custom egress rules; empty allows outbound sync
autoscaling.enabledfalseEnable HPA
pdb.enabledfalseEnable PodDisruptionBudget
ingress.enabledfalseEnable ingress

Operational Notes

  • Merge precedence is Inline > S3 > Git > OCI — if a tool with the same filename exists in multiple sources, the highest-precedence version wins
  • Production-like environments require an auth mode unless auth.allowNoAuth=true is set explicitly
  • ServiceMonitor requires metrics.enabled=true
  • Gateway mode requires either gateway.mountServers or gateway.rawMountServersJson
  • Knowledge base files are served as MCP resources at knowledge://{filename} URIs
  • Tools are Python files with top-level functions; resources need a RESOURCE_URI or RESOURCES module-level variable
  • Tools support optional metadata: __tags__ (set), __timeout__ (float), __annotations_mcp__ (dict)
  • Resource URIs can use {param} placeholders for dynamic templates
  • The extraPipPackages list installs before tools load — use it when tools import external libraries
  • The Web UI auto-refreshes every 15 seconds and requires no external dependencies
  • Init container pattern (initSync.enabled) separates source syncing from server startup for better Kubernetes readiness semantics
  • Readiness uses /healthz by default so empty starter deployments become ready; use /readyz only when you want readiness gated on loaded content

More Information