FastMCP Server
Helm chart for deploying FastMCP Server on Kubernetes — a MCP (Model Context Protocol) server that dynamically loads tools, resources, prompts, and knowledge bases from multiple sources.
Key Features
- Multi-source loading — tools, resources, prompts, and knowledge from inline ConfigMaps, S3-compatible storage (AWS S3, MinIO, R2), Git repositories, or OCI artifacts
- Merge precedence — Inline (highest) > S3 > Git > OCI (lowest) — override remote tools locally without touching the upstream source
- Bearer, JWT, and multi-auth — configure one auth mode or combine multiple providers through FastMCP
- Knowledge base support — serve Markdown files as MCP resources for RAG and context injection
- Extra pip packages — install additional Python packages at startup before loading tools
- Tool metadata — optional
__tags__,__timeout__,__annotations_mcp__module variables for tool categorization and behavior hints - Resource templates — parameterized URIs like
users://{user_id}/profilefor dynamic resources - Multiple resources per file —
RESOURCESdict maps multiple URIs to handler functions - Error masking — hide internal error details from clients via
MCP_MASK_ERROR_DETAILS - Duplicate handling — control behavior when tools share names via
MCP_ON_DUPLICATE_TOOLS - Built-in Web UI — dashboard at
/uiwith tools/resources/prompts explorer (Alpine.js + Tailwind CDN) - Prometheus metrics — tool call counts, durations, errors, source sync status at
/metrics - Structured JSON logging —
LOG_FORMAT=jsonfor Loki, ELK, CloudWatch, Datadog - Dedicated health endpoints —
/healthz(liveness/readiness by default),/readyz,/startupz - Diagnostic endpoint —
GET /debug/infowith full server introspection - Init container pattern — pre-sync sources before server starts via
initSync.enabled - Gateway API HTTPRoute — expose the MCP endpoint through existing Gateway listeners
- Dual-stack Service controls — optional
service.ipFamilyPolicyandservice.ipFamilies - Namespace override — deploy chart-managed resources into a target namespace while keeping the Helm release separate
- Restricted runtime defaults — non-root UID/GID 1000, seccomp RuntimeDefault, dropped Linux capabilities, and no service account token automount
- Strict loading —
MCP_STRICT_LOADING=truefails on boot if any tool/resource has errors - Hot reload — automatic tool/resource reload on filesystem changes via
MCP_HOT_RELOAD=true - Periodic sync — poll S3/Git sources for changes at configurable intervals
- Webhook reload —
POST /reloadendpoint for CI/CD-triggered reloads - OCI artifact source — pull tool bundles from OCI registries via ORAS with optional registry credentials
- Selective sync — include/exclude glob patterns for source filtering
- Gateway mode — compose multiple MCP servers via
MCP_MODE=gatewayandMCP_MOUNT_SERVERS - Tag visibility — enable/disable tools by tags with
MCP_ENABLE_TAGSandMCP_DISABLE_TAGS - Multi-auth — combine bearer + JWT providers via
MCP_AUTH_PROVIDERS - Tool-level scopes —
__required_scopes__module variable for authorization - Context integration — tools can use
ctx: Contextfor progress, logging, sampling, elicitation, and session state - Rate limiting —
__rate_limit__module variable orMCP_RATE_LIMIT_DEFAULTenv var (sliding window) - Caching —
__cache_ttl__module variable for idempotent tool result caching - Tool sandboxing —
__max_memory_mb__and__max_output_size_kb__resource limits per tool - PodDisruptionBudget —
pdb.enabledfor zero-downtime rolling updates - HorizontalPodAutoscaler —
autoscaling.enabledfor auto-scaling based on CPU/memory - Security — Trivy vulnerability scan, CycloneDX SBOM, Cosign keyless signing, SLSA provenance
Installation
HTTPS Repository
helm repo add helmforge https://repo.helmforge.dev
helm repo update
helm install fastmcp-server helmforge/fastmcp-server
OCI Registry
helm install fastmcp-server oci://ghcr.io/helmforgedev/helm/fastmcp-server
Basic Example — Inline Tools
# values.yaml
sources:
inline:
tools:
greet.py: |
def greet(name: str) -> str:
"""Greet someone by name."""
return f"Hello, {name}!"
math_ops.py: |
def add(a: float, b: float) -> float:
"""Add two numbers."""
return a + b
def multiply(a: float, b: float) -> float:
"""Multiply two numbers."""
return a * b
knowledge:
overview.md: |
# Product Overview
This document provides context for the AI assistant.
S3 Source (MinIO, AWS S3, Cloudflare R2)
sources:
s3:
enabled: true
endpoint: 'https://minio.example.com'
bucket: mcp-tools
region: us-east-1
prefix: production
accessKey: '<access-key>'
secretKey: '<secret-key>'
include:
- 'tools/**/*.py'
- 'knowledge/**/*.md'
exclude:
- '**/*.tmp'
syncInterval: 300
Git Source
sources:
git:
enabled: true
repository: 'https://github.com/your-org/mcp-tools.git'
branch: main
path: '' # optional subdirectory
token: '<github-token>' # for private repos
allowedRepositories:
- 'https://github.com/your-org/mcp-tools.git'
allowedBranches:
- main
include:
- 'tools/**/*.py'
- 'resources/**/*.py'
exclude:
- '**/private/**'
syncInterval: 300
OCI Source
sources:
oci:
enabled: true
registry: ghcr.io/your-org/mcp-bundle
tag: '1.0.0'
username: '<registry-user>'
password: '<registry-token>'
include:
- 'tools/**/*.py'
- 'knowledge/**/*.md'
exclude:
- '**/*.tmp'
Authentication
Bearer Token
auth:
type: bearer
bearer:
token: my-secret-token
# or use an existing Kubernetes secret:
# existingSecret: my-auth-secret
# existingSecretKey: token
JWT
auth:
type: jwt
jwt:
issuer: 'https://auth.example.com'
audience: 'mcp-server'
jwksUri: 'https://auth.example.com/.well-known/jwks.json'
Multi-Auth and Scopes
auth:
type: multi
providers:
- bearer
- jwt
bearer:
existingSecret: fastmcp-auth
existingSecretKey: token
jwt:
issuer: 'https://auth.example.com'
audience: 'mcp-server'
jwksUri: 'https://auth.example.com/.well-known/jwks.json'
algorithm: RS256
scopes:
- tools:read
- tools:execute
requiredScopes:
- tools:execute
clientId: fastmcp-server
requireHumanApprovalForDestructive: true
Gateway Mode
gateway:
enabled: true
mountServers:
github:
transport: streamable-http
url: 'https://github-mcp.example.com/mcp'
auth:
type: bearer
tokenEnv: GITHUB_MCP_TOKEN
internal:
transport: streamable-http
url: 'http://internal-mcp.default.svc.cluster.local:8000/mcp'
extraEnv:
- name: GITHUB_MCP_TOKEN
valueFrom:
secretKeyRef:
name: github-mcp-auth
key: token
Visibility and Reload
hotReload:
enabled: true
visibility:
mode: allowlist
enableTags:
- public
- approved
disableTags:
- destructive
Observability
Prometheus Metrics
metrics:
enabled: true
serviceMonitor:
enabled: true # requires Prometheus Operator
interval: 30s
Metrics exposed at /metrics: tool call counts, durations, errors, source sync status, auth attempts.
Structured Logging
server:
logFormat: json # JSON output for log aggregation
Health Endpoints
| Endpoint | Type | When 200 |
|---|---|---|
/healthz | Liveness | Always (process running) |
/readyz | Readiness | Sources synced + components loaded |
/startupz | Startup | Full initialization complete |
Diagnostics
GET /debug/info returns server version, FastMCP version, uptime, registered components, source status, auth type, and configuration.
Gateway API and Networking
Use gatewayAPI.enabled when an existing Gateway controller owns ingress traffic and the FastMCP chart should only render an HTTPRoute:
gatewayAPI:
enabled: true
parentRefs:
- name: public-gateway
namespace: gateway-system
sectionName: https
hostnames:
- mcp.example.com
paths:
- type: PathPrefix
value: /mcp
For clusters running IPv4/IPv6 dual-stack, set the Service family fields explicitly:
service:
ipFamilyPolicy: PreferDualStack
ipFamilies:
- IPv4
- IPv6
If you need the Helm release metadata in one namespace and the FastMCP workload in another, create the target namespace first and set namespaceOverride:
namespaceOverride: fastmcp-runtime
Rate Limiting
rateLimiting:
default: '100/min' # global default
perTool:
DEPLOY: '5/min' # per-tool override
DELETE_DATA: '2/min'
Or via module-level variable:
__rate_limit__ = "5/min"
def deploy(service: str, version: str) -> str:
"""Deploy a service (rate limited)."""
return f"Deployed {service}@{version}"
Caching
__cache_ttl__ = 300 # cache results for 5 minutes
def get_exchange_rate(currency: str) -> float:
"""Get current exchange rate (cached 5min)."""
...
caching:
enabled: true # default
maxSize: 1000 # max entries per tool
Tool Sandboxing
__max_memory_mb__ = 256
__max_output_size_kb__ = 100
def process_data(data: str) -> str:
"""Process data with resource limits."""
...
Autoscaling
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 80
pdb:
enabled: true
minAvailable: 1
Key Values
| Key | Default | Description |
|---|---|---|
namespaceOverride | "" | Override namespace for chart-managed resources |
image.repository | docker.io/helmforge/fastmcp-server | Container image |
image.tag | 0.11.2 | Image tag |
server.name | fastmcp-server | Server name in MCP responses |
server.version | "" | Server version env; empty uses chart appVersion |
server.environment | dev | Runtime environment (MCP_ENV) |
server.host | 0.0.0.0 | Bind host |
server.port | 8000 | HTTP port |
server.path | /mcp | MCP endpoint path |
server.workspace | /app/workspace | Runtime source workspace |
server.logFormat | text | Log format: text or json |
server.strictLoading | false | Fail on boot if component errors |
server.maskErrorDetails | "" | Override runtime error masking behavior |
server.onDuplicateTools | error | Duplicate tool policy: error, warn, replace |
server.corsAllowedOrigins | [] | Allowed CORS origins |
server.maxSourceFileSizeBytes | 1048576 | Max source file size |
server.maxKnowledgeBytes | 10485760 | Max knowledge file size |
server.allowedKnowledgeExtensions | [".md", ".txt"] | Knowledge file extensions |
server.strategy.type | Recreate | Deployment strategy |
server.revisionHistoryLimit | 10 | Deployment revision history |
ui.enabled | true | Enable Web UI at /ui |
metrics.enabled | false | Enable Prometheus metrics at /metrics |
metrics.serviceMonitor.enabled | false | Create ServiceMonitor when metrics are enabled |
auth.type | none | Authentication: none, bearer, jwt, multi |
auth.allowNoAuth | false | Explicitly allow no-auth production deployments |
auth.bearer.token | "" | Bearer token secret value |
auth.bearer.existingSecret | "" | Existing bearer token secret |
auth.jwt.issuer | "" | JWT issuer |
auth.jwt.audience | "" | JWT audience |
auth.jwt.jwksUri | "" | JWT JWKS URI |
auth.jwt.algorithm | RS256 | JWT verification algorithm |
auth.jwt.publicKeyExistingSecret | "" | Existing JWT public key secret |
auth.scopes | [] | Advertised auth scopes |
auth.requiredScopes | [] | Required request scopes |
auth.clientId | "" | OAuth client identifier |
auth.providers | [] | Providers used by multi auth |
auth.reloadRequiredScopes | [] | Scopes required for reload operations |
auth.requireHumanApprovalForDestructive | true | Require approval metadata for destructive tools |
rateLimiting.default | "" | Default rate limit (e.g., 100/min) |
caching.enabled | true | Enable result caching |
sandboxing.maxMemoryMb | 0 | Default max memory per tool (MB) |
sandboxing.maxOutputSizeKb | 0 | Default max output per tool (KB) |
sources.blockedFileAllowlist | [] | Explicit allowlist for blocked file names |
sources.inline.dir | /workspace/inline | Inline ConfigMap mount directory |
sources.inline.tools | {} | Inline Python tool files |
sources.inline.resources | {} | Inline resource files |
sources.inline.prompts | {} | Inline prompt files |
sources.inline.knowledge | {} | Inline knowledge base files |
sources.s3.enabled | false | Enable S3 source |
sources.s3.bucket | "" | S3 bucket name |
sources.s3.include | [] | S3 include glob patterns |
sources.s3.exclude | [] | S3 exclude glob patterns |
sources.s3.syncInterval | 0 | S3 polling interval in seconds |
sources.git.enabled | false | Enable Git source |
sources.git.repository | "" | Git repository HTTPS URL |
sources.git.username | "" | Optional Git username |
sources.git.allowedRepositories | [] | Allowed Git repository URLs |
sources.git.allowedBranches | [] | Allowed Git branches |
sources.git.include | [] | Git include glob patterns |
sources.git.exclude | [] | Git exclude glob patterns |
sources.git.syncInterval | 0 | Git polling interval in seconds |
sources.oci.enabled | false | Enable OCI artifact source |
sources.oci.registry | "" | OCI artifact reference |
sources.oci.tag | "" | OCI artifact tag; empty lets runtime decide |
sources.oci.existingSecret | "" | Existing OCI registry credential secret |
sources.oci.include | [] | OCI include glob patterns |
sources.oci.exclude | [] | OCI exclude glob patterns |
hotReload.enabled | false | Enable filesystem hot reload |
gateway.enabled | false | Run server in gateway mode |
gateway.mountServers | {} | Gateway mount server map |
gateway.rawMountServersJson | "" | Raw MCP_MOUNT_SERVERS JSON override |
visibility.mode | blocklist | Tool visibility mode |
visibility.enableTags | [] | Allowlisted tags |
visibility.disableTags | [] | Hidden tags |
extraPipPackages | [] | Extra pip packages to install at startup |
initSync.enabled | false | Run source sync as init container |
persistence.enabled | false | Enable persistent workspace volume |
serviceAccount.create | false | Create a dedicated Kubernetes ServiceAccount |
serviceAccount.automountServiceAccountToken | false | Automount Kubernetes API token |
service.ipFamilyPolicy | "" | Optional Service IP family policy |
service.ipFamilies | [] | Optional Service IP families |
gatewayAPI.enabled | false | Create Gateway API HTTPRoute |
gatewayAPI.parentRefs | [] | Parent Gateway references |
gatewayAPI.hostnames | [] | Hostnames attached to the HTTPRoute |
networkPolicy.enabled | false | Create NetworkPolicy |
networkPolicy.ingress | [] | Custom ingress rules; empty uses service port |
networkPolicy.egress | [] | Custom egress rules; empty allows outbound sync |
autoscaling.enabled | false | Enable HPA |
pdb.enabled | false | Enable PodDisruptionBudget |
ingress.enabled | false | Enable ingress |
Operational Notes
- Merge precedence is Inline > S3 > Git > OCI — if a tool with the same filename exists in multiple sources, the highest-precedence version wins
- Production-like environments require an auth mode unless
auth.allowNoAuth=trueis set explicitly ServiceMonitorrequiresmetrics.enabled=true- Gateway mode requires either
gateway.mountServersorgateway.rawMountServersJson - Knowledge base files are served as MCP resources at
knowledge://{filename}URIs - Tools are Python files with top-level functions; resources need a
RESOURCE_URIorRESOURCESmodule-level variable - Tools support optional metadata:
__tags__(set),__timeout__(float),__annotations_mcp__(dict) - Resource URIs can use
{param}placeholders for dynamic templates - The
extraPipPackageslist installs before tools load — use it when tools import external libraries - The Web UI auto-refreshes every 15 seconds and requires no external dependencies
- Init container pattern (
initSync.enabled) separates source syncing from server startup for better Kubernetes readiness semantics - Readiness uses
/healthzby default so empty starter deployments become ready; use/readyzonly when you want readiness gated on loaded content