Skip to content

Troubleshooting

Symptom-based guide to diagnosing and resolving common issues with HelmForge charts.


Pod stays in CrashLoopBackOff

Symptoms: Pod restarts repeatedly, kubectl get pods shows CrashLoopBackOff status.

Diagnosis:

# Check pod logs
kubectl logs <pod-name> --previous

# Check pod events
kubectl describe pod <pod-name>

Common causes:

CauseFix
Missing or wrong database credentialsCheck auth.existingSecret references and secret key names
Insufficient memory (OOMKilled)Increase resources.limits.memory
Wrong image tag or missing imageVerify image.tag matches a valid published version
Config file syntax errorCheck mounted ConfigMaps for YAML/JSON syntax
Dependency not readyEnsure dependent services (database, Redis) are running first

If the pod log shows exec format error, you may be running an AMD64 image on an ARM node (or vice versa). Check the image supports your node architecture.


PVC stuck in Pending

Symptoms: kubectl get pvc shows Pending status, pods cannot start.

Diagnosis:

kubectl describe pvc <pvc-name>
kubectl get storageclass

Common causes:

CauseFix
No default StorageClassSet a default: kubectl patch sc <name> -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
StorageClass doesn’t existCreate the StorageClass or change persistence.storageClass in values
Insufficient cluster storageFree disk space or add nodes
WaitForFirstConsumer bindingThe PVC binds when a pod is scheduled — check pod scheduling issues

Database connection refused

Symptoms: Application pods log connection refused or could not connect to server when trying to reach a database.

Diagnosis:

# Check if the database pod is running
kubectl get pods -l app.kubernetes.io/name=<chart-name>

# Check the service exists
kubectl get svc -l app.kubernetes.io/name=<chart-name>

# Test connectivity from within the cluster
kubectl run debug --rm -it --image=busybox -- sh
# then: nc -zv <service-name> <port>

Common causes:

CauseFix
Database pod not readyWait for readiness probe to pass, check logs for startup errors
Wrong service nameUse the full service name: <release>-<chart>.<namespace>.svc.cluster.local
Wrong portCheck service.port in the chart’s values
Network policy blocking trafficCheck NetworkPolicies in the namespace
Auth mismatchVerify the application uses the same credentials as the database chart

Ingress returns 404 or 503

Symptoms: Ingress resource exists but the application returns 404 or 503 errors.

Diagnosis:

# Check ingress resource
kubectl describe ingress <ingress-name>

# Check ingress controller logs
kubectl logs -n <ingress-namespace> -l app.kubernetes.io/name=<controller>

# Verify backend service
kubectl get endpoints <service-name>

Common causes:

CauseFix
Wrong ingressClassNameMatch the class to your installed controller (traefik, nginx, etc.)
No ingress controller installedInstall one: helm install traefik traefik/traefik
Service has no endpointsCheck if pods are running and passing readiness probes
Path mismatchVerify pathType (Prefix vs Exact) matches your app’s routing
TLS secret missingCreate the TLS secret or configure cert-manager

Backup CronJob never runs

Symptoms: Backup is enabled but no backup jobs appear.

Diagnosis:

# Check CronJob exists
kubectl get cronjob -l app.kubernetes.io/name=<chart-name>

# Check CronJob schedule
kubectl describe cronjob <cronjob-name>

# Check for failed jobs
kubectl get jobs -l app.kubernetes.io/name=<chart-name>

Common causes:

CauseFix
backup.enabled not set to trueSet backup.enabled: true in values
Invalid cron scheduleValidate schedule syntax (5 fields, no seconds)
S3 credentials wrongTest S3 connectivity manually with aws s3 ls --endpoint-url
Job deadline exceededIncrease backup.activeDeadlineSeconds
Suspended CronJobCheck spec.suspend field — set to false

Backup jobs use the same ServiceAccount as the main pod. If you have restrictive PodSecurityPolicies or PodSecurityStandards, ensure the backup container is allowed to run.


Helm upgrade fails with conflict

Symptoms: helm upgrade fails with cannot patch or field is immutable errors.

Common causes:

CauseFix
Immutable field changed (e.g., StatefulSet volumeClaimTemplates)Delete the StatefulSet with --cascade=orphan and re-run upgrade
Resource owned by another releaseCheck meta.helm.sh/release-name annotation
CRD version conflictManually update CRDs before upgrading
# For immutable StatefulSet fields:
kubectl delete statefulset <name> --cascade=orphan
helm upgrade my-release helmforge/<chart-name> -f values.yaml

Using --cascade=orphan keeps the pods running while deleting the StatefulSet controller. The upgrade will recreate the StatefulSet and adopt the existing pods.


Helm install times out

Symptoms: helm install --wait times out before pods are ready.

Diagnosis:

kubectl get pods -l app.kubernetes.io/instance=<release>
kubectl describe pod <pod-name>
kubectl get events --sort-by=.metadata.creationTimestamp

Common causes:

CauseFix
Image pull errorCheck image name, tag, and pull secrets
Resource quota exceededCheck namespace ResourceQuotas
Node scheduling issuesCheck node taints, tolerations, and available resources
Slow startup (large DB init)Increase --timeout flag: helm install --wait --timeout 10m

General debugging commands

# Overview of release status
helm status <release-name>

# See what values are in use
helm get values <release-name>

# See rendered templates
helm template <release-name> helmforge/<chart-name> -f values.yaml

# Diff before upgrading (requires helm-diff plugin)
helm diff upgrade <release-name> helmforge/<chart-name> -f values.yaml

# Check all resources for a release
kubectl get all -l app.kubernetes.io/instance=<release-name>

Still stuck? Open an issue on GitHub with your chart version, Kubernetes version, and the output of kubectl describe pod and helm get values.