Advanced Helm Charts: Hooks, Tests & Dependencies Guide

2026-03-20 · 12 min read · gen:2m 37s · tok:19588
#helm #kubernetes #devops #intermediate-tutorial #english #deployment-automation

Master Helm hooks, test suites, and dependency management for production-grade Kubernetes deployments. Build reliable charts with atomic migrations and validation.

Advanced Helm Charts: Mastering Hooks, Tests and Dependencies for Reliable Kubernetes Deployments

You’ve deployed your first Helm charts successfully. Your applications run in Kubernetes, and basic templating feels comfortable. Then reality hits: a database migration fails mid-deployment, leaving your production environment in an inconsistent state. Or worse, you discover a misconfigured service only after it’s handling live traffic.

These scenarios expose the gap between basic Helm usage and production-grade deployment orchestration. Standard charts handle simple workloads well, but complex applications demand more: automated pre-deployment validations, coordinated database migrations, graceful dependency management, and comprehensive testing before traffic reaches new pods.

This guide shows you how to build Helm charts that handle these challenges elegantly. You’ll implement hooks that execute migrations atomically with proper rollback handling, create test suites that validate your entire deployment before marking it successful, and architect dependency relationships that work across complex microservice topologies.

Prerequisites

Before diving in, ensure you have:

  • Kubernetes cluster (1.24+) with kubectl configured
  • Helm 3.12+ installed locally
  • Familiarity with basic Helm concepts: templates, values, releases
  • Working knowledge of Kubernetes resources: Deployments, Services, Jobs, ConfigMaps
  • A container registry accessible from your cluster (Docker Hub, ECR, GCR)

You should be comfortable creating a basic Helm chart with helm create and understand the template directory structure.

Architecture and Key Concepts

Helm’s advanced features operate on a lifecycle model that extends beyond simple resource creation. Understanding this lifecycle is crucial for building reliable deployment workflows.

flowchart TD
    subgraph "Helm Release Lifecycle"
        A[helm install/upgrade] --> B{Pre-install/upgrade Hooks}
        B -->|Success| C[Deploy Kubernetes Resources]
        B -->|Failure| R1[Rollback & Abort]
        C --> D{Post-install/upgrade Hooks}
        D -->|Success| E{Run Helm Tests}
        D -->|Failure| R2[Rollback Resources]
        E -->|Success| F[Release Complete]
        E -->|Failure| R3[Mark Release Failed]
    end
    
    subgraph "Hook Execution Order"
        H1[pre-install weight=-5] --> H2[pre-install weight=0]
        H2 --> H3[pre-install weight=5]
    end
    
    subgraph "Dependency Resolution"
        P1[Parent Chart] --> P2{Condition Check}
        P2 -->|enabled| P3[Load Subchart]
        P2 -->|disabled| P4[Skip Subchart]
        P3 --> P5[Merge Values]
        P5 --> P6[Render Templates]
    end

Hooks are Kubernetes resources with special annotations that tell Helm to execute them at specific lifecycle points. Unlike regular chart resources, hooks run to completion before the lifecycle continues.

Tests are post-deployment validation jobs that verify your release works correctly. They run on-demand via helm test and determine whether your deployment actually succeeded.

Dependencies define relationships between charts, allowing you to compose complex applications from smaller, reusable components with conditional inclusion and value overrides.

Step-by-Step Implementation

Implementing Pre-Install and Post-Upgrade Hooks for Database Migrations

Database migrations represent the canonical use case for Helm hooks. You need migrations to complete before your application starts, and you need them to roll back cleanly if they fail.

Let’s build a migration hook that handles PostgreSQL schema updates:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# templates/db-migration-job.yaml
{{- if .Values.migrations.enabled }}
apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "myapp.fullname" . }}-db-migrate
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
    app.kubernetes.io/component: migration
  annotations:
    # Hook type: runs before install and before upgrade
    "helm.sh/hook": pre-install,pre-upgrade
    # Weight determines execution order (lower runs first)
    "helm.sh/hook-weight": "-5"
    # Delete previous hook job before creating new one
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  # Prevent infinite retry loops
  backoffLimit: {{ .Values.migrations.backoffLimit | default 3 }}
  # Auto-cleanup after completion
  ttlSecondsAfterFinished: {{ .Values.migrations.ttlSeconds | default 600 }}
  template:
    metadata:
      labels:
        {{- include "myapp.selectorLabels" . | nindent 8 }}
        app.kubernetes.io/component: migration
    spec:
      restartPolicy: Never
      {{- with .Values.migrations.securityContext }}
      securityContext:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      initContainers:
        # Wait for database to be ready before running migrations
        - name: wait-for-db
          image: busybox:1.36
          command:
            - sh
            - -c
            - |
              echo "Waiting for database at {{ .Values.database.host }}:{{ .Values.database.port }}"
              until nc -z {{ .Values.database.host }} {{ .Values.database.port }}; do
                echo "Database not ready, sleeping..."
                sleep 2
              done
              echo "Database is ready"
      containers:
        - name: migrate
          image: "{{ .Values.migrations.image.repository }}:{{ .Values.migrations.image.tag }}"
          imagePullPolicy: {{ .Values.migrations.image.pullPolicy | default "IfNotPresent" }}
          command:
            - /bin/sh
            - -c
            - |
              set -e
              echo "Starting database migration..."
              echo "Current schema version:"
              
              # Run migration tool (example using golang-migrate)
              migrate -path /migrations -database "$DATABASE_URL" version || echo "No migrations applied yet"
              
              echo "Applying pending migrations..."
              migrate -path /migrations -database "$DATABASE_URL" up
              
              echo "Migration completed. New schema version:"
              migrate -path /migrations -database "$DATABASE_URL" version
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: {{ .Values.database.existingSecret | default (printf "%s-db-credentials" (include "myapp.fullname" .)) }}
                  key: url
          resources:
            {{- toYaml .Values.migrations.resources | nindent 12 }}
          {{- with .Values.migrations.volumeMounts }}
          volumeMounts:
            {{- toYaml . | nindent 12 }}
          {{- end }}
      {{- with .Values.migrations.volumes }}
      volumes:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.nodeSelector }}
      nodeSelector:
        {{- toYaml . | nindent 8 }}
      {{- end }}
{{- end }}

💡 The hook-delete-policy: before-hook-creation,hook-succeeded annotation is critical. It cleans up successful jobs automatically but preserves failed ones for debugging.

Now let’s add a seed data hook that runs after initial installation only:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# templates/db-seed-job.yaml
{{- if and .Values.seed.enabled (not .Release.IsUpgrade) }}
apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "myapp.fullname" . }}-db-seed
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
    app.kubernetes.io/component: seed
  annotations:
    "helm.sh/hook": post-install
    # Run after migrations (which have weight -5)
    "helm.sh/hook-weight": "0"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  backoffLimit: 1
  ttlSecondsAfterFinished: 300
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: seed
          image: "{{ .Values.seed.image.repository }}:{{ .Values.seed.image.tag }}"
          command:
            - /bin/sh
            - -c
            - |
              set -e
              echo "Seeding initial data..."
              
              # Check if data already exists to make seed idempotent
              EXISTING_COUNT=$(psql "$DATABASE_URL" -t -c "SELECT COUNT(*) FROM users WHERE email = 'admin@example.com'" | tr -d ' ')
              
              if [ "$EXISTING_COUNT" -eq "0" ]; then
                echo "Creating admin user..."
                psql "$DATABASE_URL" -c "INSERT INTO users (email, role, created_at) VALUES ('admin@example.com', 'admin', NOW())"
              else
                echo "Admin user already exists, skipping..."
              fi
              
              echo "Seed completed"
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: {{ .Values.database.existingSecret | default (printf "%s-db-credentials" (include "myapp.fullname" .)) }}
                  key: url
{{- end }}

⚠️ Always make seed operations idempotent. If a hook fails partway through, Helm may retry it, and you don’t want duplicate data.

For sophisticated rollback handling, implement a rollback hook that reverts migrations on failure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# templates/db-rollback-job.yaml
{{- if .Values.migrations.rollbackEnabled }}
apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "myapp.fullname" . }}-db-rollback
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
  annotations:
    "helm.sh/hook": pre-rollback
    "helm.sh/hook-weight": "-10"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  backoffLimit: 1
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: rollback
          image: "{{ .Values.migrations.image.repository }}:{{ .Values.migrations.image.tag }}"
          command:
            - /bin/sh
            - -c
            - |
              set -e
              echo "Rolling back last migration..."
              
              # Store current version for logging
              CURRENT=$(migrate -path /migrations -database "$DATABASE_URL" version 2>&1 | tail -1)
              echo "Current version: $CURRENT"
              
              # Roll back one migration
              migrate -path /migrations -database "$DATABASE_URL" down 1
              
              NEW=$(migrate -path /migrations -database "$DATABASE_URL" version 2>&1 | tail -1)
              echo "Rolled back to version: $NEW"
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: {{ .Values.database.existingSecret | default (printf "%s-db-credentials" (include "myapp.fullname" .)) }}
                  key: url
{{- end }}

Creating Comprehensive Helm Test Suites

Helm tests validate that your deployment actually works after all resources are created. They’re Pods with the helm.sh/hook: test annotation, executed via helm test <release>.

Build a multi-stage test suite that validates connectivity, health, and configuration:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
# templates/tests/test-connection.yaml
apiVersion: v1
kind: Pod
metadata:
  name: {{ include "myapp.fullname" . }}-test-connection
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
  annotations:
    "helm.sh/hook": test
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  restartPolicy: Never
  containers:
    - name: test-api-connectivity
      image: curlimages/curl:8.4.0
      command:
        - /bin/sh
        - -c
        - |
          set -e
          echo "=== Testing API Connectivity ==="
          
          # Test internal service DNS resolution
          echo "Testing DNS resolution for {{ include "myapp.fullname" . }}..."
          nslookup {{ include "myapp.fullname" . }}.{{ .Release.Namespace }}.svc.cluster.local
          
          # Test HTTP connectivity to health endpoint
          echo "Testing HTTP connectivity..."
          RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" \
            --connect-timeout 10 \
            --max-time 30 \
            --retry 5 \
            --retry-delay 3 \
            "http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/health")
          
          if [ "$RESPONSE" = "200" ]; then
            echo "✓ Health endpoint returned 200 OK"
          else
            echo "✗ Health endpoint returned $RESPONSE"
            exit 1
          fi
          
          # Test readiness endpoint
          echo "Testing readiness..."
          READY=$(curl -s -o /dev/null -w "%{http_code}" \
            "http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/ready")
          
          if [ "$READY" = "200" ]; then
            echo "✓ Readiness endpoint returned 200 OK"
          else
            echo "✗ Readiness endpoint returned $READY"
            exit 1
          fi
          
          echo "=== Connectivity Tests Passed ==="
---
# templates/tests/test-database.yaml
apiVersion: v1
kind: Pod
metadata:
  name: {{ include "myapp.fullname" . }}-test-database
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
  annotations:
    "helm.sh/hook": test
    "helm.sh/hook-weight": "0"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  restartPolicy: Never
  containers:
    - name: test-db-connection
      image: postgres:15-alpine
      command:
        - /bin/sh
        - -c
        - |
          set -e
          echo "=== Testing Database Connectivity ==="
          
          # Test basic connection
          echo "Connecting to database..."
          if psql "$DATABASE_URL" -c "SELECT 1" > /dev/null 2>&1; then
            echo "✓ Database connection successful"
          else
            echo "✗ Database connection failed"
            exit 1
          fi
          
          # Verify schema version matches expected
          echo "Checking schema version..."
          SCHEMA_VERSION=$(psql "$DATABASE_URL" -t -c "SELECT version FROM schema_migrations ORDER BY version DESC LIMIT 1" 2>/dev/null | tr -d ' ')
          EXPECTED_VERSION="{{ .Values.migrations.expectedVersion | default "" }}"
          
          if [ -n "$EXPECTED_VERSION" ] && [ "$SCHEMA_VERSION" != "$EXPECTED_VERSION" ]; then
            echo "✗ Schema version mismatch. Expected: $EXPECTED_VERSION, Got: $SCHEMA_VERSION"
            exit 1
          fi
          echo "✓ Schema version: $SCHEMA_VERSION"
          
          # Verify critical tables exist
          echo "Verifying critical tables..."
          {{- range .Values.tests.requiredTables }}
          if psql "$DATABASE_URL" -c "SELECT 1 FROM {{ . }} LIMIT 1" > /dev/null 2>&1; then
            echo "✓ Table '{{ . }}' exists and is accessible"
          else
            echo "✗ Table '{{ . }}' missing or inaccessible"
            exit 1
          fi
          {{- end }}
          
          echo "=== Database Tests Passed ==="
      env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: {{ .Values.database.existingSecret | default (printf "%s-db-credentials" (include "myapp.fullname" .)) }}
              key: url
---
# templates/tests/test-config.yaml
apiVersion: v1
kind: Pod
metadata:
  name: {{ include "myapp.fullname" . }}-test-config
  labels:
    {{- include "myapp.labels" . | nindent 4 }}
  annotations:
    "helm.sh/hook": test
    "helm.sh/hook-weight": "5"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  restartPolicy: Never
  containers:
    - name: test-configuration
      image: curlimages/curl:8.4.0
      command:
        - /bin/sh
        - -c
        - |
          set -e
          echo "=== Testing Application Configuration ==="
          
          # Fetch and validate configuration via API
          CONFIG=$(curl -s "http://{{ include "myapp.fullname" . }}:{{ .Values.service.port }}/api/config/public")
          
          # Verify environment is correct
          ENV=$(echo "$CONFIG" | grep -o '"environment":"[^"]*"' | cut -d'"' -f4)
          EXPECTED_ENV="{{ .Values.environment }}"
          
          if [ "$ENV" = "$EXPECTED_ENV" ]; then
            echo "✓ Environment correctly set to: $ENV"
          else
            echo "✗ Environment mismatch. Expected: $EXPECTED_ENV, Got: $ENV"
            exit 1
          fi
          
          # Verify feature flags if applicable
          {{- if .Values.featureFlags }}
          echo "Verifying feature flags..."
          {{- range $flag, $enabled := .Values.featureFlags }}
          FLAG_VALUE=$(echo "$CONFIG" | grep -o '"{{ $flag }}":[^,}]*' | cut -d':' -f2)
          if [ "$FLAG_VALUE" = "{{ $enabled }}" ]; then
            echo "✓ Feature flag '{{ $flag }}' = {{ $enabled }}"
          else
            echo "✗ Feature flag '{{ $flag }}' expected {{ $enabled }}, got $FLAG_VALUE"
            exit 1
          fi
          {{- end }}
          {{- end }}
          
          echo "=== Configuration Tests Passed ==="

📝 Structure tests with hook-weight to control execution order. Run connectivity tests first (weight -5), then database tests (weight 0), then application-specific validations (weight 5+).

Managing Multi-Service Dependencies with Conditions and Aliases

Complex microservice architectures require sophisticated dependency management. Let’s build a parent chart that orchestrates multiple services with conditional inclusion and value overrides.

First, define dependencies in Chart.yaml:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
# Chart.yaml
apiVersion: v2
name: ecommerce-platform
description: Complete e-commerce platform with microservices
type: application
version: 1.0.0
appVersion: "2024.1"

dependencies:
  # Core services - always required
  - name: api-gateway
    version: "2.x.x"
    repository: "https://charts.example.com"
    
  # Database - can use external or deploy PostgreSQL
  - name: postgresql
    version: "13.x.x"
    repository: "https://charts.bitnami.com/bitnami"
    condition: postgresql.enabled
    alias: db
    
  # Cache - optional Redis deployment
  - name: redis
    version: "18.x.x"
    repository: "https://charts.bitnami.com/bitnami"
    condition: redis.enabled
    alias: cache
    
  # Message queue - choose between RabbitMQ or Kafka
  - name: rabbitmq
    version: "12.x.x"
    repository: "https://charts.bitnami.com/bitnami"
    condition: messageQueue.rabbitmq.enabled
    alias: mq-rabbit
    
  - name: kafka
    version: "26.x.x"
    repository: "https://charts.bitnami.com/bitnami"
    condition: messageQueue.kafka.enabled
    alias: mq-kafka
    
  # Microservices
  - name: user-service
    version: "1.x.x"
    repository: "file://../user-service"
    condition: services.user.enabled
    
  - name: order-service
    version: "1.x.x"
    repository: "file://../order-service"
    condition: services.order.enabled
    
  - name: payment-service
    version: "1.x.x"
    repository: "file://../payment-service"
    condition: services.payment.enabled
    
  # Observability stack
  - name: prometheus
    version: "25.x.x"
    repository: "https://prometheus-community.github.io/helm-charts"
    condition: observability.prometheus.enabled
    alias: metrics
    
  - name: grafana
    version: "7.x.x"
    repository: "https://grafana.github.io/helm-charts"
    condition: observability.grafana.enabled
    alias: dashboards

Now create a comprehensive values file that configures all dependencies:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
# values.yaml
global:
  # Shared values available to all subcharts
  imageRegistry: "registry.example.com"
  imagePullSecrets:
    - name: registry-credentials
  storageClass: "fast-ssd"
  
  # Service mesh configuration
  serviceMesh:
    enabled: true
    istio:
      mtls: STRICT
      
  # Shared database credentials (subcharts reference these)
  database:
    host: "db-postgresql"
    port: 5432
    name: "ecommerce"
    
  # Shared cache configuration
  cache:
    host: "cache-redis-master"
    port: 6379

# PostgreSQL subchart configuration (aliased as 'db')
postgresql:
  enabled: true  # Set false to use external database

db:
  auth:
    postgresPassword: ""  # Set via --set or external secret
    username: "ecommerce"
    password: ""
    database: "ecommerce"
  primary:
    persistence:
      enabled: true
      size: 50Gi
      storageClass: "fast-ssd"
    resources:
      requests:
        memory: 512Mi
        cpu: 250m
      limits:
        memory: 2Gi
        cpu: 1000m
  metrics:
    enabled: true
    serviceMonitor:
      enabled: true  # Enable when observability.prometheus.enabled is true

# Redis subchart configuration (aliased as 'cache')
redis:
  enabled: true

cache:
  architecture: replication
  auth:
    enabled: true
    password: ""
  master:
    persistence:
      enabled: true
      size: 10Gi
  replica:
    replicaCount: 2
    persistence:
      enabled: true
      size: 10Gi
  metrics:
    enabled: true

# Message queue selection (mutually exclusive)
messageQueue:
  rabbitmq:
    enabled: true
  kafka:
    enabled: false

mq-rabbit:
  auth:
    username: "ecommerce"
    password: ""
  replicaCount: 3
  persistence:
    enabled: true
    size: 20Gi
  metrics:
    enabled: true

# Microservices configuration
services:
  user:
    enabled: true
  order:
    enabled: true
  payment:
    enabled: true

user-service:
  replicaCount: 3
  image:
    tag: "v1.2.0"
  database:
    host: "db-postgresql"
    port: 5432
    name: