Running Stateful Applications in Kubernetes



Most applications in Kubernetes are stateless by default, meaning they don’t store data between restarts. However, many real-world applications—like databases, message queues, and analytics engines—need persistent storage and a consistent network identity. In this chapter, we’ll explore how to run stateful applications in Kubernetes, using StatefulSets, PersistentVolumes (PVs), and PersistentVolumeClaims (PVCs).

Understanding Stateful Applications

A stateful application keeps track of its data even when containers restart. Examples include:

  • Databases (MySQL, PostgreSQL, MongoDB)
  • Message brokers (Kafka, RabbitMQ)
  • Distributed storage systems (Elasticsearch, Cassandra)

Unlike stateless applications, stateful workloads require:

  • Stable network identities – Each pod must have a unique, persistent hostname.
  • Stable storage – Data should be retained even if the pod restarts.
  • Ordered scaling and rolling updates – Pods must start and stop in a defined order.

To meet these requirements, Kubernetes provides StatefulSets.

StatefulSets: Running Stateful Applications

A StatefulSet is a Kubernetes controller that manages stateful applications. It ensures:

  • Each pod gets a unique and stable hostname (e.g., db-0, db-1).
  • Pods are created sequentially and terminated gracefully.
  • Persistent storage is maintained across restarts.

1. Deploying a Stateful Application (MySQL Example)

Let’s deploy a MySQL database using a StatefulSet.

Step 1: Create a Persistent Volume and Persistent Volume Claim

Since stateful apps need persistent storage, we’ll define a PersistentVolume (PV) and PersistentVolumeClaim (PVC):

Create mysql-pv.yaml:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: mysql-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: manual
  hostPath:
    path: "/mnt/data"  # Adjust based on your environment
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: manual

Apply the Persistent Volume (PV) and Claim (PVC):

$ kubectl apply -f mysql-pv.yaml

Output

persistentvolume/mysql-pv created
persistentvolumeclaim/mysql-pvc created

Step 2: Deploy the MySQL StatefulSet

Now, we’ll create a StatefulSet that uses this PVC.

Create mysql-statefulset:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  selector:
    matchLabels:
      app: mysql
  serviceName: mysql
  replicas: 2  # Two MySQL pods
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:latest
        ports:
        - containerPort: 3306
        env:
        - name: MYSQL_ROOT_PASSWORD
          value: "mypassword"
        volumeMounts:
        - name: mysql-storage
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: mysql-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

Deploy the StatefulSet:

$ kubectl apply -f mysql-statefulset.yaml

Output

statefulset.apps/mysql created

Step 3: Verify the StatefulSet and Storage

Check the running StatefulSet and Pods:

$ kubectl get statefulsets

Output

NAME    READY   AGE
mysql   1/2     33s

Check the Persistent Volume Claims:

$ kubectl get pvc

Output

NAME                    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
mysql-pvc               Bound    mysql-pv                                   10Gi       RWO            manual         <unset>                 4m19s
mysql-storage-mysql-0   Bound    pvc-d0c41e0f-4690-4530-ad3f-2dcda94b661f   10Gi       RWO            hostpath       <unset>                 3m11s
mysql-storage-mysql-1   Bound    pvc-3fca3d3e-5b99-4c3f-8154-7a33c8fe65f2   10Gi       RWO            hostpath       <unset>                 2m42s

Accessing the MySQL Database

Now that our MySQL pods are running, let’s connect to one of them:

$ kubectl exec -it mysql-0 -- mysql -u root -p

Output

Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 9.2.0 MySQL Community Server - GPL

Enter the password (mypassword), and you’ll have access to MySQL. To test data persistence, create a new database:

CREATE DATABASE testdb;
SHOW DATABASES;

Output

mysql> CREATE DATABASE testdb;
Query OK, 1 row affected (0.01 sec)
mysql> SHOW DATABASES;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| testdb             |
+--------------------+
5 rows in set (0.01 sec)
mysql> 

If the pod restarts or scales down, this data will persist.

Scaling a StatefulSet

To scale up our database cluster:

$ kubectl scale statefulset mysql --replicas=3

Output

statefulset.apps/mysql scaled

Check the new pod:

$ kubectl get pods -l app=mysql

Output

NAME      READY   STATUS    RESTARTS   AGE
mysql-0   1/1     Running   0          10m
mysql-1   1/1     Running   0          10m
mysql-2   1/1     Running   0          36s

Note: Kubernetes creates mysql-2 as the next pod in sequence, ensuring stable hostnames and persistence.

Scaling Down a StatefulSet

If we want to scale down the StatefulSet (e.g., from 3 replicas to 2), we'll run:

$ kubectl scale statefulset mysql --replicas=2

Output

statefulset.apps/mysql scaled

Kubernetes will remove the highest-numbered pod first (in this case, mysql-2) to maintain order.

To verify the new state:

$ kubectl get pods -l app=mysql

Output

NAME      READY   STATUS    RESTARTS   AGE
mysql-0   1/1     Running   0          15m
mysql-1   1/1     Running   0          15m

Note: mysql-2 has been terminated to maintain the correct StatefulSet order.

Rolling Updates for Stateful Applications

When updating a StatefulSet, Kubernetes follows a rolling update strategy, replacing each pod one at a time.

To update the MySQL image:

$ kubectl set image statefulset/mysql mysql=mysql:8.0

Output

statefulset.apps/mysql image updated

Monitor the update:

$ kubectl rollout status statefulset/mysql

Output

Waiting for 1 pods to be ready...
Waiting for partitioned roll out to finish: 1 out of 2 new pods have been updated...
Waiting for 1 pods to be ready...
Waiting for 2 pods to be ready...
Waiting for 2 pods to be ready...
Waiting for 2 pods to be ready...
Waiting for 1 pods to be ready...
partitioned roll out complete: 2 new pods have been updated...

Headless Services: Ensuring Stable Network Identity

StatefulSets require stable DNS names. Instead of a normal Service, we can use a Headless Service:

Create mysql-service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: mysql
spec:
  selector:
    app: mysql
  clusterIP: None
  ports:
    - protocol: TCP
      port: 3306

Apply the service:

$ kubectl apply -f mysql-service.yaml

Output

service/mysql created

Now, each MySQL pod gets a stable DNS name:

  • mysql-0.mysql
  • mysql-1.mysql

This ensures that MySQL replicas can communicate consistently.

Best Practices for Running Stateful Applications

Here's a list of some of the best practices that you should apply while running stateful applications:

  • Use StatefulSets for Stateful Apps – Ensures ordering, stable storage, and predictable names.
  • Define Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) – Prevents data loss.
  • Use Headless Services – Ensures stable networking.
  • Enable Backups – Regularly back up stateful data using tools like Velero.
  • Use Rolling Updates – Ensures smooth updates without downtime.

Conclusion

In this chapter, we’ve learned how to:

  • Run stateful applications in Kubernetes using StatefulSets.
  • Created a Persistent Volume for storage.
  • Deployed a MySQL StatefulSet.
  • Used a Headless Service for stable networking.
  • Scaled and updated the StatefulSet safely.

With these tools, we can run databases, message queues, and other critical stateful workloads efficiently in Kubernetes.

Advertisements