Kubernetes Storage Provisioning - WordPress on Kubernetes

Storage in Kubernetes is a large and complex topic. We’ve chatted about the concept of persistent volumes in an earlier section, giving our WordPress and MariaDB database containers some persistent space for their data. However, in that section we had to go through the trouble of manually creating volumes on specific Kubernetes nodes, and were then limited to where we could run our pods.

Note: all code samples from this section are available on GitHub.

In this section we’ll address this problem with Dynamic Volume Provisioning. We’ll take a quick look at the OpenEBS project and how it can help us automatically provision local volumes. We’ll take a look at replicated storage as well, allowing us to run our applications on any worker node in our Kubernetes cluster.

Dynamic Volume Provisioning

As a Kubernetes cluster user, you’ll almost never work with volumes directly, opting to use volume claims instead. Dynamic volume provisioning allows cluster users to automatically create volumes from a persistent volume claim, rather than having a cluster administrator manually create those volumes beforehand.

This is quite convenient, and the flexibility allows for different volume types, which are called Storage Classes. When operating on just local volumes, this doesn’t matter that much as each volume is just some physical disk space on a particular node. However in real-world clusters that type of storage is very limited in use, and we’ll cover some more complex and interesting options later.

Think about cloud environments though, such as AWS or Google Cloud, where you have expensive fast storage, and cheap slow storage, read-optimized storage, multi-write storage, replicated storage and plenty of other options. The abstractions in Kubernetes are designed to work with all these options, and the main concept behind it is a Storage Class.

Provisioners

By itself, a storage class is just a declaration of some arbitrary type of storage. We’ve seen in a previous section a declaration of an arbitrary “local-storage” class which we then manually assigned to our volumes and volume claims.

A storage class is more useful when it comes along with a provisioner, which determines what’s going to happen, when somebody claims a persistent volume of that class. The provisioner is defined in the StorageClass YAML manifset and is usually picked up by a Kubernetes CSI (container storage interface) plugin/driver.

The unopinionated vanilla Kubernetes install doesn’t ship with any storage plugins, so there’s no provisioner you could use out of the box. However, you’ll find that cloud platforms like AWS, Azure, and others will often ship with existing provisioners or relatively straightforward ways to enable them, allowing users to claim cloud-specific storage, like Amazon’s EBS or EFS volumes.

For on-premise Kubernetes deployments you’ll find CSI drivers for things like NFS, GlusterFS, Ceph and other storage options. We’re going to explore Ceph in more detail in later sections of this guide, but to introduce you to provisioners, we’ll start with OpenEBS.

Installing OpenEBS

OpenEBS is a CNCF storage project, which allows you to use existing storage devices or disk space in your Kubernetes cluster, to dynamically provision local or replicated volumes.

Local volumes in OpenEBS are regular filesystem mounts, while replicated volumes are based on block storage connected via iSCSI, meaning it’s usually very fast and efficient, but does come with some limitations (we’ll look at shared storage in later sections).

Let’s first install the very minimal version of OpenEBS, one that only supports local volumes backed by a filesystem path on the host node. We’ll use Helm, which is a package manager for Kubernetes, and we typically use it on the management node:

$ helm repo add openebs https://openebs.github.io/openebs
$ helm repo update
$ helm install openebs openebs/openebs \
  --set engines.replicated.mayastor.enabled=false \
  --set engines.local.lvm.enabled=false \
  --set engines.local.zfs.enabled=false \
  --namespace openebs \
  --create-namespace

This tells Helm where to look for the openebs repository, and then install the openebs/openebs chart under the openebs namespace that it will create.

By default, the current OpenEBS Helm chart will install support for multiple engines, but we’ll keep things simple for now and skip installing Mayastor, LVM and ZFS support. This will leave us with just the local HostPath provisioner:

$ kubectl -n openebs get pods
NAME                                           READY   STATUS    RESTARTS   AGE
openebs-localpv-provisioner-7cd9f85f8f-c479d   1/1     Running   0          11s

You might also notice that a new StorageClass is now available in our Kubernetes cluster:

$ kubectl get storageclass
NAME               PROVISIONER        RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
openebs-hostpath   openebs.io/local   Delete          WaitForFirstConsumer   false                  94s

Let’s rework our WordPress application manifests to work with this new storage class.

Provisioning in Action

We’ll continue building on our configuration from the previous section, where we had a MariaDB and WordPress stateful sets and an Nginx deployment with three replicas. Instead of provisioning new volumes on a specific host, we’ll let our new provisioner handle that.

This means we’ll no longer need storage-class.yml and volumes.yml, but we do need to update the storage class in our volume-claims.yml file (feel free to remove the www-data claim and focus on just mariadb for now):

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mariadb
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: openebs-hostpath
  resources:
    requests:
      storage: 2Gi

Let’s now provision the the volume claim and see what happens:

$ kubectl apply -f volume-claims.yml 
persistentvolumeclaim/mariadb created

$ kubectl get pvc
NAME       STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS       VOLUMEATTRIBUTESCLASS   AGE
mariadb    Pending                                      openebs-hostpath   <unset>                 12s

We now have a pending volume claim. Let’s launch our MariaDB StatefulSet and don’t forget to apply the ConfigMap and Service:

$ kubectl apply \
  -f mariadb.configmap.yml \
  -f mariadb.statefulset.yml \
  -f mariadb.service.yml 
configmap/mariadb created
statefulset.apps/mariadb created
service/mariadb created

If you look at your PVCs list again, you’ll see that the MariaDB claim is now bound to a volume:

$ kubectl get pvc                                                                         
NAME       STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS       VOLUMEATTRIBUTESCLASS   AGE
mariadb    Bound     pvc-3f85e3cb-5cae-4163-aaf8-1a3003c6ca58   2Gi        RWO            openebs-hostpath   <unset>                 2m58s

Furthermore, you will see that new volume appear in the volumes list:

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                            STORAGECLASS            VOLUMEATTRIBUTESCLASS   REASON   AGE
pvc-3f85e3cb-5cae-4163-aaf8-1a3003c6ca58   2Gi        RWO            Delete           Bound    default/mariadb                  openebs-hostpath        <unset>                          6m26s

And our MariaDB pod is up and running:

$ kubectl get pods
NAME        READY   STATUS    RESTARTS   AGE
mariadb-0   1/1     Running   0          4m42s

As you would expect, deleting the MariaDB pod will simply cause the StatefulSet to create a new pod with the same configuration, with the same volume attached.

$ kubectl delete pod mariadb-0
pod "mariadb-0" deleted

$ kubectl get pods            
NAME        READY   STATUS    RESTARTS   AGE
mariadb-0   1/1     Running   0          6s

$ kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                            STORAGECLASS            VOLUMEATTRIBUTESCLASS   REASON   AGE
pvc-3f85e3cb-5cae-4163-aaf8-1a3003c6ca58   2Gi        RWO            Delete           Bound    default/mariadb                  openebs-hostpath        <unset>                          9m26s

Interestingly, deleting the StatefulSet will also retain the persistent volume, until the claim continues to exist. Recreating the StatefulSet will re-attach that volume to any new pod that requires it.

$ kubectl delete -f mariadb.statefulset.yml 
statefulset.apps "mariadb" deleted

$ kubectl get pvc                          
NAME       STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS       VOLUMEATTRIBUTESCLASS   AGE
mariadb    Bound     pvc-3f85e3cb-5cae-4163-aaf8-1a3003c6ca58   2Gi        RWO            openebs-hostpath   <unset>                 12m

However, removing the volume claim (along with the StatefulSet) will cause the the OpenEBS provisioner to delete the volume per the reclaim policy:

$ kubectl delete -f volume-claims.yml 
persistentvolumeclaim "mariadb" deleted

$ kubectl get pv
No resources found

The provisioner takes care of the lifecycle of our volumes, creating new ones when a new claim is made, and deleting (or retaining, depending on policy) when a claim is removed.

The provisioner also takes care of placement for us, so we don’t really have to think which nodes to create our volumes on. However, once a volume is created, it can not (easily) be moved.

Node Affinity

We touched on this topic in an earlier section. Any local persistent volume exists on one specific node, and can only be mounted to containers running on that node.

This behavior will cause Kubernetes to reschedule a crashed (or deleted) pod on the same node when it requires a volume from that node. Until that is possible, it will remain in pending state forever. We can test this by tainting a node, which allows us to prevent scheduling new pods there.

Let’s create our volume claims and MariaDB statefulset, and observe where exactly the pod has landed:

$ kubectl apply -f volume-claims.yml -f mariadb.statefulset.yml 
persistentvolumeclaim/mariadb created
statefulset.apps/mariadb created

$ kubectl get pods -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP            NODE   NOMINATED NODE   READINESS GATES
mariadb-0   1/1     Running   0          15s   10.10.2.136   k2     <none>           <none>

In the above case, the pod has landed on the k2 node, which means the local storage is also provisioned on that node. We can verify this by describing our volume:

$ kubectl describe pv pvc-3c90db59-2879-41f2-8b9a-f5ee80e44a64
Name:              pvc-3c90db59-2879-41f2-8b9a-f5ee80e44a64
Labels:            openebs.io/cas-type=local-hostpath
Annotations:       pv.kubernetes.io/provisioned-by: openebs.io/local
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      openebs-hostpath
Status:            Bound
Claim:             default/mariadb
Reclaim Policy:    Delete
Access Modes:      RWO
VolumeMode:        Filesystem
Capacity:          2Gi
Node Affinity:     
  Required Terms:  
    Term 0:        kubernetes.io/hostname in [k2]
Message:           
Source:
    Type:  LocalVolume (a persistent volume backed by local storage on a node)
    Path:  /var/openebs/local/pvc-3c90db59-2879-41f2-8b9a-f5ee80e44a64
Events:    <none>

See the Node Affinity section in the output above. No matter how many times we delete the pod, it will be recreated on the k2 node because of this affinity. Let’s taint the k2 node, preventing the scheduling of new pods:

$ kubectl taint nodes k2 please:NoSchedule
node/k2 tainted

$ kubectl delete pod mariadb-0
pod "mariadb-0" deleted

$ kubectl get pods            
NAME        READY   STATUS    RESTARTS   AGE
mariadb-0   0/1     Pending   0          12s

The only other available node in our cluster for this workload is k1, but the volume doesn’t exist there, so Kubernetes may keep this pod in pending state forever. Describing the pod will yield more details:

$ kubectl describe pod mariadb-0
# (output omitted ...)
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  62s   default-scheduler  0/3 nodes are available: 1 node(s) had untolerated taint {please: }. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.

Describing the Persistent Volume Claim will also let you know which node the volume is provisioned on, if any, via annotations:

$ kubectl describe pvc mariadb
# (output omitted...)
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: openebs.io/local
               volume.kubernetes.io/selected-node: k2
               volume.kubernetes.io/storage-provisioner: openebs.io/local

Note that more often than not, creating a volume claim does not cause a volume to be created right away. Most provisioners will only create the volume (and thus make the decision which node to place it on) when a pod is trying to mount it, so affinity/node matching is usually done at that time for new volumes.

Before proceeding, don’t forget to untaint the previously tainted node so that we can schedule workflows there:

$ kubectl taint nodes k2 please:NoSchedule-
node/k2 untainted

The Usefulness of Local Volumes

While having a provisioner is certainly helpful, the fact that it’s still creating local volumes bound to specific nodes isn’t all that great. If we run the rest of our YAML manifests you’ll see that the three Nginx replica pods and the WordPress pod still all end up on the same node:

$ kubectl apply -f .
$ kubectl get pods -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP            NODE   NOMINATED NODE   READINESS GATES
mariadb-0                1/1     Running   0          7s    10.10.1.23    k1     <none>           <none>
nginx-76876c5747-2bllb   1/1     Running   0          7s    10.10.2.135   k2     <none>           <none>
nginx-76876c5747-cgdft   1/1     Running   0          7s    10.10.2.12    k2     <none>           <none>
nginx-76876c5747-hkkl6   1/1     Running   0          7s    10.10.2.225   k2     <none>           <none>
wordpress-0              1/1     Running   0          6s    10.10.2.157   k2     <none>           <none>

This is because they’re bound to the same volume, which has been provisioned and only available on the k2 node. If this k2 node suffers from a hardware failure, Kubernetes will not be able to reschedule the pods on a different node as we’ve seen with the taint example earlier.

The concept of a local volume is mostly useful for stateful applications that already have some kind of resiliency built in. Furthermore, locally attached storage tends to be significantly faster that network-attached block storage and especially filesystems.

Many databases are good candidates for this. MySQL or MariaDB are no exception. When scaling out a MySQL service, your cluster will typically consist of one primary (or master) server, and one or more replicas (or slave) server. These servers will never share filesystems or other storage between each other and will only use MySQL’s replication mechanisms to keep the data in sync across the cluster.

If one replica server dies along with its storage, spinning up a new replica with a brand new volume elsewhere will not be a problem. If the primary server dies, along with its storage, promoting one of the existing replicas to primary, then adding another replica with a new volume is also not that hard.

In most circumstances when running a MySQL or MariaDB cluster, we don’t really need redundant storage, so a locally attached volume is perfectly fine, and as a bonus, will probably outperform the majority of network attached solutions too. We’ll explore MySQL replication and getting WordPress to work with multiple databases in a later section.

However, if running a single MySQL or MariaDB server, then relying on a local volume is not a great idea. This is where replicated volumes may come in handy, and OpenEBS has a great engine just for that.

Before diving into replicated storage, let’s remove all our resources in the default namespace, as we’ll need to recreate them later with new volume claims:

$ kubectl delete -f .

OpenEBS Replicated Storage

On the surface, replicated storage with OpenEBS is not that different from local storage – you create a claim, you get some disk space mounted into your pod and your application is happy.

Underneath, however, it’s a whole different story based on a storage engine called Mayastor, which brings synchronous replication, redundancy and other great features.

Mayastor creates persistent volumes, which can be attached from any node in a Kubernetes cluster, solving our single MySQL/MariaDB server problem. Once the node is dead, you will be able to spin up a new MySQL/MariaDB pod on a different node, and use the existing OpenEBS/Mayastor volume, where all your data is (hopefully) intact.

Before we install OpenEBS with all the bells and whistles, there are a couple of requirements we need to take care of. First, let’s enable Hugepages support in Linux via the /etc/sysctl.conf file:

vm.nr_hugepages = 1024

Run sysctl -p after modifying the file to apply changes, and restart the Kubelet service:

$ sysctl -p
$ systemctl restart kubelet.service

Then, we’ll need to enable a kernel module called NVMe/TCP which is a requirement for Mayastor. Most Linux distributions will have this available.

You can activate the module using modprobe:

$ modprobe nvme_tcp
$ lsmod | grep nvme_tcp
nvme_tcp               53248  0
nvme_keyring           20480  1 nvme_tcp
nvme_fabrics           36864  1 nvme_tcp
nvme_core             212992  2 nvme_tcp,nvme_fabrics

To persist a reboot you’ll need to add a modules-load.d entry (non-Debian distributions should have something similar):

$ echo nvme_tcp > /etc/modules-load.d/nvme_tcp.conf

Installing OpenEBS with Mayastor

We’ve already used Helm to install the minimal OpenEBS version earlier. Let’s uninstall that, and then run the installer again, only this time have it use all the default options:

$ helm uninstall openebs -n openebs 
$ helm install openebs openebs/openebs -n openebs
$ kubectl -n openebs get pods

Don’t be alarmed by the amount of pods you see! It’ll be about 30 for a three-node cluster with one tainted for control-plane. It may take a few minutes for all the pods to get into a Running state.

Next, we’ll need to set up some storage space for Mayastor. In a production environment you’ll usually use separate disk drives for this, however for testing purposes, we may create a file (that looks like a disk) on our existing nodes filesystems, let’s do that on all three nodes:

$ mkdir -p /var/local/openebs/pools/
$ truncate -s 10G /var/local/openebs/pools/disk.img

As you might have guessed, this will create a 10 GB disk image on each node, giving us a total of 30 GB in our pool of three nodes. Next, we’ll need to make sure this new directory we created is mounted to the OpenEBS IO engine pods. To do this, we’ll have to modify the openebs-io-engine DaemonSet.

A DaemonSet is just another abstraction over Pods in Kubernetes. We already covered Deployments and StatefulSets earlier. DaemonSets are very similar, and ensure certain Pods are running across all (necessary) nodes in a cluster.

We can use the kubectl edit command to edit this component in our cluster:

$ kubectl -n openebs edit DaemonSet/openebs-io-engine

This will open up your text editor with the YAML of the DaemonSet. Note that saving the file end exiting the editor at this point will send the updated definition to the Kubernetes API server, replacing the previous definition. It’s quite a dangerous way to change things in your cluster, but it’s great for learning purposes.

In this openebs-io-engine DaemonSet definition you’ll find some existing volume definitions and mounts in the volumeMounts and volumes sections respectively.

Let’s add our new directory to the volumeMounts section:

         volumeMounts:
         - mountPath: /dev
           name: device
         - mountPath: /run/udev
           name: udev
         - mountPath: /dev/shm
           name: dshm
         - mountPath: /var/local/openebs/io-engine/
           name: configlocation
         - mountPath: /dev/hugepages
           name: hugepage
         - mountPath: /var/local/openebs/pools/
           name: pools

Then define the pools volume in the volumes section:

       volumes:
       - hostPath:
           path: /dev 
           type: Directory
         name: device 
       - hostPath:
           path: /run/udev
           type: Directory
         name: udev
       - emptyDir:
           medium: Memory
           sizeLimit: 1Gi
         name: dshm
       - emptyDir:
           medium: HugePages
         name: hugepage
       - hostPath:
           path: /var/local/openebs/io-engine/
           type: DirectoryOrCreate
         name: configlocation
       - hostPath:
           path: /var/local/openebs/pools/
           type: DirectoryOrCreate
         name: pools

You might notice that the /dev directory is already mounted in this DaemonSet, which means that if you’re planning to use an actual disk (and not an image like in our example) then this step will not be necessary.

Next, we’ll need to tell OpenEBS which nodes are okay to run Mayastor on. This can be done with the label: openebs.io/engine: mayastor

$ kubectl label nodes {k0,k1,k2} openebs.io/engine=mayastor   
node/k0 labeled
node/k1 labeled
node/k2 labeled

You’ll notice a new openebs-io-engine-* pod start in your cluster for every labelled node. These are the pods managed by the DaemonSet we’ve edited earlier.

Finally, we’ll need to define some DiskPool resources to reference our fake (or maybe real in your case) disks. This is a custom resource defined and used by OpenEBS. Let’s create a new diskpool.yml file and define our three pools. Since this pool may be shared between different applications and namespaces, it’s best to create this in a different directory (we chose cluster) so that it’s not deleted accidentally:

apiVersion: "openebs.io/v1beta2"
kind: DiskPool
metadata:
  name: pool-k0
  namespace: openebs
spec:
  node: k0
  disks: ["aio:///var/local/openebs/pools/disk.img"]
---
apiVersion: "openebs.io/v1beta2"
kind: DiskPool
metadata:
  name: pool-k1
  namespace: openebs
spec:
  node: k1
  disks: ["aio:///var/local/openebs/pools/disk.img"]
---
apiVersion: "openebs.io/v1beta2"
kind: DiskPool
metadata:
  name: pool-k2
  namespace: openebs
spec:
  node: k2
  disks: ["aio:///var/local/openebs/pools/disk.img"]

We’re defining three DiskPool objects in the openebs namespace named pool-k0, pool-k1 and pool-k2. Each has a spec referencing a specific node and an array for disks. You’ll use the path to a physical disk device if you’re using one. In our example we’re using disk emulation (among other options) via the aio:// URI.

Let’s apply our diskpool.yml file and look at the results:

$ kubectl apply -f cluster/diskpool.yml
$ kubectl -n openebs get diskpool           
NAME      NODE   STATE     POOL_STATUS   CAPACITY      USED   AVAILABLE
pool-k0   k0     Created   Online        10724835328   0      10724835328
pool-k1   k1     Created   Online        10724835328   0      10724835328
pool-k2   k2     Created   Online        10724835328   0      10724835328

If everything worked flawlessly, you should see the pools with a Created state on each node. We can also see capacity and usage of each pool. If your pools are stuck in Pending or an Error state, you can use kubectl describe to obtain more information and start debugging.

Mounting Replicated Volumes

The default installation of OpenEBS ships with a StorageClass called openebs-single-replica. While this does use the Mayastor backend, the resulting volume will still only exist on a single node. To have a truly replicated volume we’ll need to define a new StorageClass with our desired replica count.

Let’s create a storage-class.yml file for our replicated volumes (also separately from our applications):

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-replicated
parameters:
  protocol: nvmf
  repl: "3"
provisioner: io.openebs.csi-mayastor
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true

The new StorageClass will be named openebs-replicated, the nvmf protocol tells Mayastor to use the NVMe over TCP protocol. The repl parameter tells Mayastor how many replicas we want. The provisioner attribute makes sure Mayastor handles all claims using this storage class. The rest of the attributes define some provisioning and management behavior. You can find more details about these in the StorageClass documentation.

Apply the storage-class.yml manifest:

$ kubectl apply -f cluster/storage-class.yml
storageclass.storage.k8s.io/openebs-replicated created

Next, let’s update our application’s volume-claims.yml file for our MariaDB claim to use this new storage class:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mariadb
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: openebs-replicated
  resources:
    requests:
      storage: 2Gi

Apply the manifest and list the claims:

$ kubectl apply -f volume-claims.yml 
persistentvolumeclaim/mariadb created

$ kubectl get pvc
NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS         VOLUMEATTRIBUTESCLASS   AGE
mariadb   Bound    pvc-343545ca-fcce-442c-bce4-184271f14d0b   2Gi        RWO            openebs-replicated   <unset>                 2s

This time around the volume is created immediately and bound to the claim, without having to wait for a pod to try and use it. This behavior is defined by volumeBindingMode in the StorageClass.

Note that we’re defining a claim here with 2GB of storage in total. Let’s look at how this affected our DiskPools:

$ kubectl -n openebs get diskpool
NAME      NODE   STATE     POOL_STATUS   CAPACITY      USED         AVAILABLE
pool-k0   k0     Created   Online        10724835328   2147483648   8577351680
pool-k1   k1     Created   Online        10724835328   2147483648   8577351680
pool-k2   k2     Created   Online        10724835328   2147483648   8577351680

As you can see, each pool’s availability has shrunk by about 2GB, which suggests that the space is allocated on all three pools, as expected with our replication count set to 3.

Let’s run some pods with these volumes, shall we?

MySQL/MariaDB

As mentioned earlier, a single-server MariaDB or MySQL database would greatly benefit from a replicated block store. If the underlying Kubernetes node crashes, we can (often times) resume our pods on a different node. Let’s test some of that.

Unaltered from our previous examples we’ll need a MariaDB ConfigMap, Service and StatefulSet:

$ kubectl apply \
  -f mariadb.configmap.yml \
  -f mariadb.statefulset.yml \
  -f mariadb.service.yml
configmap/mariadb created
statefulset.apps/mariadb created
service/mariadb created

Let’s take a look which node the MariaDB pod was assigned to:

$ kubectl get pods -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP            NODE   NOMINATED NODE   READINESS GATES
mariadb-0   1/1     Running   0          10s   10.10.1.217   k1     <none>           <none>

In our example it’s node k1. Let’s add some data to the MySQL database on this node (you’ll find the database credentials in the ConfigMap):

$ kubectl exec -it mariadb-0 -- mysql -uwordpress -psecret wordpress
> create table foo (bar text);
> insert into foo values ('foo'), ('bar'), ('baz');
Query OK, 3 rows affected (0.002 sec)
Records: 3  Duplicates: 0  Warnings: 0

Let’s now exit the MariaDB shell and drain the node:

$ kubectl drain k1 --ignore-daemonsets --delete-emptydir-data
# (omitted output ...)
node/k1 cordoned
node/k1 drained

$ kubectl get pods -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP           NODE   NOMINATED NODE   READINESS GATES
mariadb-0   1/1     Running   0          93s   10.10.2.49   k2     <none>           <none>

Looks like MariaDB is happily chilling on the k2 node now. Let’s make sure the data is still there:

$ kubectl exec -it mariadb-0 -- mysql -uwordpress -psecret wordpress
> select * from foo;
+------+
| bar  |
+------+
| foo  |
| bar  |
| baz  |
+------+
3 rows in set (0.007 sec)

If you want to be able to schedule things on the drained node, you’ll need to uncordon it with kubectl:

$ kubectl uncordon k1
node/k1 uncordoned

In the example above we drained the node using kubectl. This tells the node to evict all the pods from it, and does so in a graceful manner. This means that the MariaDB service will have some time to properly shutdown and write its data to disk before terminating. This is what usually happens with scheduled maintenance, when you need to replace nodes or disks, or shuffle things around.

However, it is not uncommon for a node to crash, or lose connectivity or suffer from some bad hardware failure beyond recovery. These cases need a lot more involvement in replacing the faulty nodes, and possibly recovering (or permanently losing) unwritten data.

MySQL and MariaDB have some built-in crash recovery mechanisms that usually run on startup, but there’s never a guarantee, so even with this storage redundancy in place, you should continue doing proper database backups. We’ll cover more backup and disaster recovery topics in later sections in this guide.

Recap & Cleanup

In this section we covered storage provisioners in Kubernetes. We installed OpenEBS and looked at automatically provisioning local volumes using the HostPath provisioner.

We then looked at the OpenEBS Mayastor storage backend and configured our Kubernetes cluster with some DiskPools and a 3-replica StorageClass. We provisioned some replicated volumes using that new class and ran a MariaDB pod with the provisioned replicated volume attached. Finally we looked at how to move the MariaDB pod to a different node, while still having access to the replicated volume.

Feel free to nuke the MariaDB stateful set, service and configmap, along with any volume claims before proceeding:

$ kubectl delete \
  -f mariadb.configmap.yml \
  -f mariadb.statefulset.yml \
  -f mariadb.service.yml \
  -f volume-claims.yml

Head over to the next section where we’ll look at different storage access modes, and why such block storage is still a bit problematic for a scalable WordPress application in a Kubernetes cluster.