The MariaDB Operator

In this section we’ll introduce the concept of operators in Kubernetes, and use the MariaDB Operator to create and manage a scalable MariaDB database service in our Kubernetes cluster.

Note: all code samples from this section are available on GitHub.

Operators

Kubernetes operators are designed to automate manual work that’s needed to run a specific application in the cluster. These are typically used with stateful applications, such as databases, where every replica has its own state and identity.

Operators can take care of a wide range of tasks related to the application, such as creating and restoring backups, performing upgrades, updating configurations, handling failure and more.

Many operators in Kubernetes are built around Custom Resources (or CRDs) which allow us to extend the Kubernetes API with our own objects alongside the built-in resources, such as Pods.

On their own, custom resources aren’t very useful as they’re simply a place to store some structured data, however with a controller, CRDs can become very handy as you will see with the MariaDB example.

The MariaDB Operator for Kubernetes is a set of custom resource definitions and controllers, which allows users to create and manage MariaDB clusters.

Given these CRDs, instead of using StatefulSets and Services in Kubernetes, we’ll be defining objects using various new abstractions, such as MariaDBs, Databases, Users and Backups. The operator controllers will then take care of mapping these objects into any relevant Pods, Services and other Kubernetes built-in objects.

The MariaDB operator has a ton of useful features and we’ll be exploring just a few of those in this section.

Installing the Operator

In our previous sections we’ve relied on mariadb.statefulset.yml for our StatefulSet declaration, mariadb.service.yml for the Kubernetes Service and mariadb.secrets.yml to store our credentials.

Before installing the MariaDB operator, let’s destroy our existing MariaDB objects to avoid any confusion:

$ kubectl delete -f mariadb.secrets.yml \
  -f mariadb.service.yml \
  -f mariadb.statefulset.yml
secret "mariadb-secrets" deleted
service "mariadb" deleted
statefulset.apps "mariadb" deleted

Now let’s use Helm to install the MariaDB Operator:

$ helm repo add mariadb-operator https://helm.mariadb.com/mariadb-operator
$ helm install mariadb-operator-crds mariadb-operator/mariadb-operator-crds
$ helm install mariadb-operator mariadb-operator/mariadb-operator

You’ll see a few new running pods prefixed with “mariadb-operator”, these are the controllers that will watch for changes in our custom resources and other things.

Creating a MariaDB

Now that the operator is up and running, let’s re-create our MariaDB secrets first. We removed the database name and user name, as those will derive from our new resources later.

apiVersion: v1
kind: Secret
metadata:
  name: mariadb-secrets
  labels:
    k8s.mariadb.com/watch:
stringData:
  MARIADB_PASSWORD: secret
  MARIADB_ROOT_PASSWORD: verysecret

Note the new k8s.mariadb.com/watch label attached to this secret. This will ask the operator to monitor this for any changes, and perform updates when necessary. Note that updating the root password will require some hoops to jump through, but updating the WordPress user password will work just fine!

Let’s add this mariadb.secrets.yml to our cluster:

$ kubectl apply -f mariadb.secrets.yml
secret/mariadb-secrets created

Next, we’ll need a MariaDB resource, let’s call it mariadb.yml:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: wordpress-mariadb
spec:
  rootPasswordSecretKeyRef:
    name: mariadb-secrets
    key: MARIADB_ROOT_PASSWORD
  storage:
    size: 1Gi
    storageClassName: openebs-hostpath

This is a custom resource with the kind MariaDB, which has a spec that supports a wide range of options and features. We used only a couple here, to specify the location of the root password, as well as a storage option to claim a local HostPath volume.

Note that in previous sections we used an OpenEBS/Mayastor replicated volume for some redundancy with MariaDB, but given that we’ll eventually be running multiple replicas this time around, it’s totally fine to downgrade this storage to a single OpenEBS non-replicated volume or even a local HostPath volume, for better performance.

Let’s apply this manifest and ensure our MariaDB resource and its related pods and service are up and running:

$ kubectl apply -f mariadb.yml                  
mariadb.k8s.mariadb.com/wordpress-mariadb created

$ kubectl get mariadbs
NAME                READY   STATUS    PRIMARY POD           AGE
wordpress-mariadb   True    Running   wordpress-mariadb-0   42s

$ kubectl get pods                    
NAME                                                READY   STATUS    RESTARTS   AGE
mariadb-operator-769bb76896-5z5md                   1/1     Running   0          78m
mariadb-operator-cert-controller-5d849657f4-lccxs   1/1     Running   0          78m
mariadb-operator-webhook-5d455b84d4-wk87v           1/1     Running   0          78m
minio-0                                             1/1     Running   0          82m
wordpress-756d97c644-pgk5k                          2/2     Running   0          82m
wordpress-mariadb-0                                 1/1     Running   0          49s

Our new pod is wordpress-mariadb-0 and you should also see a persistent volume claim for the pod, as well as a couple of ClusterIP services for this MariaDB deployment.

If the pod fails to start you should be able to obtain some information from the mariadb-operator pod logs.

Before we can use this with our WordPress installation, we’ll need to create a database, a user and a grant.

Databases, Users and Grants

As briefly mentioned earlier, the MariaDB operator includes custom resource definitions for databases, users and grants. This means that we can use YAML manifests to define and manage these resources.

Let’s squeeze them all into a single YAML file called mariadb.data.yml and separate the three resources with the --- block:

apiVersion: k8s.mariadb.com/v1alpha1
kind: Database
metadata:
  name: wordpress
spec:
  mariaDbRef:
    name: wordpress-mariadb

Our first resource is the Database named wordpress, linked to our MariaDB service using the mariaDbRef.name attribute.

---
apiVersion: k8s.mariadb.com/v1alpha1
kind: User
metadata:
  name: wordpress
spec:
  mariaDbRef:
    name: wordpress-mariadb
  passwordSecretKeyRef:
    name: mariadb-secrets
    key: MARIADB_PASSWORD
  maxUserConnections: 0

The next resource is a User also named wordpress, linked to our MariaDB service using the mariaDbRef.name attribute. We also set the maxUserConnections to unlimited here (only 10 by default) since we don’t really know how many connections our WordPress pods will need.

---
apiVersion: k8s.mariadb.com/v1alpha1
kind: Grant
metadata:
  name: wordpress
spec:
  mariaDbRef:
    name: wordpress-mariadb
  privileges:
  - ALL PRIVILEGES
  database: wordpress
  username: wordpress

Finally our Grant object linked to the same MariaDB service with ALL PRIVILEGES on the wordpress database granted to the wordpress user.

If you’ve created a user in MySQL or MariaDB before, all of this should not be a surprise. There is a short-hand to create all three as part of the MariaDB resource definition, that’s okay for testing purposes, but you’ll be stuck with the default connection limit of 10 which might be tricky to change later.

Let’s add these resources to our Kubernetes cluster:

$ kubectl apply -f mariadb.data.yml
database.k8s.mariadb.com/wordpress unchanged
user.k8s.mariadb.com/wordpress unchanged
grant.k8s.mariadb.com/wordpress created

We can use kubectl exec to make sure we can connect to our MariaDB database using these credentials:

$ kubectl exec -it wordpress-mariadb-0 -- \
  mysql -uwordpress -psecret wordpress -e 'show databases;'
+--------------------+
| Database           |
+--------------------+
| information_schema |
| wordpress          |
+--------------------+

MariaDB Replication

Setting up replication with the MariaDB operator in Kubernetes is quite straightforward. We need a replication.enabled flag and a minimum of two replicas.

With out traditional primary/replica configuration, we’ll also need separate endpoints to reach the primary server and the replica servers.

Our new mariadb.yml file will now look like this:

apiVersion: k8s.mariadb.com/v1alpha1
kind: MariaDB
metadata:
  name: wordpress-mariadb
spec:
  rootPasswordSecretKeyRef:
    name: mariadb-secrets
    key: MARIADB_ROOT_PASSWORD

  storage:
    size: 1Gi
    storageClassName: openebs-hostpath

  replicas: 3
  replication:
    enabled: true

  primaryService:
    type: ClusterIP

  secondaryService:
    type: ClusterIP

The replicas and replication blocks tell the operator how many MariaDB pods we’d like to run. Note that the specified number of replicas includes the primary service in a primary/replica configuration, so the minimum number of two replicas will include one primary pod, and one replica. In the case above we’ll be running one primary and two replicas.

The primaryService and secondaryService attributes tell the operator what type of Services to create for our pods in the Kubernetes cluster. Let’s apply these changes to our Kubernetes cluster and observe the pods:

$ kubectl apply -f mariadb.yml
mariadb.k8s.mariadb.com/wordpress-mariadb configured

$ kubectl get pods
NAME                                                READY   STATUS    RESTARTS   AGE
mariadb-operator-769bb76896-5z5md                   1/1     Running   0          5h55m
mariadb-operator-cert-controller-5d849657f4-lccxs   1/1     Running   0          5h55m
mariadb-operator-webhook-5d455b84d4-wk87v           1/1     Running   0          5h55m
minio-0                                             1/1     Running   0          5h59m
wordpress-756d97c644-pgk5k                          2/2     Running   0          5h59m
wordpress-mariadb-0                                 1/1     Running   0          2m12s
wordpress-mariadb-1                                 1/1     Running   0          24s
wordpress-mariadb-2                                 1/1     Running   0          24s

We now have three MariaDB pods running, and quite a few services:

$ kubectl get svc
NAME                          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
kubernetes                    ClusterIP   10.100.0.1       <none>        443/TCP          29d
mariadb-operator-webhook      ClusterIP   10.100.110.133   <none>        443/TCP          5h56m
minio                         ClusterIP   10.100.155.81    <none>        9000/TCP         12d
minio-console                 NodePort    10.100.192.4     <none>        9001:30008/TCP   12d
wordpress                     NodePort    10.100.152.147   <none>        80:30007/TCP     12d
wordpress-mariadb             ClusterIP   10.100.127.168   <none>        3306/TCP         3m1s
wordpress-mariadb-internal    ClusterIP   None             <none>        3306/TCP         3m1s
wordpress-mariadb-primary     ClusterIP   10.100.46.238    <none>        3306/TCP         3m8s
wordpress-mariadb-secondary   ClusterIP   10.100.77.21     <none>        3306/TCP         3m8s

The two services we’ll be using with WordPress are wordpress-mariadb-primary for writes and reads, and wordpress-mariadb-secondary for reads. Let’s try and query these services using the MySQL command line:

$ kubectl exec -it wordpress-mariadb-0 -- \
  mysql -uroot -pverysecret -hwordpress-mariadb-primary \
  -e 'show replica hosts;'
+-----------+-------------+------+-----------+
| Server_id | Host        | Port | Master_id |
+-----------+-------------+------+-----------+
|        12 | 10.10.2.235 | 3306 |        10 |
|        11 | 10.10.3.29  | 3306 |        10 |
+-----------+-------------+------+-----------+

We can also ensure that the secondary service connects us to a database that’s in read-only mode (i.e. a replica/slave):

$ kubectl exec -it wordpress-mariadb-0 -- \
  mysql -uroot -pverysecret -hwordpress-mariadb-secondary \
  -e "show variables like 'read_only';"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only     | ON    |
+---------------+-------+

Finally, let’s make sure replication is actually working, by writing to a primary, then reading from a replica using our WordPress user credentials:

$ kubectl exec -it wordpress-mariadb-0 -- \
  mysql -uwordpress -psecret wordpress -hwordpress-mariadb-primary \
  -e 'create table foo (id integer);'

$ kubectl exec -it wordpress-mariadb-0 -- \
  mysql -uwordpress -psecret wordpress -hwordpress-mariadb-secondary \
  -e 'show tables;'
+---------------------+
| Tables_in_wordpress |
+---------------------+
| foo                 |
+---------------------+

At this point we can point our WordPress deployment to the wordpress-mariadb-primary database service via the wordpress.configmap.yml manifest, restart our deployment and run through the installation using just the primary database (we’ll explore working with replicas in WordPress in the next section).

Failover

As mentioned briefly earlier, one of the things the MariaDB operator helps us with is failover. Let’s delete one of the database replicas and see this in action:

$ kubectl delete pod wordpress-mariadb-2
pod "wordpress-mariadb-2" deleted

The underlying StatefulSet will create a new pod bound to the same existing persistent volume. The operator isn’t doing much work here since we’re essentially restarting the container from the same persistent volume. What if we deleted the persistent volume together with the pod?

$ kubectl delete pvc storage-wordpress-mariadb-1 \
  & kubectl delete pod wordpress-mariadb-1
persistentvolumeclaim "storage-wordpress-mariadb-1" deleted
pod "wordpress-mariadb-1" deleted

A few moments later you will see our operator created a new pod, with a brand new and empty persistent volume claim (and volume). If we inspect the replica status on the new pod, we’ll see that it’s all up-to-date and synced with the primary:

$ kubectl exec -it wordpress-mariadb-1 -- \
  mysql -uroot -pverysecret \
  -e 'show all replicas status\G'
*************************** 1. row ***************************
Connection_name: mariadb-operator
Slave_SQL_State: Slave has read all relay log; waiting for more updates
Slave_IO_State: Waiting for master to send event
# output omitted...

A more interesting experiment is to delete the primary node:

$ kubectl delete pvc storage-wordpress-mariadb-0 \
  & kubectl delete pod wordpress-mariadb-0
persistentvolumeclaim "storage-wordpress-mariadb-0" deleted
pod "wordpress-mariadb-0" deleted

At this stage, if you observe the MariaDB operator logs, you’ll see something along the lines of:

Configuring new primary
Connecting replicas to new primary
Primary switched
Configuring replica

The pod will get re-created like any other pod in a StatefulSet, however this time around it will no longer be a primary (by omitting the -h flag in mysql we’re connecting to the local database service, rather than going through a Kubernetes service):

$ kubectl exec -it wordpress-mariadb-0 -- \
  mysql -uroot -pverysecret -e 'show replica hosts'
# no output

$ kubectl exec -it wordpress-mariadb-0 -- \
  mysql -uroot -pverysecret -e "show variables like 'read_only';"       
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only     | ON    |
+---------------+-------+

$ kubectl exec -it wordpress-mariadb-1 -- \
mysql -uroot -pverysecret -e 'show replica hosts'
+-----------+-------------+------+-----------+
| Server_id | Host        | Port | Master_id |
+-----------+-------------+------+-----------+
|        10 | 10.10.3.254 | 3306 |        11 |
|        12 | 10.10.2.15  | 3306 |        11 |
+-----------+-------------+------+-----------+

In our case the wordpress-mariadb-1 pod was promoted to be the new primary and all replicas (including the new one) were configured to read from that instead.

Our Services have also been updated to make sure they’re continuing to point to the correct pods:

$ kubectl describe svc wordpress-mariadb-primary
Endpoints:         10.10.3.176:3306
# output omitted

In the case above, 10.10.3.176 is the IP address of our recently promoted wordpress-mariadb-1 pod that is now the primary. The secondary MariaDB Service has also been updated to point to the replica pods.

What’s next?

We encourage you to explore the examples directory of the MariaDB operator if you’d like to learn more about it. For our purposes here with WordPress, we have a fully working replicated MariaDB service in our Kubernetes cluster. We have primary and secondary service endpoints we can use to write and read to and from our database.

As mentioned earlier, in order to now make good use of this highly available database cluster, we’ll need to split read and write queries in WordPress. This can be done using a special plugin called HyperDB, which we’ll deep dive into next.

Would you like daily or weekly reminders to help you stay on track? Follow by email for a quick nudge to keep your momentum going.