Clients connect to unready endpoints #20

TomHutter · 2018-06-12T10:25:18Z

Hi everybody,

due to the annotation: service.alpha.kubernetes.io/tolerate-unready-endpoints: "true" in the service, it seems to me, that the service is distributing requests to nodes, even they are not ready. This leads to connection or SQL errors, when client requests are distributed to nodes which are shutting down.
Therefore I created another service, which has no annotation and use this service for the clients to connect:

# create a service for clients which honors readiness
apiVersion: v1
kind: Service
metadata:
  name: "{{ template "dnsname" . }}-service"
  labels:
    app: {{ template "fullname" . }}
    chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
    release: "{{ .Release.Name }}"
    heritage: "{{ .Release.Service }}"
spec:
  ports:
  - name: mysql
    port: 3306
  clusterIP: None
  selector:
    app: {{ template "fullname" . }}

Additionally I modified the readiness probe to check for a semaphore file and if the node is in sync:

#!/bin/bash
#
# Adfinis SyGroup AG
# openshift-mariadb-galera: mysqld readinessProbe
#

MYSQL_USER="readinessProbe"
MYSQL_PASS="readinessProbe"
MYSQL_HOST="localhost"

if [ -f "/tmp/wsrep_off" ];then
  exit 1
fi

mysql --protocol=socket --socket=/var/run/mysqld/mysqld.sock -u${MYSQL_USER} -p${MYSQL_PASS} -h${MYSQL_HOST} -e"SHOW DATABASES;"

if [ $? -ne 0 ]; then
  exit 1
fi

SYNCED=$( mysql -s --skip-column-names --protocol=socket --socket=/var/run/mysqld/mysqld.sock -u${MYSQL_USER} -p${MYSQL_PASS} -h${MYSQL_HOST} -e"SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';" | awk '{ print $2 }' )

if [ "${SYNCED}" != "Synced" ];then
  exit 1
else
  exit 0
fi

Then I added a pre_stop command to the stateful set, increased the terminationGracePeriodSeconds to 60, to give the nodes enough time to shut down and set the frequency of the readinessProbe to 10 seconds:

....
terminationGracePeriodSeconds: 60
....
     containers:
        lifecycle:
         preStop:
            exec:
              command:
                - /bin/sh
                - -c
                - touch /tmp/wsrep_off && sleep 20
...
       readinessProbe:
          exec:
            command:
            - /usr/share/container-scripts/mysql/readiness-probe.sh
          timeoutSeconds: 5
          periodSeconds: 10
          failureThreshold: 1

Now the nodes themselves can connect to each other over the galera-mdb-ga service, which tolerates not ready nodes and the clients can connect to the nodes over galera-mdb-ga-service, which distributes requests only to ready nodes.

The text was updated successfully, but these errors were encountered:

tongpu · 2018-06-12T10:57:11Z

The galera service is required by the StatefulSet and shall not be used to let clients connect. So creating a second service for client access is the right way to go.

We've also discussed updating the readinessProbe in #8, but didn't yet work on it. A PR with your changes would be very much appreciated.

tongpu added the enhancement label Jun 12, 2018

tongpu self-assigned this Jun 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clients connect to unready endpoints #20

Clients connect to unready endpoints #20

TomHutter commented Jun 12, 2018

tongpu commented Jun 12, 2018

Clients connect to unready endpoints #20

Clients connect to unready endpoints #20

Comments

TomHutter commented Jun 12, 2018

tongpu commented Jun 12, 2018