Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Galera Replication #28

Open
FrederikNJS opened this issue Oct 30, 2015 · 51 comments · May be fixed by #377
Open

Support Galera Replication #28

FrederikNJS opened this issue Oct 30, 2015 · 51 comments · May be fixed by #377
Assignees
Labels
Request Request for image modification or feature

Comments

@FrederikNJS
Copy link

To use this docker image in production, it would be very nice to be able to run it with replication, to maintain some redundancy.

The bitnami/mariadb image already does this, but they don't have the 10.x versions.

@yosifkit
Copy link
Contributor

yosifkit commented Nov 2, 2015

Would Galera cover that for you (#24) or are you looking at the autoconfiguration that is in their script (https://github.com/bitnami/bitnami-docker-mariadb/blob/master/rootfs/bitnami-utils-custom.sh)?

@krasi-georgiev
Copy link

I am on the same path

mariadb has this build it from 10.1 and after
this is resolved #29
it would be nice to implement some sort of env to bootstrap only the first time the container is lunched using the mysqld --wsrep-new-cluster and then any additional container can join the cluster
with mysqld --wsrep_cluster_address=gcomm://container_name

@nazar-pc
Copy link

nazar-pc commented Dec 2, 2015

If anyone is interested - this is not very straightforward, but doable.
You can find scalable MariaDB image that requires no configuration here: https://github.com/nazar-pc/docker-webserver
Please, read readme and advanced documentation carefully, it really works well.
I'm working on GlusterFS image, so that literally everything will scale with zero configuration, but MariaDB works fine already.
Comparing to Bitnami's images my images look a bit simpler and uses this MariaDB official image as base image. Take a look and let me know what you think.

BTW, I'm not sure that that PR with Galera support is really as good as it could be. Also I have such feeling, that official image should not provide Galera setup, it is tricky and will not be flexible enough eventually. Current image is good enough and contains most of necessary basic blocks to build Galera image yourself while nicely reusing what is done here already.

@krasi-georgiev
Copy link

thanks I will check it.

I will not advise you to go for the GlusterFS , especialy for web apps. I haven't tested it myself , but the feedback is that is quite slow for when accessing many small files.

I have setup 2x lsyncd container and so far it syncs really well. It is kind of a rsync daemon , watches for changes and it fires rsync

The documentation sais not fit for 2 way sync , but in my setup it works very , very well.

Here is the repo, It is not universal but you don't have time right now to improve it.
https://github.com/vipconsult/dockerfiles/tree/master/lsyncd

I start is with the same compose file on both hosts

lsyncd:
# so that can increase fs.inotify.max_user_watches
privileged: true
# without net host it cannot bind to the internal ip
net: host
volumes:
# share the same keys on all containers so that no ssh config is needed
- ../lsyncd/.ssh:/root/.ssh
# the actual folders that will be synced
- /folder1ToShare:/sync/f1
- /folder2ToShare:/sync/.f2

it needs 2 env variables
$INTERNAL_IP - ot binds to this ip
$LB_SERVER - it connects to this ip on port 222

on host 1
$INTERNAL_IP - 172.0.0.10
$LB_SERVER - 172.0.0.20

on host 2
$INTERNAL_IP - 172.0.0.20
$LB_SERVER - 172.0.0.10

@macropin
Copy link

macropin commented Jun 1, 2016

If you're going to support Galera, please be sure to include galera-arbitrator-3 (aka garbd) in future builds.

@activatedgeek
Copy link

The current 10.x builds already have the Galera plugin. Doesn't that work? I have been trying to debug that for quite a while now and the second node after the bootstrap node crashes saying MySQL init process failed. Here is an SO post for anybody who knows about the issue: http://stackoverflow.com/questions/39744949/unable-to-create-mariadb-galera-cluster.

@klausenbusk
Copy link
Contributor

klausenbusk commented Oct 12, 2016

The current 10.x builds already have the Galera plugin. Doesn't that work? I have been trying to debug that for quite a while now and the second node after the bootstrap node crashes saying MySQL init process failed.

I have been using the MariaDB 10.1 with Galera for a long time (3+ months) and it works perfect..
Before that I used a home crafted image with MariaDB Galera (before it was merged into regular MariaDB)
I start it as:

/usr/bin/docker run \
        --name mariadb-galera \
        --rm \
        -p 3306:3306 \
        -p ${COREOS_PRIVATE_IPV4}:4444:4444 \
        -p ${COREOS_PRIVATE_IPV4}:4567:4567/udp \
        -p ${COREOS_PRIVATE_IPV4}:4567-4568:4567-4568 \
        --dns 172.17.42.1 \
        -v /var/lib/mysql:/var/lib/mysql \
        -v /var/log/mysql:/var/log/mysql \
        mariadb:10.1 \
        --log-bin=mysqld-bin \
        --log-slave-updates \
        --binlog-format=row \
        --binlog-annotate-row-events \
        --innodb-autoinc-lock-mode=2 \
        --innodb-flush-log-at-trx-commit=0 \
        --slow-query-log \
        --wsrep-on="ON" \
        --wsrep-log-conflicts \
        --wsrep-slave-threads=4 \
        --wsrep-provider="/usr/lib/libgalera_smm.so" \
        --wsrep-cluster-address="gcomm://mysql.skydns.local,mysql.skydns.local" \
        --wsrep-node-address="${COREOS_PRIVATE_IPV4}" \
        --wsrep-node-name="%H" \
        --wsrep-sst-method="xtrabackup-v2" \
        --wsrep-sst-auth="${SST_AUTH}"

I run it on a CoreOS cluster with Skydns, and the server register itself with Skydns/etcd so it is available on "mysql.skydns.local"..

@activatedgeek
Copy link

@klausenbusk Would you mind quickly reviewing the configs present at the SO link I have provided? http://stackoverflow.com/questions/39744949/unable-to-create-mariadb-galera-cluster

@klausenbusk
Copy link
Contributor

@klausenbusk Would you mind quickly reviewing the configs present at the SO link I have provided? http://stackoverflow.com/questions/39744949/unable-to-create-mariadb-galera-cluster

First you should use 10.1, secondly it doesn't seems like you expose any ports?, thirdly you need to set --wsrep-node-address.. :)

@activatedgeek
Copy link

activatedgeek commented Oct 12, 2016

@klausenbusk I am using inter-container networking so I don't think port exposure is needed. I am trying to run it on my local machine, I can worry about multi-host communication later. Also --wsrep-node-address is it necessary? According the logs of second node, it looks like the SST was completed successfully. Thanks! :)

@klausenbusk
Copy link
Contributor

@activatedgeek Oh yes, I think you correct :)
Do you have a full log of the secondary node?

@activatedgeek
Copy link

activatedgeek commented Oct 12, 2016

@klausenbusk Have a look here: http://pastebin.com/3exPpqvc. If you have a look at lines 215-217 you can see that the SST was completed successfully, but the node crashed due to failed init. Also line 220 shows port: 0, which is rather odd as on the bootstrap node it shows port: 3306.

@klausenbusk
Copy link
Contributor

klausenbusk commented Oct 12, 2016

@klausenbusk Have a look here: http://pastebin.com/3exPpqvc. If you have a look at lines 215-217 you can see that the SST was completed successfully, but the node crashed due to failed init. Also line 220 shows port: 0, which is rather odd as on the bootstrap node it shows port: 3306.

The issues is that the image get stuck in the initializing phase (which is running with --skip-networking (explain port 0)).
That the reason why I run mkdir -p /var/lib/mysql/mysql, before I start MariaDB, so It doesn't run the initializing code.

So try something like:

mkdir -p /var/lib/mysql/mysql
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \
  -v /var/lib/mysql:/var/lib/mysql \
  activatedgeek/mariadb:devel \
  --wsrep-cluster-name=test_cluster \
  --wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4

@activatedgeek
Copy link

@klausenbusk So essentially, you mean that start all nodes such that the folder /var/lib/mysql (or the data dir) should contain a mysql folder? This could either be done within the container or the mounted volume. Is that right?

@klausenbusk
Copy link
Contributor

@klausenbusk So essentially, you mean that start all nodes such that the folder /var/lib/mysql (or the data dir) should contain a mysql folder? This could either be done within the container or the mounted volume. Is that right?

Correct expect for the bootstrap node, also you should consider using xtrabackup for SST.

@activatedgeek
Copy link

@klausenbusk Perfect. This is working great now. I'll create a wrapper MariaDB image which takes an extra environment flag to create the dir for non-bootstrap nodes. Thank you so much for your inputs!

@yosifkit
Copy link
Contributor

I believe tanji had a working setup with this compose file: #57 (comment).

@wglambert wglambert added the question Usability question, not directly related to an error with the image label Apr 24, 2018
@strarsis
Copy link

strarsis commented May 2, 2018

Not sure if this relates to this issue, but I get this warning on new database initialization:

[Warning] Failed to load slave replication state from table mysql.gtid_slave_pos: 1146: Table 'mysql.gtid_slave_pos' doesn't exist

@tanji
Copy link

tanji commented May 3, 2018

@strarsis Unrelated, please open a different issue for that

@tianon
Copy link
Contributor

tianon commented May 29, 2018

I think supporting replication out of the box is probably a bit too ambitious for this image (as noted by other folks above) -- there are too many edge cases and environmental configuration for us to do that really well in such a way that would satisfy not only existing replication users but also cover the use cases of new users.

@wglambert wglambert added Request Request for image modification or feature and removed question Usability question, not directly related to an error with the image labels Jul 11, 2018
@jan-hudec
Copy link

jan-hudec commented Nov 8, 2018

Severalnines have an example for docker-swarm in severalnines/galera-docker-mariadb. They also have example for kubernetes there, but I had problems with the etcd (there is only incubator chart for it and it seems broken), so I rewrote it to use labels in kubernetes instead, and used this image.

Notable differences are that they are still using xtracbackup-v2, while this image (only) has mariabackup, and different path to the galera_ssm library. Version 10.2 also changes the state variables, so the healthcheck scripts need to be rewritten as severalnines still have 10.1.

I used this image unmodified, just injecting a script wrapping the default entrypoint from the kubernetes manifest. It is kubernetes-specific, so appropriate place to host it would be the helm/charts.

I had one issue with the docker-entrypoint.sh though: I prefixed it with a bit that derives the appropriate --wsrep options, but passing them to the docker-entrypoint.sh as is does not work, because they must not be passed to mysql_install_db. I hacked around it, but it would be better to have a way to either distinguish the options in the default script, or to tell it to just initialise the database without the final exec, so the wrapper script would do that.

@grooverdan
Copy link
Member

Most discussion here seems galera related.

MDEV-25667 recently got raised on getting some better support for doing galera in the server that will with hopefully minimal entrypoint changes provide the necessary functionality.

galera-arbitrator should be its own image rather than bloating this one.

@tymonx
Copy link

tymonx commented May 19, 2021

I have some little success in this area. My experiences:

  • It is possible to have a full automatic bootstrap for new Galera cluster in multi containers scenario without any manual action or interaction. In my case, I'm checking presents of the /var/lib/mysql/gvwstate.dat. To automatically determine the node used for bootstrapping I check container host name with provided list of Galera cluster nodes (hostnames).
  • It is also possible to have a full automatic node join during Galera cluster creation. This require two steps. Firs step is bootstrapping phase, second step is nodes joining order that can be achieved by looping through provided Galera cluster nodes (hostnames), checking Galera status for each node and waiting for synced state for each previous nodes in order (like chaining, node1 -> node2 -> ... -> nodeN) before running mysqld (joining to cluster).
  • The default docker-entrypoint.sh script used in the official MariaDB image is not perfect/not suitable in cluster replication. Main issues came from the... mysqld itself. For example, if someone will provide a custom galera.cnf MariaDB configuration file, Galera options will be used during invocation of the mysql_install_db command (it creates a temporary mysqld instance) and pre MariaDB configuration/setup (it also creates a temporary mysqld instance) called by the default docker-entrypoint.sh script. It can result to strange behaviors. To play safe, I have created my own version of the docker-entrypoint.sh script.

Working results can be found here: https://gitlab.com/tymonx/docker-mariadb. I have a simple Docker Compose example with automatic bootstrapping and automatic cluster nodes join. This should work out-of-box without requiring a third party (especially an additional cluster controller/manager). There are some TODOs that can be done like:

  • Automatically detecting Galera cluster Initialized state for all nodes, determine the most advanced node and execute automatic bootstrap.
  • Automatically detecting Initialized state for node(s) when quorum is still valid and kick node again to cluster.
  • My custom mysqld-entrypint script should have the same features/capabilities like the default docker-entrypoint.sh script or be merged into it.
  • Not fully tested but results look promising.

@tymonx
Copy link

tymonx commented May 20, 2021

It will definitely help.

This would make a user opt in - but setting the user setting mariadb config option for wsrep_notify_cmd (and other galera bits) which they'd be doing it anyway in their configuration.

This can be solved. By detecting this and creating a script wrapper that will execute both (or more) scripts, the current one (always forked) and custom one provided by user. It is similar to observer pattern from software when you register N custom callbacks and fire them in loop for every event/notification. This case also happens in classic signal (C?) or trap (POSIX scripts) mechanism in application and scripts. And using the observer pattern helps to resolve that.

@tymonx
Copy link

tymonx commented May 20, 2021

After some thinking, I have a nice proposition for the wsrep_notify_cmd improvement. This parameter should accept a list of scripts separated with comma , :) Or multi invocation of the --wsrep-notify-command command line parameter. As I assume, current implementation of the the wsrep_notify_cmd parameter handles only a single command/script.

@grooverdan
Copy link
Member

yes, single command (implementation of execution)

@grooverdan
Copy link
Member

I think you're over engineering it (MDEV-25742). If a user wants their own notification script as well as what you what you develop, they can wrap it in a script themselves. If really needed its a rather simple shell script to fan out an reap, lets not burden the server the the added complexity.

@tymonx
Copy link

tymonx commented May 20, 2021

If you think this feature will not help or make something easier I can close it. At beginning the idea sounds good :)

I have prepared a simple test. When I was using official image 10.6 initial auto-joining doesn't work as I have mentioned before. But replacing the default docker-entrypoint.sh from the official mariadb:10.6 image with the newest docker-entrypoint.sh from the master branch it started working. For now I have ran both (original and updated) several times. I suspect that this fix #358 helped. Temporary created mysqld instances read and was trying to attempt configuration settings for Galera and mess up with joining sequence. I think I can now drop my silly idea of keeping initial sequence order of nodes when auto-creating a cluster :)

My tests can be found here: https://gitlab.com/tymonx/docker-mariadb/-/tree/dev-mariadb-docker-initial-auto-join-failure

If you want look into CI logs:

@tymonx
Copy link

tymonx commented May 20, 2021

I think if you finish to implement this feature (MDEV-25667) and add support for the GALERA_AUTO_BOOTSTRAP and maybe optionally the GALERA_AUTO_RECOVER environment variables it should be quiet enough :)

@tymonx
Copy link

tymonx commented May 20, 2021

Good news! After updating the docker-entrypoint.sh script with the newest one, everything works beautifully :)

I have added only the auto bootstrap feature.

My working replication support https://gitlab.com/tymonx/docker-mariadb:

  • based on the original MariaDB image 10.6 with updated the docker-entrypoint.sh script
  • simple custom mysqld-entrypoint script that can be merged later into the the default docker-entrypoint.sh script. @grooverdan I can prepare a proper Pull Request for that
  • it automatically detects if the wsrep_on parameter was enabled based on mysqld configuration files or provided command line arguments
  • parsing the wsrep_cluster_address parameter from mysqld configuration files or provided command line arguments or set by new environment variable CLUSTER_ADDRESS to automatically determine node used for cluster bootstrapping
  • on default the auto cluster bootstrapping feature is enabled and it uses the first node from the wsrep_cluster_address parameter. This can be disabled with the CLUSTER_AUTO_BOOTSTRAP=<0|NO|OFF|FALSE|DISABLE> environment variable. It also possible to select other node with the CLUSTER_BOOTSTRAP_ADDRESS=<hostname|ip> environment variable.

@grooverdan
Copy link
Member

grooverdan commented May 21, 2021

quick note: 6f5d272 is the cause of the timezone initialization failures in your CI. A 10.6.1 release is imitate. You want work around it with MARIADB_INITDB_SKIP_TZINFO=1. Looking at other comments now.

Chat available on https://mariadb.zulipchat.com

@tymonx
Copy link

tymonx commented May 21, 2021

Thanks! I have recently tested with Docker Compose and Docker Swarm. Current mysqld-entrypoint implementation.

Docker Compose scaling:

CLUSTER_ADDRESS="gcomm://db_node_1,db_node_2,db_node_3,db_node_4,db_node_5"
docker-compose --project-name db up --scale node="$(echo "${CLUSTER_ADDRESS}" | tr ',' ' ' | wc -w)"

@tymonx
Copy link

tymonx commented May 21, 2021

Add scaling using external mysql configuration file (first it needs be added to the volumes: section):

docker-compose --project-name db up --scale node="$(grep -i wsrep_cluster_address <name>.cnf | tr -d ' ' | tr ',' ' ' | wc -w)"

@grooverdan
Copy link
Member

ok. Looking forward to a PR.

Notes based on current entrypoint:

  • please make better use of bash string functions rather than sed/grep -q and obvious the existing functions within the entrypoint.
  • for consistent naming use WSREP_{AUTO_BOOTSTRAP,BOOTSTRAP_ADDRESS,*} if needed rather than GALERA/CLUSTER.
  • existing variables are just checked with [ -n "$MARIADB_PASSWORD" ] for enabled so lets keep it simple there rather than a proliferation of accepted options that only apply to these new variables "0\|no\|off\|false\|disable"
  • I haven't see much ${@:+$@} use before. My bash knowledge isn't perfect however "$@" is sufficient I suspect.
  • When you have a PR ready, can you show the delta to a different branch would show what it is like with MDEV-25667 ready. The currently is_new_cluster simplicity makes it look like its not needed.
  • I'm still worried about auto-enable messing with an existing user that has something working. Maybe a WSREP_AUTOBOOTSTRAP={nodename} to identify the node to wsrep-new-cluster on.

@tymonx
Copy link

tymonx commented May 21, 2021

Sure :) I will fork this project and prepare a proper PR including your suggestions.

I haven't see much ${@:+$@} use before. My bash knowledge isn't perfect however "$@" is sufficient I suspect.

It is related with the SC2068 warning. The ${@:+$@} construct only silent the ShellCheck linter warning (or error) for $@ rather putting everywhere # shellcheck disable=SC2068. You are right, in this case using the "$@" construct is sufficient.

tymonx added a commit to tymonx/mariadb-docker that referenced this issue May 21, 2021
This patch add support for Galera replication.

Features:
- it detects if Galera replication was enabled using `mysql`
  configuration files or provided `mysqld` command line arguments
- on default it enables cluster auto bootstrap feature
- on default the first cluster node is used for cluster auto bootstrapping
  based on the wsrep_cluster_address parameter from `mysql`
  configuration files, `mysqld` command line arguments or by setting the
  `WSREP_CLUSTER_ADDRESS` environment variable
- cluster auto bootstrap feature can be disabled by setting the
  `WSREP_SKIP_AUTO_BOOTSTRAP` environment variable
- use the `WSREP_AUTO_BOOTSTRAP_ADDRESS` environment variable to explicitly
  choice other node for cluster bootstrapping
- cluster node hostnames or IP addresses must be valid to enable cluster
  auto bootstrapping

How to use it.

1. Prepare `mysql` configuration file `galera.cnf`:

```plaintext
[galera]
wsrep_on                       = ON
wsrep_sst_method               = rsync
wsrep_provider                 = /usr/lib/libgalera_smm.so
bind-address                   = 0.0.0.0
binlog_format                  = row
default_storage_engine         = InnoDB
innodb_doublewrite             = 1
innodb_autoinc_lock_mode       = 2
innodb_flush_log_at_trx_commit = 2
```

2. Make it read-only:

```plaintext
chmod 444 galera.cnf
```

3. Prepare Docker Compose file `docker-compose.yml`:

```yaml
services:
    node:
        image: mariadb
        restart: always
        security_opt:
            - label=disable
        environment:
            WSREP_CLUSTER_ADDRESS: "${WSREP_CLUSTER_ADDRESS:-}"
            MYSQL_ROOT_PASSWORD: example
        volumes:
            - ./galera.cnf:/etc/mysql/conf.d/10-galera.cnf:ro
        command:
            - --wsrep-cluster-address=gcomm://db_node_1,db_node_2,db_node_3
        deploy:
            replicas: 3
```

4. Start Docker Compose:

```plaintext
docker-compose --project-name db up
```

To start N MariaDB instances using environment variable:

```plaintext
WSREP_CLUSTER_ADDRESS="gcomm://db_node_1,db_node_2,db_node_3,db_node_4,db_node_5"
docker-compose --project-name db up --scale node="$(echo "${WSREP_CLUSTER_ADDRESS}" | tr ',' ' ' | wc -w)"
```

To start N MariaDB instances using `mysql` configuration file:

```plaintext
docker-compose --project-name db up --scale node="$(grep -i wsrep_cluster_address <name>.cnf | tr -d ' ' | tr ',' ' ' | wc -w)"
```
@tymonx
Copy link

tymonx commented May 21, 2021

@grooverdan done. Ready for review :)

@tymonx
Copy link

tymonx commented May 22, 2021

I'm still worried about auto-enable messing with an existing user that has something working. Maybe a WSREP_AUTOBOOTSTRAP={nodename} to identify the node to wsrep-new-cluster on.

I have added two useful environment variables WSREP_SKIP_AUTO_BOOTSTRAP and WSREP_AUTO_BOOTSTRAP_ADDRESS=<ip|hostname>.

I can reverse the logic and disable the auto bootstrap feature on default. Then I can replace the WSREP_SKIP_AUTO_BOOTSTRAP environment variable with the WSREP_AUTO_BOOTSTRAP.
If user will provide Galera configuration and set the wsrep_cluster_address parameter (or use --wsrep-cluster-address or WSREP_CLUSTER_ADDRESS), it must also explicitly set the WSREP_AUTO_BOOTSTRAP=1.

grooverdan pushed a commit to tymonx/mariadb-docker that referenced this issue Feb 10, 2022
This patch add support for Galera replication.

Features:
- It detects if Galera replication was enabled wsrep_on=ON
- By default it enables cluster auto bootstrap feature
- By default the first cluster node is used for cluster auto bootstrapping
  based on the wsrep_cluster_address parameter or by setting the
  `WSREP_CLUSTER_ADDRESS` environment variable
- cluster auto bootstrap feature can be disabled by setting the
  `WSREP_SKIP_AUTO_BOOTSTRAP` environment variable
- use the `WSREP_AUTO_BOOTSTRAP_ADDRESS` environment variable to explicitly
  choice other node for cluster bootstrapping
- cluster node hostnames or IP addresses must be valid to enable cluster
  auto bootstrapping

How to use it.

1. Prepare MariaDB configuration file `galera.cnf`:

```plaintext
[galera]
wsrep_on                       = ON
wsrep_sst_method               = mariabackup
wsrep_provider                 = /usr/lib/libgalera_smm.so
binlog_format                  = row
default_storage_engine         = InnoDB
innodb_doublewrite             = 1
innodb_autoinc_lock_mode       = 2
```

2. Make it read-only:

```plaintext
chmod 444 galera.cnf
```

3. Prepare Docker Compose file `docker-compose.yml`:

```yaml
services:
    node:
        image: mariadb
        restart: always
        security_opt:
            - label=disable
        environment:
            WSREP_CLUSTER_ADDRESS: "${WSREP_CLUSTER_ADDRESS:-}"
            MARIADB_ROOT_PASSWORD: example
        volumes:
            - ./galera.cnf:/etc/mysql/conf.d/10-galera.cnf:ro
        command:
            - --wsrep-cluster-address=gcomm://db_node_1,db_node_2,db_node_3
        deploy:
            replicas: 3
```

4. Start Docker Compose:

```plaintext
docker-compose --project-name db up
```

To start N MariaDB instances using environment variable:

```plaintext
WSREP_CLUSTER_ADDRESS="gcomm://db_node_1,db_node_2,db_node_3,db_node_4,db_node_5"
docker-compose --project-name db up --scale node="$(echo "${WSREP_CLUSTER_ADDRESS}" | tr ',' ' ' | wc -w)"
```

To start N MariaDB instances using MariaDB configuration file:

```plaintext
docker-compose --project-name db up --scale node="$(grep -i wsrep_cluster_address <name>.cnf | tr -d ' ' | tr ',' ' ' | wc -w)"
```

Closes: MariaDB#28
grooverdan pushed a commit to tymonx/mariadb-docker that referenced this issue Feb 15, 2022
This patch add support for Galera replication.

Features:
- It detects if Galera replication was enabled wsrep_on=ON
- By default it enables cluster auto bootstrap feature
- By default the first cluster node is used for cluster auto bootstrapping
  based on the wsrep_cluster_address parameter or by setting the
  `WSREP_CLUSTER_ADDRESS` environment variable
- cluster auto bootstrap feature can be disabled by setting the
  `WSREP_SKIP_AUTO_BOOTSTRAP` environment variable
- use the `WSREP_AUTO_BOOTSTRAP_ADDRESS` environment variable to explicitly
  choice other node for cluster bootstrapping
- cluster node hostnames or IP addresses must be valid to enable cluster
  auto bootstrapping

How to use it.

1. Prepare MariaDB configuration file `galera.cnf`:

```plaintext
[galera]
wsrep_on                       = ON
wsrep_sst_method               = mariabackup
wsrep_provider                 = /usr/lib/libgalera_smm.so
binlog_format                  = row
default_storage_engine         = InnoDB
innodb_doublewrite             = 1
innodb_autoinc_lock_mode       = 2
```

2. Make it read-only:

```plaintext
chmod 444 galera.cnf
```

3. Prepare Docker Compose file `docker-compose.yml`:

```yaml
services:
    node:
        image: mariadb
        restart: always
        security_opt:
            - label=disable
        environment:
            WSREP_CLUSTER_ADDRESS: "${WSREP_CLUSTER_ADDRESS:-}"
            MARIADB_ROOT_PASSWORD: example
        volumes:
            - ./galera.cnf:/etc/mysql/conf.d/10-galera.cnf:ro
        command:
            - --wsrep-cluster-address=gcomm://db_node_1,db_node_2,db_node_3
        deploy:
            replicas: 3
```

4. Start Docker Compose:

```plaintext
docker-compose --project-name db up
```

To start N MariaDB instances using environment variable:

```plaintext
WSREP_CLUSTER_ADDRESS="gcomm://db_node_1,db_node_2,db_node_3,db_node_4,db_node_5"
docker-compose --project-name db up --scale node="$(echo "${WSREP_CLUSTER_ADDRESS}" | tr ',' ' ' | wc -w)"
```

To start N MariaDB instances using MariaDB configuration file:

```plaintext
docker-compose --project-name db up --scale node="$(grep -i wsrep_cluster_address <name>.cnf | tr -d ' ' | tr ',' ' ' | wc -w)"
```

Closes: MariaDB#28
@chengkuangan
Copy link

The existing image is supporting Galera. The Galeria library is included in the image. I have made it works for my NextCloud. I even have the maxscale running in front of these cluster nodes.

I am using k8s (on RPI4s :-) ) so I will just highlight those changes needed and I believe you can easily replicates to docker and so on.

You need to run the container with the following command. This is the syntax for YAML, should be very similar to docker compose. Use the following to start the master node. Launch your slave nodes using standard approach as what you always do. Best to wait for your master node to be ready then only launching the slave nodes.

command: ["mariadbd"]
args: ["--user=mysql", "--wsrep-new-cluster"]

The following is the config required in mariadb.cnf on top of whatever you have now. This should be the same for all your MariaDB instances.

[galera]
    # Mandatory settings
    wsrep_on=ON
    # this is the correct path for the library. I am using image 10.7.3
    wsrep_provider=/usr/lib/galera/libgalera_smm.so
    #add your node ips here
    # make sure these are IPs or Resolvable DNS/domain names
    wsrep_cluster_address="gcomm://mariadb1,mariadb2,mariadb3"
    binlog_format=row
    default_storage_engine=InnoDB
    innodb_autoinc_lock_mode=2
    #Cluster name
    wsrep_cluster_name="nc-mariadb_cluster"
    # Allow server to accept connections on all interfaces.
    bind-address=0.0.0.0
    # this server ip, change for each server
    # this server name, change for each server
    wsrep_node_name="mariadb1"
    wsrep_sst_method=rsync
    innodb_doublewrite=1 

@chengkuangan
Copy link

Screenshot 2022-03-11 at 7 52 26 PM

@skjnldsv
Copy link

skjnldsv commented Mar 11, 2022

@chengkuangan But then you have to edit the config every time you want to add a new node, no?
How did you made server1 and server3 slaves ? Is this just a multi-master galera with a maxscale readwrite split to create a master-slave setup?

If you stop the cluster, how do you restart it again? For me it will always complain, since the cluster already have been initialized and --wsrep-new-cluster is not really relevant here, right?

Thanks for taking the time to share this with us :)

@chengkuangan
Copy link

@skjnldsv
Galera configure master-master cluster. The master-slave you see on the maxscale is because of readwrite split (which required by NextCloud in my case).

From what I read if we stop the cluster completely, it considered terminated, we will need to bootstrap the cluster again. Which is mentioned in the doc. It will be a manual step.

Thanks for pointing of removing the --wsrep-new-cluster. It is only needed for the first node bootstrap.

To be frank, I am not an expert here... This is my first time setting up MariaDB cluster and maxscale. and it is RPI4 playground ... so no guaranteed I am 100% correct. I am just sharing what I have learned and subject to mistake. :-)

@chengkuangan
Copy link

The existing image is supporting Galera. The Galeria library is included in the image. I have made it works for my NextCloud. I even have the maxscale running in front of these cluster nodes.

I am using k8s (on RPI4s :-) ) so I will just highlight those changes needed and I believe you can easily replicates to docker and so on.

You need to run the container with the following command. This is the syntax for YAML, should be very similar to docker compose. Use the following to start the master node. Launch your slave nodes using standard approach as what you always do. Best to wait for your master node to be ready then only launching the slave nodes.

command: ["mariadbd"]
args: ["--user=mysql", "--wsrep-new-cluster"]

The following is the config required in mariadb.cnf on top of whatever you have now. This should be the same for all your MariaDB instances.

[galera]
    # Mandatory settings
    wsrep_on=ON
    # this is the correct path for the library. I am using image 10.7.3
    wsrep_provider=/usr/lib/galera/libgalera_smm.so
    #add your node ips here
    # make sure these are IPs or Resolvable DNS/domain names
    wsrep_cluster_address="gcomm://mariadb1,mariadb2,mariadb3"
    binlog_format=row
    default_storage_engine=InnoDB
    innodb_autoinc_lock_mode=2
    #Cluster name
    wsrep_cluster_name="nc-mariadb_cluster"
    # Allow server to accept connections on all interfaces.
    bind-address=0.0.0.0
    # this server ip, change for each server
    # this server name, change for each server
    wsrep_node_name="mariadb1"
    wsrep_sst_method=rsync
    innodb_doublewrite=1 

I forgot to mention. I have to initiate all the PODs first time without galera first. When the pod is ready, I will change it to enable galena and restart it.

Reason of doing so is if galera is enabled the first time, the instance will failed with error complaining mysql tables and other files are not available. I guess this maybe because the docker-entry point.sh is not designed for galera in the first boot and the database server is not initiated.

@chengkuangan
Copy link

@skjnldsv Galera configure master-master cluster. The master-slave you see on the maxscale is because of readwrite split (which required by NextCloud in my case).

From what I read if we stop the cluster completely, it considered terminated, we will need to bootstrap the cluster again. Which is mentioned in the doc. It will be a manual step.

Thanks for pointing of removing the --wsrep-new-cluster. It is only needed for the first node bootstrap.

To be frank, I am not an expert here... This is my first time setting up MariaDB cluster and maxscale. and it is RPI4 playground ... so no guaranteed I am 100% correct. I am just sharing what I have learned and subject to mistake. :-)

oh another thing, I have not looked hard enough maybe ... so far I can't find the grastate.dat file mentioned in the doc in order to change the safe_to_bootstrap=1. so I am domed if my cluster crashed or completely shutdown.

Any idea?

@grooverdan
Copy link
Member

There's some notes started on grastate.dat and documentation references on MDEV-25855. Insights and corrections welcome.

@chengkuangan
Copy link

grastate.dat

@grooverdan To verify ... grastate.dat will not be created if the cluster is gracefully shutdown, am I correct?

@skjnldsv
Copy link

skjnldsv commented Mar 17, 2022

@grooverdan To verify ... grastate.dat will not be created if the cluster is gracefully shutdown, am I correct?

You can generate it with --wsrep-recover
Then edit the grastate.dat and start again without --wsrep-recover

@grooverdan
Copy link
Member

I'm still learning the mechanics of the mechanisms available

@chengkuangan
Copy link

I recently have a crash. I follow the documented approach and be able to bring the cluster back without any problem. Thanks for all the tips and guides here. This is what I do:

  1. Change the yaml command and to perform a recovery boot for each Pod. Once the process completed. Set the replicas to 0.
        command: ["mariadbd"]
        args: ["--user=mysql", "--wsrep-recover"]
  1. Check the grastate.dat. This is located at the root path of the /data (in my case). mysql is at /data/mysql. For my case, maybe my nextcloud DB is not busy because I am the only user and all 3 instances' seqno is -1. So I guess it is fine to choose any instance to bootstrap again.

  2. I bootstrap one of the node with --wsrep-new-cluster. Once the Pod is read, I proceed to start all the remaining nodes. All actives and serving request now without problem.

  3. Remove the --wsrep-new-cluster command and args and reapply the yaml changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Request Request for image modification or feature