[ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out) at gcomm/src/pc.cpp:connect():158

Another issue we can run into is the following set of messages:

2019-06-08T04:56:24.518538Z 0 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
at gcomm/src/pc.cpp:connect():158
2019-06-08T04:56:24.518591Z 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open backend connection: -110 (Connection timed out)
2019-06-08T04:56:24.518764Z 0 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1458: Failed to open channel ‘galera_cluster1’ at ‘gcomm://192.168.0.126,192.168.0.107,192.168.0.114’: -110 (Connection timed out)
2019-06-08T04:56:24.518793Z 0 [ERROR] WSREP: gcs connect failed: Connection timed out
2019-06-08T04:56:24.518812Z 0 [ERROR] WSREP: wsrep::connect(gcomm://192.168.0.126,192.168.0.107,192.168.0.114) failed: 7
2019-06-08T04:56:24.518835Z 0 [ERROR] Aborting

To solve this, take a look at your /etc/my.cnf file. The following three fields have to match the server that you are on:

wsrep_node_address=”192.168.0.126″
wsrep_node_name=”mysql01″
server_id=1
bind-address=192.168.0.126

If they don’t, the above error is thrown. server_id must be unique for each node. This can happen when you’re restoring your database from one node to another or trying recovery steps where you copy the /etc/my.cnf over to another host.

Another solution involves copying the data directory from the master or most current host to the target host we want as a new master. We do this because we wanted to test when the master is offline while working only with 2/3 nodes. Here is our situation:

mysql01 GOOD
mysql02 BAD
mysql03 BAD

Copy the data dir from mysql01 to mysql02:

mysql02 # cd /var/lib/mysql; scp -rp mysql01:/var/lib/mysql/ .

Set the safe_to_bootstrap flag to 1:

cat grastate.dat
# GALERA saved state
version: 2.1
uuid: f25fc12b-8a0b-11e9-b58d-bfb801e3b36d
seqno: -1
safe_to_bootstrap: 1

(IF Grant Tables is an Issue) Start mysqld in safe mode to reset the pass, if need be:

[root@mysql03 mysql]# systemctl set-environment MYSQLD_OPTS=”–wsrep-new-cluster –skip-grant-tables”

(Recommended) OR just set the following variable indicating this is a new bootstrapped cluster:

[root@mysql03 mysql]# systemctl set-environment MYSQLD_OPTS=”–wsrep-new-cluster”

Bootstrap this node:

mysql02 # /usr/bin/mysqld_bootstrap

On the third node, mysql03, remove all files from the /var/lib/mysql folder because we’ll let it sync up from mysql02:

mysql03 # cd /var/lib; mv mysql mysql-bk01; mkdir mysql; chown mysql:mysql mysql; cd mysql;
OR
mysql03 # cd /var/lib/mysql; rm -rf *

Start mysql on mysql03 so it sync’s from mysql02:

mysql03 # systemctl start mysqld

Let it sync. You should have an accessible 2/3 node cluster at this point. If the original node you bootstrapped from exhibits the same issue, also clear it’s directory and restart it to allow data to be copied from the two good nodes.

Restart the original bootstrapped node without the flag set before, rebooting to test after the next two steps:

[root@mysql01 mysql]# systemctl set-environment MYSQLD_OPTS=””;
[root@mysql01 mysql]# systemctl restart mysqld;

Restart the cluster one node at a time at first, then all together to verify cluster comes back up properly.

Thx,
TK

This entry was posted on Saturday, June 8th, 2019 at 1:07 am and is filed under NIX Posts. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

You must be logged in to post a comment.

Thoughts and Scribbles | MicroDevSys.com