Postgres SQL HA Cluster ( Quick Start Guide )
We will be keeping the Postgres SQL HA Cluster configuration / setup very very brief using only the bare essentials to get it up and running.
Before we begin, we need to plan things out on our CentOS 7.X servers. First, fill in this table of what your cluster is supposed to look like when done:
NAME | ADDRESS | HOSTNAME | SERVICES |
---|---|---|---|
psql01 | 192.168.0.108 | psql01.nix.mds.xyz | PostgreSQL, ETCD, Patroni, HAPROXY, Keepalived |
psql02 | 192.168.0.124 | psql02.nix.mds.xyz | PostgreSQL, ETCD, Patroni, HAPROXY, Keepalived |
psql03 | 192.168.0.118 | psql03.nix.mds.xyz | PostgreSQL, ETCD, Patroni, HAPROXY, Keepalived |
psql-c01 (VIP) | 192.168.0.112 | psql-c01.nix.mds.xyz |
OS:
Set the IP forward and IP nonlocal bind kernel parameters for Haproxy / Keepalived:
echo "net.ipv4.ip_nonlocal_bind = 1" >> /etc/sysctl.conf; echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf; sysctl -p
FIREWALL:
firewall-cmd –zone=public –permanent –add-port=2379/tcp
firewall-cmd –zone=public –permanent –add-port=2379/udp
firewall-cmd –zone=public –permanent –add-port=2380/tcp
firewall-cmd –zone=public –permanent –add-port=8008/tcp
firewall-cmd –zone=public –permanent –add-port=5432/tcp
firewall-cmd –zone=public –permanent –add-port=67/tcp
firewall-cmd –zone=public –permanent –add-port=68/tcp
firewall-cmd –reload
firewall-cmd –zone=public –list-all
[root@psql02 ~]# cat /etc/firewalld/zones/public.xml
<?xml version="1.0" encoding="utf-8"?>
<zone>
<port protocol="udp" port="20048"/>
<port protocol="tcp" port="2049"/>
<port protocol="tcp" port="111"/>
<port protocol="udp" port="111"/>
<port protocol="tcp" port="24007-24008"/>
<port protocol="tcp" port="38465-38469"/>
<port protocol="tcp" port="4501"/>
<port protocol="udp" port="4501"/>
<port protocol="tcp" port="22"/>
<port protocol="udp" port="22"/>
<port protocol="udp" port="49000-59999"/>
<port protocol="tcp" port="49000-59999"/>
<port protocol="tcp" port="9000"/>
<port protocol="udp" port="9000"/>
<port protocol="udp" port="137"/>
<port protocol="udp" port="138"/>
<port protocol="tcp" port="20048"/>
<port protocol="udp" port="2049"/>
<port protocol="tcp" port="2379"/>
<port protocol="udp" port="2379"/>
<port protocol="tcp" port="2380"/>
<port protocol="tcp" port="8008"/>
<port protocol="tcp" port="5432"/>
<port protocol="tcp" port="67"/>
<port protocol="tcp" port="68"/>
</zone>
PostgreSQL 10:
Install PostgreSQL 10 server:
rpm -e $(rpm -aq|grep -Ei postgres); rpm -Uvh https://yum.postgresql.org/10/redhat/rhel-7-x86_64/pgdg-centos10-10-2.noarch.rpm; yum -y install postgresql10 postgresql10-server postgresql10-contrib postgresql10-libs
Rocky / Alma Linux
rpm -e $(rpm -aq|grep -Ei postgres); rpm -Uvh https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm; yum -y install postgresql10 postgresql10-server postgresql10-contrib postgresql10-libs
PATRONI:
Install patroni on all 3 nodes.
yum install python python-pip -y
pip install –upgrade setuptools
pip install patroni
pip install patroni[etcd]
mkdir -p /data/patroni; chmod 700 /data/patroni
Mar 2022
yum install python python3-pip -y
yum install gcc python3-devel -y
pip3 install –upgrade setuptools
pip3 install patroni
pip3 install patroni[etcd]
mkdir -p /data/patroni
Alternatively, use yum to install needed packages, skipping pip3 entirely:
yum install gcc python3-pip python3-devel python3-psycopg2 etcd patroni python3-etcd python-urllib3 -y; mkdir -p /data/patroni; chown postgres.postgres /data/patroni; chmod 700 /data/patroni
Will still need the urllib3 pip3 install:
pip3 install urllib3 patroni[etcd]
This will result in the following being installed:
[root@psql06 ~]# rpm -aq|grep -Ei "patroni|etcd|psycopg"
python3-psycopg2-2.8.6-1.rhel7.x86_64
etcd-3.3.11-2.el7.centos.x86_64
python3-etcd-0.4.5-20.rhel7.noarch
patroni-2.1.3-1.rhel7.x86_64
[root@psql06 ~]# pip3 list | grep -Ei "patroni|psycopg|etcd"
patroni 2.1.3
psycopg2 2.8.6
python-etcd 0.4.5
[root@psql06 ~]#
Resolve any dependencies you may have. This site and Google are your friends. Add some needed binaries for Patroni. This should be completed by the above commandsso this can be skipped if already installed above:
yum install -y libpq5-devel # pg_config
pip3 install psycopg2 psycopg2-binary psycopg
Once installed, setup the Patroni configuration files as such:
psql01
# cat /etc/patroni.yml
scope: postgres
namespace: /db/
name: postgresql0
restapi:
listen: 192.168.0.108:8008
connect_address: 192.168.0.108:8008
etcd:
host: 192.168.0.108:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
initdb:
– encoding: UTF8
– data-checksums
pg_hba:
– host replication replicator 127.0.0.1/32 md5
– host replication replicator 192.168.0.108/0 md5
– host replication replicator 192.168.0.124/0 md5
– host replication replicator 192.168.0.118/0 md5
– host all all 0.0.0.0/0 md5
users:
admin:
password: admin
options:
– createrole
– createdb
postgresql:
listen: 192.168.0.108:5432
bin_dir: /usr/pgsql-10/bin
connect_address: 192.168.0.108:5432
data_dir: /data/patroni
pgpass: /tmp/pgpass
unix_socket_directories: /data/patroni
authentication:
replication:
username: replicator
password: rep-pass
superuser:
username: postgres
password: secretpassword
parameters:
unix_socket_directories: '.'
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
psql02
scope: postgres
namespace: /db/
name: postgresql1
restapi:
listen: 192.168.0.124:8008
connect_address: 192.168.0.124:8008
etcd:
host: 192.168.0.124:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
initdb:
– encoding: UTF8
– data-checksums
pg_hba:
– host replication replicator 127.0.0.1/32 md5
– host replication replicator 192.168.0.108/0 md5
– host replication replicator 192.168.0.124/0 md5
– host replication replicator 192.168.0.118/0 md5
– host all all 0.0.0.0/0 md5
users:
admin:
password: admin
options:
– createrole
– createdb
postgresql:
listen: 192.168.0.124:5432
bin_dir: /usr/pgsql-10/bin
connect_address: 192.168.0.124:5432
data_dir: /data/patroni
pgpass: /tmp/pgpass
unix_socket_directories: /data/patroni
authentication:
replication:
username: replicator
password: rep-pass
superuser:
username: postgres
password: secretpassword
parameters:
unix_socket_directories: '.'
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
psql03
# cat /etc/patroni.yml
scope: postgres
namespace: /db/
name: postgresql2
restapi:
listen: 192.168.0.118:8008
connect_address: 192.168.0.118:8008
etcd:
host: 192.168.0.118:2379
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
initdb:
– encoding: UTF8
– data-checksums
pg_hba:
– host replication replicator 127.0.0.1/32 md5
– host replication replicator 192.168.0.108/0 md5
– host replication replicator 192.168.0.124/0 md5
– host replication replicator 192.168.0.118/0 md5
– host all all 0.0.0.0/0 md5
users:
admin:
password: admin
options:
– createrole
– createdb
postgresql:
listen: 192.168.0.118:5432
bin_dir: /usr/pgsql-10/bin
connect_address: 192.168.0.118:5432
data_dir: /data/patroni
pgpass: /tmp/pgpass
unix_socket_directories: /data/patroni
authentication:
replication:
username: replicator
password: rep-pass
superuser:
username: postgres
password: secretpassword
parameters:
unix_socket_directories: '.'
tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
ETCD:
Install the packages and configure as follows:
yum install etcd -y
psql01
# cat /etc/etcd/etcd.conf
ETCD_LISTEN_PEER_URLS="http://192.168.0.108:2380"
ETCD_LISTEN_CLIENT_URLS="http://localhost:2379,http://192.168.0.108:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.0.108:2380"
ETCD_INITIAL_CLUSTER="etcd01=http://192.168.0.108:2380,etcd02=http://192.168.0.124:2380,etcd03=http://192.168.0.118:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.0.108:2379"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-c01"
ETCD_INITIAL_CLUSTER_STATE="existing"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_NAME="etcd01"
psql02
# cat /etc/etcd/etcd.conf
ETCD_LISTEN_PEER_URLS="http://192.168.0.124:2380"
ETCD_LISTEN_CLIENT_URLS="http://localhost:2379,http://192.168.0.124:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.0.124:2380"
ETCD_INITIAL_CLUSTER="etcd01=http://192.168.0.108:2380,etcd02=http://192.168.0.124:2380,etcd03=http://192.168.0.118:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.0.124:2379"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-c01"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_NAME="etcd02"
psql03
# cat /etc/etcd/etcd.conf
ETCD_LISTEN_PEER_URLS="http://192.168.0.118:2380"
ETCD_LISTEN_CLIENT_URLS="http://localhost:2379,http://192.168.0.118:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.0.118:2380"
ETCD_INITIAL_CLUSTER="etcd01=http://192.168.0.108:2380,etcd02=http://192.168.0.124:2380,etcd03=http://192.168.0.118:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.0.118:2379"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-c01"
ETCD_INITIAL_CLUSTER_STATE="existing"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_NAME="etcd03"
HAPROXY:
Install haproxy using the following:
yum install haproxy -y
Throw down the config:
psql01
global
log 127.0.0.1 local0
stats socket /var/run/haproxy.sock mode 0600 level admin
user haproxy
group haproxy
daemon
debug
maxconn 1024
defaults
mode tcp
log global
option dontlognull
option redispatch
retries 2
timeout client 30m
timeout connect 4s
timeout server 30m
timeout check 5s
frontend psql-in
mode tcp
bind psql-c01:5432
option tcplog
default_backend psql-back
backend psql-back
mode tcp
option tcplog
option httpchk
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server psql01.nix.mds.xyz psql01.nix.mds.xyz:5432 maxconn 1024 check port 8008
server psql02.nix.mds.xyz psql02.nix.mds.xyz:5432 maxconn 1024 check port 8008
server psql03.nix.mds.xyz psql03.nix.mds.xyz:5432 maxconn 1024 check port 8008
listen stats
bind :9000
mode http
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /haproxy-stats
stats auth admin:sup3rs3cret
psql02
global
log 127.0.0.1 local0
stats socket /var/run/haproxy.sock mode 0600 level admin
user haproxy
group haproxy
daemon
debug
maxconn 1024
defaults
mode tcp
log global
option dontlognull
option redispatch
retries 2
timeout client 30m
timeout connect 4s
timeout server 30m
timeout check 5s
frontend psql-in
mode tcp
bind psql-c01:5432
option tcplog
default_backend psql-back
backend psql-back
mode tcp
option tcplog
option httpchk
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server psql01.nix.mds.xyz psql01.nix.mds.xyz:5432 maxconn 1024 check port 8008
server psql02.nix.mds.xyz psql02.nix.mds.xyz:5432 maxconn 1024 check port 8008
server psql03.nix.mds.xyz psql03.nix.mds.xyz:5432 maxconn 1024 check port 8008
listen stats
bind :9000
mode http
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /haproxy-stats
stats auth admin:sup3rs3cret
psql03
global
log 127.0.0.1 local0
stats socket /var/run/haproxy.sock mode 0600 level admin
user haproxy
group haproxy
daemon
debug
maxconn 1024
defaults
mode tcp
log global
option dontlognull
option redispatch
retries 2
timeout client 30m
timeout connect 4s
timeout server 30m
timeout check 5s
frontend psql-in
mode tcp
bind psql-c01:5432
option tcplog
default_backend psql-back
backend psql-back
mode tcp
option tcplog
option httpchk
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server psql01.nix.mds.xyz psql01.nix.mds.xyz:5432 maxconn 1024 check port 8008
server psql02.nix.mds.xyz psql02.nix.mds.xyz:5432 maxconn 1024 check port 8008
server psql03.nix.mds.xyz psql03.nix.mds.xyz:5432 maxconn 1024 check port 8008
listen stats
bind :9000
mode http
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /haproxy-stats
stats auth admin:sup3rs3cret
KEEPALIVED
Install keepalived using:
yum install keepalived -y
Throw down the keepalived configuration as follows:
psql01
# cat /etc/keepalived/keepalived.conf
vrrp_script chk_haproxy {
script "killall -0 haproxy" # check the haproxy process
interval 2 # every 2 seconds
weight 2 # add 2 points if OK
}
vrrp_instance VI_1 {
interface eth0 # interface to monitor
state MASTER # MASTER on haproxy1, BACKUP on haproxy2
virtual_router_id 112 # Set to last digit of IP
priority 101 # 101 on haproxy1, 100 on haproxy2 etc
virtual_ipaddress {
192.168.0.112 # virtual ip address
}
track_script {
chk_haproxy
}
}
psql02
vrrp_script chk_haproxy {
script "killall -0 haproxy" # check the haproxy process
interval 2 # every 2 seconds
weight 2 # add 2 points if OK
}
vrrp_instance VI_1 {
interface eth0 # interface to monitor
state BACKUP # MASTER on haproxy1, BACKUP on haproxy2
virtual_router_id 112 # Set to last digit of cluster IP.
priority 102 # 101 on haproxy1, 100 on haproxy2 etc
virtual_ipaddress {
192.168.0.112 # virtual ip address
}
track_script {
chk_haproxy
}
}
psql03
vrrp_script chk_haproxy {
script "killall -0 haproxy" # check the haproxy process
interval 2 # every 2 seconds
weight 2 # add 2 points if OK
}
vrrp_instance VI_1 {
interface eth0 # interface to monitor
state BACKUP # MASTER on haproxy1, BACKUP on haproxy2
virtual_router_id 112 # Set to last digit of cluster IP.
priority 103 # 101 on haproxy1, 100 on haproxy2 etc
virtual_ipaddress {
192.168.0.112 # virtual ip address
}
track_script {
chk_haproxy
}
}
Setup a start / stop service for Patroni (All Hosts):
# cat /etc/systemd/system/patroni.service
# This is an example systemd config file for Patroni
# You can copy it to "/etc/systemd/system/patroni.service",
[Unit]
Description=Runners to orchestrate a high-availability PostgreSQL
After=syslog.target network.target
[Service]
Type=simple
User=postgres
Group=postgres
# Read in configuration file if it exists, otherwise proceed
EnvironmentFile=-/etc/patroni_env.conf
WorkingDirectory=~
# Where to send early-startup messages from the server
# This is normally controlled by the global default set by systemd
#StandardOutput=syslog
# Pre-commands to start watchdog device
# Uncomment if watchdog is part of your patroni setup
#ExecStartPre=-/usr/bin/sudo /sbin/modprobe softdog
#ExecStartPre=-/usr/bin/sudo /bin/chown postgres /dev/watchdog
# Start the patroni process
ExecStart=/usr/bin/patroni /etc/patroni.yml
# Send HUP to reload from patroni.yml
ExecReload=/bin/kill -s HUP $MAINPID
# only kill the patroni process, not it's children, so it will gracefully stop postgres
KillMode=process
# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=30
# Do not restart the service if it crashes, we want to manually inspect database on failure
Restart=no
[Install]
WantedBy=multi-user.target
#
If you run into Auditd / Permission issues, use the following page for reference to resolve them. It may take a few attempts:
METHOD 1:
grep AVC /var/log/audit/audit.log | audit2allow -M systemd-allow
semodule -i systemd-allow.pp
METHOD 2:
audit2allow -a
audit2allow -a -M ganesha_<NUM>_port
semodule -i ganesha_<NUM>_port.pp
ONE LINER:
grep AVC /var/log/audit/audit.log* | grep -Ei denied >/var/log/audit/audit.previous; cat /var/log/audit/audit.previous | audit2allow -M systemd-allow; semodule -i systemd-allow.pp
USEFULL THINGS:
ausearch –interpret
aureport
BEST SELINUX SOLUTION FOR HAPROXY
semanage permissive -a haproxy_t
Last but not least:
systemctl enable etcd haproxy keepalived patroni
Congratulations! You've installed your three node Patroni HA Cluster.
Cheers,
TK
REF:
A few references below on our Journey to HA heaven:
https://medium.com/@tbeach/how-to-install-postgres-posgresql-on-centos-7-or-rhel-7-af32b26a79f0
https://coreos.com/etcd/docs/latest/v2/clustering.html
https://www.linode.com/docs/databases/postgresql/create-a-highly-available-postgresql-cluster-using-patroni-and-haproxy/