Header Shadow Image


PANIC:  replication checkpoint has wrong magic 0 instead of 307747550

So we run into a little problem getting out PostgreSQL Patroni w/ ETCD cluster going after a rather serious failure. 

# sudo su – postgres

$ /usr/pgsql-10/bin/postgres -D /data/patroni –config-file=/data/patroni/postgresql.conf –listen_addresses=192.168.0.118 –max_worker_processes=8 –max_locks_per_transaction=64 –wal_level=replica –track_commit_timestamp=off –max_prepared_transactions=0 –port=5432 –max_replication_slots=10 –max_connections=100 –hot_standby=on –cluster_name=postgres –wal_log_hints=on –max_wal_senders=10 -d 5

This resulted in one of the 3 messages above.  Hence the post here.  If I can start a single instance, I should be fine since I could then 1) replicate over to the other two or 2) simply take a dump, reinitialize all the databases then restore the dump.  

Using the above procedure I get one of three error messages when using the data files of each node:

[ PSQL01 ]
postgres: postgres: startup process waiting for 000000010000000000000008

[ PSQL02 ]
PANIC:replicationcheckpointhas wrong magic 0 instead of  307747550

[ PSQL03 ]
FATAL:syntax error inhistory file:f2W 

 

Unfortunately, we couldn't do anything about PSQL03 and PSQL02, the standby's, since the database base/ folder was way out of sync, meaning, there was no tables there:

[ PSQL03 ]

[root@psql03 base]# ls -altri
total 40
    42424 drwx——.  2 postgres postgres 8192 Oct 29  2018 1
 67714749 drwx——.  2 postgres postgres 8192 Oct 29  2018 13805
202037206 drwx——.  5 postgres postgres   38 Oct 29  2018 .
134312175 drwx——.  2 postgres postgres 8192 May 22 01:55 13806
    89714 drwxr-xr-x. 20 root     root     4096 May 22 22:43 ..
[root@psql03 base]#

 

[ PSQL02 ]

 [root@psql02 base]# ls -altri

total 412
201426668 drwx——.  2 postgres postgres  8192 Oct 29  2018 1
   743426 drwx——.  2 postgres postgres  8192 Mar 24 03:47 13805
135326327 drwx——.  2 postgres postgres 16384 Mar 24 20:15 40970
   451699 drwx——.  2 postgres postgres 40960 Mar 25 19:47 16395
  1441696 drwx——.  2 postgres postgres  8192 Mar 31 15:09 131137
 68396137 drwx——.  2 postgres postgres  8192 Mar 31 15:09 131138
135671065 drwx——.  2 postgres postgres  8192 Mar 31 15:09 131139
204353100 drwx——.  2 postgres postgres  8192 Mar 31 15:09 131140
135326320 drwx——. 17 postgres postgres  4096 Apr 14 10:08 .
 68574415 drwx——.  2 postgres postgres 12288 Apr 28 06:06 131142
   288896 drwx——.  2 postgres postgres 16384 Apr 28 06:06 131141
203015232 drwx——.  2 postgres postgres  8192 Apr 28 06:06 131136
135326328 drwx——.  2 postgres postgres 40960 May  5 22:09 24586
 67282461 drwx——.  2 postgres postgres  8192 May  5 22:09 13806
 67640961 drwx——.  2 postgres postgres 20480 May  5 22:09 131134
203500274 drwx——.  2 postgres postgres 16384 May  5 22:09 155710
134438257 drwxr-xr-x. 20 root     root      4096 May 22 01:44 ..
[root@psql02 base]# pwd
/root/postgres-patroni-backup/base
[root@psql02 base]#

 

[ PSQL01 ]

[root@psql01 base]# ls -altri
total 148
134704615 drwx——.  2 postgres postgres  8192 Oct 29  2018 1
201547700 drwx——.  2 postgres postgres  8192 Oct 29  2018 13805
   160398 drwx——.  2 postgres postgres  8192 Feb 24 23:53 13806
 67482137 drwx——.  7 postgres postgres    62 Feb 24 23:54 .
135909671 drwx——.  2 postgres postgres 24576 Feb 24 23:54 24586
134444555 drwx——.  2 postgres postgres 24576 Feb 24 23:54 16395
 67178716 drwxr-xr-x. 20 root     root      4096 May 22 01:53 ..
[root@psql01 base]# pwd
/root/postgresql-patroni-etcd/base
[root@psql01 base]#

So we could only work with PSQL02, the original primary node.  Everyother node has nothing.  

Looks like our replorigin_checkpoint is at fault resulting in a rather nasty replication error:

open("pg_wal/000000BE000000000000004C", O_RDONLY) = 5
open("pg_wal/000000BE000000000000004C", O_RDONLY) = 5
openat(AT_FDCWD, "base", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 6
openat(AT_FDCWD, "pg_tblspc", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 6
openat(AT_FDCWD, "pg_replslot", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 6
openat(AT_FDCWD, "pg_replslot", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 6
open("pg_logical/replorigin_checkpoint", O_RDONLY) = 6
write(2, "2019-06-02 14:50:34.777 EDT [283″…, 1062019-06-02 14:50:34.777 EDT [28362] PANIC:  replication checkpoint has wrong magic 0 instead of 307747550
-bash-4.2$ cat pg_logical/replorigin_checkpoint
cat: pg_logical/replorigin_checkpoint: No such file or directory
-bash-4.2$ pwd
/data/patroni/tmp
-bash-4.2$ cd ..
-bash-4.2$ cat pg_logical/replorigin_checkpoint
øÉKíÛ0ðð bø{nð- Ðð à Ð à4Ø4-bash-4.2$ PuTTY
-bash: PuTTY: command not found
-bash-4.2$
-bash-4.2$ strings pg_logical/replorigin_checkpoint
-bash-4.2$ ls -altri pg_logical/replorigin_checkpoint
67894871 -rw——-. 1 postgres postgres 16384 Oct 29  2018 pg_logical/replorigin_checkpoint
-bash-4.2$ ls -altri pg_logical/
total 20
 67894871 -rw——-.  1 postgres postgres 16384 Oct 29  2018 replorigin_checkpoint
136946383 drwx——.  2 postgres postgres     6 Oct 29  2018 snapshots
204367784 drwx——.  2 postgres postgres     6 Oct 29  2018 mappings
 67894870 drwx——.  4 postgres postgres    65 Apr 28 06:06 .
135326272 drwx——. 21 postgres postgres  4096 Jun  2 14:50 ..
-bash-4.2$


So let's copy a good one from another host (I guess we could delete it but I haven't tried):


[root@psql03 pg_logical]#
[root@psql03 pg_logical]# ls -altri
total 8
 68994432 drwx——.  2 postgres postgres    6 Oct 29  2018 snapshots
134984156 drwx——.  2 postgres postgres    6 Oct 29  2018 mappings
   566745 -rw——-.  1 postgres postgres    8 May 22 01:55 replorigin_checkpoint
   566731 drwx——.  4 postgres postgres   65 May 22 01:55 .
    89714 drwxr-xr-x. 20 root     root     4096 May 22 22:43 ..
[root@psql03 pg_logical]#
[root@psql03 pg_logical]#
[root@psql03 pg_logical]#
[root@psql03 pg_logical]# scp replorigin_checkpoint psql02:/data/patroni/pg_logical/
Password:
replorigin_checkpoint                                                                                 100%    8    10.1KB/s   00:00
[root@psql03 pg_logical]#
[root@psql03 pg_logical]#


Now we can get to the backend in standalone mode:


-bash-4.2$
-bash-4.2$ /usr/pgsql-10/bin/postgres –single -D /data/patroni –config-file=/data/patroni/postgresql.conf –hot_standby=off –listen_addresses=192.168.0.124 –max_worker_processes=8 –max_locks_per_transaction=64 –wal_level=replica –cluster_name=postgres –wal_log_hints=on –max_wal_senders=10 –track_commit_timestamp=off –max_prepared_transactions=0 –port=5432 –max_replication_slots=10 –max_connections=20 -d 5 2>&1
2019-06-02 15:00:48.981 EDT [29057] DEBUG:  invoking IpcMemoryCreate(size=144687104)
2019-06-02 15:00:48.982 EDT [29057] DEBUG:  mmap(144703488) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
2019-06-02 15:00:48.993 EDT [29057] DEBUG:  SlruScanDirectory invoking callback on pg_notify/0000
2019-06-02 15:00:48.993 EDT [29057] DEBUG:  removing file "pg_notify/0000"
2019-06-02 15:00:48.993 EDT [29057] DEBUG:  dynamic shared memory system will support 128 segments
2019-06-02 15:00:48.994 EDT [29057] DEBUG:  created dynamic shared memory control segment 1025202362 (3088 bytes)
2019-06-02 15:00:48.994 EDT [29057] DEBUG:  InitPostgres
2019-06-02 15:00:48.994 EDT [29057] DEBUG:  my backend ID is 1
2019-06-02 15:00:48.994 EDT [29057] LOG:  database system was interrupted; last known up at 2019-04-28 06:06:24 EDT
2019-06-02 15:00:49.265 EDT [29057] LOG:  invalid record length at 0/4C35CDF8: wanted 24, got 0
2019-06-02 15:00:49.266 EDT [29057] LOG:  invalid primary checkpoint record
2019-06-02 15:00:49.266 EDT [29057] LOG:  using previous checkpoint record at 0/4C34EDA8
2019-06-02 15:00:49.266 EDT [29057] DEBUG:  redo record is at 0/4C34ED70; shutdown FALSE
2019-06-02 15:00:49.266 EDT [29057] DEBUG:  next transaction ID: 0:1409831; next OID: 237578
2019-06-02 15:00:49.266 EDT [29057] DEBUG:  next MultiXactId: 48; next MultiXactOffset: 174
2019-06-02 15:00:49.266 EDT [29057] DEBUG:  oldest unfrozen transaction ID: 549, in database 1
2019-06-02 15:00:49.266 EDT [29057] DEBUG:  oldest MultiXactId: 1, in database 1
2019-06-02 15:00:49.266 EDT [29057] DEBUG:  commit timestamp Xid oldest/newest: 0/0
2019-06-02 15:00:49.266 EDT [29057] DEBUG:  transaction ID wrap limit is 2147484196, limited by database with OID 1
2019-06-02 15:00:49.266 EDT [29057] DEBUG:  MultiXactId wrap limit is 2147483648, limited by database with OID 1
2019-06-02 15:00:49.266 EDT [29057] DEBUG:  starting up replication slots
2019-06-02 15:00:49.266 EDT [29057] DEBUG:  starting up replication origin progress state
2019-06-02 15:00:49.266 EDT [29057] LOG:  database system was not properly shut down; automatic recovery in progress
2019-06-02 15:00:49.267 EDT [29057] DEBUG:  resetting unlogged relations: cleanup 1 init 0
2019-06-02 15:00:49.269 EDT [29057] LOG:  redo starts at 0/4C34ED70
2019-06-02 15:00:49.273 EDT [29057] DEBUG:  attempting to remove WAL segments newer than log file 000000BE000000000000004C
2019-06-02 15:00:49.273 EDT [29057] LOG:  invalid record length at 0/4C35CDC0: wanted 24, got 0
2019-06-02 15:00:49.273 EDT [29057] LOG:  redo done at 0/4C35CD90
2019-06-02 15:00:49.273 EDT [29057] LOG:  last completed transaction was at log time 2019-04-28 06:05:44.017446-04
2019-06-02 15:00:49.273 EDT [29057] DEBUG:  resetting unlogged relations: cleanup 0 init 1
2019-06-02 15:00:49.280 EDT [29057] DEBUG:  performing replication slot checkpoint
2019-06-02 15:00:49.288 EDT [29057] DEBUG:  attempting to remove WAL segments older than log file 000000000000000000000043
2019-06-02 15:00:49.289 EDT [29057] DEBUG:  MultiXactId wrap limit is 2147483648, limited by database with OID 1
2019-06-02 15:00:49.290 EDT [29057] DEBUG:  oldest MultiXactId member is at offset 1
2019-06-02 15:00:49.290 EDT [29057] DEBUG:  MultiXact member stop limit is now 4294914944 based on MultiXact 1
2019-06-02 15:00:49.292 EDT [29057] DEBUG:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0
2019-06-02 15:00:49.302 EDT [29057] DEBUG:  CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0

PostgreSQL stand-alone backend 10.5
backend>


But we choose not to use the backend capabilities at this time.  We'll start the database as Patroni would, using the following command:


-bash-4.2$ /usr/pgsql-10/bin/postgres -D /data/patroni –config-file=/data/patroni/postgresql.conf –hot_standby=off –listen_addresses=192.168.0.124 –max_worker_processes=8 –max_locks_per_transaction=64 –wal_level=replica –cluster_name=postgres –wal_log_hints=on –max_wal_senders=10 –track_commit_timestamp=off –max_prepared_transactions=0 –port=5432 –max_replication_slots=10 –max_connections=20 -d 5 2>&1
2019-06-02 15:11:55.379 EDT [29789] DEBUG:  postgres: PostmasterMain: initial environment dump:
2019-06-02 15:11:55.380 EDT [29789] DEBUG:  —————————————–
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      XDG_SESSION_ID=171
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      HOSTNAME=psql02.nix.mds.xyz
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      SHELL=/bin/bash
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      TERM=xterm
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      HISTSIZE=1000
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      USER=postgres
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      MAIL=/var/spool/mail/postgres
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      PATH=/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/pgsql-10/bin/
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      PWD=/data/patroni/pg_logical
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      LANG=en_US.UTF-8
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      HISTCONTROL=ignoredups
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      SHLVL=1
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      HOME=/var/lib/pgsql
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      LOGNAME=postgres
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      PGDATA=/data/patroni
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      LESSOPEN=||/usr/bin/lesspipe.sh %s
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      _=/usr/pgsql-10/bin/postgres
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      OLDPWD=/data/patroni
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      PGLOCALEDIR=/usr/pgsql-10/share/locale
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      PGSYSCONFDIR=/etc/sysconfig/pgsql
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      LC_COLLATE=en_US.UTF-8
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      LC_CTYPE=en_US.UTF-8
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      LC_MESSAGES=en_US.UTF-8
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      LC_MONETARY=C
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      LC_NUMERIC=C
2019-06-02 15:11:55.380 EDT [29789] DEBUG:      LC_TIME=C
2019-06-02 15:11:55.380 EDT [29789] DEBUG:  —————————————–
2019-06-02 15:11:55.383 EDT [29789] LOG:  listening on IPv4 address "192.168.0.124", port 5432
2019-06-02 15:11:55.385 EDT [29789] LOG:  listening on Unix socket "./.s.PGSQL.5432"
2019-06-02 15:11:55.386 EDT [29789] DEBUG:  invoking IpcMemoryCreate(size=144687104)
2019-06-02 15:11:55.387 EDT [29789] DEBUG:  mmap(144703488) with MAP_HUGETLB failed, huge pages disabled: Cannot allocate memory
2019-06-02 15:11:55.398 EDT [29789] DEBUG:  SlruScanDirectory invoking callback on pg_notify/0000
2019-06-02 15:11:55.398 EDT [29789] DEBUG:  removing file "pg_notify/0000"
2019-06-02 15:11:55.398 EDT [29789] DEBUG:  dynamic shared memory system will support 128 segments
2019-06-02 15:11:55.398 EDT [29789] DEBUG:  created dynamic shared memory control segment 721092148 (3088 bytes)
2019-06-02 15:11:55.401 EDT [29789] DEBUG:  max_safe_fds = 985, usable_fds = 1000, already_open = 5
2019-06-02 15:11:55.404 EDT [29789] LOG:  redirecting log output to logging collector process
2019-06-02 15:11:55.404 EDT [29789] HINT:  Future log output will appear in directory "log".

And voila!  We are in our database and can see all of our databases:

-bash-4.2$ psql -h psql02 -p 5432 -W
Password:
psql (10.5)
Type "help" for help.

postgres=# \l
                                          List of databases
      Name       |    Owner     | Encoding |   Collate   |    Ctype    |      Access privileges
—————–+————–+———-+————-+————-+—————————–
 amon_mws01      | amon_mws01   | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 awx             | awx          | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 confluence      | postgres     | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres               +
                 |              |          |             |             | postgres=CTc/postgres      +
                 |              |          |             |             | confluenceuser=CTc/postgres
 hue_mws01       | hue_mws01    | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 metastore_mws01 | hive_mws01   | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 nav_mws01       | nav_mws01    | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 navms_mws01     | navms_mws01  | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 oozie_mws01     | oozie_mws01  | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 postgres        | postgres     | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 rman_mws01      | rman_mws01   | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 scm_mws01       | scm_mws01    | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 sentry_mws01    | sentry_mws01 | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 template0       | postgres     | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres                +
                 |              |          |             |             | postgres=CTc/postgres
 template1       | postgres     | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres                +
                 |              |          |             |             | postgres=CTc/postgres
 twr             | postgres     | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres               +
                 |              |          |             |             | postgres=CTc/postgres      +
                 |              |          |             |             | twr=CTc/postgres
(15 rows)

postgres=#

And checking after to ensure PostgreSQL / Patroni replicated everything just fine:

[root@psql01 base]# ls -altri
total 364
 67518653 drwx——.  2 postgres postgres  8192 Jun  2 20:35 1
134766555 drwx——.  2 postgres postgres  8192 Jun  2 20:35 13805
   152733 drwx——.  2 postgres postgres 24576 Jun  2 20:35 16395
 68698149 drwx——.  2 postgres postgres 24576 Jun  2 20:35 24586
134741283 drwx——.  2 postgres postgres 16384 Jun  2 20:35 40970
202922441 drwx——.  2 postgres postgres 16384 Jun  2 20:35 131134
   871098 drwx——.  2 postgres postgres  8192 Jun  2 20:35 131136
 68026687 drwx——.  2 postgres postgres  8192 Jun  2 20:35 131137
135079123 drwx——.  2 postgres postgres  8192 Jun  2 20:35 131138
202874795 drwx——.  2 postgres postgres  8192 Jun  2 20:35 131139
   871469 drwx——.  2 postgres postgres  8192 Jun  2 20:35 131140
 68280133 drwx——.  2 postgres postgres 16384 Jun  2 20:35 131141
135080185 drwx——.  2 postgres postgres 12288 Jun  2 20:35 131142
   152732 drwx——. 17 postgres postgres  4096 Jun  2 20:35 .
202879025 drwx——.  2 postgres postgres 16384 Jun  2 20:35 155710
 67482133 drwx——. 21 postgres postgres  4096 Jun  2 20:36 ..
201711623 drwx——.  2 postgres postgres  8192 Jun  2 20:36 13806
[root@psql01 base]#
[root@psql01 base]#
[root@psql01 base]#
[root@psql01 base]# pwd
/data/patroni/base
[root@psql01 base]#

 

[root@psql02 base]# ls -altri
total 368
204367267 drwx——.  2 postgres postgres  8192 Mar 24 03:47 13805
 68669097 drwx——. 17 postgres postgres  4096 Apr 14 10:08 .
204362619 drwx——.  2 postgres postgres 16384 Jun  2 20:31 40970
134473951 drwx——.  2 postgres postgres 24576 Jun  2 20:31 24586
 68669102 drwx——.  2 postgres postgres 24576 Jun  2 20:31 16395
138812710 drwx——.  2 postgres postgres  8192 Jun  2 20:31 1
204366769 drwx——.  2 postgres postgres 12288 Jun  2 20:31 131142
136945631 drwx——.  2 postgres postgres 16384 Jun  2 20:31 131141
 67894451 drwx——.  2 postgres postgres  8192 Jun  2 20:31 131140
  1403920 drwx——.  2 postgres postgres  8192 Jun  2 20:31 131139
204366412 drwx——.  2 postgres postgres  8192 Jun  2 20:31 131138
136945273 drwx——.  2 postgres postgres  8192 Jun  2 20:31 131137
 67894080 drwx——.  2 postgres postgres  8192 Jun  2 20:31 131136
  1403182 drwx——.  2 postgres postgres 16384 Jun  2 20:31 131134
  1404278 drwx——.  2 postgres postgres 16384 Jun  2 20:31 155710
 11395780 drwx——.  2 postgres postgres  8192 Jun  2 20:31 13806
135326272 drwx——. 21 postgres postgres  4096 Jun  2 20:31 ..
[root@psql02 base]# pwd
/data/patroni/base
[root@psql02 base]#

 

[root@psql03 audit]# cd /data/patroni/base/
[root@psql03 base]# ls -altri
total 372
 67130854 drwx——.  2 postgres postgres  8192 Jun  2 20:37 1
134446297 drwx——.  2 postgres postgres  8192 Jun  2 20:37 13805
    79298 drwx——.  2 postgres postgres 24576 Jun  2 20:37 16395
 69209007 drwx——.  2 postgres postgres 24576 Jun  2 20:37 24586
135152677 drwx——.  2 postgres postgres 16384 Jun  2 20:37 40970
201954381 drwx——.  2 postgres postgres 16384 Jun  2 20:37 131134
    80500 drwx——.  2 postgres postgres  8192 Jun  2 20:37 131136
 68241705 drwx——.  2 postgres postgres  8192 Jun  2 20:37 131137
134443358 drwx——.  2 postgres postgres  8192 Jun  2 20:37 131138
201808206 drwx——.  2 postgres postgres  8192 Jun  2 20:37 131139
    80871 drwx——.  2 postgres postgres  8192 Jun  2 20:37 131140
 68242063 drwx——.  2 postgres postgres 16384 Jun  2 20:37 131141
134443716 drwx——.  2 postgres postgres 12288 Jun  2 20:37 131142
    79297 drwx——. 17 postgres postgres  4096 Jun  2 20:37 .
201828372 drwx——.  2 postgres postgres 16384 Jun  2 20:37 155710
134812989 drwx——. 21 postgres postgres  4096 Jun  2 20:38 ..
201807458 drwx——.  2 postgres postgres  8192 Jun  2 20:38 13806
[root@psql03 base]# pwd
/data/patroni/base
[root@psql03 base]#

And that should get you up and running again.  Don't forget to get some decent PostgreSQL backups.

ALTERNATE

We have not tried this but the above could potentially be resolved by pg_resetwal as well:

[root@psql03 ~]# find / -iname pg_resetwal
/usr/pgsql-10/bin/pg_resetwal
[root@psql03 ~]#

 

Thx,
TK


     
  Copyright © 2003 - 2013 Tom Kacperski (microdevsys.com). All rights reserved.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License