postgresql – Why did repmgr not switch automatically?

Installation

- name: Install repmgr repository
  get_url:
    url: https://dl.2ndquadrant.com/default/release/get/10/rpm
    dest: /var/lib/pgsql/repmgr

- name: Install repmgr rpm
  shell: bash /var/lib/pgsql/repmgr

- name: Install repmgr
  yum:
    name: repmgr10
    state: present

node1 – primary

$ cat /etc/repmgr/10/repmgr.conf
node_id=1
node_name=node1
conninfo='host=node1 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/pgsql/10/data'
failover=automatic
promote_command='repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='repmgr standby follow -f /etc/repmgr.conf --log-to-file'
log_file='/var/log/repmgr/repmgr.log'
log_level=NOTICE
reconnect_attempts=4
reconnect_interval=5
monitoring_history=yes

node2 – standby

$ cat /etc/repmgr/10/repmgr.conf
node_id=2
node_name=node2
conninfo='host=node2 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/pgsql/10/data'
failover=automatic
promote_command='repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='repmgr standby follow -f /etc/repmgr.conf --log-to-file'
log_file='/var/log/repmgr/repmgr.log'
log_level=NOTICE
reconnect_attempts=4
reconnect_interval=5
monitoring_history=yes

Start their services

$ sudo systemctl start repmgr10

Status

$ repmgr cluster show
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+-------------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 3        | host=node1 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | standby |   running | node1    | default  | 100      | 3        | host=node2 user=repmgr dbname=repmgr connect_timeout=2

test

Stop the PostgreSQL service on node1

$ sudo systemctl stop postgresql-10

Its status can be disconnected. But node2 can not be primary automatically.

After restarting the PostgreSQL service on node1 again, the status of the repmpr cluster may be back to the origin.

failover – How to automatically promote a PostgreSQL standby server by repmgr with pgbouncer and barman?

Now, using these servers:

  • Node1: pgbouncer1
  • Node2: pgbouncer2
  • Node3: primary repmgr
  • Node4: standby repmgr
  • Node5: standby repgmr
  • Node6: standby repmgr
  • Node7: primary bartender
  • Node8: passive bartender

pgbouncer for the connection pool, repmgr for replication and automatic failover, bartender for backup and restore.

You now want to add a shell script to all repmgr servers to automatically change a new primary if it is in progress (Node3). But it may be random to select a new one (one from Node4, Node5, Node6).

If I provide a promotion script on all repmgr servers, update the self-host IP and copy it to the pgbouncer servers (Node1, Node2) and the bartender server (Node7). Which script will be executed?

the repmgr.conf as:

# ...

service_promote_command = '/usr/local/bin/promote.sh'
promote_check_timeout = 15

promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'

promote.sh as:

#!/usr/bin/env bash
set -e
set -u

PGBOUNCER_DATABASE_INI_NEW="/tmp/pgbouncer.database.ini"
PGBOUNCER_HOSTS="Node1_IP Node2_IP"
DATABASES="db1 db2"

# Pause pgbouncer
for h in ${PGBOUNCER_HOSTS}
do
  for d in ${DATABASES}
  do
    ┆ psql -U postgres -h ${h} -p 5432 pgbouncer -tc "pause ${d}"
  done
done

# Promote server
/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file # Here possible?

# Generate new config file for pgbouncer
echo -e "(databases)n" > ${PGBOUNCER_DATABASE_INI_NEW}
for d in ${DATABASES}
do
  echo -e "${d}= host=$(hostname -f)n" >> ${PGBOUNCER_DATABASE_INI_NEW}
done

# Copy new config file, reload and resume pgbouncer
for h in ${PGBOUNCER_HOSTS}
do
  for d in ${DATABASES}
  do
    rsync -a ${PGBOUNCER_DATABASE_INI_NEW} ${h}:/etc/pgbouncer/pgbouncer.database.ini
    psql -U postgres -h ${h} -p 5432 pgbouncer -tc "reload"
    psql -U postgres -h ${h} -p 5432 pgbouncer -tc "resume ${d}"
  done
done

rm ${PGBOUNCER_DATABASE_INI_NEW}

# Copy new config file, reload and resume barman
# TODO

If deploying it as all repmgr servers, can they run together or select a single promoted primary server if the current primary server is down?

ubuntu – install postgresql without creating an instance (for use with repmgr)

I'm trying to get the REPMGR configuration, and I'm following the steps at https://repmgr.org/docs/current/quickstart-standby-preparation.html to get the backup configuration.

I noticed that he warns On the standby, do not create a PostgreSQL instance. However, I believe this happens automatically with the way I installed Postgres

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
echo "deb http://apt.postgresql.org/pub/repos/apt/ `lsb_release -cs`-pgdg main" |sudo tee  /etc/apt/sources.list.d/pgdg.list
sudo apt update
sudo apt -y install postgresql-12 postgresql-client-12

because when i try to clone the primary on the standby as mentioned in https://repmgr.org/docs/current/quickstart-standby-clone.html
$ repmgr -h node1 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --dry-run

I receive

postgres@empty2:~$ repmgr -h 192.168.1.102 -U repmgr -d repmgr -f /etc/repmgr.conf standby clone --dry-run
NOTICE: destination directory "/var/lib/postgresql/12/main" provided
ERROR: specified data directory "/var/lib/postgresql/12/main" appears to contain a running PostgreSQL instance
HINT: ensure the target data directory does not contain a running PostgreSQL instance

Now, I just assume it's because the database instance was created when I installed postgres on standby. and I also assume that I can just delete everything in the standby data directory, and everything will work fine …

But (assuming my assumptions are correct ….) what's the right way to install postgres12 without creating an instance and the corresponding data files?

PostgreSQL, repmgr Configuration Questions – Exchange Stack Database Administrators

While configure and install repmgr
pg_config check … / opt / PostgreSQL / 10 / bin / pg_config
configure: build against PostgreSQL 10.9
configure: create ./config.status
config.status: create a Makefile
config.status: Makefile.global creation
config.status: creation of doc / Makefile
config.status: creating config.h
config.status: config.h is unchanged
Build against PostgreSQL 10
gcc -Wall -Wmissing-prototypes -Wpointer-arith -WDeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g2 -DMAP_HUGETLB = 0x40000-rep -client. o repmgr-action-primary.o repmgr-action-standby.o repmgr-action-witness.o repmgr-action-bdr.o repmgr-action-cluster.o repmgr-action-node.o repmgr-action daemon.o configfile .o log.o strutil.o controldata.o dirutil.o compat.o starts.o sysutils.o -L / opt / PostgreSQL / 10 / lib -lpgcommon -lpgport -L / opt / PostgreSQL / 10 / lib -lpq – L / opt / PostgreSQL / 10 / lib-L / opt / local / Current / lib -Wl, – as needed -Wl, -rpath, & opt / PostgreSQL / 10 / lib, – enable -new-dtags -lpgcommon -lpgport -lxslt -lxml2 -lpam -lssl -lcrypto -lgssapi_krb5 -lz -ledit -lrt -lcrypt -ldl -lm -o repmgr
/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/libgssapi_krb5.so: undefined reference to k5_enctype_to_ssf @ k5crypto_3_MIT & # 39;
/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/libgssapi_krb5.so: undefined reference to
k5_os_mutex_unlock @ krb5support_0_MIT & # 39;
/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/libgssapi_krb5.so: undefined reference to k5_os_mutex_init @ krb5support_0_MIT & # 39;
/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/libgssapi_krb5.so: undefined reference to
k5_os_mutex_lock @ krb5support_0_MIT & # 39;
/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/libgssapi_krb5.so: undefined reference to k5_once @ krb5support_0_MIT & # 39;
//usr/lib64/libldap_r-2.4.so.2: undefined reference to
ber_sockbuf_io_udp & # 39;
/usr/lib/gcc/x86_64-redhat-linux/4.8.5/../../../../lib64/libgssapi_krb5.so: undefined reference to "k5_os_mutex_destroy @ krb5support_0_MIT"
collect2: error: ld returned 1 exit status
make: *** [repmgr] Error 1

replication – The Postgresql replica is synchronized but repmgr says that the node is not attached

I have configured a replica of a Postgresql instance with the help of repmgr. This replica is fully synchronized with the main instance and has been synchronized for a few days. repmgr tells me that the replica is not connected.
Here's what the cluster looks like:

postgres @ www: ~ $ cluster show repmgr
ID | Name | Role | Status | Upstream | Location | Connection string
---- + ------------------ + --------- + ----------- + ---- --------- + --------- + ------------------------------ ------------------------------------
1 | orig_master | primary | * running | | default | host = main user = user dbname = repmgr connect_timeout = 2
2 | orig_slave | waiting | in progress | orig_master | default | host = SLAVE user = USER dbname = repmgr connect_timeout = 2

Here's what we get on the replica instance:

postgres @ db: ~ $ status of the repmgr node
Node "orig_slave":
PostgreSQL version: 9.6.11
Total data size: 5603 MB
Conninfo: host = SLAVE user = USER basename = repmgr connect_timeout = 2
Role: eve
WAL archiving: disabled (on standbys, "archive_mode" must be set to "always" to be effective)
Archive command: rsync -a% p barman @ BARMAN: / mnt / volume / prod / incoming /% f
WAL waiting for archiving: 0 file pending
Replication Connections: 0 (out of 6 maximum)
Replication Slots: Disabled
Upstream node: orig_master (ID: 1)
Replication delay: 0 seconds
Last number received: 63 / BB0E3C10
Last Reading LSN: 63 / BB0E3C10

This clearly shows that we have no connection lag, that the instances are synchronized and that the WAL numbers are correct.
Here's what I get on the main instance:

postgres @ www: ~ $ status of the repmgr node
"Orig_master" node:
PostgreSQL version: 9.6.11
Total data size: 5603 MB
Conninfo: host = MASTER user = USER basename = repmgr connect_timeout = 2
Role: primary
WAL archiving: activated
Archive command: rsync -a% p barman @ BARMAN: / mnt / volume / prod / incoming /% f
WAL waiting for archiving: 0 file pending
Replication Connections: 1 (out of a maximum of 6)
Replication Slots: Disabled
Replication delay: n / a

ATTENTION: the following problem (s) have been detected:
- 1 of the 1 downstream nodes not attached:
- orig_slave (ID: 2)

TIP: Run "repmgr node check" for more details

Which tells me that the downstream node is not connected, which is clearly connected!
My question is: is there really a problem in the replication process? If so, then what is it and how can I solve it? If no, how can I do repmgr sure that there is no problem?

P.S .: Postgresql 9.6.11, rep 4.2