Описание


PGVIP - служба аварийного однократного переключения (auto failover) с основной(master) на резервную(standby) базу данных PostgreSQL с использованием виртуального(плавающего) IP-адреса

Служба устанавливается на один сервер, являющийся единой точной отказа (nginx proxy, сервер приложений 1С), или на два любых сервера (далее по тексту web-сервера). На серверах баз данных устанавливать что-либо не требуется, web-сервер подключается к ним по ssh и выполняет команды из конфигурационного файла. Первичную настройку резервной базы данных и обратное переключение администратор выполняет вручную.

Схемы аварийного переключения:
для одного web-сервера
Failover schema for one web server


для нескольких web-серверов
Failover schema for multiple web servers


Технически служба PGVIP реализована как обертка(wrapper), которая выполняет команды операционной системы из конфигурационного файла. Изначально была реализована как bash-скрипт, потом переписана на Си.

Условия, положенные в основу разработки:
более подробно в разделе Ключевые особенности

Основные действия, выполняемые службой pgvip: более подробно в разделе Действия


Пример переключения

Состояние (pgvip status)
[root@web-server ~]# pgvip status
status: SUCCESS (duration: 0 00:00:32)

+---------+--------------+-------------------------+---------------+---------------+
| Role    | IP address   | Database state          | 192.168.56.10 | VIP auto down |
+---------+--------------+-------------------------+---------------+---------------+
| master  | 192.168.56.1 | read-write              | up on enp0s3  | executing     |
| standby | 192.168.56.2 | in recovery, lag 0 min. | down          | executing     |
+---------+--------------+-------------------------+---------------+---------------+


[root@web-server ~]# pgvip status
status: ERROR (duration: 0 00:00:14, cause: master db is not read-write)

+---------+--------------+-------------------------+---------------+---------------+
| Role    | IP address   | Database state          | 192.168.56.10 | VIP auto down |
+---------+--------------+-------------------------+---------------+---------------+
| master  | 192.168.56.1 | not available           | not available | not available |
| standby | 192.168.56.2 | in recovery, lag 0 min. | down          | executing     |
+---------+--------------+-------------------------+---------------+---------------+


[root@web-server ~]# pgvip status
status: STANDBY_PROMOTED (duration: 0 00:00:06)

+---------+--------------+----------------+---------------+---------------+
| Role    | IP address   | Database state | 192.168.56.10 | VIP auto down |
+---------+--------------+----------------+---------------+---------------+
| master  | 192.168.56.1 | not available  | not available | not available |
| standby | 192.168.56.2 | read-write     | up on enp0s3  | executing     |
+---------+--------------+----------------+---------------+---------------+


Системный журнал (journalctl -fu pgvip or tail -f /var/log/pgsuite/pgvip.log)
[root@web-server ~]# journalctl -fu pgvip
Feb 08 09:49:17 web-server systemd[1]: Started pgvip (PostgreSQL master-standby auto failover).
Feb 08 09:49:17 web-server systemd[1]: Starting pgvip (PostgreSQL master-standby auto failover)...
Feb 08 09:49:17 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      thread started, thread_id: 2718
Feb 08 09:49:17 web-server pgvip[2713]: INFO  ADMINISTRATION       thread started, thread_id: 2719
Feb 08 09:49:17 web-server pgvip[2713]: INFO  ADMINISTRATION       binded to unix socket "/tmp/.s.PGVIP.0"
Feb 08 09:49:17 web-server pgvip[2713]: INFO  ADMINISTRATION       listening
Feb 08 09:49:17 web-server pgvip[2713]: INFO  CHECKER_MASTER_DB    thread started, thread_id: 2714
Feb 08 09:49:17 web-server pgvip[2713]: INFO  CHECKER_MASTER_VIP   thread started, thread_id: 2715
Feb 08 09:49:17 web-server pgvip[2713]: INFO  CHECKER_STANDBY_DB   thread started, thread_id: 2716
Feb 08 09:49:17 web-server pgvip[2713]: INFO  CHECKER_STANDBY_VIP  thread started, thread_id: 2717
...
Feb 08 09:50:15 web-server pgvip[2713]: WARN  CHECKER_MASTER_DB    PGSUITE-905 OS command executed with error (errno: 124)
Feb 08 09:50:15 web-server pgvip[2713]: WARN  CHECKER_MASTER_DB    command:
Feb 08 09:50:15 web-server pgvip[2713]: WARN  CHECKER_MASTER_DB    timeout 5 ssh 192.168.56.1 "su - postgres -c "psql -Aqtc 'select pg_is_in_recovery() or pg_current_xact_id() is null, extract(epoch from now()-pg_last_xact_replay_timestamp())'""
Feb 08 09:50:15 web-server pgvip[2713]: WARN  CHECKER_MASTER_DB    output:
Feb 08 09:50:15 web-server pgvip[2713]: WARN  CHECKER_MASTER_DB
Feb 08 09:50:15 web-server pgvip[2713]: WARN  CHECKER_MASTER_VIP   PGSUITE-905 OS command executed with error (errno: 124)
Feb 08 09:50:15 web-server pgvip[2713]: WARN  CHECKER_MASTER_VIP   command:
Feb 08 09:50:15 web-server pgvip[2713]: WARN  CHECKER_MASTER_VIP   timeout 5 ssh 192.168.56.1 "flock /tmp/pgvip.time -c 'date +%s > /tmp/pgvip.time' ; flock -n /tmp/pgvip.vip_auto_down -c '' ; echo -n $?, ; ip address show | awk '/ inet 192.168.56.10\// {print $NF}'"
Feb 08 09:50:15 web-server pgvip[2713]: WARN  CHECKER_MASTER_VIP   output:
Feb 08 09:50:15 web-server pgvip[2713]: WARN  CHECKER_MASTER_VIP
...
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      executing action "action_standby_db_promote"
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      condition: (time_now()-status.value_time)>g_time_standby_promote_delay_int && status.master_db_state!=G_DB_STATE_READ_WRITE && status.standby_db_state==G_DB_STATE_IN_RECOVERY && status.standby_db_lag<=g_time_standby_allowable_lag_int
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      status: ERROR (duration: 0 00:00:41, cause: master db is not read-write)
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      +---------+--------------+-------------------------+---------------+---------------+
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      | Role    | IP address   | Database state          | 192.168.56.10 | VIP auto down |
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      +---------+--------------+-------------------------+---------------+---------------+
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      | master  | 192.168.56.1 | not available           | not available | not available |
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      | standby | 192.168.56.2 | in recovery, lag 0 min. | down          | executing     |
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      +---------+--------------+-------------------------+---------------+---------------+
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      executing command:
Feb 08 09:50:56 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      timeout 5 ssh 192.168.56.1 "ip address del 192.168.56.10/24 dev `ip address show | awk '/ inet 192.168.56.10\// {print $NF}'`"
Feb 08 09:50:59 web-server pgvip[2713]: WARN  ACTION_EXECUTOR      PGSUITE-905 OS command executed with error (errno: 255)
Feb 08 09:50:59 web-server pgvip[2713]: WARN  ACTION_EXECUTOR      output:
Feb 08 09:50:59 web-server pgvip[2713]: WARN  ACTION_EXECUTOR
Feb 08 09:50:59 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      executing command:
Feb 08 09:50:59 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      timeout 5 ssh 192.168.56.2 "su - postgres -c "\$(pg_config --bindir)/pg_ctl promote -W -D \$(psql -Aqtc 'show data_directory')""
Feb 08 09:50:59 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      output:
Feb 08 09:50:59 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      server promoting
Feb 08 09:50:59 web-server pgvip[2713]: INFO  ACTION_EXECUTOR
Feb 08 09:50:59 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      command executed successfully
Feb 08 09:50:59 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      executing command:
Feb 08 09:50:59 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      timeout 5 ssh 192.168.56.2 "ip address add 192.168.56.10/24 dev `ip address show | awk '/ inet 192.168.56.2\// {print $NF}'`"
Feb 08 09:51:00 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      output:
Feb 08 09:51:00 web-server pgvip[2713]: INFO  ACTION_EXECUTOR
Feb 08 09:51:00 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      command executed successfully
Feb 08 09:51:00 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      action "action_standby_db_promote" executed successfully
...
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      executing action "action_master_db_break"
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      condition: status.master_db_state==G_DB_STATE_READ_WRITE && status.standby_db_state==G_DB_STATE_READ_WRITE
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      status: STANDBY_PROMOTED (duration: 0 00:01:25)
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      +---------+--------------+----------------+---------------+---------------+
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      | Role    | IP address   | Database state | 192.168.56.10 | VIP auto down |
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      +---------+--------------+----------------+---------------+---------------+
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      | master  | 192.168.56.1 | read-write     | down          | executing     |
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      | standby | 192.168.56.2 | read-write     | up on enp0s3  | executing     |
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      +---------+--------------+----------------+---------------+---------------+
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      executing command:
Feb 08 09:52:29 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      timeout 5 ssh 192.168.56.1 "su - postgres -c "psql -c 'alter system set listen_addresses=pgvip_broken'" ; ps -e | awk '$4=="postmaster" || $4=="postgres" {print $1}' | xargs -r kill -9"
Feb 08 09:52:30 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      output:
Feb 08 09:52:30 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      ALTER SYSTEM
Feb 08 09:52:30 web-server pgvip[2713]: INFO  ACTION_EXECUTOR
Feb 08 09:52:30 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      command executed successfully
Feb 08 09:52:30 web-server pgvip[2713]: INFO  ACTION_EXECUTOR      action "action_master_db_break" executed successfully


Пример установки

[root@web-server ~]# mkdir /tmp/pgvip ; cd /tmp/pgvip

[root@web-server /tmp/pgvip]# curl -Os https://pgvip.org/files/download/pgvip.tar

[root@web-server /tmp/pgvip]# tar -xf pgvip.tar

[root@web-server /tmp/pgvip]# make install ip_master=192.168.56.1 ip_standby=192.168.56.2 ip_virtual=192.168.56.10 subnet_mask=24
# Configuration
if [ -z "24" ] ; then echo -e '
Usage with parameters: make install ip_master=[ip_master] ip_standby=[ip_standby] ip_virtual=[ip_virtual] subnet_mask=[subnet_mask]
Example: make install ip_master=192.168.56.1 ip_standby=192.168.56.2 ip_virtual=192.168.56.10 subnet_mask=24' ; exit 1; fi
if [ -e /etc/pgsuite/pgvip.conf ]   ; then echo -e '
Configuration file /etc/pgsuite/pgvip.conf already exists, change it' ; exit 1; fi
cp pgvip.conf /etc/pgsuite/pgvip.conf
sed -i "s/[ip_master]/192.168.56.1/;s/[ip_standby]/192.168.56.2/;s/[ip_virtual]/192.168.56.10/;s/[subnet_mask]/24/" /etc/pgsuite/pgvip.conf
# Service
if [ -e /etc/systemd/system/pgvip.service ] ; then echo -e '
Service file /etc/systemd/system/pgvip.service already exists, change it' ; exit 1; fi
cp pgvip.service /etc/systemd/system/pgvip.service
rm -f /usr/bin/pgvip
cp pgvip /usr/bin/pgvip
systemctl daemon-reload
echo -e '
pgvip service has been successfully created.
Use command "systemctl --now enable pgvip" to enable and start service,
"journalctl -fu pgvip" to view the journal'

pgvip service has been successfully created.
Use command "systemctl --now enable pgvip" to enable and start service,
"journalctl -fu pgvip" to view the journal

[root@web-server pgvip]# systemctl --now enable pgvip
Created symlink from /etc/systemd/system/multi-user.target.wants/pgvip.service to /etc/systemd/system/pgvip.service.

[root@web-server pgvip]# systemctl status pgvip
● pgvip.service - pgvip (PostgreSQL master-standby auto failover)
   Loaded: loaded (/etc/systemd/system/pgvip.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2025-02-15 10:51:33 EST; 2s ago
 Main PID: 1546 (pgvip)
   CGroup: /system.slice/pgvip.service
           ├─1546 /usr/bin/pgvip execute
           ├─1553 timeout 5 ssh 192.168.56.1 su - postgres -c "psql -Aqtc 'select pg_is_in_recovery() or pg_current_xact_id() is null, extract(epoch from now()-pg_last_xact_replay_timestamp())'"
           ├─1554 timeout 5 ssh 192.168.56.1 flock /tmp/pgvip.time -c 'date +%s > /tmp/pgvip.time' ; flock -n /tmp/pgvip.vip_auto_down -c '' ; echo -n $?, ; ip address show | awk '/ inet 192.168.5...
           ├─1555 timeout 5 ssh 192.168.56.2 su - postgres -c "psql -Aqtc 'select pg_is_in_recovery() or pg_current_xact_id() is null, extract(epoch from now()-pg_last_xact_replay_timestamp())'"
           ├─1556 timeout 5 ssh 192.168.56.2 flock /tmp/pgvip.time -c 'date +%s > /tmp/pgvip.time' ; flock -n /tmp/pgvip.vip_auto_down -c '' ; echo -n $?, ; ip address show | awk '/ inet 192.168.5...
           ├─1557 ssh 192.168.56.1 su - postgres -c "psql -Aqtc 'select pg_is_in_recovery() or pg_current_xact_id() is null, extract(epoch from now()-pg_last_xact_replay_timestamp())'"
           ├─1558 ssh 192.168.56.1 flock /tmp/pgvip.time -c 'date +%s > /tmp/pgvip.time' ; flock -n /tmp/pgvip.vip_auto_down -c '' ; echo -n $?, ; ip address show | awk '/ inet 192.168.56.10// {p...
           ├─1559 ssh 192.168.56.2 su - postgres -c "psql -Aqtc 'select pg_is_in_recovery() or pg_current_xact_id() is null, extract(epoch from now()-pg_last_xact_replay_timestamp())'"
           └─1560 ssh 192.168.56.2 flock /tmp/pgvip.time -c 'date +%s > /tmp/pgvip.time' ; flock -n /tmp/pgvip.vip_auto_down -c '' ; echo -n $?, ; ip address show | awk '/ inet 192.168.56.10// {p...

Feb 15 10:51:33 web-server systemd[1]: Started pgvip (PostgreSQL master-standby auto failover).
Feb 15 10:51:33 web-server systemd[1]: Starting pgvip (PostgreSQL master-standby auto failover)...
Feb 15 10:51:33 web-server pgvip[1546]: INFO  ACTION_EXECUTOR      thread started, thread_id: 1551
Feb 15 10:51:33 web-server pgvip[1546]: INFO  ADMINISTRATION       thread started, thread_id: 1552
Feb 15 10:51:33 web-server pgvip[1546]: INFO  ADMINISTRATION       binded to unix socket "/tmp/.s.PGVIP.0"
Feb 15 10:51:33 web-server pgvip[1546]: INFO  ADMINISTRATION       listening
Feb 15 10:51:33 web-server pgvip[1546]: INFO  CHECKER_MASTER_DB    thread started, thread_id: 1547
Feb 15 10:51:33 web-server pgvip[1546]: INFO  CHECKER_MASTER_VIP   thread started, thread_id: 1548
Feb 15 10:51:33 web-server pgvip[1546]: INFO  CHECKER_STANDBY_DB   thread started, thread_id: 1549
Feb 15 10:51:33 web-server pgvip[1546]: INFO  CHECKER_STANDBY_VIP  thread started, thread_id: 1550

[root@web-server pgvip]# pgvip status
status: SUCCESS (duration: 0 00:00:00)

+---------+--------------+-------------------------+---------------+---------------+
| Role    | IP address   | Database state          | 192.168.56.10 | VIP auto down |
+---------+--------------+-------------------------+---------------+---------------+
| master  | 192.168.56.1 | read-write              | up on enp0s3  | executing     |
| standby | 192.168.56.2 | in recovery, lag 0 min. | down          | executing     |
+---------+--------------+-------------------------+---------------+---------------+

[root@web-server pgvip]# journalctl -fu pgvip
Feb 15 10:54:50 web-server pgvip[1767]: INFO  ADMINISTRATION       connection accepted
Feb 15 10:54:50 web-server pgvip[1767]: INFO  ADMINISTRATION       received command "status"
Feb 15 10:54:50 web-server pgvip[1767]: INFO  ADMINISTRATION       info sent
Feb 15 10:54:50 web-server pgvip[1767]: INFO  ACTION_EXECUTOR      output:
Feb 15 10:54:50 web-server pgvip[1767]: INFO  ACTION_EXECUTOR
Feb 15 10:54:50 web-server pgvip[1767]: INFO  ACTION_EXECUTOR      command executed successfully
Feb 15 10:54:50 web-server pgvip[1767]: INFO  ACTION_EXECUTOR      action "action_master_vip_up" executed successfully
Feb 15 10:54:52 web-server pgvip[1767]: INFO  ADMINISTRATION       connection accepted
Feb 15 10:54:52 web-server pgvip[1767]: INFO  ADMINISTRATION       received command "status"
Feb 15 10:54:52 web-server pgvip[1767]: INFO  ADMINISTRATION       info sent

[root@web-server pgvip]# pgvip show config
PGVIP is a PostgreSQL master-standby auto failover service via a virtual IP address
version 25.1.1, linux 64 bits

Runtime configuration

IP addresses
  ip_master:      192.168.56.1
  ip_standby:     192.168.56.2
  ip_virtual:     192.168.56.10
  ip_subnet_mask: 24

OS command templates
  command_ssh:                   timeout ${time_command_ssh_timeout} ssh ${ip} "${command}"
  command_db_state:              su - postgres -c "psql -Aqtc 'select pg_is_in_recovery() or pg_current_xact_id() is null, extract(epoch from now()-pg_last_xact_replay_timestamp())'"
  command_db_break:              su - postgres -c "psql -c 'alter system set listen_addresses=pgvip_broken'" ; ps -e | awk '$4=="postmaster" || $4=="postgres" {print $1}' | xargs -r kill -9
  command_db_promote:            su - postgres -c "$(pg_config --bindir)/pg_ctl promote -W -D $(psql -Aqtc 'show data_directory')"
  command_ifname:                ip address show | awk '/ inet ${ip}// {print $NF}'
  command_vip_state:             flock /tmp/pgvip.time -c 'date +%s > /tmp/pgvip.time' ; flock -n /tmp/pgvip.vip_auto_down -c '' ; echo -n $?, ; ${command_ifname}
  command_vip_up:                ip address add ${ip_virtual}/${ip_subnet_mask} dev `${command_ifname}`
  command_vip_down:              ip address del ${ip_virtual}/${ip_subnet_mask} dev `${command_ifname}`
  command_vip_auto_down_script:  delta=0 ; while [ $delta -lt ${time_vip_auto_down_duration} ] ; do set -x; date +%Y-%m-%d_%H:%M:%S ; sleep ${time_check_interval} ; delta=$(($(date +%s)-$(flock /tmp/pgvip.time cat /tmp/pgvip.time))) ; if [ $delta -gt ${time_vip_auto_down_timeout} ] ; then ${command_vip_down} ; fi ; done
  command_vip_auto_down_execute: flock -w 1 /tmp/pgvip.vip_auto_down -c "${command_vip_auto_down_script} > /dev/null 2>&1 &"

Timings in seconds
  time_command_ssh_timeout:          5
  time_check_interval:               5
  time_check_not_available_delay:   15
  time_standby_allowable_lag:      300
  time_standby_promote_delay:       30
  time_vip_auto_down_timeout:       30
  time_vip_auto_down_duration:     600
  time_monitoring_interval:         15

Generated OS commands
  command_master_db_state:               timeout 5 ssh 192.168.56.1 "su - postgres -c "psql -Aqtc 'select pg_is_in_recovery() or pg_current_xact_id() is null, extract(epoch from now()-pg_last_xact_replay_timestamp())'""
  command_master_db_break:               timeout 5 ssh 192.168.56.1 "su - postgres -c "psql -c 'alter system set listen_addresses=pgvip_broken'" ; ps -e | awk '$4=="postmaster" || $4=="postgres" {print $1}' | xargs -r kill -9"
  command_master_vip_state:              timeout 5 ssh 192.168.56.1 "flock /tmp/pgvip.time -c 'date +%s > /tmp/pgvip.time' ; flock -n /tmp/pgvip.vip_auto_down -c '' ; echo -n $?, ; ip address show | awk '/ inet 192.168.56.10\// {print $NF}'"
  command_master_vip_up:                 timeout 5 ssh 192.168.56.1 "ip address add 192.168.56.10/24 dev `ip address show | awk '/ inet 192.168.56.1\// {print $NF}'`"
  command_master_vip_down:               timeout 5 ssh 192.168.56.1 "ip address del 192.168.56.10/24 dev `ip address show | awk '/ inet 192.168.56.10\// {print $NF}'`"
  command_master_vip_auto_down_execute:  timeout 5 ssh 192.168.56.1 "flock -w 1 /tmp/pgvip.vip_auto_down -c "delta=0 ; while [ \$delta -lt 600 ] ; do set -x; date +%Y-%m-%d_%H:%M:%S ; sleep 5 ; delta=\$((\$(date +%s)-\$(flock /tmp/pgvip.time cat /tmp/pgvip.time))) ; if [ \$delta -gt 30 ] ; then ip address del 192.168.56.10/24 dev \`ip address show | awk '/ inet 192.168.56.10\\// {print \$NF}'\` ; fi ; done > /dev/null 2>&1 &""
  command_standby_db_state:              timeout 5 ssh 192.168.56.2 "su - postgres -c "psql -Aqtc 'select pg_is_in_recovery() or pg_current_xact_id() is null, extract(epoch from now()-pg_last_xact_replay_timestamp())'""
  command_standby_db_promote:            timeout 5 ssh 192.168.56.2 "su - postgres -c "\$(pg_config --bindir)/pg_ctl promote -W -D \$(psql -Aqtc 'show data_directory')""
  command_standby_vip_state:             timeout 5 ssh 192.168.56.2 "flock /tmp/pgvip.time -c 'date +%s > /tmp/pgvip.time' ; flock -n /tmp/pgvip.vip_auto_down -c '' ; echo -n $?, ; ip address show | awk '/ inet 192.168.56.10\// {print $NF}'"
  command_standby_vip_up:                timeout 5 ssh 192.168.56.2 "ip address add 192.168.56.10/24 dev `ip address show | awk '/ inet 192.168.56.2\// {print $NF}'`"
  command_standby_vip_down:              timeout 5 ssh 192.168.56.2 "ip address del 192.168.56.10/24 dev `ip address show | awk '/ inet 192.168.56.10\// {print $NF}'`"
  command_standby_vip_auto_down_execute: timeout 5 ssh 192.168.56.2 "flock -w 1 /tmp/pgvip.vip_auto_down -c "delta=0 ; while [ \$delta -lt 600 ] ; do set -x; date +%Y-%m-%d_%H:%M:%S ; sleep 5 ; delta=\$((\$(date +%s)-\$(flock /tmp/pgvip.time cat /tmp/pgvip.time))) ; if [ \$delta -gt 30 ] ; then ip address del 192.168.56.10/24 dev \`ip address show | awk '/ inet 192.168.56.10\\// {print \$NF}'\` ; fi ; done > /dev/null 2>&1 &""
 
[root@web-server /tmp/pgvip]#


Сравнение с Patroni

Patroni является кластером и во всех аспектах превосходит PGVIP, который выполняет однократное переключение и кластером не является.

Однако рассмотреть использование PGVIP вместо Patroni можно в следующих ситуациях:

Сравнение с Keepalived

Настроить Keepalived на aвтопереключение master-standby возможно.

Keepalived выполняет перемещение виртуального IP адреса в зависимости доступности службы, поэтому при настройке необходимо рассматривать различные ситуации: Настройка получается с непрозрачной логикой и дополнительными скриптами. Часть действий выполняется Keepalived, часть - скриптами. Параметры (например, IP адреса) указываются в различных местах: keepalived.conf и скрипты. Из-за этого достаточно просто что-то упустить или ошибиться.

PGVIP выполняет действия на основании информации с основного и резервного серверов. Настройки выполняются в одном конфигурационном файле, который содержит все команды.


Split-brain

При установке PGVIP на один web-сервер split-brain невозможен по определению.

При установке на два web-сервера split-brain маловероятен, т.к. при обнаруженнии c любого web-сервера двух баз данных в состоянии read-write основная база данных останавливается без возможности запуска:

Ключевые особенности


Ограничения