Skip to content

HOWTO: PostgreSQL Replication with Automatic Fail-Over , using repmgr a.k.a. replication manager

January 21, 2016

In our scenario we only have two separate database servers. The OSes are Linux Centos 6.7. The Database Servers are PostgreSQL 9.4. The repmgr is repmgr94.

The Servers was been configured with streaming replication. So, this setup is not from bare-metal, in opposite it’s from production setup.

I stop the standby server, rename the recovery.conf file and start it again. So , now , we can addusers,databases,etc and we can setup the system with repmgr.

On master and slave nodes:

  1. create ssh keys and populated , so every server can connect to each other in the context of postgres user

ON THE MASTER NODE: on the command line

  1. Install repmgr on the master node
hostname# yum -y install repmgr94 repmgr94-debuginfo

…..

2. Just for info (if I forgot something) , I’m showing the  postgresql.conf key-values.

hostname# su - postgres
-bash-4.1$ grep ^[a-zA-Z0-9] /usr/local/pgsql/data/postgresql.conf

 

listen_addresses = ‘0.0.0.0’
port = 5432 # (change requires restart)
max_connections = 2048 # (change requires restart)
tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
tcp_keepalives_count = 0 # TCP_KEEPCNT;
shared_buffers = 1024MB # min 128kB
temp_buffers = 128MB # min 800kB
work_mem = 16MB # min 64kB
dynamic_shared_memory_type = posix # the default is the first option
max_files_per_process = 1024 # min 25
wal_level = hot_standby
fsync = on # turns forced synchronization on or off
synchronous_commit = on # synchronization level;
checkpoint_segments = 3 # in logfile segments, min 1, 16MB each
checkpoint_timeout = 5min # range 30s-1h
archive_mode = on
archive_command = ‘cp %p /usr/local/pgsql/wal/%f’
archive_timeout = 2 # force a logfile segment switch after this
max_wal_senders = 8 # max number of walsender processes
wal_keep_segments = 64 # in logfile segments, 16MB each; 0 disables
wal_sender_timeout = 30s # in milliseconds; 0 disables
log_destination = ‘stderr’ # Valid values are combinations of
logging_collector = on # Enable capturing of stderr and csvlog
log_directory = ‘pg_log’ # directory where log files are written,
log_filename = ‘postgresql-%a.log’ # log file name pattern,
log_truncate_on_rotation = on # If on, an existing log file with the
log_rotation_age = 1d # Automatic rotation of logfiles will
log_rotation_size = 0 # Automatic rotation of logfiles will
log_line_prefix = ‘< %m >’ # special values:
log_timezone = ‘Europe/Berlin’
datestyle = ‘iso, mdy’
timezone = ‘Europe/Berlin’
lc_messages = ‘en_US.UTF-8’ # locale for system error message
lc_monetary = ‘en_US.UTF-8’ # locale for monetary formatting
lc_numeric = ‘en_US.UTF-8’ # locale for number formatting
lc_time = ‘en_US.UTF-8’ # locale for time formatting
default_text_search_config = ‘pg_catalog.english’

 

-bash-4.1$ createuser -s repmgr
-bash-4.1$ createdb repmgr -O repmgr
# touch /var/log/repmgr/repmgr-9.4.log
# grep -v ^\# /etc/repmgr/9.4/repmgr.conf

cluster=CLUSTER_NAME
node=1
node_name=MASTERNODE_HOSTNAME
conninfo=’host=127.0.0.1 dbname=repmgr user=repmgr’
loglevel=NOTICE
logfacility=STDERR
logfile=’/var/log/repmgr/repmgr-9.4.log’
pg_bindir=/usr/pgsql-9.4/bin
master_response_timeout=60
reconnect_attempts=6
reconnect_interval=10
failover=automatic # one of ‘automatic’, ‘manual’
priority=100 # a value of zero or less prevents the node being promoted to master
promote_command=’/usr/pgsql-9.4/bin/repmgr standby promote -f /etc/repmgr/9.4/repmgr.conf’


# repmgr -f /etc/repmgr/9.4/repmgr.conf --verbose master register

[2016-01-21 19:19:07] [NOTICE] opening configuration file: /etc/repmgr/9.4/repmgr.conf
[2016-01-21 19:19:07] [NOTICE] Redirecting logging output to ‘/var/log/repmgr/repmgr-9.4.log’

Execure cluster show to ensure everything is ok until now.

# repmgr -f /etc/repmgr/9.4/repmgr.conf --verbose cluster show

[2016-01-21 19:34:40] [NOTICE] Redirecting logging output to ‘/var/log/repmgr/repmgr-9.4.log’
Role | Connection String
* master | host=MASTERNODE_HOSTNAME dbname=repmgr user=repmgr

 

ON THE SLAVE NODE: on the command line

  1. First, clone the production database from primary/master database server: MASTERNODE_HOSTNAME
# su - postgres
-bash-4.1$ /usr/pgsql-9.4/bin/repmgr -f /etc/repmgr/9.4/repmgr.conf --force --rsync-only -d repmgr -U repmgr -h MASTERNODE_HOSTNAME --verbose  standby clone

 

……

pg_twophase/
pg_xlog/

sent 73062 bytes received 1674435 bytes 1164998.00 bytes/sec
total size is 62768776 speedup is 35.92
receiving incremental file list
pg_control
8192 100% 7.81MB/s 0:00:00 (xfer#1, to-check=0/1)

sent 102 bytes received 236 bytes 676.00 bytes/sec
total size is 8192 speedup is 24.24

 

# /etc/init.d/postgresql start
# su - postgres
-bash-4.1$ /usr/pgsql-9.4/bin/repmgr -f /etc/repmgr/9.4/repmgr.conf --verbose standby register

 

Finally ensure cluster status, with cluster show command:

-bash-4.1$ /usr/pgsql-9.4/bin/repmgr -f /etc/repmgr/9.4/repmgr.conf cluster show

 

[2016-01-21 20:28:41] [NOTICE] Redirecting logging output to ‘/var/log/repmgr/repmgr-9.4.log’
Role | Connection String
* master | host=MASTERNODE_HOSTNAME dbname=repmgr user=repmgr
standby | host=SLAVENODE_HOSTNAME dbname=repmgr user=repmgr

 

That’s All.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: