Replication, but synchronized!
Why built-in synchronous replication is an awesome feature of Postgres 9.1
This is an exciting year for Postgres. Large, enterprise-minded companies have been working on new features and getting them committed to the core of Postgres.
One of those features available for testing in the first 9.1 beta release is synchronous replication. In only a few hundred lines, the source code file that implements this feature is incredibly lightweight. But there’s some sweet and surprising functionality inside.
So, what exactly is synchronous replication?
The Postgres docs say this feature “extends the standard level of durability offered by a transaction commit. This level of protection is referred to as 2-safe replication in computer science theory.” (Here’s a paper that explains 1-safety, 2-safety and group-safety, if you’re into that type of thing.)
What this means is that synchronous replication ensures that a database transaction is written to a standby system before it’s considered committed. A transaction is anything in a BEGIN/COMMIT block, including CREATE/DROP statements.
You can then treat the two Postgres systems that are partnered as an up-to-the-latest-commit-synchronized cluster.
The system includes a simple queue that identifies which standby databases are partnered and promotes potential partners to sync status if an existing partner is unable to synchronize. If a higher synchronization priority system comes back online, and the changes can still be replayed, then the partnership will change again, back to the higher priority system.
Only two systems can be partnered at any time. Other candidates for synchronization wait in ‘potential’ status.
Under the hood, streaming replication is what ships data from database system to database system as changes occur. You can, of course, use streaming replication without synchronous commit. Changes will then be shipped between databases after data is committed locally.
A few tricky bits
Configuring replication continues to get simpler, but there may be a few speed bumps for those not familiar with Postgres.
The trickiest aspects include: the parameters which reside in recovery.conf, the pg_hba.conf and user-level permissions that are needed, as well as some unexplained dependancies on other postgresql.conf parameters.
With pre-packaged systems, utilizing the postgres user requires the addition of at least one pg_hba.conf configuration line to enable access for the postgres user to replication.
Another issue that tripped me up was the requirement that max_connections be the same value between master and standbys. It didn’t bite me until I restarted the slave systems, and they wouldn’t start up! You can restart a master system to change max_connections, and the slaves will just reconnect to it when it comes back online.
Digging in
Parameters for configuring synchronous replication on the master system are: synchronous_commit (on/off/local), and synchronous_standby_names. The synchronous_standby_names can be either ‘*’ or a list of names used to determine which standbys can synchronize in order of priority.
On the standby system you may configure the following features in recovery.conf:
primary_conninfo,standby_mode (on/off), trigger_file (trigger file that ends recovery mode), recovery_command. Only standby_mode and primary_conninfo are required.
Here’s a conninfo example:
primary_conninfo = ’host=localhost port=5432 user=sdeckelmann application_name=sync1’
Your replication user can be any user with superuser privileges. Standby systems will connect using the replication database, which is a reserved word in 9.0 and higher. Typically, you’ll need to update pg_hba.conf to allow connections to the replication database.
You may have noticed that I set application_name in my conninfo string. This can be used to explicitly control which systems are engaged in synchronization (per synchronous_standby_names), and name each standby database system in pg_stat_replication monitoring output.
Because synchronous replication was built as an extension to our WAL-based replication system, it can operate seamlessly in conjunction with hot standby. On the standby system you may also add hot_standby (on/off) to postgresql.conf, enabling hot standby *and* synchronous replication at the same time.
This is most likely not an ideal configuration for high performance, so please consider the performance impacts carefully before enabling.
Once everything’s up and running, you can view current status through the pg_stat_replication view:
application_name | state | sent_location | replay_location | sync_priority | sync_state ------------------+-----------+---------------+-----------------+---------------+------------- sync2 | STREAMING | 0/202BBB8 | 0/202BBB8 | 1 | SYNC sync1 | STREAMING | 0/202BBB8 | 0/202BBB8 | 1 | POTENTIAL (2 rows)
Here you can clearly see the sync_stat of async and sync, as well as the sync_priority of systems which are capable of receiving synchronous commits.
If I take sync1 offline, we see this:
application_name | state | sent_location | replay_location | sync_priority | sync_state ---------------+----------+------------+--------------+-----------+---------- sync2 | STREAMING | 0/202BBB8 | 0/202BBB8 | 1 | SYNC (1 row)
If instead, I configure the master system to *only* perform synchronous commits with sync1 (by configuring: synchronous_standby_names = 'sync1'), then once all systems are online and the master is restarted, we’ll see the following:
application_name | state | sent_location | replay_location | sync_priority | sync_state ---------------+----------+------------+-------------+-----------+---------- sync2 | STREAMING | 0/202BB08 | 0/202BB08 | 0 | ASYNC sync1 | STREAMING | 0/202BB08 | 0/202BB08 | 1 | SYNC
Pretty slick! This level of reporting is something that could only be partially coaxed out of calls to pg_controldata not too long ago in the past.
Try it out yourself
As I mentioned earlier, you can download 9.1 Beta 1 and try synchronous replication out today. We depend on beta testing to find corner cases and work out the kinks. So, your feedback is welcome.
I’m in Ottawa meeting with developers for PgCon, the yearly, international developers conference for the PostgreSQL Global Development Group. I’ll be giving a talk about Emma’s journey to terabyte-sized databases, leading another exciting round of Lightning Talks, and facilitating our first ever Procedural Language Summit on Saturday.
If you’re at PgCon, there will be at least one talk demonstrating sync rep, and the authors will be on hand for you to congratulate.
Also, come say hi to me and Mark Wong — we’re both here on behalf of Emma.
Leave a comment
