What is SQL Relay


Next:Replication of the ZODB Up:Replication Previous:Replication Contents

Data replication on PostgreSQL

With the Postgres there are basically two approaches to carry out the replication with the help of database resources or to use tools for doubling the storage space in the file system.
The most obvious means that the database itself provides are triggers, whereby in the event of data manipulation this must be communicated to the duplicate server and also applied to its database.
On the basis of the file system, there are various means, such as "` rcp "'," `scp"', the integration of network drives and then the use of "` cp "'. Of course, the "rsync" which can be used as a "scp framework" is particularly useful.
However, when copying at file level, there is always the problem of open files which, for example, the database server process does not release during its runtime and which therefore cannot be copied.
Products in the PostgreSQL environment that sounded interesting for replication and were therefore examined more closely are
  1. pgpool
  2. SQL relay
  3. Slony-1
The most promising product was pgpool, which promised both load balancing and replication via PostgreSQL databases.
The self-image of the product can be found on the website as follows:
It was easy to install and configure. It can manage two servers, although I can imagine cascading as the system behaves like a transparent proxy. The technique used consists of duplicating the requests on all backends. Two modes can be selected, the strict mode, in which the request is only duplicated on the slave when the operation on the master was successful, and the real parallel mode, where the duplication takes place immediately on both servers. The system also recognized when the master could no longer be reached and then carried out the operations exclusively on the slave.
Problems that the system brings with it are, among other things, that the promised load distribution can only be carried out on read operations and to secure this, it only works with SELECT statements. Even the BEGIN of a transaction, whether reading or writing, prevents the load sharing procedure, and the operations are carried out one after the other on the master and slave. Another and more important point is the fact that although a failure of the master is recognized and logged, the slave is not identified as a master, which means a return to its database when the master is restarted. There is no plausibility or real consistency check of the data.
The decisive exclusion criterion was the fact that a brief interruption in the availability of the slave from the master leads to data inconsistencies that are not recognized and rectified.

SQL Relay is a product, which is also like a proxy between the application and the database and was specially programmed to increase the performance of web applications.
A rough description of the product can be found on [sql04]:

In [srf04], the FAQ of the product, there is a reason why it is used with
ZOPE could be interesting:
However, this product specializes exclusively in load balancing and does not support any form of data replication, which in turn is left to other products, as can be found in another statement in the FAQ. According to the FAQ, SQL Relay supports writing functionality to be restricted to a "master" if this is required by replication. The only problem is that no guarantee can be given for the synchronization and reference is made to external tools. It is important to note that load balancing can make replication more difficult.

Jan Wieck released the first release of its tool "Slony-1" 'in the middle of this year, which provides a framework for replicating PostgreSQL databases in particular.
Slony-1 works in the form of a daemon that waits for events in the database and then executes scripts. In the Slony-1, a cluster is defined over several databases, usually on different servers, and access to these is regulated.
If this is done, data manipulations (so far no schema manipulations) are duplicated on all nodes of the server according to the definitions of the script.
To manage the replication, Slony-1 sets up an extra database in which the manipulations that still have to be transferred to the slaves are stored. The product works strictly according to the master-slave principle and does not offer any possibility of mutual synchronization, i.e. replication from all participating servers to all the others in the cluster.

Despite these minor shortcomings, this software is the cheapest solution for replicating the PostgreSQL database. This is due to the properties, such as support for multiple slaves. Another important point is that a brief failure of the slave to be reached by the master does not lead to the replication being aborted or data inconsistency. Rather, on the basis of the administration database that the Slony-1 sets up on the master, it is noted for each slave which data still needs to be replicated. In addition, the tool offers a feature that makes a lot of sense and that further increases its rating. Slony-1 supports a function for dynamic assignment of the roles master and slave. In the event of a maintenance window for the master, a slave (in this case there is only one) can transfer the role of master, after which the new slave and old master can be processed. In the case of failback, the old master is integrated as a slave, a data update is initiated using Slony-1 replication and then the role can be transferred again as the master.



Next:Replication of the ZODB Up:Replication Previous:Replication Contents Marko Damaschke 2005-06-11