Setting up a GoldenGate replication with multiple process groups
If your source system is a very busy OLTP system, with a high rate of data changes, then you will find that a single set of GoldenGate processes is not sufficient to replicate the load to the target environment in real time. In such a situation Oracle recommends splitting the load into multiple process groups.
Getting ready
For this recipe we will use the Order Entry demo schema. This schema has been created in both source and target databases, and is in the same state. The GoldenGate binaries, the GoldenGate Admin user, and manager instance have also been set up in the source and target environment.
How to do it...
In this recipe we will set up the following configuration for replicating the Order Entry schema:
As you can see from the preceding diagram, the replication setup performed here will consist of two Extract processes, two Datapumps, and four Replicat processes.
Perform the following steps in the source database:
Create the first Extract process EGGTEST1 in the source environment:
Add the Replicat processes to the target manager configuration:
ADD REPLICAT RGGTEST1, EXTTRAIL /u01/app/ggate/dirdat/PGGTEST1/rt, BEGIN NOW
ADD REPLICAT RGGTEST2, EXTTRAIL /u01/app/ggate/dirdat/PGGTEST1/rt, BEGIN NOW
ADD REPLICAT RGGTEST3, EXTTRAIL /u01/app/ggate/dirdat/PGGTEST2/rt, BEGIN NOW
ADD REPLICAT RGGTEST4, EXTTRAIL /u01/app/ggate/dirdat/PGGTEST2/rt, BEGIN NOW
Perform the following steps in the source database:
In this example, we have split the extract load into two streams using the FILTER clause with a RANGE option. The FILTER clause is used to refine the changes that a GoldenGate process looks for. The RANGE function distributes the load evenly into multiple streams by calculating the hash value of the key columns in the table. By using both of these options each Extract process only processes half the load. Each Extract process writes its trail files to a separate directory under $GG_HOME/dirdat/ and is configured with a dedicated Datapump process on the source server. These Datapump processes then transfer the trail files to the remote server and write them to separate directories there. The replicat load on the target server is split into four parallel streams. The first two Replicat processes (RGGTEST1 and RGGTEST2) apply the changes captured by the first Extract process (EGGTEST1) on the source server. The next two Replicat processes (RGGTEST3 and RGGTEST4) apply the changes captured by the second Extract process (EGGTEST2) on the source server.
There's more...
You would configure multiple processes to enhance the GoldenGate performance. In most cases you would find that it is the Replicat process that is causing the bottleneck and needs intervention. There are many performance enhancements in the latest release of Oracle GoldenGate. However, at times you would find that adding additional process groups is the only way of achieving the desired replication lag targets. There are various ways to determine how to split tables between various processes which are as follows:
You can specify different tables in different processes
General rule of thumb is to keep related tables together
Tables with referential integrity constraints should always be kept in the same process group
You can use the RANGE function to logically split the load of related tables into multiple streams
If a single process is not able to handle the load of high rate of change on a single table, then you should split its load using the RANGE clause
See also
See the Splitting the replication load into multiple process groups for optimal performance recipe in Chapter 6, Monitoring, Tuning, and Troubleshooting GoldenGate