Oracle Change Data Capture – The Technology Explained
Oracle Change Data Capture (CDC) is used to monitor and track all changes made in a database so that required action may be taken on them. Primarily, Oracle CDC is software design patterns that are useful for initiating the process of data integration, data identification, and data delivery for all modifications made to the source in enterprise databases. It also helps real-time data integration across enterprises and speeds up data warehousing to improve the performance and quality of output of databases.
Oracle CDC is an efficient and ideal tool and a non-intrusive method for deploying replication activities without any drag on performance levels. The tasks include migrating databases to the cloud without the downtime and offloading analytics queries from databases in production to data warehouses. It is also possible to extract incremental data from various sources and move it to a data warehouse.
As one of the main functions of Oracle CDC is to preserve and capture the state of the data, the whole process takes place in a data warehouse ecosystem that can be implemented in any data repository system or database. Users setting up Oracle CDC mechanisms have the option to do so in multiple ways from application logic to physical storage in one or several combinations of system layers.
Development of Oracle CDC Software and Technology
Change Data Capture technology was first incorporated as a built-in feature in Oracle databases in the Oracle 9i version and was used to track and record all changes in user tables in a database. All modifications were then stored in specific change tables to be used in ETL applications. This data was then processed and loaded into other databases and data warehouses. The first version of Oracle CDC worked through triggers placed in source tables which did not find favor with Database Administrators who considered this technology to be very invasive.
Subsequently, Oracle released a less intrusive form of Oracle CDC in its 10g version and named it Oracle Streams. This process leveraged the redo logs of the source database along with the in-built replication tool of Oracle. Streams was a very effective tool to detect and move change data to a target repository without impacting in any manner the performance of the system at source.
However, despite its efficiency, Oracle decided to discontinue Streams from its Oracle 12c model and it no longer supported Oracle CDC. Users now either have to look for another Oracle replication or CDC solution or choose to pay for Oracle GoldenGate that has Oracle CDC as an in-built feature.
The Current Form of Oracle Change Data Capture
The concept of Oracle Change Data Capture is based on the premise that the data in one computer (the source database) is changed and another computer (the target database) has to take some action based on those changes. The source and the target database might be the same in some cases and there too Oracle CDC will work equally well as several CDC solutions can exist in the same system.
The Oracle Data Integrator helps Oracle CDC to identify changes to the data at source. Two modes are supported by the Oracle Data Integrator.
- Synchronous Mode:In this mode, triggers are placed in the source database. It ensures that any changes made here are captured immediately. A DML (Data Manipulation Language) activity which is insert, update, or delete is carried out by each SQL statement. The changed data is captured as a component of the transactions that have been responsible for changing the data at source. This form of Oracle CDCis available in the Oracle Enterprise Edition and the Oracle Standard Edition.
- Asynchronous Mode: In this mode, data is moved to the redo log files, and the changes are captured after a SQL statement goes through a DML activity. The modified data does not have any effect on the transaction as it is not captured as a part of the transactions that changed the source tables. HotLog, Distributed HotLog, and AutoLog are the three modes in asynchronous CDC. Asynchronous Change Data Capture is based on the now-discontinued Oracle Streams and provides a relational interface to it.
Setting up and configuring Oracle Data Integrator is easy and mostly Automated.
Comparison of Database Extraction With and Without Oracle CDC
There are several benefits of database extraction with Oracle CDC.
- Extraction of data: Oracle CDC enables immediate database extraction from Insert, Update, and Delete operations in real-time as soon as the changes are made in the source tables. Without CDC, database extraction is not optimized for Insert activities and very difficult for Update and Delete as the data is no longer available in the table.
- Staging: Staging data is placed directly in relational tables with Oracle CDC and flat files are not required. Without CDC, entire tables have to be moved into flat files.
- Interface: An easy-to-publish and subscribe interface is provided by Oracle CDC through DBMS_LOGMNR_CDC_PUBLISH and DBMS_LOGMNR_CDC_SUBSCRIBE packages. Without CDC, extensive manpower is required to administer this step which is then also prone to errors.
- Costs: Oracle CDC comes in-built with the Oracle 9i version and Oracle GoldenGate which is a paid service. Even then, costs are affordable for the extraction of change data. Without CDC, the process is expensive as the software required for data capture has to be purchased from a third-party vendor or developed in-house.
These are some of the advantages of working with Oracle CDC.
This technology from Oracle has made database administration as well as migration and replication activities quick and with the minimum of human intervention.