Saturday, September 13, 2014

Troubleshooting SQLServer CDC Service for Oracle Capturing Only Changed Column Values for Updates Issue

Recently for one of my datawarehouse projects we had a scenario which required getting data from a Oracle server. Since this Oracle server was a high end OLTP system with high frequency DML operations happening the impact on the system has to be kept a minimum.
For this purpose we made use of the CDC service for Oracle for replicating the changes from Oracle server to our SQLServer datawarehouse. This worked fine for us until we faced an issue with data modifications happening at the Oracle end.
We could see that the CDC only captured details of the changed columns for UPDATE operations happening in Oracle server leaving out all the other columns in the table with NULL values. This threw off our logic to capture the changed information from the Oracle delta data.
What we didn't understand was what made this issue to suddenly crop up as this was working well at the time of implementation for us. I did some analysis on this and found out the reason for this which helped us to fix this issue. I'm sharing it via this blog to benefit others who may get stuck up with similar issues.
The key requirement for CDC to work is that Oracle server should be setup in ARCHIVELOG mode with supplementary logging enabled. This is because CDC makes use of logs to capture the change data and replicate it to the SQLServer database.So my analysis was around the settings to be done for supplementary logging at Oracle server.
Through the below link
I learned that there are two level of logging and for CDC to get enough information from the redo logs and capture it outside of Oracle instance Database Level Identification Key logging has to be enabled. The information regarding this can be got from V$DATABASES catalog view in Oracle. It has a column corresponding to each of the options available like
I checked the value and found out that SUPPLEMENTAL_LOG_DATA_ALL had the value set as NO. So I tried setting it to YES using below statement

I did some changes in source tables to check the outcome and found out that CDC was able to pick up data in other columns too from then on which confirmed me that issue had been fixed by tweaking the setting. Apparently what had happened was that the ARCHIVELOG got full in the Oracle server and after fixing this the supplementary settings somehow got reset to the default values. Setting it back to original state fixed the issue. So whenever you face similar issues with CDC not able to get required data from redo logs make sure you check the V$DATABASES view for the supplementary log settings existing for various types of columns and fix them accordingly in case they got reset using ALTER DATABASE command.