Monday, February 3, 2020

SQLServer : Real Time Streaming RDBMS Data Using Apache Kafka Platform


Problem Statement 

There is an application database hosted in a SQLServer instance which is capturing data from a automation software including the readings as well as general health related details of multiple machines. 
The requirement asks for the creation of a close to real time reporting platform which can be used as a monitoring and health check mechanism for the machine as well as perform time series based analysis on it.

Solution

The requirements like above requires the usage of a real time streaming platform which is capable of transmitting huge amount of data close to real time from the database. There are many such streaming platforms available. 
One of the popular streaming tool which can be used in the above case is Apache Kafka. Apache Kafka is capable to connecting to variety of heterogeneous sources and transmit data from them in close to real time as streams. 
So in our above case we make use Kafka to connect to the SQLServer database to extract data from tables and stream them in real time.
You can learn more about Kafka from the below link


Kafka Connect is an open source component of Kafka which provides the required framework to connect Kafka to a variety of systems to act as source or sink.


For connecting to SQL Server we shall make use of either of the below connectors

1. JDBC based connector


This connect directly to the tables and extract data from them

2. Debezium connector for SQLServer


Debezium is a distributed platform which enables converting RDBMS data to event streams by means of set of source and sink connectors. The connectors enables Kafka Connect to systems as source to extract data and stream them as Kafka topics close to realtime. They also includes sink connectors which enables data from Kafka streams to be stored as data inside RDBMS table.


Illustration

A very detailed step by step illustration for the above scenario is explained in the latest article I've contributed to Technet community whose link is given below


The article explains from setting up of the required systems (SQLServer, Kafka, Kafka Connect, Debezium connector) to streaming data and analyzing the stream through command line and through K-SQL server.

The article should help someone who is starting on the above mentioned systems and requires implementing use case similar to that specified above

Wish you happy reading.

As always feel free to give your valuables feedback

No comments:

Post a Comment