In the ever-changing world of data and analytics, it can be challenging to assess how your organization is doing compared to the rest of the market and how to frame your data strategy. To better understand how your organization compares, Gartner defined “The Analytics Continuum” that lays out seven high-level tasks and plots them on the scale of analytics maturity and competitive advantage.
the traditional etl challenge
The transition from “What happened?” to “Why did it happen?” on “The Analytics Continuum” is the most difficult step to take because the capability to answer these questions requires an entirely new approach to data management. This step requires access to data in real time as the events happen and a traditional extract, transform, load (ETL) process isn’t capable of supporting actionable feedback. For example, let’s say you would like to send real-time targeted ads via push notifications to customers in your store. A traditional ETL process might only be kicked off once a day when the store is closing, well after the customer has left the store. So how do you solve this problem in today’s world?
the modern data architecture solution
A modern data architecture (MDA) allows you to process real-time streaming events in addition to more traditional data pipelines. There are two primary approaches Credera recommends when building an MDA for your organization, each having their own strengths and weaknesses. The first approach is called a Lambda architecture and has two different components: batch processing and stream processing. The second approach is called a Kappa architecture where all data in your environment is treated as a stream.
lambda architecture overview
The main advantage of implementing a Lambda based MDA is that you can typically continue to use your existing batch ETL processes as the batch component. The only time this wouldn’t be true is if your existing systems are unable to handle the throughput of data your organization is seeing. A well-known weakness of Lambda is that you now have to manage and maintain two separate systems to acquire data.
Lambda architecture example
kappa architecture overview
The biggest advantage of Kappa architecture is that it is a simplification of the Lambda architecture and allows you to have only streaming services as your main source of data. This reduces the number of services and amount of code your organization has to maintain. Treating every data point in your organization as a streaming event also provides you the ability to ‘time travel’ to any point and see the state of all data in your organization. One downside of Kappa is the need to re-process events in the case of errors; however access to affordable, elastic compute makes this a minor issue.
Kappa architecture example
Choosing the correct modern data architecture is an important step in crafting your organization’s data strategy. This involves carefully assessing your organization’s current state architecture and planning for maximum flexibility to best serve the consumers of this data. Both Kappa and Lambda architectures will provide a strong foundation when constructing a broader data-oriented business.
Have additional questions? Interested in how Credera can implement a modern data architecture for you? Reach out to us at firstname.lastname@example.org.