In many organizations, data is ubiquitous and growing. Data can help companies improve customer experience, optimize sales pipelines, and measure marketing performance, among many other applications. The promise and potential of data is boundless, but is ultimately dependent on providing the right data, at the right time, and in the right manner. Attaining this goal can be challenging as data usage increases, putting your data operations at risk.
A modern data architecture acts as a framework to help alleviate these issues and creates increased data capabilities. When designing with scalability, flexibility, and availability in mind, your data can reach its highest potential.
Two leaders in Credera’s Data & Analytics Practice, Gilbert Sharp and Phil Shon, shared their guidance on how a modern data architecture could help transform how companies better leverage their data and run their business.
Gilbert Sharp – Principal Architect, Data & Analytics
Gilbert Sharp is a principal architect in the Data & Analytics Practice at Credera. Gilbert has more than 30 years of experience specializing in strategy, advisory, data modeling, and delivery roles with a focus on data architecture and engineering.
Phil Shon – Senior Architect, Data & Analytics
Phil is a senior architect in the Data & Analytics Practice at Credera with interests and experience in distributed data systems and data management. He enjoys spending time with his family and trying out new and interesting cooking techniques.
Five Principles of Modern Data Architecture
While each organization will have a different definition and implementation of a modern data architecture, they will often be rooted in the same principles. Ensuring each of these principles are met will lead to the best outcomes for using your data.
1. Solve a Business Problem
“Have a sponsoring business-side consumer with a business case,” Shon said. The most important principle of developing a modern data architecture is to understand what you’re building, and why you’re building it.
“Technology in itself is not a deliverable,” added Shon. Every modern data architecture will be unique just as every problem that is trying to be solved is unique. Defining the business use case will lead the organization to understand exactly what needs to be done and how to define success. Only then should you start to choose the tools for the job. Credera has experience working closely with our clients, utilizing design sprints, and refining data strategy to focus on the business problems at hand and how technology can help.
2. Scale Well
The demand of compute power and storage space on your data architecture will be elastic over time. Whether trends appear seasonally, on a weekly basis, or even throughout the course of a day, there will be times in which more stress is being placed in storing or processing data.
Data demand will increase over time as your organization utilizes more and more data and can eventually outpace your current architecture’s capacity. This manifests as higher query loads, slower performance, and more storage used. The natural solution is to purchase more hardware to increase capacity, but it always requires what is at best an estimate of the needed capacity.
As demonstrated in the diagram below, excess capacity reflects an over-provision of resources and will result in higher hardware costs than what was needed at the time. A capacity shortfall means you couldn’t serve all requests for data adequately, resulting in missed insights and lost revenue.
“The ability to scale up, enabling more resource-intensive queries, or alternatively scale out, to be able to run many queries at once, is needed to serve any modern architecture you hope to build out,” said Sharp. Cloud-based, scalable architectures will allow for highly scalable functionality. This is an absolute must to mitigate the risk of either overspending in hardware capacity or being unable to satisfy data operations.
3. Keep Data Accessible
Making your data as available as possible is paramount, because the purpose of any data platform is to serve data to others. This means allowing for your business to operate as needed and for teams to use data when required with few restrictions on how to access that data.
“Being able to get well-structured data to others easily will save time through the entire organization,” said Shon. Efficiently leveraging data is important to return value and solve business problems. Many of Credera’s modern data architecture implementations have included reporting overhauls with business intelligence applications, the development of API endpoints, and providing data for machine learning capabilities.
4. Be Flexible
It’s critical for your data architecture to ingest many different forms of data. New data sources may be in any number of formats: structured data that fits well in a relational database, like CSV files; semi-structured data that can fit in to a relational table with some work, like XML files; and unstructured data that would not have any predefined data model, like PDF, emails, or logs.
“You need the ability to natively support multiple data types, not just your standard characters and numbers,” said Sharp. “More and more use cases will require storing and querying JSON and XML data.” He learned this while working with a client to modernize their customer service process. This client, with most customer interactions and sales coming online, would commonly send surveys to customers asking about their experience with the site. When customers responded, sentiment analysis could quickly be run on the responses, which would be returned as JSON files. This capability enabled the company to strengthen customer relationships by better tailoring the online experience and recovering from bad experiences quickly.
While almost all databases will provide support for well-structured data, it’s important to find the right set of tools to handle the semi- and unstructured data if needed. Modern data warehouses, like Google BigQuery, can accept semi-structured data natively, giving full query capabilities. A different toolset may be necessary for unstructured data, as it could not be queried directly from a database. A properly structured data lake will allow for the storage and accessibility of images, text documents, and other unstructured data.
5. Built for the Cloud
Data architecture should be “cloud-native.” Being cloud-native does not refer to the location of the architecture, but rather the design principles when developing it. Serverless, managed services position your platform to fully leverage the scalability the cloud offers, while also providing significant cost savings.
“Many cloud-based tools act almost as a pay-as-you-go service, only charging for the time actively being used,” said Sharp. “This allows for splitting budgets between teams and keeping costs much lower than keeping an environment always on.” While savings will be dependent on workload and cloud provider, the savings could range up to 50% on compute costs when using serverless tools.
As with all tasks, you need the right tool for the job, and oftentimes the toolset will be different when your platform is fully in the cloud. This includes several aspects of the data architecture pipelines, like serverless cloud functionality (AWS Lambda), managed Spark and Kafka instances (Cloud DataProc and Cloud Pub/Sub), or the ability to separate compute from storage in data warehouses (Snowflake, BigQuery, Azure Synapse).
Putting the Principles into Practice
A modern data architecture will help your company fully leverage the most valuable data available and help answer your business questions. At Credera, we take a proven approach to understanding and unlocking organizations’ data to achieve better business outcomes and user experiences. To learn more, reach out to us at email@example.com.