Driving Growth with Data Mesh Architectures

Why read this article?

If you are perfectly happy with your company’s data analytics capabilities and you’re using data to effectively drive business decisions, you can stop reading now.

Company A: Sophisticated, but reaching the limits of current technology

Company A has multiple domain specific applications streaming data into an ETL pipeline and then into a data lake. A team of highly skilled data engineers analyzes the data and generates reports for various internal consumers. The capabilities of Company A’s data team and data lake infrastructure are very impressive, but their data analytics system is becoming the victim of its own success. Their useful reports simply spur demand for even more reports. On one hand, more domains want to pump data into the lake; on the other, the demand for more variety and more reports keeps growing. The data analytics system cannot scale; lead time for new reports grows, and report quality declines. Shadow data analytics takes hold — reports created by data consumers who cannot wait for the data team. These consumers get their data from other sources, causing a security and governance nightmare. In this case, a data mesh can be a low-cost way of siphoning off the more routine reporting tasks from the data lake, saving the data lake and the efforts of the highly skilled data engineers for the more analytic work they are meant for.

Company B: Data silos pose barrier to growth

Company B has a collection of domain-specific applications for HR, Legal, Sales, Shipping, and other departments. Each application has its own data store constructed of a number of available technologies: relational, noSQL, graph databases, flat files, and event streams stored both on-premises and in a public cloud. While each domain-specific application adequately reports on its own data, Company B struggles to generate actionable insights by combining data across domains. For Company B, a data mesh is perfect for aggregating data products generated by each domain in ways that provide insight into the business as a whole.

Company C: Small but growing rapidly

Company C is not yet using on-demand reporting to drive business decisions. Reports might be generated by hand at a weekly or longer cadence. This approach can work in the early phases of growth but the demand for actionable data will soon grow faster than the company’s ability to generate it. In this case, building a simple data mesh is an inexpensive way to enter the realm of on-demand analytics and reporting. Company C can get the immediate benefits of a data mesh and then let it grow and adapt in response to future demands. Best of all, the initial investment in a data mesh paves the way for more substantial investments in traditional data architectures if that becomes desirable.

What is a Data Mesh?

A data mesh is not a product. You cannot download data mesh software. While there is a common set of tooling used to build data meshes, there is no one-click install.

Data product combining data from events and serverless function.

Data Products and Data Nodes

Advantages of a Data Mesh

Data Meshes Scale Better than Traditional Data Pipelines or Data Lakes/Warehouses.

The anti-patterns described for the companies above are typical. They are not the result of bad engineering or insufficient budgets; they are not even a drawback of traditional data architectures. We believe these anti-patterns arise from an attempt to utilize traditional data architectures to meet all the varied demands for data and analytics that come from even small businesses.

Stage One: This configuration is often successful with a small number of internal providers and consumers.
Stage Two: Both publishers and subscribers grow. Workload on small data team increases. The architectural and organizational model does not scale.
Stage Three: More providers and subscribers overwhelm pipelines. Shadow Data Analytics takes hold.

Data Mesh Encourages Domain Ownership of Data

A more traditional approach to data analytics might attempt to combine data from multiple sources into a single monolithic data lake or data warehouse. In a data mesh, domain experts own the data and package it in ways that maintain the data’s integrity and fitness for purpose.

Data Mesh Limits Coupling of Resources and Promote Maximum Flexibility

Monolithic ETL and Data Lake Architecture with many Publishers and Subscribers

Data Mesh Facilitates Creation of Data Products through Domain Ownership of Data

In data mesh parlance, a data product is a node or portion of a node on the data mesh. It is the code and infrastructure required to deliver data to a consumer as well as the data itself. This is the smallest deliverable unit a data mesh can provide. Because the data product is created by domain experts who own the data, the quality of the product tends to be much higher than data provided by other architectures.

Meet Business Needs by Combining Data Products in Arbitrarily Complex Ways

Distributed data mesh nodes can call one another just like microservices call one another, and together they can generate and collate data products from multiple sources to deliver on-demand actionable reporting. A data mesh can even be used to increase observability of a company’s activities — a simple use case would be ordering more inventory depending on an analysis of sales patterns over the last few days. Though this is not a typical use of a mesh, it shows the flexibility of the solution.

Scale Each Data Node as Required

Some nodes on a data mesh may provide important data products but are accessed infrequently. Other nodes may provide access to event streams and be accessed constantly. Each data node needs to scale independently in the same way any microservice in a microservice architecture needs to scale independently of other microservices. Independent scalability of application components is a core tenet of cloud-native applications (see https://12factor.net/).

Data Mesh Governance is Implemented at the Data Product Level

In a data mesh, governance is federated within each data product. It’s best to have the most knowledgeable team responsible for implementing governance; this reduces centralized bottlenecks like data review meetings. Although some bottlenecks can be avoided, be aware that in most cases federated governance places an extra demand on the domain team.

Data Mesh is Built Using Industry standard tooling

Data meshes are built using microservice technologies, patterns, and DevOps pipelines that are all well known and heavily used in industry for nearly ten years. A data mesh employs tools and principles commonly used in microservice architectures such as containers, Kubernetes, service meshes (Istio, Consul, Linkerd), and zero-trust security measures such as continuous verifications, identity-based segmentation, least privilege principle and automated context collection and response.

Example data mesh using Kubernetes

Direct communication between Domain Experts and Data Consumers

Once parts of a business enjoy the benefit of data analytics, other parts will want to join in. Demand for new and more involved reports and analytics grows. Just as a small team of data engineers cannot understand all the sources of data in a company, they also cannot completely understand how the data is used. The demand for new and varied forms of data expands beyond the data team’s capability to understand the nuances of the data they provide to consumers. Errors in understanding data lead to errors in reports and or delays in generating reports.

Start Small and Grow Your Data Mesh According to Demand

A company can begin with a small data mesh of just a few nodes and see value almost immediately. Though a data mesh can be thought of as microservices for data, the reality is a little more complex than that. With the principles described in this article, you’ll be able to start your mesh small and let internal demand drive growth according to demand.

Data Mesh adds business domains and data products as it shows value.

Common Challenges When Adopting a Data Mesh

Your company once prided itself as a place where data drives innovation, but as the company grows the reality seems different. Many symptoms of data-related problems loom on the horizon:

  • organizational silos and lack of data sharing
  • no shared understanding of what data means outside the context of its business domain
  • incompatible technologies prevent gaining actionable insights
  • data is increasingly difficult to push through ETL pipelines
  • a growing demand for ad hoc queries and shadow data analytics
Decentralized Data mesh topology showing relationship between data products and business domains.
A more traditional monolithic data lake architecture.

Follow DATSIS Principles

DATSIS stands for Discoverable, Addressable, Trustworthy, Self-describing, Interoperable and Secure. Failure to implement any part of DATSIS could doom your data mesh.

  • Discoverable — consumers are able to research and identify data products produced by different domains. This is typically done with a centralized tool like a data catalog
  • Addressable — like microservices, data products are accessible via unique address and standard protocol (REST, AMQP, possibly SQL)
  • Trustworthy — domain owners provide high quality data products that are useful and accurate
  • Self-describing — data product metadata provides enough information that consumers do not need to query domain experts. In other words, data products are self-describing
  • Interoperable — data products must be consumable by other data products
  • Secure — access to data products is automatically regulated through access policies and security standards. This security is built into each data product

Automatically Update Data Catalogs with every Release

Data product discoverability is part of DATSIS and a key element of data meshes. Most data meshes employ a data catalog or other ad hoc mechanisms to make their data products discoverable. A data catalog can be used as an inventory of data products in a data mesh, most often using metadata to help organizations support data discovery and governance.

A simple data mesh using Azure Purview as a data catalog.

Invest in Automated Testing

A data mesh is by definition a decentralized collection of data. An important issue is how best to ensure consistent quality across data products owned by different teams, that may not even be aware of one another.

  • Every domain team is responsible for the quality of their own data. The type of testing involved depends on the nature of that data and is decided upon by the team.
  • Take advantage of the fact the data mesh is read-only. This means that not only mock data can be tested but tests can often be run repeatedly against live data as well. Take advantage of time based reporting — test on historical data that is immutable makes for an easy test and detects things like data structures changing.
  • Run data quality tests against mock and live data. These tests can be plugged into developer laptops, CI/CD pipelines or live data accessed through specific data products or an orchestration layer. Typical data quality tests verify a value should contain values between 0–60, or alphanumeric values of a specific format, or that the start date of a project is at or before the end date. Test-driven design is another approach that can be used successfully in a data mesh.
  • Include business domain subject matter experts (SME’s) when designing your tests.
  • Include data consumers when designing your tests. Data meshes should be driven by data consumers and it is important to make sure your data products meet their needs. Otherwise, why build the mesh in the first place?
  • Use automated test frameworks that specialize in API testing. We recommend the Karate framework (https://github.com/intuit/karate). Other useful tools are:
    - SoapUI: https://www.soapui.org/
    - Postman: https://www.getpostman.com/
    - Apigee: https://cloud.google.com/apigee/
    - Rest-Assured: http://rest-assured.io/
    - Swagger: https://swagger.io/
    - Fiddler: https://www.telerik.com/fiddler

To someone with a hammer, everything looks like a nail

When people become very proficient with one set of tools, they tend to use those tools even in situations where they are not appropriate. Many companies struggle with scaling data analytics because they try to use their data infrastructure to solve every need for information. An architecture where ETL pipelines pump data into a data lake is in many ways monolithic and has a finite capacity to deliver value. It simply does not scale well. A data lake excels at ad hoc queries and computationally intensive operations, but the centralized nature of the lake can make it hard to include pipelines from every domain in the company.

Data Mesh Requires Extra Work by Domain Teams

In a data mesh, domain teams maintain ownership of their data and create data products that expose that data to the rest of the company. If an engineering team handles the data mesh work, their capacity for other engineering work will decrease — at least at the beginning.

Tight Coupling Between Data Products

The design influence of microservices on a data mesh is apparent in its flexible nature. A data mesh can expand and contract to match your data topology as it grows in some areas and shrinks in others. Different technologies like streaming can be used where needed and data products can scale up and down to meet demand.

Shadow Data Analytics and Mandates

Stage One: This configuration is often successful with a small number of internal providers and consumers.
Stage Two: Both publishers and subscribers grow. Workload on the small data team increases. The architectural and organizational model does not scale.
Stage Three: More providers and subscribers overwhelm pipelines. Consumers bypass approved pipelines and Shadow Data Analytics takes hold.
  • Follow DATSIS principles and the independently deployable rule as described above
  • Enforce SLA’s for minimum performance requirements for all data products
  • Employs a transparent process to request, schedule and implement new data products.

Accurately Evolve Data Products

Data evolves as a company evolves, often in unpredictable ways. Changes often fall into two types:

  1. Changes in the domain structure of your company
  2. Changes in the structure and nature of the data itself within each domain

Accurately Version Data Products

Data products will need to be versioned as data changes at your company, and users of that data product (including maintainers of dashboards) are notified about changes, both breaking and non-breaking. Consumed data products need to be managed like resources in Helm charts or artifacts in Maven Artifactory.

Sync vs Async vs Pre-assembled Results

If a data mesh uses synchronous REST calls to package the output from a few data products, chances are the performance will be acceptable. But if the data mesh is used for more in-depth analytics combining a larger number of data products (such as the analysis typically done by a data lake), it is easy to see how synchronous communication might become a performance issue.

Datamesh using cached data
  • There are no ordering dependencies between the datasets you construct. In other words, if you concurrently build five datasets, the content of Dataset #2 cannot be dependent on the content of Dataset #1.
  • Most likely the caller will not receive an immediate response to their request. Instead, some sort of polling technique returns successfully only when all datasets are built and combined. If the dataset is very large, it may be stored somewhere and a link to the dataset provided to the user. This implies appropriate infrastructure and security is in place.

Use the Right Tool for the Job

Data lakes or data warehouse architecture could be right for the job. They are just not the right tool for every job — just as a data mesh is not the right tool for every job. In fact, it’s easy to see scenarios where a data mesh and data lake coexist and make each other stronger. Data lakes or warehouses require an investment of hundreds of thousands of dollars and hiring experienced data engineers before seeing any return on investment, but they do have a place in today’s data architectures.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store