Day 19/100
Data Mesh
Let's understand what Data Mesh is not,
- A data mesh is not a concrete tool or product, like a database, warehouse, or analytics engine. It has more to do with data architecture and governance.
- A data mesh is also not a specific set of architecture or governance rules. There is no definitive checklist that you can go through to see if something is or is not a data mesh. Rather, data mesh is a conceptual framework
Background
- there's an ever increasing need for decentralized data ownership and distributed data across an enterprise, united by central self-serve infrastructure
- The resulting data products are made accessible across functional teams by adhering to high-quality standards, governance, and metadata.
Old way of Doing it
- Usually any enterprise would store all the data in a monolithic data lake.
- where all different types of data usages and client source destination paths are mixed up
- the various data teams are responsible for different stuff, hence they are siloed.
- The architectural quanta (units) are steps in the data lifecycle, for example: “ingest,” “process,” “serve.”
The Data Mesh way
- Data infrastructure is centralised and the organisation adheres to global standards. But that is the only thing that’s centralised.
- Architecture is domain-driven. teams are responsible for a specific domain only.
- the respective teams are responsible for the entire data lifecycle for that domain. they host and serve it themselves
- The important quanta are now these business domains.
- data mesh approach is first and foremost a change in perspective about the best way to conceptually break down enterprise data platforms into useful units
- now data has become ubiquitous in the enterprise at all levels.
- manage data in a way that mirrors the way we manage businesses, teams, and people
There's no one size fits all solution here, Your data strategy is unique or inspired for you
- Embrace flexibility and growth
- Prioritize access, most important thing is that everyone can access and use the data they need
- Set the goal of creating useable, cross-functional data products.
High level Data Mesh Plan
- Take stock of your current data stack, tooling, and most importantly, your people and team structure.
- With the goal in mind of central infrastructure and distributed products, evaluate your current organizational state. What would the data products be? Who should own each one? Who owns the self-serve infrastructure? Most importantly, how can you delegate new ownership without completely rearranging your teams?
- Which of your current tools are unusable in this new paradigm? Evaluate replacements, but don’t expect all changes to be one-to-one. Crowd-source from published use cases to get ideas.
- Decide on data product and metadata standards and document them meticulously before you actually do anything.
- Hold meetings, provide educational resources, and make sure you have buy-in from all stakeholders before you begin your migration.