This is May from the GLB Division Lakehouse Department.
We have posted the following related blogs for Data + AI Summit 2023.
- "Data + AI Summit 2023 special site"
- "Digital Twin"
- “[Will it be a hot keyword in 2023?] What is LLMOps?”
Today, I would like to introduce "Data Mesh Architecture".
2. table of contents
- 1. Introduction
- 2. table of contents
- 3. Data mesh in "Data + AI Summit 2023"
- 4. What is data mesh
- 5. Dimensional change of data mesh
- 6. Data mesh principle
- 7. Advantages of data mesh
- 8. Databricks function that realizes data mesh
- 9. APC Data + AI Summit 2023 Special Site
- 10. Conclusion
3. Data mesh in "Data + AI Summit 2023"
At the "Data + AI Summit 2023" to be held in San Francisco, lectures related to building the Databricks Lake House that realizes a data mesh will be held on June 28th and June 29th.
4. What is data mesh
A data mesh is a socio-technical approach to sharing, accessing and managing analytical data within and across organizations in large-scale and complex environments.
The organization theory (team topologies) related to data mesh is posted in the following blog.
5. Dimensional change of data mesh
A data mesh represents an organizational and technological shift compared to previous methods of data management.
Reference: Data Mesh by Zhamak Dehghani, O'Reilly Media, Inc.
6. Data mesh principle
The data mesh architecture has four principles.
- Domain Ownership
- Data as a Product
- Self-Serve Data Platform
- Federated Computational Governance
6.1. Domain Ownership
Distributing data ownership across business domains, either the source of the data or the intended consumer, It also manages the life cycle of data logically decomposed based on business domain independently.
6.2. Data as a Product
Data products are logical extensions of traditional product thinking and include information, knowledge, insight discovery, and intelligence. The purpose is not only to collect external data, but also to develop more valuable products by embedding data into existing processes and systems. (Jobs-To-Be-Done Theory)
6.3. Self-Serve Data Platform
A distributed data management platform enables cross-functional teams to share data and manage the entire life cycle of data products with Knowledge Graph and Data Lineage.
6.4. Federated Computational Governance
Data governance standards (policies) are defined centrally, Local domain teams have the autonomy and resources to enforce defined standards. You can manage risk and ensure data compliance and privacy across your organization.
7. Advantages of data mesh
The data mesh architecture is
- Adaptable as company size, change and growth
- Ability to collect, integrate and analyze data from different systems simultaneously
- Domain teams can develop high-quality data products while maintaining complete control over their data
There are some advantages such as
8. Databricks function that realizes data mesh
There are many Databricks features that enable data meshing.
- Enable flexible sharing (Delta sharing)
- Catalog and governance tools (Unity Catalog)
- Self-service data pipeline (Workflow, Delta Live Tables)
- Sharing and reuse between data science and machine learning teams (Feature Store)
- Achieve high performance BI and SQL queries, multiple copies for data products by data teams (Databricks SQL)
9. APC Data + AI Summit 2023 Special Site
At "Data + AI Summit 2023", AP Communications, which has a partnership agreement with Databricks, plans to deliver keynote speeches and the latest update information from the local site sequentially from our special site!
During the period until the event, we plan to post the charm and highlights of Databricks. We will also cooperate with the people of Databricks, so if you are even slightly interested, we would appreciate it if you could join us for about a month.
Thank you for reading until the end. This time we will introduce the data mesh architecture. The session on data mesh is not only a technical issue, but also an organizational issue, so if you are interested, please join us!
We provide a wide range of support, from the introduction of a data analysis platform using Databricks to support for in-house production. If you are interested, please contact us.
We are also looking for people to work with us! We look forward to hearing from anyone who is interested in APC.
Translated by Johann