Neda Abolhassani & Teresa Tung | Accelerating Industry Data Integration
19m
A data supply chain is industry-specific, but many data prep tools are industry agnostic. As part doing this work, data engineers and domain experts apply their deep knowledge of how to transform raw data to a form that can address specific problems. In this way, the data supply chain is a domain like so many others to which AI is applied. It involves making decisions repeatedly informed by past experience.
Our work aims to leverage our own experience, to capture and apply industry and domain experience to accelerate industry data integration. To tackle this problem we’re taking an approach that layers a Knowledge Graph on top of the Data Mesh to capture domain knowledge and to connect the different data domains.
Like many others, we’re excited about Zhamak Dehghani’s new Data Mesh architecture paradigm. It’s distributed architecture follows many of the data integration strategies from the web, and includes mature rules around data products, quality, and governance.
This talk illustrates how some machine learning techniques can help in data profiling and mapping of siloed data sources for Enterprise Knowledge Graph construction. We present how this Knowledge Graph can be employed in a Data Mesh architecture for accelerating industry data integration for an Oil and Gas use case.