Ben De Meester | PROV4ITDaTa: Flexible Knowledge Graph Generation Within Reach
KGC | The Complete Collection
•
16m
Personal Knowledge Graph generation is no longer a cumbersome technical endeavor. PROV4ITDaTa is an MIT open source platform to provide a smooth user experience for generating knowledge graphs from your online web services, such as Google, Flickr, and Imgur, into your personal data space. This brings your personal data back under your control, and as a graph, its true interlinking potential is unleashed. PROV4ITDaTa allows to configure and set up a web application where users can easily pick one or more web services to extract their data from, transform that data into best-practice knowledge graphs, and push those graphs to a personal data space, such as a Solid pod.
All heavy lifting is included in PROV4ITData: management of service authentication (e.g., OAuth 1.0/2.0 sessions), setting up the infrastructure to extract and transform your personal data from popular web services, directly loading those graphs into your personal data space, and generating a simple user interface. Developers only need to focus on providing custom connectors to more specialized web services, and configuring the data processing pipeline to generate knowledge graphs into any needed form or shape.
The data processing pipeline is based on RML.io, meaning that it is extensible to any data source, can integrate multiple data sources on the fly, includes data cleansing functions, supports any RDF graph structure and ontology, and its configuration is fully declarative. The pipelines are thus more maintainable and more transparent than hard-coded solutions such as the Data Transfer Project (DTP). Where existing platforms such as DTP provide one-to-one data transfer, PROV4ITDaTa enables extensible many-to-many data processing pipelines, beyond the web services and (personal) data spaces we currently provide. All these features can easily be included in the PROV4ITDaTa platform, and will be showcased during the presentation.
As a result, developers can more easily support custom web services. As users can generate personal knowledge graphs faster and easier, more knowledge graph applications can be built that rely on real-world data. With a click of a button, users can try out knowledge graph applications using their actual data from music streaming systems, fitness apps, address books, social media, etc.
Continuing this product, we are building a data processing workbench, where these different data processing pipeline configurations can be managed, scheduled, and orchestrated, giving companies more control, and allowing to upscale PROV4ITData more easily.
Introducing PROV4ITDaTa is Ben De Meester, a postdoctoral researcher at IDLab. In this talk he provides insight on the problems with current data that is messy and personal.Data is scattered across several service providers and is difficult to transfer over. The solution he provides is PROV4ITDaTa, an open source flexible tool for transparent and direct transfer of personal data to personal stores. He discusses mainly about PROV4ITDaTa explaining how it could knowledge scientists and allow users to easily transfer their data using PROV4ITDaTa. #knowledgegraphs #knowledgegraphconference #knowledgegraphtools #knowledgegraphandbigdataprocessing
Up Next in KGC | The Complete Collection
-
Bernhard Krabina | Semantic MediaWiki...
Semantic MediaWiki (SMW), which was introduced as early as in 2006, has since gone on to establish a vital community and is currently one of the few semantic wiki solutions still in existence. SMW is an extension of MediaWiki, the software used for Wikipedia and many other projects, resulting in ...
-
Branimir Rakic | OriginTrail: Decentr...
Knowledge graphs are powerful tools used by organizations to integrate their siloed data into useful, machine readable information for a wide range of purposes. The OriginTrail Decentralized Knowledge Graph (DKG) extends this approach to enable trusted knowledge exchange between multiple organiza...
-
Cedric Berger | Data Governance 4.0 A...
Driven by legacy paper-based approaches, the design, conduction and analysis of clinical studies requires the creation and transformation of many data in many different formats. This hinders the process and necessitates significant resources. Having metadata-driven transformation is not new, howe...