Antonin Delpeuch | Scaling & Maintaining OpenRefine
KGC 2021 Conference, Workshops and Tutorials
•
18m
OpenRefine is a data wrangling tool which celebrated its 10th birthday this year. Cleaning and importing data in knowledge graphs is its core use case, since it was originally designed to help populate Freebase. In this talk I want to give a broad overview of the latest developments in the tool and our efforts to consolidate it as a mature open source project. Please come along and tell us where you would like to see the tool in a few years!
Antonin Delpeuch provides information about OpenRefine where he helped develop the open-source project to what it is today and how much it takes scaling & maintaining OpenRefine with the team from W3. In this video he talks about the features of OpenRefine which is an open source data cleaning tool that is very accessible. Some of the features he talks about from the OpenRefine project is reconciliation and the scalability. He talks briefly about the history of the tool and how it was originally created for knowledge graphs and he mention the growth of the original project to become what it is today. #knowledgegraphs #knowledgegraphconference #knowledgegraphtools
Up Next in KGC 2021 Conference, Workshops and Tutorials
-
Andriy Nikolov | Biological Insights:...
The use of knowledge graphs as a data source for machine learning methods to solve complex problems in life sciences has rapidly become popular in recent years. Our Biological Insights Knowledge Graph (BIKG) combines relevant data for drug development from more than 50 public as well as internal ...
-
Amgad Madkour | Entity Life Cycle In ...
Entity-based results are becoming an integral part of the search experience. Search-centric companies highly rely on knowledge graphs in providing the necessary information for building rich search experiences. An entity can originate from a structured, semi-structured, or unstructured data sourc...
-
Alex Kalinowski | Structured To Unstr...
Identification of entities and the relations between them is a difficult task for traditional pattern-based matching or machine learning approaches; these techniques rapidly overfit training datasets and struggle to transfer to other contexts or domains. Utilizing outside knowledge, such as facts...