Live stream preview
A distributed Vertex-Deduplication Framework for Large Graphs
May 11 | KGC 2023 • 26m
Our framework is built on top of a distributed Hadoop/MapReduce/Hbase infrastructure capturing both “low-level” graph database operations as well as “higher level” algorithmic aspects such as vertex-vertex similarity, graph clustering, and “robust” vertex id stamping. We have been using the framework to de-duplicate the Goldman Sachs Knowledge Graph – in this talk we will report a few experimental results applying the framework to public datasets.