During the presentation, we will share our experience in building a knowledge graph leveraging Spark, NLP, and Machine Learning. We will start with explaining the business problems and challenges. Then walk through our data pipeline, including text analytics processes, name similarity solutions, street address normalization, clustering algorithms, confidence level building, etc. At the end we will discuss the business impact and the takeaways.
Dmytro Dolgopolov and Chen Zhang from FINRA, a company that protects investors controlling massive amounts of data as they can run up to 50,000 compute nodes per day and they process up to 135 billion market events per day. This talk will mainly talk about the knowledge graphs that they use are FINRA which are mainly enterprise search and using higher level analytics. In order to explain it, speakers talk about the Entity Disambiguation with Knowledge Graphs where data is extracted from an entity, then they link entity to entity. After linking, clusters are built of these entities and then these clusters allow disambiguation graphs to form which help identify unique entities. Dmytro will give Chen the mic and she provides an insight to these steps and list obstacles they had to overcome to create a system like this. #knowledgegraphs #knowledgegraphconference #knowledgegraphbigdataprocessing #knowledgegraphbusiness