MatKG: The largest Knowledge Graph in Material Science
23m
In the work, we present MatKG, the largest knowledge graph in the field of material science. It contains over 80,000 unique entities and over 5 million statements covering several topical fields such as inorganic oxides, functional materials, battery materials, metals and alloys, polymers, cements, high entropy alloys, biomaterials, and catalysts. The triples are generated autonomously through data driven natural language processing pipelines and extracted from a corpus of around 4 million published scientific articles. Several informational entities such as materials, properties, application areas, synthesis information, and characterization methods are integrated together with a hierarchical ontological schema, where the base relations are extracted through statistical correlations to which higher level ontologies are appended. We show that using a graph representation model we are able to perform link prediction allowing the correlation of materials with novel properties/application and vice versa.