Peter Rose | Integrating Heterogeneous Data Sources Into A COVID-19 Graph
24m
The COVID-19 pandemic has mobilized researchers worldwide to investigate many aspects of the outbreak, ranging from case statistics, patient demographics, transportation modeling, epidemiological studies, to viral genome sequencing. Relevant data are produced and publically shared at an unprecedented pace and updated daily. Given the urgency of the outbreak and the high levels of velocity and variety of pandemic-related data, efforts have not focused on data interoperability across domains. The avalanche of COVID-19-related data streams from agencies and public and private research teams, with little coordination and without reliance on best interoperability practices, creates enormous challenges for researchers attempting to analyze the pandemic in all its multi-disciplinary complexity and develop a comprehensive policy response. With data collection and analysis efforts largely fragmented and siloed, this goal can be addressed by the roll-out of a comprehensive semantic integration platform that organizes available information into an easily queryable transdisciplinary knowledge system. We developed the COVID-19-Net Knowledge Graph that integrates epidemiological, biological, and population characteristic data. The challenges of integrating data across diverse domains, proposed solutions, and calls for actions to prepare for future outbreaks will be discussed.