DRUGS4COVID: KG about drugs used in the clinical control of the coronavirus
Natural Language Processing (NLP) Track | KGC 2023
•
26m
The main objective of Drugs4Covid is to create resources, following the principles of Open Science, that facilitate the extraction of knowledge from scientific literature related to the Coronavirus. These resources can be used by scientific communities that carry out research in relation to SARS-CoV-2/COVID-19 and also by therapeutic communities, laboratories, etc., that wish to find and understand relationships between symptoms, drugs, active principles and their documentary evidence. We have created a best practice guide that reviews and documents the steps required to build knowledge graphs from sets of scientific articles. Since in the early stages it is necessary to structure the data that is collected in the form of written text, we have developed named entity recognition models that identify drugs, diseases, genes and proteins mentioned in scientific articles. The discovery of relationships between these entities, either explicit when they are mentioned in the texts themselves or implicit from their semantic representations, has been approached through representation models based on probabilistic topics. Specifically, a study has been carried out on the capabilities offered by representations based on dynamic topics to capture the relevance of drugs in the treatment of coronavirus. To add meaning to the annotations and describe the evidence that we automatically extract when processing scientific articles, we have created the EBOCA ontology, where the associations between biomedical concepts supported by scientific literature are modeled. As a final result, we published the Drugs4Covid knowledge Graph with evidence between drugs and diseases mentioned in the CORD-19 corpus, which contains scientific publications on coronaviruses in the last 50 years. Finally, and to facilitate access to all this information without having to be an expert in semantic technologies, we have created a question-answer interface in natural language to consult its content together with other external knowledge bases, such as DBpedia and Wikidata.
Up Next in Natural Language Processing (NLP) Track | KGC 2023
-
Knowledge Graph Treatments for Halluc...
Despite the excitement about Large Language Models (LLM), these models suffer from hallucinations problems, e.g., generating factually incorrect text. These problems restrict the development of production-ready applications. This talk will highlight the importance of combining Knowledge Graphs wi...
-
Unleash the value of unstructured dat...
Significant portions of the data generated in enterprises are unstructured and text-based. This can span the entire product lifecycle, from early research to post-launch analysis. A major challenge for companies is managing these vast amounts of text data and extracting hidden and valuable inform...
-
Leave no Thought Behind: Encoding Con...
Many industries store vast amounts of information as natural language. Current methods for composing this text into knowledge graphs parse a small set of relations from within a larger document. The author's specific diction is approximated by the vocabulary of the model. In domains where precise...