Head and Tail Entity Fusion Model in Medical Knowledge Graph Construction: Case Study in Pituitary Adenoma (Preprint)

An Fang, Pei Lou, Jiahui Hu, Wanqing Zhao, Ming Feng, Huiling Ren, Xianlai Chen
2021 JMIR Medical Informatics  
Pituitary adenoma is one of the most common central nervous system tumors. The diagnosis and treatment of pituitary adenoma are still very difficult. Misdiagnosis and recurrence often occur, and experienced neurosurgeons are in serious shortage. Knowledge graph can help interns quickly understand the medical knowledge related to pituitary tumor. A data fusion method suitable for medical data is presented, and integrate the data of pituitary adenomas from different sources. Construct a pituitary
more » ... adenoma knowledge graph, and use the knowledge graph for knowledge discovery. In this paper, a complete framework suitable for the construction of medical knowledge graph was presented, and used to build a knowledge graph for pituitary adenoma(KGPA). The schema of the KGPA was manually constructed. Information of pituitary adenoma were automatically extracted from Chinese electronic medical records (CEMRs) and the medical websites through CRF model and web wrappers we designed. An entity fusion method was proposed based on the head and tail entity fusion model to fuse the data from heterogeneous sources. Data was extracted from 300 CEMRs of pituitary adenoma and 4 health portals. Entity fusion was carried out by using the data fusion model we proposed. The F1 scores of the head and tail entity fusion were 97.32% and 98.57%. Select triples from KGPA for evaluation, and the accuracy was 95.4%. This paper introduced an approach to fuse triples extracted from heterogeneous data sources, and it can be used to build a knowledge graph. The evaluation results show that the data in KGPA is of high quality. The constructed KGPA can help physicians in their clinical practice.
doi:10.2196/28218 pmid:34057414 pmcid:PMC8367125 fatcat:cptryunflfezngiyswddsnzccu