Mining Biomedical Ontologies and Data Using RDF Hypergraphs

Haishan Liu, Dejing Dou, Ruoming Jin, Paea Lependu, Nigam Shah
2013 2013 12th International Conference on Machine Learning and Applications  
As researchers analyze huge amounts of data that are annotated by large biomedical ontologies, one of the major challenges for data mining and machine learning is to leverage both ontologies and data together in a systematic and scalable way. In this paper, we address two interesting and related problems for mining biomedical ontologies and data: i) how to discover semantic associations with the help of formal ontologies; ii) how to identify potential errors in the ontologies with the help of
more » ... ta. By representing both ontologies and data using RDF hypergraphs, and subsequently transforming the hypergraphs to corresponding bipartite forms, we provide a generalized data mining method that scales beyond what existing ontology-based approaches can provide. We show the proposed method is indeed capable of capturing semantic associations while seamlessly incorporate domain knowledge in ontologies by performing evaluations on real-world electronic health dataset and NCBO ontologies. We also show that our data mining methods can discover and suggest corrections for misinformation in biomedical ontologies. 2013 12th International Conference on Machine Learning and Applications 978-0-7695-5144-9/13 $26.00
doi:10.1109/icmla.2013.31 dblp:conf/icmla/LiuDJLS13 fatcat:ftgbjka5lnb7vkc53utif4pkje