LINKING ONTOLOGICAL RESOURCES USING AGGREGATABLE SUBSTANCE IDENTIFIERS TO ORGANIZE EXTRACTED RELATIONS

BYRON MARSHALL, HUA SU, DANIEL MCDONALD, HSINCHUN CHEN
2004 Biocomputing 2005  
Systems that extract biological regulatory pathway relations from free-text sources are intended to help researchers leverage vast and growing collections of research literature. Several systems to extract such relations have been developed but little work has focused on how those relations can be usefully organized (aggregated) to support visualization systems or analysis algorithms. Ontological resources that enumerate name strings for different types of biomedical objects should play a key
more » ... le in the organization process. In this paper we delineate five potentially useful levels of relational granularity and propose the use of aggregatable substance identifiers to help reduce lexical ambiguity. An aggregatable substance identifier applies to a gene and its products. We merged 4 extensive lexicons and compared the extracted strings to the text of five million MEDLINE abstracts. We report on the ambiguity within and between name strings and common English words. Our results show an 89% reduction in ambiguity for the extracted human substance name strings when using an aggregatable substance approach.
doi:10.1142/9789812702456_0016 fatcat:w2ow6nb2xvag7evp6fdwalj27q