Sofie Van Landeghem, Bernard De Baets, Yves Van de Peer, Yvan Saeys
2011 Computational intelligence  
We have developed a machine learning framework to accurately extract complex genetic interactions from text. Employing type-specific classifiers, this framework processes research articles to extract various biological events. Subsequently, the algorithm identifies regulation events that take other events as arguments, allowing a nested structure of predictions. All predictions are merged into an integrated network, useful for visualization and for deduction of new biological knowledge. In this
more » ... paper, we discuss several design choices for an event-based extraction framework. These detailed studies help improving on existing systems, which is illustrated by the relative performance gain of 10% of our system compared to the official results in the recent BioNLP'09 Shared Task. Our framework now achieves state-of-the-art performance with 37.43 recall, 54.81 precision and 44.48 F-score. We further present the first study of feature selection for bio-molecular event extraction from text. While producing more cost-effective models, feature selection can also lead to a better insight into the complexity of the challenge. Finally, this paper tries to bridge the gap between theoretical relation extraction from text and experimental work on bio-molecular interactions by discussing interesting opportunities to employ event-based text mining tools for real-life tasks such as hypothesis generation, database curation and knowledge discovery.
doi:10.1111/j.1467-8640.2011.00403.x fatcat:wmvblcmuvbaznjivph2by27ebm