EDGAR: Extraction of Drugs, Genes And Relations from the Biomedical Literature

Thomas C. Rindflesch, Lorraine Tanabe, John N. Weinstein, Lawrence Hunter
1999 Biocomputing 2000  
EDGAR (Extraction of Drugs, Genes and Relations) is a natural language processing system that extracts information about drugs and genes relevant to cancer from the biomedical literature. This automatically extracted information has remarkable potential to facilitate computational analysis in the molecular biology of cancer, and the technology is straightforwardly generalizable to many areas of biomedicine. This paper reports on the mechanisms for automatically generating such assertions and on
more » ... a simple application, conceptual clustering of documents. The system uses a stochastic part of speech tagger, generates an underspecified syntactic parse and then uses semantic and pragmatic information to construct its assertions. The system builds on two important existing resources: the MEDLINE database of biomedical citations and abstracts and the Unified Medical Language System, which provides syntactic and semantic information about the terms found in biomedical abstracts.
doi:10.1142/9789814447331_0049 fatcat:dyebl3wchrfvfb3zukr2lypj3u